# Hop Teaming Analysis : Database Creation

## Team_Blimp

![dirigibles_vs_blimps.jpg.webp](attachment:dirigibles_vs_blimps.jpg.webp)


For this project the data was provided to us by our instructor Michael holloway. This project was a collaboration between Hayden Greer, Smita Misra, Tim Simpson and Asha Maheshwari. This notebook mainly comprise of code used for database creation. 

Four different dataset were used for database creation

* Hop Teaming dataset: This dataset aims to capture referrals between healthcare providers based on medicare claims.
  More information about the Hop teaming data can be found at https://careset.com/docgraph-hop-teaming-dataset/.

* NPPES dataset: For supplementing the Hop Teaming, the NPPES Data Dissemination was downloaded from  https://download.cms.gov/nppes/NPI_Files.html.

* Taxonomy Code dataset: For taxonomy code classification download from https://www.nucc.org/index.php/code-sets-mainmenu-41/provider-taxonomy-mainmenu-40/csv-mainmenu-57

* CBSA dataset: Download from https://www.huduser.gov/portal/datasets/usps_crosswalk.html


In [1]:
import pandas as pd
import sqlite3
from tqdm.notebook import tqdm

## Hop Team 

In [2]:
referrals =  pd.read_csv('Data/DocGraph_Hop_Teaming_2018.csv', nrows = 100)
referrals

Unnamed: 0,from_npi,to_npi,patient_count,transaction_count,average_day_wait,std_day_wait
0,1508062167,1730166109,350,370,53.922,72.612
1,1508065640,1730166109,25,25,49.800,55.006
2,1508052093,1730166109,16,16,109.500,70.593
3,1508172545,1730166109,14,14,103.357,75.483
4,1508285131,1730166109,20,21,89.952,89.880
...,...,...,...,...,...,...
95,1508178229,1730166893,31,32,67.125,61.279
96,1508196445,1730166893,23,24,60.833,59.129
97,1508811076,1730166935,14,15,62.533,62.827
98,1508871252,1730166935,29,31,25.323,36.693


We wanted to eliminate "accidental" referrals, so we filtered the hop teaming data for the transaction_count at least 50 and the average_day_wait < 50.

In [3]:
db = sqlite3.connect('Data/hop_team.sqlite')

for chunk in tqdm(pd.read_csv('Data/DocGraph_Hop_Teaming_2018.csv', chunksize = 10000)):
    chunk = chunk[(chunk['transaction_count'] >= 50) & (chunk['average_day_wait'] < 50)]     # filter transaction_count is at least 50 and the average_day_wait is less than 50 
    chunk.to_sql('referral', db, if_exists = 'append', index = False)            # Append the chunk to a referral table

0it [00:00, ?it/s]

## NPPES

In [4]:
nppes = pd.read_csv('Data/npidata_pfile_20050523-20230212.csv', nrows = 10000)

  nppes = pd.read_csv('Data/npidata_pfile_20050523-20230212.csv', nrows = 10000)


In [5]:
nppes.columns.tolist()

['NPI',
 'Entity Type Code',
 'Replacement NPI',
 'Employer Identification Number (EIN)',
 'Provider Organization Name (Legal Business Name)',
 'Provider Last Name (Legal Name)',
 'Provider First Name',
 'Provider Middle Name',
 'Provider Name Prefix Text',
 'Provider Name Suffix Text',
 'Provider Credential Text',
 'Provider Other Organization Name',
 'Provider Other Organization Name Type Code',
 'Provider Other Last Name',
 'Provider Other First Name',
 'Provider Other Middle Name',
 'Provider Other Name Prefix Text',
 'Provider Other Name Suffix Text',
 'Provider Other Credential Text',
 'Provider Other Last Name Type Code',
 'Provider First Line Business Mailing Address',
 'Provider Second Line Business Mailing Address',
 'Provider Business Mailing Address City Name',
 'Provider Business Mailing Address State Name',
 'Provider Business Mailing Address Postal Code',
 'Provider Business Mailing Address Country Code (If outside U.S.)',
 'Provider Business Mailing Address Telephone Nu

In [6]:
nppes.shape

(10000, 330)

The NPPES dataset contains a large number of fields, only a few of which are relevant: 

'NPI'
Entity Type, indicated by the 'Entity Type Code' field:
1 = Provider (doctors, nurses, etc.)
2 = Facility (Hospitals, Urgent Care, Doctors Offices)
Entity Name: Either First/Last or Organization or Other Organization Name contained in the following fields:
'Provider Organization Name (Legal Business Name)'
'Provider Last Name (Legal Name)'
'Provider First Name'
'Provider Middle Name'
'Provider Name Prefix Text'
'Provider Name Suffix Text'
'Provider Credential Text'
Address: Business Practice Location (not mailing), contained in the following fields:
'Provider First Line Business Practice Location Address'
'Provider Second Line Business Practice Location Address'
'Provider Business Practice Location Address City Name'
'Provider Business Practice Location Address State Name'
'Provider Business Practice Location Address Postal Code'
The provider's taxonomy code, which is contained in one of the 'Healthcare Provider Taxonomy Code*' columns. A provider can have up to 15 taxonomy codes, but we want the one which has Primary Switch = Y in the associated 'Healthcare Provider Primary Taxonomy Switch*' field. Note that this does not always occur in spot 1.

In [7]:
NPPES = nppes[['NPI','Entity Type Code', 
               'Provider Organization Name (Legal Business Name)',
               'Provider Last Name (Legal Name)',
               'Provider First Name',
               'Provider Middle Name',
               'Provider Name Prefix Text',
               'Provider Name Suffix Text',
               'Provider Credential Text',
               'Provider First Line Business Practice Location Address',
               'Provider Second Line Business Practice Location Address',
               'Provider Business Practice Location Address City Name',
               'Provider Business Practice Location Address State Name',
               'Provider Business Practice Location Address Postal Code',
               'Healthcare Provider Taxonomy Code_1',
               'Healthcare Provider Primary Taxonomy Switch_1',
               'Healthcare Provider Taxonomy Code_2',
               'Healthcare Provider Primary Taxonomy Switch_2',
               'Healthcare Provider Taxonomy Code_3',
               'Healthcare Provider Primary Taxonomy Switch_3',
               'Healthcare Provider Taxonomy Code_4',
              'Healthcare Provider Primary Taxonomy Switch_4',
               'Healthcare Provider Taxonomy Code_5',
              'Healthcare Provider Primary Taxonomy Switch_5',
               'Healthcare Provider Taxonomy Code_6',
              'Healthcare Provider Primary Taxonomy Switch_6',
               'Healthcare Provider Taxonomy Code_7',
              'Healthcare Provider Primary Taxonomy Switch_7',
               'Healthcare Provider Taxonomy Code_8',
              'Healthcare Provider Primary Taxonomy Switch_8',
               'Healthcare Provider Taxonomy Code_9',
              'Healthcare Provider Primary Taxonomy Switch_9',
               'Healthcare Provider Taxonomy Code_10',
              'Healthcare Provider Primary Taxonomy Switch_10',
               'Healthcare Provider Taxonomy Code_11',
              'Healthcare Provider Primary Taxonomy Switch_11',
               'Healthcare Provider Taxonomy Code_12',
              'Healthcare Provider Primary Taxonomy Switch_12',
               'Healthcare Provider Taxonomy Code_13',
              'Healthcare Provider Primary Taxonomy Switch_13',
               'Healthcare Provider Taxonomy Code_14',
              'Healthcare Provider Primary Taxonomy Switch_14',
               'Healthcare Provider Taxonomy Code_15',
              'Healthcare Provider Primary Taxonomy Switch_15']]

In [8]:
pd.set_option('display.max_columns', None)
NPPES.tail(20)

Unnamed: 0,NPI,Entity Type Code,Provider Organization Name (Legal Business Name),Provider Last Name (Legal Name),Provider First Name,Provider Middle Name,Provider Name Prefix Text,Provider Name Suffix Text,Provider Credential Text,Provider First Line Business Practice Location Address,Provider Second Line Business Practice Location Address,Provider Business Practice Location Address City Name,Provider Business Practice Location Address State Name,Provider Business Practice Location Address Postal Code,Healthcare Provider Taxonomy Code_1,Healthcare Provider Primary Taxonomy Switch_1,Healthcare Provider Taxonomy Code_2,Healthcare Provider Primary Taxonomy Switch_2,Healthcare Provider Taxonomy Code_3,Healthcare Provider Primary Taxonomy Switch_3,Healthcare Provider Taxonomy Code_4,Healthcare Provider Primary Taxonomy Switch_4,Healthcare Provider Taxonomy Code_5,Healthcare Provider Primary Taxonomy Switch_5,Healthcare Provider Taxonomy Code_6,Healthcare Provider Primary Taxonomy Switch_6,Healthcare Provider Taxonomy Code_7,Healthcare Provider Primary Taxonomy Switch_7,Healthcare Provider Taxonomy Code_8,Healthcare Provider Primary Taxonomy Switch_8,Healthcare Provider Taxonomy Code_9,Healthcare Provider Primary Taxonomy Switch_9,Healthcare Provider Taxonomy Code_10,Healthcare Provider Primary Taxonomy Switch_10,Healthcare Provider Taxonomy Code_11,Healthcare Provider Primary Taxonomy Switch_11,Healthcare Provider Taxonomy Code_12,Healthcare Provider Primary Taxonomy Switch_12,Healthcare Provider Taxonomy Code_13,Healthcare Provider Primary Taxonomy Switch_13,Healthcare Provider Taxonomy Code_14,Healthcare Provider Primary Taxonomy Switch_14,Healthcare Provider Taxonomy Code_15,Healthcare Provider Primary Taxonomy Switch_15
9980,1962401141,1.0,,MOSKOVIC,JEFFREY,,DR.,,DDS,73 MURIEL AVE,,LAWRENCE,NY,115591810.0,1223G0001X,Y,,,,,,,,,,,,,,,,,,,,,,,,,,,,
9981,1770582942,1.0,,DIEZ,LORETTA,MARIE,MS.,,"M.A., L.P.",408 SAINT PETER ST,SUITE 429,SAINT PAUL,MN,551021130.0,103T00000X,Y,,,,,,,,,,,,,,,,,,,,,,,,,,,,
9982,1235138454,1.0,,HOLCOMB,SHANON,,DR.,,D.C.,1 VALLEY ST,SUITE 106,CARLISLE,PA,170133193.0,111N00000X,Y,,,,,,,,,,,,,,,,,,,,,,,,,,,,
9983,1497754618,1.0,,ROSA,LUIS,,,,LPCC,3469 FORTUNA DR,,AKRON,OH,443125281.0,101YM0800X,Y,,,,,,,,,,,,,,,,,,,,,,,,,,,,
9984,1124027347,1.0,,NGUYEN,CHARLIE,C,MR.,,RPH,15626 ASHBOURNNE SPRINGS LN,,HOUSTON,TX,770952262.0,183500000X,Y,,,,,,,,,,,,,,,,,,,,,,,,,,,,
9985,1942209168,1.0,,IQBAL,AMJAD,,DR.,,M.D.,4455 DRESSLER RD NW,,CANTON,OH,447182769.0,207RC0000X,Y,,,,,,,,,,,,,,,,,,,,,,,,,,,,
9986,1366441503,1.0,,LAU,DIANE,M,,,MNT,2142 N COVE BLVD,,TOLEDO,OH,436063895.0,133V00000X,Y,,,,,,,,,,,,,,,,,,,,,,,,,,,,
9987,1447259684,1.0,,VIDAL,ADA,,,,M.D.,471 BARNUM AVE,,BRIDGEPORT,CT,66082409.0,208000000X,Y,,,,,,,,,,,,,,,,,,,,,,,,,,,,
9988,1770582918,1.0,,DAVIS,ANDREW,ALEXANDER,DR.,,PHARM.D.,4200 EAST NINTH AVE,,DENVER,CO,802620001.0,1835P1200X,Y,,,,,,,,,,,,,,,,,,,,,,,,,,,,
9989,1205835444,1.0,,DURANT,LAURA,SHARON,MRS.,,LCSW,1041 AUTUMN LEAF DR,,WINTER GARDEN,FL,347872111.0,1041C0700X,Y,,,,,,,,,,,,,,,,,,,,,,,,,,,,


In [9]:
# Function to check taxonomy code 
def find_taxonomy_code(column):  
    for i in range(1,16):
        switch = f'Healthcare Provider Primary Taxonomy Switch_{i}'
        value = f'Healthcare Provider Taxonomy Code_{i}' 
        if column[switch] == 'Y':
           return column[value]
       # print(column[switch])
    return 'no taxonomy switch'

In [10]:
db = sqlite3.connect('Data/hop_team.sqlite')

for chunk in tqdm(pd.read_csv('Data/npidata_pfile_20050523-20230212.csv', chunksize = 10000)):
    chunk['primary_taxonomy_code'] = chunk.apply(find_taxonomy_code, axis = 1)
    chunk = chunk [[
        'NPI',
        'Entity Type Code', 
               'Provider Organization Name (Legal Business Name)',
               'Provider Last Name (Legal Name)',
               'Provider First Name',
               'Provider Middle Name',
               'Provider Name Prefix Text',
               'Provider Name Suffix Text',
               'Provider Credential Text',
               'Provider First Line Business Practice Location Address',
               'Provider Second Line Business Practice Location Address',
               'Provider Business Practice Location Address City Name',
               'Provider Business Practice Location Address State Name',
               'Provider Business Practice Location Address Postal Code',
               'primary_taxonomy_code'            
    ]]   # Filter only the relevant columns
    chunk.columns = [x.lower().replace(' ', '_') for x in chunk.columns] # clean up column names
    chunk.to_sql('nppes', db, if_exists = 'append', index = False)           

0it [00:00, ?it/s]

  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:
  for obj in iterable:


In [11]:
db.close()

## Taxonomy Code

In [12]:
taxonomy = pd.read_csv('Data/nucc_taxonomy_230.csv', encoding= 'unicode_escape')

taxonomy.columns = [x.lower().replace(' ', '_') for x in taxonomy.columns]

db = sqlite3.connect('Data/hop_team.sqlite')

taxonomy.to_sql('taxonomy', db, if_exists = 'append', index = False) 

873

In [13]:
db.close()

## CBSA

In [14]:
CBSA = pd.read_excel('Data/ZIP_CBSA_122021.xlsx')

db = sqlite3.connect('Data/hop_team.sqlite')

CBSA.to_sql('CBSA', db, if_exists = 'append', index = False)

47484

In [15]:
db.close()

In [16]:
db = sqlite3.connect('Data/hop_team.sqlite')
db.execute("""

SELECT 
    name
FROM 
    sqlite_schema
WHERE 
    type ='table' AND 
    name NOT LIKE 'sqlite_%';
    """).fetchall()

[('referral',), ('nppes',), ('taxonomy',), ('CBSA',)]

The four tables contained within 'hop_team.sqlite' database 

* taxomomy
* CBSA
* referral
* nppes

In [17]:
# query to check table

query = """

SELECT *
FROM nppes
limit 5;

"""

with sqlite3.connect('Data/hop_team.sqlite') as db: 
    hop = pd.read_sql(query, db)
hop   

Unnamed: 0,npi,entity_type_code,provider_organization_name_(legal_business_name),provider_last_name_(legal_name),provider_first_name,provider_middle_name,provider_name_prefix_text,provider_name_suffix_text,provider_credential_text,provider_first_line_business_practice_location_address,provider_second_line_business_practice_location_address,provider_business_practice_location_address_city_name,provider_business_practice_location_address_state_name,provider_business_practice_location_address_postal_code,primary_taxonomy_code
0,1740284231,,,,,,,,,,,,,,no taxonomy switch
1,1346245800,,,,,,,,,,,,,,no taxonomy switch
2,1487650776,,,,,,,,,,,,,,no taxonomy switch
3,1033113022,,,,,,,,,,,,,,no taxonomy switch
4,1043216138,,,,,,,,,,,,,,no taxonomy switch


In [None]:
# Deleting table in database
# db = sqlite3.connect('Data/hop_team.sqlite')
# db.execute("DROP TABLE taxonomy")