# Goals

This notebook serves to download and process the CMS.gov National Plan and Provider Enumeration System (NPPES) database into a developer-friendly format on the Payless Health public S3 bucket at https://data.payless.health.

# Background

The Centers for Medicare & Medicaid Services maintains a database of national provider identification (NPI) numbers here:

https://www.cms.gov/Regulations-and-Guidance/Administrative-Simplification/NationalProvIdentStand/DataDissemination 

The database is updated monthly. The most recent update is available here:

https://download.cms.gov/nppes/NPPES_Data_Dissemination_July_2023.zip 

This database is needed to link to hospital price transparency data and transparency in coverage data.

In [1]:
!wget https://download.cms.gov/nppes/NPPES_Data_Dissemination_July_2023.zip

--2023-07-28 11:41:13--  https://download.cms.gov/nppes/NPPES_Data_Dissemination_July_2023.zip
Resolving download.cms.gov (download.cms.gov)... 104.127.188.67
Connecting to download.cms.gov (download.cms.gov)|104.127.188.67|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 942006146 (898M) [application/zip]
Saving to: ‘NPPES_Data_Dissemination_July_2023.zip’


2023-07-28 11:44:43 (4.29 MB/s) - ‘NPPES_Data_Dissemination_July_2023.zip’ saved [942006146/942006146]



In [2]:
!unzip NPPES_Data_Dissemination_July_2023.zip

Archive:  NPPES_Data_Dissemination_July_2023.zip
  inflating: othername_pfile_20050523-20230709.csv  
  inflating: othername_pfile_20050523-20230709_fileheader.csv  
  inflating: endpoint_pfile_20050523-20230709.csv  
  inflating: endpoint_pfile_20050523-20230709_fileheader.csv  
  inflating: pl_pfile_20050523-20230709.csv  
  inflating: pl_pfile_20050523-20230709_fileheader.csv  
  inflating: npidata_pfile_20050523-20230709.csv  
  inflating: npidata_pfile_20050523-20230709_fileheader.csv  
  inflating: NPPES_Data_Dissemination_Readme.pdf  
  inflating: NPPES_Data_Dissemination_CodeValues.pdf  


In [3]:
!head npidata_pfile_20050523-20230709.csv

"NPI","Entity Type Code","Replacement NPI","Employer Identification Number (EIN)","Provider Organization Name (Legal Business Name)","Provider Last Name (Legal Name)","Provider First Name","Provider Middle Name","Provider Name Prefix Text","Provider Name Suffix Text","Provider Credential Text","Provider Other Organization Name","Provider Other Organization Name Type Code","Provider Other Last Name","Provider Other First Name","Provider Other Middle Name","Provider Other Name Prefix Text","Provider Other Name Suffix Text","Provider Other Credential Text","Provider Other Last Name Type Code","Provider First Line Business Mailing Address","Provider Second Line Business Mailing Address","Provider Business Mailing Address City Name","Provider Business Mailing Address State Name","Provider Business Mailing Address Postal Code","Provider Business Mailing Address Country Code (If outside U.S.)","Provider Business Mailing Address Telephone Number","Provider Business Mailing Address Fax Number",

In [2]:
!ls -lh npidata_pfile_20050523-20230709.csv

-rw-r--r--@ 1 me  staff   8.8G Jul 10 03:57 npidata_pfile_20050523-20230709.csv


In [7]:
!wc -l npidata_pfile_20050523-20230709.csv

 7890767 npidata_pfile_20050523-20230709.csv


# Plan SQL Query using ChatGPT / Claude
```

SQL code snippet:
```
SELECT *
FROM read_csv('https://data.cityofnewyork.us/api/views/erm2-nwe9/rows.csv?accessType=DOWNLOAD',
    header=True,
    delim=',',
    quote='"',
    columns={'Unique Key': 'BIGINT',
    'Created Date': 'VARCHAR',
    'Closed Date': 'VARCHAR',
    'Agency': 'VARCHAR',
    'Agency Name': 'VARCHAR',
    'Complaint Type': 'VARCHAR',
    'Descriptor': 'VARCHAR',
    'Location Type': 'VARCHAR',
    'Incident Zip': 'VARCHAR',
    'Incident Address': 'VARCHAR',
    'Street Name': 'VARCHAR',
    'Cross Street 1': 'VARCHAR',
    'Cross Street 2': 'VARCHAR',
    'Intersection Street 1': 'VARCHAR',
    'Intersection Street 2': 'VARCHAR',
    'Address Type': 'VARCHAR',
    'City': 'VARCHAR',
    'Landmark': 'VARCHAR',
    'Facility Type': 'VARCHAR',
    'Status': 'VARCHAR',
    'Due Date': 'VARCHAR',
    'Resolution Description': 'VARCHAR',
    'Resolution Action Updated Date': 'VARCHAR',
    'Community Board': 'VARCHAR',
    'BBL': 'VARCHAR',
    'Borough': 'VARCHAR',
    'X Coordinate (State Plane)': 'VARCHAR',
    'Y Coordinate (State Plane)': 'VARCHAR',
    'Open Data Channel Type': 'VARCHAR',
    'Park Facility Name': 'VARCHAR',
    'Park Borough': 'VARCHAR',
    'Vehicle Type': 'VARCHAR',
    'Taxi Company Borough': 'VARCHAR',
    'Taxi Pick Up Location': 'VARCHAR',
    'Bridge Highway Name': 'VARCHAR',
    'Bridge Highway Direction': 'VARCHAR',
    'Road Ramp': 'VARCHAR',
    'Bridge Highway Segment': 'VARCHAR',
    'Latitude': 'DOUBLE',
    'Longitude': 'DOUBLE',
    'Location': 'VARCHAR'}) 
LIMIT 10;
```

Documentation for `read_csv` function parameters from https://duckdb.org/docs/data/csv/overview.html#parameters:

```
 Parameters
Name 	Description 	Type 	Default
all_varchar 	Option to skip type detection for CSV parsing and assume all columns to be of type VARCHAR. 	bool 	false
auto_detect 	Enables auto detection of parameters 	bool 	true
columns 	A struct that specifies the column names and column types contained within the CSV file (e.g. {'col1': 'INTEGER', 'col2': 'VARCHAR'}). 	struct 	(empty)
compression 	The compression type for the file. By default this will be detected automatically from the file extension (e.g. t.csv.gz will use gzip, t.csv will use none). Options are none, gzip, zstd. 	varchar 	auto
dateformat 	Specifies the date format to use when parsing dates. See Date Format 	varchar 	(empty)
decimal_separator 	The decimal separator of numbers 	varchar 	.
delim or sep 	Specifies the string that separates columns within each row (line) of the file. 	varchar 	,
escape 	Specifies the string that should appear before a data character sequence that matches the quote value. 	varchar 	"
filename 	Whether or not an extra filename column should be included in the result. 	bool 	false
force_not_null 	Do not match the specified columns’ values against the NULL string. In the default case where the NULL string is empty, this means that empty values will be read as zero-length strings rather than NULLs. 	varchar[] 	[]
header 	Specifies that the file contains a header line with the names of each column in the file. 	bool 	false
hive_partitioning 	Whether or not to interpret the path as a hive partitioned path. 	bool 	false
ignore_errors 	Option to ignore any parsing errors encountered - and instead ignore rows with errors. 	bool 	false
max_line_size 	The maximum line size in bytes 	bigint 	2097152
names 	The column names as a list. Example here. 	varchar[] 	(empty)
new_line 	Set the new line character(s) in the file. Options are '\r','\n', or '\r\n'. 	varchar 	(empty)
normalize_names 	Boolean value that specifies whether or not column names should be normalized, removing any non-alphanumeric characters from them. 	bool 	false
nullstr 	Specifies the string that represents a NULL value. 	varchar 	(empty)
parallel 	Whether or not the experimental parallel CSV reader is used. 	bool 	false
quote 	Specifies the quoting string to be used when a data value is quoted. 	varchar 	"
sample_size 	The number of sample rows for auto detection of parameters. 	bigint 	20480
skip 	The number of lines at the top of the file to skip. 	bigint 	0
timestampformat 	Specifies the date format to use when parsing timestamps. See Date Format 	varchar 	(empty)
types or dtypes 	The column types as either a list (by position) or a struct (by name). Example here. 	varchar[] or struct 	(empty)
union_by_name 	Whether the columns of multiple schemas should be unified by name, rather than by position. 	bool 	false
```

Please take the above SQL code snippet and documentation of the `read_csv` function used in the query, and rewrite it to load the following file located at `/Users/me/dropbox/projects/data_build_tool_payless.health/npidata_pfile_20050523-20230709.csv` which has the following header:


```
"NPI","Entity Type Code","Replacement NPI","Employer Identification Number (EIN)","Provider Organization Name (Legal Business Name)","Provider Last Name (Legal Name)","Provider First Name","Provider Middle Name","Provider Name Prefix Text","Provider Name Suffix Text","Provider Credential Text","Provider Other Organization Name","Provider Other Organization Name Type Code","Provider Other Last Name","Provider Other First Name","Provider Other Middle Name","Provider Other Name Prefix Text","Provider Other Name Suffix Text","Provider Other Credential Text","Provider Other Last Name Type Code","Provider First Line Business Mailing Address","Provider Second Line Business Mailing Address","Provider Business Mailing Address City Name","Provider Business Mailing Address State Name","Provider Business Mailing Address Postal Code","Provider Business Mailing Address Country Code (If outside U.S.)","Provider Business Mailing Address Telephone Number","Provider Business Mailing Address Fax Number","Provider First Line Business Practice Location Address","Provider Second Line Business Practice Location Address","Provider Business Practice Location Address City Name","Provider Business Practice Location Address State Name","Provider Business Practice Location Address Postal Code","Provider Business Practice Location Address Country Code (If outside U.S.)","Provider Business Practice Location Address Telephone Number","Provider Business Practice Location Address Fax Number","Provider Enumeration Date","Last Update Date","NPI Deactivation Reason Code","NPI Deactivation Date","NPI Reactivation Date","Provider Gender Code","Authorized Official Last Name","Authorized Official First Name","Authorized Official Middle Name","Authorized Official Title or Position","Authorized Official Telephone Number","Healthcare Provider Taxonomy Code_1","Provider License Number_1","Provider License Number State Code_1","Healthcare Provider Primary Taxonomy Switch_1","Healthcare Provider Taxonomy Code_2","Provider License Number_2","Provider License Number State Code_2","Healthcare Provider Primary Taxonomy Switch_2","Healthcare Provider Taxonomy Code_3","Provider License Number_3","Provider License Number State Code_3","Healthcare Provider Primary Taxonomy Switch_3","Healthcare Provider Taxonomy Code_4","Provider License Number_4","Provider License Number State Code_4","Healthcare Provider Primary Taxonomy Switch_4","Healthcare Provider Taxonomy Code_5","Provider License Number_5","Provider License Number State Code_5","Healthcare Provider Primary Taxonomy Switch_5","Healthcare Provider Taxonomy Code_6","Provider License Number_6","Provider License Number State Code_6","Healthcare Provider Primary Taxonomy Switch_6","Healthcare Provider Taxonomy Code_7","Provider License Number_7","Provider License Number State Code_7","Healthcare Provider Primary Taxonomy Switch_7","Healthcare Provider Taxonomy Code_8","Provider License Number_8","Provider License Number State Code_8","Healthcare Provider Primary Taxonomy Switch_8","Healthcare Provider Taxonomy Code_9","Provider License Number_9","Provider License Number State Code_9","Healthcare Provider Primary Taxonomy Switch_9","Healthcare Provider Taxonomy Code_10","Provider License Number_10","Provider License Number State Code_10","Healthcare Provider Primary Taxonomy Switch_10","Healthcare Provider Taxonomy Code_11","Provider License Number_11","Provider License Number State Code_11","Healthcare Provider Primary Taxonomy Switch_11","Healthcare Provider Taxonomy Code_12","Provider License Number_12","Provider License Number State Code_12","Healthcare Provider Primary Taxonomy Switch_12","Healthcare Provider Taxonomy Code_13","Provider License Number_13","Provider License Number State Code_13","Healthcare Provider Primary Taxonomy Switch_13","Healthcare Provider Taxonomy Code_14","Provider License Number_14","Provider License Number State Code_14","Healthcare Provider Primary Taxonomy Switch_14","Healthcare Provider Taxonomy Code_15","Provider License Number_15","Provider License Number State Code_15","Healthcare Provider Primary Taxonomy Switch_15","Other Provider Identifier_1","Other Provider Identifier Type Code_1","Other Provider Identifier State_1","Other Provider Identifier Issuer_1","Other Provider Identifier_2","Other Provider Identifier Type Code_2","Other Provider Identifier State_2","Other Provider Identifier Issuer_2","Other Provider Identifier_3","Other Provider Identifier Type Code_3","Other Provider Identifier State_3","Other Provider Identifier Issuer_3","Other Provider Identifier_4","Other Provider Identifier Type Code_4","Other Provider Identifier State_4","Other Provider Identifier Issuer_4","Other Provider Identifier_5","Other Provider Identifier Type Code_5","Other Provider Identifier State_5","Other Provider Identifier Issuer_5","Other Provider Identifier_6","Other Provider Identifier Type Code_6","Other Provider Identifier State_6","Other Provider Identifier Issuer_6","Other Provider Identifier_7","Other Provider Identifier Type Code_7","Other Provider Identifier State_7","Other Provider Identifier Issuer_7","Other Provider Identifier_8","Other Provider Identifier Type Code_8","Other Provider Identifier State_8","Other Provider Identifier Issuer_8","Other Provider Identifier_9","Other Provider Identifier Type Code_9","Other Provider Identifier State_9","Other Provider Identifier Issuer_9","Other Provider Identifier_10","Other Provider Identifier Type Code_10","Other Provider Identifier State_10","Other Provider Identifier Issuer_10","Other Provider Identifier_11","Other Provider Identifier Type Code_11","Other Provider Identifier State_11","Other Provider Identifier Issuer_11","Other Provider Identifier_12","Other Provider Identifier Type Code_12","Other Provider Identifier State_12","Other Provider Identifier Issuer_12","Other Provider Identifier_13","Other Provider Identifier Type Code_13","Other Provider Identifier State_13","Other Provider Identifier Issuer_13","Other Provider Identifier_14","Other Provider Identifier Type Code_14","Other Provider Identifier State_14","Other Provider Identifier Issuer_14","Other Provider Identifier_15","Other Provider Identifier Type Code_15","Other Provider Identifier State_15","Other Provider Identifier Issuer_15","Other Provider Identifier_16","Other Provider Identifier Type Code_16","Other Provider Identifier State_16","Other Provider Identifier Issuer_16","Other Provider Identifier_17","Other Provider Identifier Type Code_17","Other Provider Identifier State_17","Other Provider Identifier Issuer_17","Other Provider Identifier_18","Other Provider Identifier Type Code_18","Other Provider Identifier State_18","Other Provider Identifier Issuer_18","Other Provider Identifier_19","Other Provider Identifier Type Code_19","Other Provider Identifier State_19","Other Provider Identifier Issuer_19","Other Provider Identifier_20","Other Provider Identifier Type Code_20","Other Provider Identifier State_20","Other Provider Identifier Issuer_20","Other Provider Identifier_21","Other Provider Identifier Type Code_21","Other Provider Identifier State_21","Other Provider Identifier Issuer_21","Other Provider Identifier_22","Other Provider Identifier Type Code_22","Other Provider Identifier State_22","Other Provider Identifier Issuer_22","Other Provider Identifier_23","Other Provider Identifier Type Code_23","Other Provider Identifier State_23","Other Provider Identifier Issuer_23","Other Provider Identifier_24","Other Provider Identifier Type Code_24","Other Provider Identifier State_24","Other Provider Identifier Issuer_24","Other Provider Identifier_25","Other Provider Identifier Type Code_25","Other Provider Identifier State_25","Other Provider Identifier Issuer_25","Other Provider Identifier_26","Other Provider Identifier Type Code_26","Other Provider Identifier State_26","Other Provider Identifier Issuer_26","Other Provider Identifier_27","Other Provider Identifier Type Code_27","Other Provider Identifier State_27","Other Provider Identifier Issuer_27","Other Provider Identifier_28","Other Provider Identifier Type Code_28","Other Provider Identifier State_28","Other Provider Identifier Issuer_28","Other Provider Identifier_29","Other Provider Identifier Type Code_29","Other Provider Identifier State_29","Other Provider Identifier Issuer_29","Other Provider Identifier_30","Other Provider Identifier Type Code_30","Other Provider Identifier State_30","Other Provider Identifier Issuer_30","Other Provider Identifier_31","Other Provider Identifier Type Code_31","Other Provider Identifier State_31","Other Provider Identifier Issuer_31","Other Provider Identifier_32","Other Provider Identifier Type Code_32","Other Provider Identifier State_32","Other Provider Identifier Issuer_32","Other Provider Identifier_33","Other Provider Identifier Type Code_33","Other Provider Identifier State_33","Other Provider Identifier Issuer_33","Other Provider Identifier_34","Other Provider Identifier Type Code_34","Other Provider Identifier State_34","Other Provider Identifier Issuer_34","Other Provider Identifier_35","Other Provider Identifier Type Code_35","Other Provider Identifier State_35","Other Provider Identifier Issuer_35","Other Provider Identifier_36","Other Provider Identifier Type Code_36","Other Provider Identifier State_36","Other Provider Identifier Issuer_36","Other Provider Identifier_37","Other Provider Identifier Type Code_37","Other Provider Identifier State_37","Other Provider Identifier Issuer_37","Other Provider Identifier_38","Other Provider Identifier Type Code_38","Other Provider Identifier State_38","Other Provider Identifier Issuer_38","Other Provider Identifier_39","Other Provider Identifier Type Code_39","Other Provider Identifier State_39","Other Provider Identifier Issuer_39","Other Provider Identifier_40","Other Provider Identifier Type Code_40","Other Provider Identifier State_40","Other Provider Identifier Issuer_40","Other Provider Identifier_41","Other Provider Identifier Type Code_41","Other Provider Identifier State_41","Other Provider Identifier Issuer_41","Other Provider Identifier_42","Other Provider Identifier Type Code_42","Other Provider Identifier State_42","Other Provider Identifier Issuer_42","Other Provider Identifier_43","Other Provider Identifier Type Code_43","Other Provider Identifier State_43","Other Provider Identifier Issuer_43","Other Provider Identifier_44","Other Provider Identifier Type Code_44","Other Provider Identifier State_44","Other Provider Identifier Issuer_44","Other Provider Identifier_45","Other Provider Identifier Type Code_45","Other Provider Identifier State_45","Other Provider Identifier Issuer_45","Other Provider Identifier_46","Other Provider Identifier Type Code_46","Other Provider Identifier State_46","Other Provider Identifier Issuer_46","Other Provider Identifier_47","Other Provider Identifier Type Code_47","Other Provider Identifier State_47","Other Provider Identifier Issuer_47","Other Provider Identifier_48","Other Provider Identifier Type Code_48","Other Provider Identifier State_48","Other Provider Identifier Issuer_48","Other Provider Identifier_49","Other Provider Identifier Type Code_49","Other Provider Identifier State_49","Other Provider Identifier Issuer_49","Other Provider Identifier_50","Other Provider Identifier Type Code_50","Other Provider Identifier State_50","Other Provider Identifier Issuer_50","Is Sole Proprietor","Is Organization Subpart","Parent Organization LBN","Parent Organization TIN","Authorized Official Name Prefix Text","Authorized Official Name Suffix Text","Authorized Official Credential Text","Healthcare Provider Taxonomy Group_1","Healthcare Provider Taxonomy Group_2","Healthcare Provider Taxonomy Group_3","Healthcare Provider Taxonomy Group_4","Healthcare Provider Taxonomy Group_5","Healthcare Provider Taxonomy Group_6","Healthcare Provider Taxonomy Group_7","Healthcare Provider Taxonomy Group_8","Healthcare Provider Taxonomy Group_9","Healthcare Provider Taxonomy Group_10","Healthcare Provider Taxonomy Group_11","Healthcare Provider Taxonomy Group_12","Healthcare Provider Taxonomy Group_13","Healthcare Provider Taxonomy Group_14","Healthcare Provider Taxonomy Group_15","Certification Date"
"1679576722","1","","","","WIEBE","DAVID","A","","","M.D.","","","","","","","","","","PO BOX 2168","","KEARNEY","NE","688482168","US","3088652512","3088652506","3500 CENTRAL AVE","","KEARNEY","NE","688472944","US","3088652512","3088652506","05/23/2005","07/08/2007","","","","M","","","","","","207X00000X","12637","NE","Y","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","46969","01","KS","BCBS","1553","01","NE","BCBS","645540","01","KS","FIRSTGUARD","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","X","","","","","","","","","","","","","","","","","","","","","",""
"1588667638","1","","","","PILCHER","WILLIAM","C","DR.","","MD","","","","","","","","","","1824 KING STREET","SUITE 300","JACKSONVILLE","FL","322044736","US","9043881820","9043881827","1824 KING STREET","SUITE 300","JACKSONVILLE","FL","322044736","US","9043881820","9043881827","05/23/2005","05/29/2014","","","","M","","","","","","207RC0000X","ME68414","FL","Y","207RC0000X","032024","GA","N","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","0897705","01","FL","AETNA","510265","01","GA","BCBS","00532485C","05","GA","","208143","01","FL","AVMED","251286600","05","FL","","27888","01","FL","BCBS","00706626A","05","GA","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","N","","","","","","","","","","","","","","","","","","","","","",""
"1497758544","2","","<UNAVAIL>","CUMBERLAND COUNTY HOSPITAL SYSTEM, INC","","","","","","","CAPE FEAR VALLEY HOME HEALTH AND HOSPICE","3","","","","","","","","3418 VILLAGE DR","","FAYETTEVILLE","NC","283044552","US","9106096740","","3418 VILLAGE DR","","FAYETTEVILLE","NC","283044552","US","9106096740","","05/23/2005","09/26/2011","","","","","NAGOWSKI","MICHAEL","","CEO","9106096700","251G00000X","HC0283","NC","Y","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","3401562","05","NC","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","N","","","MR.","","","","","","","","","","","","","","","","","",""
"1306849450","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","03/03/2021","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","",""
"1215930367","1","","","","GRESSOT","LAURENT","","DR.","","M.D.","","","","","","","","","","17323 RED OAK DR","","HOUSTON","TX","770901243","US","2814405006","2814406149","17323 RED OAK DR","","HOUSTON","TX","770901243","US","2814405006","2814406149","05/23/2005","11/25/2014","","","","M","","","","","","174400000X","H6257","TX","N","207RH0003X","H6257","TX","Y","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","N","","","","","","","","","","","","","","","","","","","","","",""
"1023011178","2","","<UNAVAIL>","COLLABRIA CARE","","","","","","","NAPA VALLEY HOSPICE & ADULT DAY SERVICES","4","","","","","","","","414 S JEFFERSON ST","","NAPA","CA","945594515","US","7072589080","7072582476","414 S JEFFERSON ST","","NAPA","CA","945594515","US","7072589080","7072582476","05/23/2005","06/09/2020","","","","","ANDERSON","DONALD","WAYNE","ASSISTANT SECRETARY OF ENROLLMENT","4255255392","251G00000X","100000741","CA","Y","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","HPC01537G","05","CA","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","N","","","","JR.","","","","","","","","","","","","","","","","","06/09/2020"
"1932102084","1","","","","ADUSUMILLI","RAVI","K","","","MD","","","","","","","","","","2940 N MCCORD RD","","TOLEDO","OH","436151753","US","4198423000","4198423048","2940 N MCCORD RD","","TOLEDO","OH","436151753","US","4198423000","4198423048","05/23/2005","04/23/2012","","","","M","","","","","","207RC0000X","35069014","OH","Y","207RC0000X","4301081344","MI","N","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","0178623","05","OH","","P00751116","01","","RAILROAD MEDICARE","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","N","","","","","","","","","","","","","","","","","","","","","",""
"1841293990","1","","","","WORTSMAN","SUSAN","","","","MA-CCC","","","","","","","","","","68 ROCKLEDGE RD","APT 1C","HARTSDALE","NY","105303455","US","2124814464","","425 E 25TH ST","","NEW YORK","NY","100102547","US","2124814464","","05/23/2005","07/08/2007","","","","F","","","","","","231H00000X","000396-1","NY","Y","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","N","","","","","","","","","","","","","","","","","","","","","",""
"1750384806","1","","","","BISBEE","ROBERT","","DR.","","MD","","","","","","","","","","5219 CITY BANK PKWY STE 214","","LUBBOCK","TX","794073537","US","8067810360","8067820097","808 JOLIET AVE UNIT 120","","LUBBOCK","TX","794151148","US","8067610540","8067610451","05/23/2005","03/10/2017","","","","M","","","","","","207R00000X","J8461","TX","Y","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","N","","","","","","","","","","","","","","","","","","","","","",""
```
```

# Test SQL query in notebook

In [2]:
# Load duckdb, which lets us efficiently load large files
import duckdb

# Import jupysql Jupyter extension to create SQL cells
%load_ext sql

# Set configrations on jupysql to directly output data to Pandas and to simplify the output that is printed to the notebook.
%config SqlMagic.autopandas = True

%config SqlMagic.feedback = False
%config SqlMagic.displaycon = False

# Connect jupysql to DuckDB using a SQLAlchemy-style connection string. Either connect to an in memory DuckDB, or a file backed db.
%sql duckdb:///:memory:

In [5]:
%%sql
SELECT *
FROM read_csv('npidata_pfile_20050523-20230709.csv', 
  header=True,
  delim=',',
  quote='"',
  nullstr='',
  dateformat='%m/%d/%Y',
  parallel=false,
  columns={
    'NPI': 'BIGINT',
    'Entity Type Code': 'VARCHAR',
    'Replacement NPI': 'BIGINT', 
    'Employer Identification Number (EIN)': 'VARCHAR',
    'Provider Organization Name (Legal Business Name)': 'VARCHAR',
    'Provider Last Name (Legal Name)': 'VARCHAR',
    'Provider First Name': 'VARCHAR',
    'Provider Middle Name': 'VARCHAR',
    'Provider Name Prefix Text': 'VARCHAR',
    'Provider Name Suffix Text': 'VARCHAR',
    'Provider Credential Text': 'VARCHAR',
    'Provider Other Organization Name': 'VARCHAR',
    'Provider Other Organization Name Type Code': 'VARCHAR',
    'Provider Other Last Name': 'VARCHAR',
    'Provider Other First Name': 'VARCHAR',
    'Provider Other Middle Name': 'VARCHAR',
    'Provider Other Name Prefix Text': 'VARCHAR',
    'Provider Other Name Suffix Text': 'VARCHAR',
    'Provider Other Credential Text': 'VARCHAR',
    'Provider Other Last Name Type Code': 'INT',
    'Provider First Line Business Mailing Address': 'VARCHAR',
    'Provider Second Line Business Mailing Address': 'VARCHAR',
    'Provider Business Mailing Address City Name': 'VARCHAR',
    'Provider Business Mailing Address State Name': 'VARCHAR',
    'Provider Business Mailing Address Postal Code': 'VARCHAR',
    'Provider Business Mailing Address Country Code (If outside U.S.)': 'VARCHAR',
    'Provider Business Mailing Address Telephone Number': 'VARCHAR',
    'Provider Business Mailing Address Fax Number': 'VARCHAR',
    'Provider First Line Business Practice Location Address': 'VARCHAR',
    'Provider Second Line Business Practice Location Address': 'VARCHAR',
    'Provider Business Practice Location Address City Name': 'VARCHAR',
    'Provider Business Practice Location Address State Name': 'VARCHAR',
    'Provider Business Practice Location Address Postal Code': 'VARCHAR',
    'Provider Business Practice Location Address Country Code (If outside U.S.)': 'VARCHAR',
    'Provider Business Practice Location Address Telephone Number': 'VARCHAR',
    'Provider Business Practice Location Address Fax Number': 'VARCHAR',
    'Provider Enumeration Date': 'VARCHAR',
    'Last Update Date': 'VARCHAR',
    'NPI Deactivation Reason Code': 'VARCHAR',
    'NPI Deactivation Date': 'VARCHAR',
    'NPI Reactivation Date': 'VARCHAR',
    'Provider Gender Code': 'VARCHAR', 
    'Authorized Official Last Name': 'VARCHAR',
    'Authorized Official First Name': 'VARCHAR',
    'Authorized Official Middle Name': 'VARCHAR',
    'Authorized Official Title or Position': 'VARCHAR',
    'Authorized Official Telephone Number': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_1': 'VARCHAR',
    'Provider License Number_1': 'VARCHAR',
    'Provider License Number State Code_1': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_1': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_2': 'VARCHAR',
    'Provider License Number_2': 'VARCHAR',
    'Provider License Number State Code_2': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_2': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_3': 'VARCHAR',
    'Provider License Number_3': 'VARCHAR',
    'Provider License Number State Code_3': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_3': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_4': 'VARCHAR',
    'Provider License Number_4': 'VARCHAR',
    'Provider License Number State Code_4': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_4': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_5': 'VARCHAR',
    'Provider License Number_5': 'VARCHAR',
    'Provider License Number State Code_5': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_5': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_6': 'VARCHAR',
    'Provider License Number_6': 'VARCHAR',
    'Provider License Number State Code_6': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_6': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_7': 'VARCHAR',
    'Provider License Number_7': 'VARCHAR',
    'Provider License Number State Code_7': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_7': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_8': 'VARCHAR',
    'Provider License Number_8': 'VARCHAR',
    'Provider License Number State Code_8': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_8': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_9': 'VARCHAR',
    'Provider License Number_9': 'VARCHAR',
    'Provider License Number State Code_9': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_9': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_10': 'VARCHAR',
    'Provider License Number_10': 'VARCHAR',
    'Provider License Number State Code_10': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_10': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_11': 'VARCHAR',
    'Provider License Number_11': 'VARCHAR',
    'Provider License Number State Code_11': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_11': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_12': 'VARCHAR',
    'Provider License Number_12': 'VARCHAR',
    'Provider License Number State Code_12': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_12': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_13': 'VARCHAR',
    'Provider License Number_13': 'VARCHAR',
    'Provider License Number State Code_13': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_13': 'VARCHAR',   
    'Healthcare Provider Taxonomy Code_14': 'VARCHAR',
    'Provider License Number_14': 'VARCHAR',
    'Provider License Number State Code_14': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_14': 'VARCHAR',  
    'Healthcare Provider Taxonomy Code_15': 'VARCHAR',
    'Provider License Number_15': 'VARCHAR',
    'Provider License Number State Code_15': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_15': 'VARCHAR',
    'Other Provider Identifier_1': 'VARCHAR',
    'Other Provider Identifier Type Code_1': 'VARCHAR',
    'Other Provider Identifier State_1': 'VARCHAR',
    'Other Provider Identifier Issuer_1': 'VARCHAR',
    'Other Provider Identifier_2': 'VARCHAR',
    'Other Provider Identifier Type Code_2': 'VARCHAR',
    'Other Provider Identifier State_2': 'VARCHAR',
    'Other Provider Identifier Issuer_2': 'VARCHAR',
    'Other Provider Identifier_3': 'VARCHAR',
    'Other Provider Identifier Type Code_3': 'VARCHAR',
    'Other Provider Identifier State_3': 'VARCHAR',
    'Other Provider Identifier Issuer_3': 'VARCHAR',
    'Other Provider Identifier_4': 'VARCHAR',
    'Other Provider Identifier Type Code_4': 'VARCHAR',
    'Other Provider Identifier State_4': 'VARCHAR',
    'Other Provider Identifier Issuer_4': 'VARCHAR',
    'Other Provider Identifier_5': 'VARCHAR',
    'Other Provider Identifier Type Code_5': 'VARCHAR',
    'Other Provider Identifier State_5': 'VARCHAR',
    'Other Provider Identifier Issuer_5': 'VARCHAR',
    'Other Provider Identifier_6': 'VARCHAR',
    'Other Provider Identifier Type Code_6': 'VARCHAR',
    'Other Provider Identifier State_6': 'VARCHAR',
    'Other Provider Identifier Issuer_6': 'VARCHAR',
    'Other Provider Identifier_7': 'VARCHAR',
    'Other Provider Identifier Type Code_7': 'VARCHAR',
    'Other Provider Identifier State_7': 'VARCHAR',
    'Other Provider Identifier Issuer_7': 'VARCHAR',
    'Other Provider Identifier_8': 'VARCHAR',
    'Other Provider Identifier Type Code_8': 'VARCHAR',
    'Other Provider Identifier State_8': 'VARCHAR',
    'Other Provider Identifier Issuer_8': 'VARCHAR',
    'Other Provider Identifier_9': 'VARCHAR',
    'Other Provider Identifier Type Code_9': 'VARCHAR',
    'Other Provider Identifier State_9': 'VARCHAR',
    'Other Provider Identifier Issuer_9': 'VARCHAR',
    'Other Provider Identifier_10': 'VARCHAR',
    'Other Provider Identifier Type Code_10': 'VARCHAR',
    'Other Provider Identifier State_10': 'VARCHAR',
    'Other Provider Identifier Issuer_10': 'VARCHAR',
    'Other Provider Identifier_11': 'VARCHAR',
    'Other Provider Identifier Type Code_11': 'VARCHAR',
    'Other Provider Identifier State_11': 'VARCHAR',
    'Other Provider Identifier Issuer_11': 'VARCHAR',
    'Other Provider Identifier_12': 'VARCHAR',
    'Other Provider Identifier Type Code_12': 'VARCHAR',
    'Other Provider Identifier State_12': 'VARCHAR',
    'Other Provider Identifier Issuer_12': 'VARCHAR',
    'Other Provider Identifier_13': 'VARCHAR',
    'Other Provider Identifier Type Code_13': 'VARCHAR',
    'Other Provider Identifier State_13': 'VARCHAR',
    'Other Provider Identifier Issuer_13': 'VARCHAR',
    'Other Provider Identifier_14': 'VARCHAR',
    'Other Provider Identifier Type Code_14': 'VARCHAR', 
    'Other Provider Identifier State_14': 'VARCHAR',
    'Other Provider Identifier Issuer_14': 'VARCHAR',
    'Other Provider Identifier_15': 'VARCHAR',
    'Other Provider Identifier Type Code_15': 'VARCHAR',
    'Other Provider Identifier State_15': 'VARCHAR',
    'Other Provider Identifier Issuer_15': 'VARCHAR',
    'Other Provider Identifier_16': 'VARCHAR',
    'Other Provider Identifier Type Code_16': 'VARCHAR',
    'Other Provider Identifier State_16': 'VARCHAR',
    'Other Provider Identifier Issuer_16': 'VARCHAR',
    'Other Provider Identifier_17': 'VARCHAR',
    'Other Provider Identifier Type Code_17': 'VARCHAR',
    'Other Provider Identifier State_17': 'VARCHAR',
    'Other Provider Identifier Issuer_17': 'VARCHAR',
    'Other Provider Identifier_18': 'VARCHAR',
    'Other Provider Identifier Type Code_18': 'VARCHAR',
    'Other Provider Identifier State_18': 'VARCHAR',
    'Other Provider Identifier Issuer_18': 'VARCHAR',
    'Other Provider Identifier_19': 'VARCHAR',
    'Other Provider Identifier Type Code_19': 'VARCHAR',
    'Other Provider Identifier State_19': 'VARCHAR',
    'Other Provider Identifier Issuer_19': 'VARCHAR',
    'Other Provider Identifier_20': 'VARCHAR',
    'Other Provider Identifier Type Code_20': 'VARCHAR',
    'Other Provider Identifier State_20': 'VARCHAR',
    'Other Provider Identifier Issuer_20': 'VARCHAR',
    'Other Provider Identifier_21': 'VARCHAR',
    'Other Provider Identifier Type Code_21': 'VARCHAR',
    'Other Provider Identifier State_21': 'VARCHAR',
    'Other Provider Identifier Issuer_21': 'VARCHAR',
    'Other Provider Identifier_22': 'VARCHAR',
    'Other Provider Identifier Type Code_22': 'VARCHAR',
    'Other Provider Identifier State_22': 'VARCHAR',
    'Other Provider Identifier Issuer_22': 'VARCHAR',
    'Other Provider Identifier_23': 'VARCHAR',
    'Other Provider Identifier Type Code_23': 'VARCHAR',
    'Other Provider Identifier State_23': 'VARCHAR',
    'Other Provider Identifier Issuer_23': 'VARCHAR',
    'Other Provider Identifier_24': 'VARCHAR',  
    'Other Provider Identifier Type Code_24': 'VARCHAR',
    'Other Provider Identifier State_24': 'VARCHAR',
    'Other Provider Identifier Issuer_24': 'VARCHAR',
    'Other Provider Identifier_25': 'VARCHAR',
    'Other Provider Identifier Type Code_25': 'VARCHAR',
    'Other Provider Identifier State_25': 'VARCHAR',
    'Other Provider Identifier Issuer_25': 'VARCHAR',
    'Other Provider Identifier_26': 'VARCHAR',
    'Other Provider Identifier Type Code_26': 'VARCHAR',
    'Other Provider Identifier State_26': 'VARCHAR',
    'Other Provider Identifier Issuer_26': 'VARCHAR',
    'Other Provider Identifier_27': 'VARCHAR',
    'Other Provider Identifier Type Code_27': 'VARCHAR',
    'Other Provider Identifier State_27': 'VARCHAR',
    'Other Provider Identifier Issuer_27': 'VARCHAR',
    'Other Provider Identifier_28': 'VARCHAR',
    'Other Provider Identifier Type Code_28': 'VARCHAR',
    'Other Provider Identifier State_28': 'VARCHAR',
    'Other Provider Identifier Issuer_28': 'VARCHAR',
    'Other Provider Identifier_29': 'VARCHAR',
    'Other Provider Identifier Type Code_29': 'VARCHAR',
    'Other Provider Identifier State_29': 'VARCHAR',
    'Other Provider Identifier Issuer_29': 'VARCHAR',
    'Other Provider Identifier_30': 'VARCHAR',
    'Other Provider Identifier Type Code_30': 'VARCHAR',
    'Other Provider Identifier State_30': 'VARCHAR',
    'Other Provider Identifier Issuer_30': 'VARCHAR',
    'Other Provider Identifier_31': 'VARCHAR',
    'Other Provider Identifier Type Code_31': 'VARCHAR',
    'Other Provider Identifier State_31': 'VARCHAR',
    'Other Provider Identifier Issuer_31': 'VARCHAR',
    'Other Provider Identifier_32': 'VARCHAR',
    'Other Provider Identifier Type Code_32': 'VARCHAR',
    'Other Provider Identifier State_32': 'VARCHAR',
    'Other Provider Identifier Issuer_32': 'VARCHAR',
    'Other Provider Identifier_33': 'VARCHAR',
    'Other Provider Identifier Type Code_33': 'VARCHAR',
    'Other Provider Identifier State_33': 'VARCHAR',
    'Other Provider Identifier Issuer_33': 'VARCHAR',
    'Other Provider Identifier_34': 'VARCHAR',
    'Other Provider Identifier Type Code_34': 'VARCHAR',
    'Other Provider Identifier State_34': 'VARCHAR',
    'Other Provider Identifier Issuer_34': 'VARCHAR',
    'Other Provider Identifier_35': 'VARCHAR',
    'Other Provider Identifier Type Code_35': 'VARCHAR',
    'Other Provider Identifier State_35': 'VARCHAR',
    'Other Provider Identifier Issuer_35': 'VARCHAR',
    'Other Provider Identifier_36': 'VARCHAR',
    'Other Provider Identifier Type Code_36': 'VARCHAR',
    'Other Provider Identifier State_36': 'VARCHAR',
    'Other Provider Identifier Issuer_36': 'VARCHAR',
    'Other Provider Identifier_37': 'VARCHAR',
    'Other Provider Identifier Type Code_37': 'VARCHAR',
    'Other Provider Identifier State_37': 'VARCHAR',
    'Other Provider Identifier Issuer_37': 'VARCHAR',
    'Other Provider Identifier_38': 'VARCHAR',
    'Other Provider Identifier Type Code_38': 'VARCHAR',
    'Other Provider Identifier State_38': 'VARCHAR',
    'Other Provider Identifier Issuer_38': 'VARCHAR',
    'Other Provider Identifier_39': 'VARCHAR',
    'Other Provider Identifier Type Code_39': 'VARCHAR',
    'Other Provider Identifier State_39': 'VARCHAR',
    'Other Provider Identifier Issuer_39': 'VARCHAR',
    'Other Provider Identifier_40': 'VARCHAR',
    'Other Provider Identifier Type Code_40': 'VARCHAR',
    'Other Provider Identifier State_40': 'VARCHAR',
    'Other Provider Identifier Issuer_40': 'VARCHAR',
    'Other Provider Identifier_41': 'VARCHAR',
    'Other Provider Identifier Type Code_41': 'VARCHAR',
    'Other Provider Identifier State_41': 'VARCHAR',
    'Other Provider Identifier Issuer_41': 'VARCHAR',
    'Other Provider Identifier_42': 'VARCHAR',
    'Other Provider Identifier Type Code_42': 'VARCHAR',
    'Other Provider Identifier State_42': 'VARCHAR',
    'Other Provider Identifier Issuer_42': 'VARCHAR',
    'Other Provider Identifier_43': 'VARCHAR',
    'Other Provider Identifier Type Code_43': 'VARCHAR',
    'Other Provider Identifier State_43': 'VARCHAR',
    'Other Provider Identifier Issuer_43': 'VARCHAR',
    'Other Provider Identifier_44': 'VARCHAR',
    'Other Provider Identifier Type Code_44': 'VARCHAR',
    'Other Provider Identifier State_44': 'VARCHAR',
    'Other Provider Identifier Issuer_44': 'VARCHAR',
    'Other Provider Identifier_45': 'VARCHAR',
    'Other Provider Identifier Type Code_45': 'VARCHAR',
    'Other Provider Identifier State_45': 'VARCHAR',
    'Other Provider Identifier Issuer_45': 'VARCHAR',
    'Other Provider Identifier_46': 'VARCHAR',
    'Other Provider Identifier Type Code_46': 'VARCHAR',
    'Other Provider Identifier State_46': 'VARCHAR',
    'Other Provider Identifier Issuer_46': 'VARCHAR',
    'Other Provider Identifier_47': 'VARCHAR',
    'Other Provider Identifier Type Code_47': 'VARCHAR',
    'Other Provider Identifier State_47': 'VARCHAR',
    'Other Provider Identifier Issuer_47': 'VARCHAR',
    'Other Provider Identifier_48': 'VARCHAR',
    'Other Provider Identifier Type Code_48': 'VARCHAR',
    'Other Provider Identifier State_48': 'VARCHAR',
    'Other Provider Identifier Issuer_48': 'VARCHAR',
    'Other Provider Identifier_49': 'VARCHAR',
    'Other Provider Identifier Type Code_49': 'VARCHAR',
    'Other Provider Identifier State_49': 'VARCHAR',
    'Other Provider Identifier Issuer_49': 'VARCHAR',  
    'Other Provider Identifier_50': 'VARCHAR',
    'Other Provider Identifier Type Code_50': 'VARCHAR',
    'Other Provider Identifier State_50': 'VARCHAR',
    'Other Provider Identifier Issuer_50': 'VARCHAR',
    'Is Sole Proprietor': 'VARCHAR',
    'Is Organization Subpart': 'VARCHAR', 
    'Parent Organization LBN': 'VARCHAR',
    'Parent Organization TIN': 'VARCHAR',
    'Authorized Official Name Prefix Text': 'VARCHAR',
    'Authorized Official Name Suffix Text': 'VARCHAR',
    'Authorized Official Credential Text': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_1': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_2': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_3': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_4': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_5': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_6': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_7': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_8': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_9': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_10': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_11': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_12': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_13': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_14': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_15': 'VARCHAR',
    'Certification Date': 'VARCHAR'
  }
)
LIMIT 10;

Unnamed: 0,NPI,Entity Type Code,Replacement NPI,Employer Identification Number (EIN),Provider Organization Name (Legal Business Name),Provider Last Name (Legal Name),Provider First Name,Provider Middle Name,Provider Name Prefix Text,Provider Name Suffix Text,...,Healthcare Provider Taxonomy Group_7,Healthcare Provider Taxonomy Group_8,Healthcare Provider Taxonomy Group_9,Healthcare Provider Taxonomy Group_10,Healthcare Provider Taxonomy Group_11,Healthcare Provider Taxonomy Group_12,Healthcare Provider Taxonomy Group_13,Healthcare Provider Taxonomy Group_14,Healthcare Provider Taxonomy Group_15,Certification Date
0,1679576722,1.0,,,,WIEBE,DAVID,A,,,...,,,,,,,,,,NaT
1,1588667638,1.0,,,,PILCHER,WILLIAM,C,DR.,,...,,,,,,,,,,NaT
2,1497758544,2.0,,<UNAVAIL>,"CUMBERLAND COUNTY HOSPITAL SYSTEM, INC",,,,,,...,,,,,,,,,,NaT
3,1306849450,,,,,,,,,,...,,,,,,,,,,NaT
4,1215930367,1.0,,,,GRESSOT,LAURENT,,DR.,,...,,,,,,,,,,NaT
5,1023011178,2.0,,<UNAVAIL>,COLLABRIA CARE,,,,,,...,,,,,,,,,,2020-06-09
6,1932102084,1.0,,,,ADUSUMILLI,RAVI,K,,,...,,,,,,,,,,NaT
7,1841293990,1.0,,,,WORTSMAN,SUSAN,,,,...,,,,,,,,,,NaT
8,1750384806,1.0,,,,BISBEE,ROBERT,,DR.,,...,,,,,,,,,,NaT
9,1669475711,1.0,,,,SUNG,BIN,SHENG,,,...,,,,,,,,,,2020-03-01


In [21]:
%%sql
COPY (SELECT *
FROM read_csv('npidata_pfile_20050523-20230709.csv', 
  header=True,
  delim=',',
  quote='"',
  nullstr='',
  dateformat='%m/%d/%Y',
  parallel=false,
  columns={
    'NPI': 'BIGINT',
    'Entity Type Code': 'VARCHAR',
    'Replacement NPI': 'BIGINT', 
    'Employer Identification Number (EIN)': 'VARCHAR',
    'Provider Organization Name (Legal Business Name)': 'VARCHAR',
    'Provider Last Name (Legal Name)': 'VARCHAR',
    'Provider First Name': 'VARCHAR',
    'Provider Middle Name': 'VARCHAR',
    'Provider Name Prefix Text': 'VARCHAR',
    'Provider Name Suffix Text': 'VARCHAR',
    'Provider Credential Text': 'VARCHAR',
    'Provider Other Organization Name': 'VARCHAR',
    'Provider Other Organization Name Type Code': 'VARCHAR',
    'Provider Other Last Name': 'VARCHAR',
    'Provider Other First Name': 'VARCHAR',
    'Provider Other Middle Name': 'VARCHAR',
    'Provider Other Name Prefix Text': 'VARCHAR',
    'Provider Other Name Suffix Text': 'VARCHAR',
    'Provider Other Credential Text': 'VARCHAR',
    'Provider Other Last Name Type Code': 'INT',
    'Provider First Line Business Mailing Address': 'VARCHAR',
    'Provider Second Line Business Mailing Address': 'VARCHAR',
    'Provider Business Mailing Address City Name': 'VARCHAR',
    'Provider Business Mailing Address State Name': 'VARCHAR',
    'Provider Business Mailing Address Postal Code': 'VARCHAR',
    'Provider Business Mailing Address Country Code (If outside U.S.)': 'VARCHAR',
    'Provider Business Mailing Address Telephone Number': 'VARCHAR',
    'Provider Business Mailing Address Fax Number': 'VARCHAR',
    'Provider First Line Business Practice Location Address': 'VARCHAR',
    'Provider Second Line Business Practice Location Address': 'VARCHAR',
    'Provider Business Practice Location Address City Name': 'VARCHAR',
    'Provider Business Practice Location Address State Name': 'VARCHAR',
    'Provider Business Practice Location Address Postal Code': 'VARCHAR',
    'Provider Business Practice Location Address Country Code (If outside U.S.)': 'VARCHAR',
    'Provider Business Practice Location Address Telephone Number': 'VARCHAR',
    'Provider Business Practice Location Address Fax Number': 'VARCHAR',
    'Provider Enumeration Date': 'VARCHAR',
    'Last Update Date': 'VARCHAR',
    'NPI Deactivation Reason Code': 'VARCHAR',
    'NPI Deactivation Date': 'VARCHAR',
    'NPI Reactivation Date': 'VARCHAR',
    'Provider Gender Code': 'VARCHAR', 
    'Authorized Official Last Name': 'VARCHAR',
    'Authorized Official First Name': 'VARCHAR',
    'Authorized Official Middle Name': 'VARCHAR',
    'Authorized Official Title or Position': 'VARCHAR',
    'Authorized Official Telephone Number': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_1': 'VARCHAR',
    'Provider License Number_1': 'VARCHAR',
    'Provider License Number State Code_1': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_1': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_2': 'VARCHAR',
    'Provider License Number_2': 'VARCHAR',
    'Provider License Number State Code_2': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_2': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_3': 'VARCHAR',
    'Provider License Number_3': 'VARCHAR',
    'Provider License Number State Code_3': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_3': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_4': 'VARCHAR',
    'Provider License Number_4': 'VARCHAR',
    'Provider License Number State Code_4': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_4': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_5': 'VARCHAR',
    'Provider License Number_5': 'VARCHAR',
    'Provider License Number State Code_5': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_5': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_6': 'VARCHAR',
    'Provider License Number_6': 'VARCHAR',
    'Provider License Number State Code_6': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_6': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_7': 'VARCHAR',
    'Provider License Number_7': 'VARCHAR',
    'Provider License Number State Code_7': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_7': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_8': 'VARCHAR',
    'Provider License Number_8': 'VARCHAR',
    'Provider License Number State Code_8': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_8': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_9': 'VARCHAR',
    'Provider License Number_9': 'VARCHAR',
    'Provider License Number State Code_9': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_9': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_10': 'VARCHAR',
    'Provider License Number_10': 'VARCHAR',
    'Provider License Number State Code_10': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_10': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_11': 'VARCHAR',
    'Provider License Number_11': 'VARCHAR',
    'Provider License Number State Code_11': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_11': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_12': 'VARCHAR',
    'Provider License Number_12': 'VARCHAR',
    'Provider License Number State Code_12': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_12': 'VARCHAR',
    'Healthcare Provider Taxonomy Code_13': 'VARCHAR',
    'Provider License Number_13': 'VARCHAR',
    'Provider License Number State Code_13': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_13': 'VARCHAR',   
    'Healthcare Provider Taxonomy Code_14': 'VARCHAR',
    'Provider License Number_14': 'VARCHAR',
    'Provider License Number State Code_14': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_14': 'VARCHAR',  
    'Healthcare Provider Taxonomy Code_15': 'VARCHAR',
    'Provider License Number_15': 'VARCHAR',
    'Provider License Number State Code_15': 'VARCHAR',
    'Healthcare Provider Primary Taxonomy Switch_15': 'VARCHAR',
    'Other Provider Identifier_1': 'VARCHAR',
    'Other Provider Identifier Type Code_1': 'VARCHAR',
    'Other Provider Identifier State_1': 'VARCHAR',
    'Other Provider Identifier Issuer_1': 'VARCHAR',
    'Other Provider Identifier_2': 'VARCHAR',
    'Other Provider Identifier Type Code_2': 'VARCHAR',
    'Other Provider Identifier State_2': 'VARCHAR',
    'Other Provider Identifier Issuer_2': 'VARCHAR',
    'Other Provider Identifier_3': 'VARCHAR',
    'Other Provider Identifier Type Code_3': 'VARCHAR',
    'Other Provider Identifier State_3': 'VARCHAR',
    'Other Provider Identifier Issuer_3': 'VARCHAR',
    'Other Provider Identifier_4': 'VARCHAR',
    'Other Provider Identifier Type Code_4': 'VARCHAR',
    'Other Provider Identifier State_4': 'VARCHAR',
    'Other Provider Identifier Issuer_4': 'VARCHAR',
    'Other Provider Identifier_5': 'VARCHAR',
    'Other Provider Identifier Type Code_5': 'VARCHAR',
    'Other Provider Identifier State_5': 'VARCHAR',
    'Other Provider Identifier Issuer_5': 'VARCHAR',
    'Other Provider Identifier_6': 'VARCHAR',
    'Other Provider Identifier Type Code_6': 'VARCHAR',
    'Other Provider Identifier State_6': 'VARCHAR',
    'Other Provider Identifier Issuer_6': 'VARCHAR',
    'Other Provider Identifier_7': 'VARCHAR',
    'Other Provider Identifier Type Code_7': 'VARCHAR',
    'Other Provider Identifier State_7': 'VARCHAR',
    'Other Provider Identifier Issuer_7': 'VARCHAR',
    'Other Provider Identifier_8': 'VARCHAR',
    'Other Provider Identifier Type Code_8': 'VARCHAR',
    'Other Provider Identifier State_8': 'VARCHAR',
    'Other Provider Identifier Issuer_8': 'VARCHAR',
    'Other Provider Identifier_9': 'VARCHAR',
    'Other Provider Identifier Type Code_9': 'VARCHAR',
    'Other Provider Identifier State_9': 'VARCHAR',
    'Other Provider Identifier Issuer_9': 'VARCHAR',
    'Other Provider Identifier_10': 'VARCHAR',
    'Other Provider Identifier Type Code_10': 'VARCHAR',
    'Other Provider Identifier State_10': 'VARCHAR',
    'Other Provider Identifier Issuer_10': 'VARCHAR',
    'Other Provider Identifier_11': 'VARCHAR',
    'Other Provider Identifier Type Code_11': 'VARCHAR',
    'Other Provider Identifier State_11': 'VARCHAR',
    'Other Provider Identifier Issuer_11': 'VARCHAR',
    'Other Provider Identifier_12': 'VARCHAR',
    'Other Provider Identifier Type Code_12': 'VARCHAR',
    'Other Provider Identifier State_12': 'VARCHAR',
    'Other Provider Identifier Issuer_12': 'VARCHAR',
    'Other Provider Identifier_13': 'VARCHAR',
    'Other Provider Identifier Type Code_13': 'VARCHAR',
    'Other Provider Identifier State_13': 'VARCHAR',
    'Other Provider Identifier Issuer_13': 'VARCHAR',
    'Other Provider Identifier_14': 'VARCHAR',
    'Other Provider Identifier Type Code_14': 'VARCHAR', 
    'Other Provider Identifier State_14': 'VARCHAR',
    'Other Provider Identifier Issuer_14': 'VARCHAR',
    'Other Provider Identifier_15': 'VARCHAR',
    'Other Provider Identifier Type Code_15': 'VARCHAR',
    'Other Provider Identifier State_15': 'VARCHAR',
    'Other Provider Identifier Issuer_15': 'VARCHAR',
    'Other Provider Identifier_16': 'VARCHAR',
    'Other Provider Identifier Type Code_16': 'VARCHAR',
    'Other Provider Identifier State_16': 'VARCHAR',
    'Other Provider Identifier Issuer_16': 'VARCHAR',
    'Other Provider Identifier_17': 'VARCHAR',
    'Other Provider Identifier Type Code_17': 'VARCHAR',
    'Other Provider Identifier State_17': 'VARCHAR',
    'Other Provider Identifier Issuer_17': 'VARCHAR',
    'Other Provider Identifier_18': 'VARCHAR',
    'Other Provider Identifier Type Code_18': 'VARCHAR',
    'Other Provider Identifier State_18': 'VARCHAR',
    'Other Provider Identifier Issuer_18': 'VARCHAR',
    'Other Provider Identifier_19': 'VARCHAR',
    'Other Provider Identifier Type Code_19': 'VARCHAR',
    'Other Provider Identifier State_19': 'VARCHAR',
    'Other Provider Identifier Issuer_19': 'VARCHAR',
    'Other Provider Identifier_20': 'VARCHAR',
    'Other Provider Identifier Type Code_20': 'VARCHAR',
    'Other Provider Identifier State_20': 'VARCHAR',
    'Other Provider Identifier Issuer_20': 'VARCHAR',
    'Other Provider Identifier_21': 'VARCHAR',
    'Other Provider Identifier Type Code_21': 'VARCHAR',
    'Other Provider Identifier State_21': 'VARCHAR',
    'Other Provider Identifier Issuer_21': 'VARCHAR',
    'Other Provider Identifier_22': 'VARCHAR',
    'Other Provider Identifier Type Code_22': 'VARCHAR',
    'Other Provider Identifier State_22': 'VARCHAR',
    'Other Provider Identifier Issuer_22': 'VARCHAR',
    'Other Provider Identifier_23': 'VARCHAR',
    'Other Provider Identifier Type Code_23': 'VARCHAR',
    'Other Provider Identifier State_23': 'VARCHAR',
    'Other Provider Identifier Issuer_23': 'VARCHAR',
    'Other Provider Identifier_24': 'VARCHAR',  
    'Other Provider Identifier Type Code_24': 'VARCHAR',
    'Other Provider Identifier State_24': 'VARCHAR',
    'Other Provider Identifier Issuer_24': 'VARCHAR',
    'Other Provider Identifier_25': 'VARCHAR',
    'Other Provider Identifier Type Code_25': 'VARCHAR',
    'Other Provider Identifier State_25': 'VARCHAR',
    'Other Provider Identifier Issuer_25': 'VARCHAR',
    'Other Provider Identifier_26': 'VARCHAR',
    'Other Provider Identifier Type Code_26': 'VARCHAR',
    'Other Provider Identifier State_26': 'VARCHAR',
    'Other Provider Identifier Issuer_26': 'VARCHAR',
    'Other Provider Identifier_27': 'VARCHAR',
    'Other Provider Identifier Type Code_27': 'VARCHAR',
    'Other Provider Identifier State_27': 'VARCHAR',
    'Other Provider Identifier Issuer_27': 'VARCHAR',
    'Other Provider Identifier_28': 'VARCHAR',
    'Other Provider Identifier Type Code_28': 'VARCHAR',
    'Other Provider Identifier State_28': 'VARCHAR',
    'Other Provider Identifier Issuer_28': 'VARCHAR',
    'Other Provider Identifier_29': 'VARCHAR',
    'Other Provider Identifier Type Code_29': 'VARCHAR',
    'Other Provider Identifier State_29': 'VARCHAR',
    'Other Provider Identifier Issuer_29': 'VARCHAR',
    'Other Provider Identifier_30': 'VARCHAR',
    'Other Provider Identifier Type Code_30': 'VARCHAR',
    'Other Provider Identifier State_30': 'VARCHAR',
    'Other Provider Identifier Issuer_30': 'VARCHAR',
    'Other Provider Identifier_31': 'VARCHAR',
    'Other Provider Identifier Type Code_31': 'VARCHAR',
    'Other Provider Identifier State_31': 'VARCHAR',
    'Other Provider Identifier Issuer_31': 'VARCHAR',
    'Other Provider Identifier_32': 'VARCHAR',
    'Other Provider Identifier Type Code_32': 'VARCHAR',
    'Other Provider Identifier State_32': 'VARCHAR',
    'Other Provider Identifier Issuer_32': 'VARCHAR',
    'Other Provider Identifier_33': 'VARCHAR',
    'Other Provider Identifier Type Code_33': 'VARCHAR',
    'Other Provider Identifier State_33': 'VARCHAR',
    'Other Provider Identifier Issuer_33': 'VARCHAR',
    'Other Provider Identifier_34': 'VARCHAR',
    'Other Provider Identifier Type Code_34': 'VARCHAR',
    'Other Provider Identifier State_34': 'VARCHAR',
    'Other Provider Identifier Issuer_34': 'VARCHAR',
    'Other Provider Identifier_35': 'VARCHAR',
    'Other Provider Identifier Type Code_35': 'VARCHAR',
    'Other Provider Identifier State_35': 'VARCHAR',
    'Other Provider Identifier Issuer_35': 'VARCHAR',
    'Other Provider Identifier_36': 'VARCHAR',
    'Other Provider Identifier Type Code_36': 'VARCHAR',
    'Other Provider Identifier State_36': 'VARCHAR',
    'Other Provider Identifier Issuer_36': 'VARCHAR',
    'Other Provider Identifier_37': 'VARCHAR',
    'Other Provider Identifier Type Code_37': 'VARCHAR',
    'Other Provider Identifier State_37': 'VARCHAR',
    'Other Provider Identifier Issuer_37': 'VARCHAR',
    'Other Provider Identifier_38': 'VARCHAR',
    'Other Provider Identifier Type Code_38': 'VARCHAR',
    'Other Provider Identifier State_38': 'VARCHAR',
    'Other Provider Identifier Issuer_38': 'VARCHAR',
    'Other Provider Identifier_39': 'VARCHAR',
    'Other Provider Identifier Type Code_39': 'VARCHAR',
    'Other Provider Identifier State_39': 'VARCHAR',
    'Other Provider Identifier Issuer_39': 'VARCHAR',
    'Other Provider Identifier_40': 'VARCHAR',
    'Other Provider Identifier Type Code_40': 'VARCHAR',
    'Other Provider Identifier State_40': 'VARCHAR',
    'Other Provider Identifier Issuer_40': 'VARCHAR',
    'Other Provider Identifier_41': 'VARCHAR',
    'Other Provider Identifier Type Code_41': 'VARCHAR',
    'Other Provider Identifier State_41': 'VARCHAR',
    'Other Provider Identifier Issuer_41': 'VARCHAR',
    'Other Provider Identifier_42': 'VARCHAR',
    'Other Provider Identifier Type Code_42': 'VARCHAR',
    'Other Provider Identifier State_42': 'VARCHAR',
    'Other Provider Identifier Issuer_42': 'VARCHAR',
    'Other Provider Identifier_43': 'VARCHAR',
    'Other Provider Identifier Type Code_43': 'VARCHAR',
    'Other Provider Identifier State_43': 'VARCHAR',
    'Other Provider Identifier Issuer_43': 'VARCHAR',
    'Other Provider Identifier_44': 'VARCHAR',
    'Other Provider Identifier Type Code_44': 'VARCHAR',
    'Other Provider Identifier State_44': 'VARCHAR',
    'Other Provider Identifier Issuer_44': 'VARCHAR',
    'Other Provider Identifier_45': 'VARCHAR',
    'Other Provider Identifier Type Code_45': 'VARCHAR',
    'Other Provider Identifier State_45': 'VARCHAR',
    'Other Provider Identifier Issuer_45': 'VARCHAR',
    'Other Provider Identifier_46': 'VARCHAR',
    'Other Provider Identifier Type Code_46': 'VARCHAR',
    'Other Provider Identifier State_46': 'VARCHAR',
    'Other Provider Identifier Issuer_46': 'VARCHAR',
    'Other Provider Identifier_47': 'VARCHAR',
    'Other Provider Identifier Type Code_47': 'VARCHAR',
    'Other Provider Identifier State_47': 'VARCHAR',
    'Other Provider Identifier Issuer_47': 'VARCHAR',
    'Other Provider Identifier_48': 'VARCHAR',
    'Other Provider Identifier Type Code_48': 'VARCHAR',
    'Other Provider Identifier State_48': 'VARCHAR',
    'Other Provider Identifier Issuer_48': 'VARCHAR',
    'Other Provider Identifier_49': 'VARCHAR',
    'Other Provider Identifier Type Code_49': 'VARCHAR',
    'Other Provider Identifier State_49': 'VARCHAR',
    'Other Provider Identifier Issuer_49': 'VARCHAR',  
    'Other Provider Identifier_50': 'VARCHAR',
    'Other Provider Identifier Type Code_50': 'VARCHAR',
    'Other Provider Identifier State_50': 'VARCHAR',
    'Other Provider Identifier Issuer_50': 'VARCHAR',
    'Is Sole Proprietor': 'VARCHAR',
    'Is Organization Subpart': 'VARCHAR', 
    'Parent Organization LBN': 'VARCHAR',
    'Parent Organization TIN': 'VARCHAR',
    'Authorized Official Name Prefix Text': 'VARCHAR',
    'Authorized Official Name Suffix Text': 'VARCHAR',
    'Authorized Official Credential Text': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_1': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_2': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_3': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_4': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_5': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_6': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_7': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_8': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_9': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_10': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_11': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_12': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_13': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_14': 'VARCHAR',
    'Healthcare Provider Taxonomy Group_15': 'VARCHAR',
    'Certification Date': 'VARCHAR'
  }
)
-- LIMIT 1000 -- uncomment to test
) TO './national_plan_and_provider_identifiers.parquet' (COMPRESSION ZSTD);

FloatProgress(value=0.0, layout=Layout(width='auto'), style=ProgressStyle(bar_color='black'))

Unnamed: 0,Count
0,7890766


# Load the data for visualization

In [22]:
import vegafusion as vf
import polars as pl
import altair as alt
from vega_datasets import data
alt.data_transformers.disable_max_rows()
alt.renderers.enable('html')

# Configure DuckDB connection
vf.runtime.set_connection("duckdb")

# Enable Mime Renderer
vf.enable(row_limit=100000000)

vegafusion.enable(mimetype='html', row_limit=100000000, embed_options=None)

In [23]:
provider_identifiers = pl.read_parquet('national_plan_and_provider_identifiers.parquet')

In [24]:
provider_identifiers

NPI,Entity Type Code,Replacement NPI,Employer Identification Number (EIN),Provider Organization Name (Legal Business Name),Provider Last Name (Legal Name),Provider First Name,Provider Middle Name,Provider Name Prefix Text,Provider Name Suffix Text,Provider Credential Text,Provider Other Organization Name,Provider Other Organization Name Type Code,Provider Other Last Name,Provider Other First Name,Provider Other Middle Name,Provider Other Name Prefix Text,Provider Other Name Suffix Text,Provider Other Credential Text,Provider Other Last Name Type Code,Provider First Line Business Mailing Address,Provider Second Line Business Mailing Address,Provider Business Mailing Address City Name,Provider Business Mailing Address State Name,Provider Business Mailing Address Postal Code,Provider Business Mailing Address Country Code (If outside U.S.),Provider Business Mailing Address Telephone Number,Provider Business Mailing Address Fax Number,Provider First Line Business Practice Location Address,Provider Second Line Business Practice Location Address,Provider Business Practice Location Address City Name,Provider Business Practice Location Address State Name,Provider Business Practice Location Address Postal Code,Provider Business Practice Location Address Country Code (If outside U.S.),Provider Business Practice Location Address Telephone Number,Provider Business Practice Location Address Fax Number,Provider Enumeration Date,…,Other Provider Identifier State_47,Other Provider Identifier Issuer_47,Other Provider Identifier_48,Other Provider Identifier Type Code_48,Other Provider Identifier State_48,Other Provider Identifier Issuer_48,Other Provider Identifier_49,Other Provider Identifier Type Code_49,Other Provider Identifier State_49,Other Provider Identifier Issuer_49,Other Provider Identifier_50,Other Provider Identifier Type Code_50,Other Provider Identifier State_50,Other Provider Identifier Issuer_50,Is Sole Proprietor,Is Organization Subpart,Parent Organization LBN,Parent Organization TIN,Authorized Official Name Prefix Text,Authorized Official Name Suffix Text,Authorized Official Credential Text,Healthcare Provider Taxonomy Group_1,Healthcare Provider Taxonomy Group_2,Healthcare Provider Taxonomy Group_3,Healthcare Provider Taxonomy Group_4,Healthcare Provider Taxonomy Group_5,Healthcare Provider Taxonomy Group_6,Healthcare Provider Taxonomy Group_7,Healthcare Provider Taxonomy Group_8,Healthcare Provider Taxonomy Group_9,Healthcare Provider Taxonomy Group_10,Healthcare Provider Taxonomy Group_11,Healthcare Provider Taxonomy Group_12,Healthcare Provider Taxonomy Group_13,Healthcare Provider Taxonomy Group_14,Healthcare Provider Taxonomy Group_15,Certification Date
i64,str,i64,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,i32,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,…,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str
1679576722,"""1""",,,,"""WIEBE""","""DAVID""","""A""",,,"""M.D.""",,,,,,,,,,"""PO BOX 2168""",,"""KEARNEY""","""NE""","""688482168""","""US""","""3088652512""","""3088652506""","""3500 CENTRAL A…",,"""KEARNEY""","""NE""","""688472944""","""US""","""3088652512""","""3088652506""","""05/23/2005""",…,,,,,,,,,,,,,,,"""X""",,,,,,,,,,,,,,,,,,,,,,
1588667638,"""1""",,,,"""PILCHER""","""WILLIAM""","""C""","""DR.""",,"""MD""",,,,,,,,,,"""1824 KING STRE…","""SUITE 300""","""JACKSONVILLE""","""FL""","""322044736""","""US""","""9043881820""","""9043881827""","""1824 KING STRE…","""SUITE 300""","""JACKSONVILLE""","""FL""","""322044736""","""US""","""9043881820""","""9043881827""","""05/23/2005""",…,,,,,,,,,,,,,,,"""N""",,,,,,,,,,,,,,,,,,,,,,
1497758544,"""2""",,"""<UNAVAIL>""","""CUMBERLAND COU…",,,,,,,"""CAPE FEAR VALL…","""3""",,,,,,,,"""3418 VILLAGE D…",,"""FAYETTEVILLE""","""NC""","""283044552""","""US""","""9106096740""",,"""3418 VILLAGE D…",,"""FAYETTEVILLE""","""NC""","""283044552""","""US""","""9106096740""",,"""05/23/2005""",…,,,,,,,,,,,,,,,,"""N""",,,"""MR.""",,,,,,,,,,,,,,,,,,
1306849450,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,…,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1215930367,"""1""",,,,"""GRESSOT""","""LAURENT""",,"""DR.""",,"""M.D.""",,,,,,,,,,"""17323 RED OAK …",,"""HOUSTON""","""TX""","""770901243""","""US""","""2814405006""","""2814406149""","""17323 RED OAK …",,"""HOUSTON""","""TX""","""770901243""","""US""","""2814405006""","""2814406149""","""05/23/2005""",…,,,,,,,,,,,,,,,"""N""",,,,,,,,,,,,,,,,,,,,,,
1023011178,"""2""",,"""<UNAVAIL>""","""COLLABRIA CARE…",,,,,,,"""NAPA VALLEY HO…","""4""",,,,,,,,"""414 S JEFFERSO…",,"""NAPA""","""CA""","""945594515""","""US""","""7072589080""","""7072582476""","""414 S JEFFERSO…",,"""NAPA""","""CA""","""945594515""","""US""","""7072589080""","""7072582476""","""05/23/2005""",…,,,,,,,,,,,,,,,,"""N""",,,,"""JR.""",,,,,,,,,,,,,,,,,"""06/09/2020"""
1932102084,"""1""",,,,"""ADUSUMILLI""","""RAVI""","""K""",,,"""MD""",,,,,,,,,,"""2940 N MCCORD …",,"""TOLEDO""","""OH""","""436151753""","""US""","""4198423000""","""4198423048""","""2940 N MCCORD …",,"""TOLEDO""","""OH""","""436151753""","""US""","""4198423000""","""4198423048""","""05/23/2005""",…,,,,,,,,,,,,,,,"""N""",,,,,,,,,,,,,,,,,,,,,,
1841293990,"""1""",,,,"""WORTSMAN""","""SUSAN""",,,,"""MA-CCC""",,,,,,,,,,"""68 ROCKLEDGE R…","""APT 1C""","""HARTSDALE""","""NY""","""105303455""","""US""","""2124814464""",,"""425 E 25TH ST""",,"""NEW YORK""","""NY""","""100102547""","""US""","""2124814464""",,"""05/23/2005""",…,,,,,,,,,,,,,,,"""N""",,,,,,,,,,,,,,,,,,,,,,
1750384806,"""1""",,,,"""BISBEE""","""ROBERT""",,"""DR.""",,"""MD""",,,,,,,,,,"""5219 CITY BANK…",,"""LUBBOCK""","""TX""","""794073537""","""US""","""8067810360""","""8067820097""","""808 JOLIET AVE…",,"""LUBBOCK""","""TX""","""794151148""","""US""","""8067610540""","""8067610451""","""05/23/2005""",…,,,,,,,,,,,,,,,"""N""",,,,,,,,,,,,,,,,,,,,,,
1669475711,"""1""",,,,"""SUNG""","""BIN""","""SHENG""",,,"""M. D.""",,,,,,,,,,"""600 JEFFERSON …",,"""LAFAYETTE""","""LA""","""705016987""","""US""","""2813460018""","""2813460913""","""7629 TIKI DR""",,"""FULSHEAR""","""TX""","""774411548""","""US""","""2813460018""","""2813460913""","""05/23/2005""",…,,,,,,,,,,,,,,,"""Y""",,,,,,,,,,,,,,,,,,,,,,"""03/01/2020"""


In [25]:
provider_identifiers.schema

{'NPI': Int64,
 'Entity Type Code': Utf8,
 'Replacement NPI': Int64,
 'Employer Identification Number (EIN)': Utf8,
 'Provider Organization Name (Legal Business Name)': Utf8,
 'Provider Last Name (Legal Name)': Utf8,
 'Provider First Name': Utf8,
 'Provider Middle Name': Utf8,
 'Provider Name Prefix Text': Utf8,
 'Provider Name Suffix Text': Utf8,
 'Provider Credential Text': Utf8,
 'Provider Other Organization Name': Utf8,
 'Provider Other Organization Name Type Code': Utf8,
 'Provider Other Last Name': Utf8,
 'Provider Other First Name': Utf8,
 'Provider Other Middle Name': Utf8,
 'Provider Other Name Prefix Text': Utf8,
 'Provider Other Name Suffix Text': Utf8,
 'Provider Other Credential Text': Utf8,
 'Provider Other Last Name Type Code': Int32,
 'Provider First Line Business Mailing Address': Utf8,
 'Provider Second Line Business Mailing Address': Utf8,
 'Provider Business Mailing Address City Name': Utf8,
 'Provider Business Mailing Address State Name': Utf8,
 'Provider Business 

# Visualizations

In [26]:
# Create a bar chart
alt.Chart(provider_identifiers).mark_bar().encode(
    x='Provider Gender Code:N',
    y='count()',
)

In [27]:
# Create a bar chart
alt.Chart(provider_identifiers).mark_bar().encode(
    x='Provider Business Mailing Address State Name:N',
    y='count()',
)

In [28]:
# Create a bar chart
alt.Chart(provider_identifiers).mark_bar().encode(
    x='Provider Business Practice Location Address State Name:N',
    y='count()',
)

In [29]:
# Create a bar chart
alt.Chart(provider_identifiers).mark_bar().encode(
    x='Healthcare Provider Taxonomy Code_1:N',
    y='count()',
)

In [30]:
# Create a bar chart
alt.Chart(provider_identifiers).mark_bar().encode(
    x='Provider License Number State Code_1:N',
    y='count()',
)

In [32]:
# Create a bar chart
alt.Chart(provider_identifiers).mark_bar().encode(
    x='Other Provider Identifier Type Code_1:N',
    y='count()',
)

In [33]:
# Create a bar chart
alt.Chart(provider_identifiers).mark_bar().encode(
    x='Other Provider Identifier State_1:N',
    y='count()',
)

In [35]:
# Create a bar chart
alt.Chart(provider_identifiers).mark_bar().encode(
    x='Entity Type Code:N',
    y='count()',
)

In [37]:
# Create a bar chart
alt.Chart(provider_identifiers).mark_bar().encode(
    x='Provider Business Practice Location Address State Name:N',
    y='count()',
)

In [38]:
# Create a bar chart
alt.Chart(provider_identifiers).mark_bar().encode(
    x='Parent Organization TIN:N',
    y='count()',
)

In [39]:
# Create a bar chart
alt.Chart(provider_identifiers).mark_bar().encode(
    x='Healthcare Provider Taxonomy Group_1:N',
    y='count()',
)

In [40]:
# Create a bar chart
alt.Chart(provider_identifiers).mark_bar().encode(
    x='Employer Identification Number (EIN):N',
    y='count()',
)

## System information

In [41]:
import duckdb 
duckdb.__version__

'0.8.1'