### Import Data Sets from the following sources and transform to one record per zip code:

* 2016 Census population estimates, demographics, household income, and insured population by zip code from factfinder.census.gov
* 2016 OSHPD hospital and emergency room utilization data by zip code from oshpd.ca.gov
* CA Area Deprivation Index by zip code from University of Wisconsin School of Medicine and Public Health
* ESRI Tapestry consumer segmentation data by zip code from ESRI / ArcGIS
* 2016 Opioid and schedule drug utilization by zip code from CA Controlled Substance Utilization Review and Evaluation System (https://oag.ca.gov/cures)

### Area Deprivation Index (https://www.neighborhoodatlas.medicine.wisc.edu/)
The Area Deprivation Index is used as an indicator of socioeconomic deprivation, which we will demonstrate is a key driver of high emergent and inpatient healthcare utilization.  Originally created by the Health Resources and Services Administration nearly three decades ago, the ADI is composed of 17 education, employment, housing-quality, and poverty measures originally drawn from long-form Census data and updated by Dr. Amy Kind’s research team at the University of Wisconsin School of Medicine and Public Health to incorporate more recent American Community Survey (ACS) data.

Area deprivation index data is published in a large flat file at the census block level, and was aggregated in a separate jupyter notebook (ED_Visits_ADI_RAND.ipynb), exported to csv and merged with the rest of the data in this notebook.

### Merge data sets into one dataframe and calculate ratios:

* Hosp and ED util per 1,000 population
* Schedule 2-4 controlled substance utilization per 1,000 population
* Opioid Rx utilization per 1,000 population
* Mortality rates per 1,000 population

## Import Python Modules and all Data Sets

In [2]:
import pandas as pd
import numpy as np

In [34]:
# Add lat and lon coordinates for CA zip codes to create gmap heat map later
us_zip_code_coord_df = pd.read_csv("Resources/US Zip Codes from 2013 Government Data.csv")
ca_zip_code_coord_df = us_zip_code_coord_df[(us_zip_code_coord_df['ZIP']>=90001) & (us_zip_code_coord_df['ZIP']<=97635)]
ca_zip_code_coord_df = ca_zip_code_coord_df.rename(columns={'ZIP':'Zip','LAT':'Lat','LNG':'Lng'})

ca_zip_code_coord_df.count()

Zip    2184
Lat    2184
Lng    2184
dtype: int64

In [35]:
# Read 2016 census population estimates with some age/sex detail
ca_census_demo_df = pd.read_csv("Resources/ACS_16_5YR_DP05_with_ann.csv",skiprows=1)
ca_census_demo_df = ca_census_demo_df.rename(columns={'Id2':'Zip','Estimate; SEX AND AGE - Total population':'Total Pop',
                                                      'Estimate; SEX AND AGE - Total population - Male':'Male Pop',
                                                      'Percent; SEX AND AGE - Total population - Male':'Pct Male',
                                                      'Estimate; SEX AND AGE - Total population - Female':'Female Pop',
                                                      'Percent; SEX AND AGE - Total population - Female':'Pct Female'})
ca_census_demo_df = ca_census_demo_df[['Zip','Total Pop','Male Pop','Pct Male','Female Pop','Pct Female']]

# drop non-CA zips
ca_census_demo_df = ca_census_demo_df.drop([0,1,2,3,4,5]) # NV zips
ca_census_demo_df['Zip'].max()

97635

In [36]:
# Census data for employement status in past 12 months
ca_census_emp_df = pd.read_csv("Resources/ACS_16_5YR_S2303_with_ann.csv",skiprows=1)
ca_census_emp_df = ca_census_emp_df.rename(columns={'Id2':'Zip','Total; Estimate; Population 16 to 64 years':'Total Employable Pop',
                                                    'Total; Estimate; Workers 16 to 64 years who worked full-time, year-round':'Full Time Employed Pop',
                                                    'Percent Total; Estimate; Workers 16 to 64 years who worked full-time, year-round':'Pct Full Time Employed'})
ca_census_emp_df = ca_census_emp_df[['Zip','Total Employable Pop','Full Time Employed Pop','Pct Full Time Employed']]

ca_census_emp_df = ca_census_emp_df.replace('-', '0')
ca_census_emp_df['Pct Full Time Employed'] = ca_census_emp_df['Pct Full Time Employed'].apply(pd.to_numeric)

# drop non-CA zips
ca_census_emp_df = ca_census_emp_df.drop([0,1,2,3,4,5]) # NV zips
ca_census_emp_df.dtypes

Zip                         int64
Total Employable Pop        int64
Full Time Employed Pop      int64
Pct Full Time Employed    float64
dtype: object

In [37]:
ca_census_income_df = pd.read_csv("Resources/ACS_16_5YR_S1903_with_ann.csv",skiprows=1)
ca_census_income_df = ca_census_income_df.rename(columns={'Id2':'Zip','Total; Estimate; Households':'Total Households',
                                                      'Median income (dollars); Estimate; Households':'Household Median Income'})
ca_census_income_df = ca_census_income_df[['Zip','Total Households','Household Median Income']]

# convert Household Income to number
ca_census_income_df = ca_census_income_df.replace('-', '0')
ca_census_income_df = ca_census_income_df.replace('2,500-', '2500')
ca_census_income_df = ca_census_income_df.replace('250,000+', '250000')
ca_census_income_df['Household Median Income'] = ca_census_income_df['Household Median Income'].apply(pd.to_numeric)

# drop non-CA zips
ca_census_income_df = ca_census_income_df.drop([0,1,2,3,4,5]) # NV zips
ca_census_income_df.dtypes

Zip                        int64
Total Households           int64
Household Median Income    int64
dtype: object

In [38]:
ca_census_insured_df = pd.read_csv("Resources/ACS_16_5YR_S2701_with_ann.csv",skiprows=1)
ca_census_insured_df = ca_census_insured_df.rename(columns={'Id2':'Zip',
                                                            'Total; Estimate; Civilian noninstitutionalized population':'Insurable Pop',
                                                            'Insured; Estimate; Civilian noninstitutionalized population':'Insured Pop',
                                                            'Percent Insured; Estimate; Civilian noninstitutionalized population':'Pct Insured',
                                                            'Uninsured; Estimate; Civilian noninstitutionalized population':'Uninsured Pop'})
ca_census_insured_df = ca_census_insured_df[['Zip','Insurable Pop','Insured Pop','Pct Insured','Uninsured Pop']]

# drop non-CA zips
ca_census_insured_df = ca_census_insured_df.drop([0,1,2,3,4,5]) # NV zips
ca_census_insured_df.head()

Unnamed: 0,Zip,Insurable Pop,Insured Pop,Pct Insured,Uninsured Pop
6,90001,57935,44211,76.3,13724
7,90002,51826,39186,75.6,12640
8,90003,70208,52204,74.4,18004
9,90004,63059,47278,75.0,15781
10,90005,39338,26087,66.3,13251


In [39]:
# Combine census data sets into one
ca_census_df0 = ca_zip_code_coord_df.merge(ca_census_demo_df, on='Zip', how='inner')
ca_census_df1 = ca_census_df0.merge(ca_census_emp_df, on='Zip', how='inner')
ca_census_df2 = ca_census_df1.merge(ca_census_income_df, on='Zip', how='inner')
ca_census_df = ca_census_df2.merge(ca_census_insured_df, on='Zip', how='inner')

ca_census_df.head()

Unnamed: 0,Zip,Lat,Lng,Total Pop,Male Pop,Pct Male,Female Pop,Pct Female,Total Employable Pop,Full Time Employed Pop,Pct Full Time Employed,Total Households,Household Median Income,Insurable Pop,Insured Pop,Pct Insured,Uninsured Pop
0,90001,33.974027,-118.249509,57942,29520,50.9,28422,49.1,37501,15020,61.9,13594,34323,57935,44211,76.3,13724
1,90002,33.949099,-118.246737,51826,24877,48.0,26949,52.0,33749,12093,59.3,12543,32520,51826,39186,75.6,12640
2,90003,33.964131,-118.272783,70208,33953,48.4,36255,51.6,45068,16965,61.3,16496,31878,70208,52204,74.4,18004
3,90004,34.076198,-118.310722,63095,31141,49.4,31954,50.6,45646,21540,63.7,22431,43180,63059,47278,75.0,15781
4,90005,34.059163,-118.306892,39338,19592,49.8,19746,50.2,28125,14861,70.1,16086,31485,39338,26087,66.3,13251


In [40]:
# Read 2016 CA OSHPD Utilization file
ca_hosp_2016_df = pd.read_csv("Resources/POMS2016.csv")

# Limit to data that can be aggregated by pt zip code
ca_hosp_2016_df = ca_hosp_2016_df[['pzip','year','pattype','discharges']]
ca_hosp_2016_df = ca_hosp_2016_df.rename(columns={'year':'Year'})

# Use groupby to summarize discharges by pzip, pcounty, year and pattype
ca_hosp_2016_df = ca_hosp_2016_df.groupby(['pzip','Year','pattype']).sum()
ca_hosp_2016_df = ca_hosp_2016_df.reset_index()
ca_hosp_2016_df = ca_hosp_2016_df.set_index(['pzip','Year'])
ca_hosp_2016_df = ca_hosp_2016_df.pivot_table(index=['pzip','Year'], columns='pattype',values='discharges').reset_index()

# drop records with no zip code (ie. ARIZONA, HOMELESS, etc in pzip column)
ca_hosp_2016_df = ca_hosp_2016_df.drop([2672,2673,2674,2675,2676,2677,2678])

# convert pzip from string to int
ca_hosp_2016_df['pzip'] = ca_hosp_2016_df['pzip'].apply(pd.to_numeric)

#ca_hosp_2016_df.count()
ca_hosp_2016_df.head()

pattype,pzip,Year,AS Only,ED Only,Inpatient,Inpatient from ED
0,90001,2016,1798.0,24554.0,3628.0,3286.0
1,90002,2016,1661.0,25168.0,3421.0,3314.0
2,90003,2016,2111.0,32061.0,4610.0,4648.0
3,90004,2016,2115.0,15820.0,2751.0,2639.0
4,90005,2016,1107.0,8650.0,1579.0,1488.0


In [41]:
drug_sched_df = pd.read_csv("Resources/schedule_drugs_table9_presciptions_by_year_locale_drugschedule.csv")

# create dataframe with only 2016 drug data
drug_sched_2016_df = drug_sched_df.loc[drug_sched_df['xYear']==2016]

drug_sched_2016_df = drug_sched_2016_df[['Zip','County','State','xYear','Schedule_Group','RxCount']]
drug_sched_2016_df = drug_sched_2016_df.set_index(['Zip','County','State','xYear'])
drug_sched_2016_df = drug_sched_2016_df.pivot_table(index=['Zip','County','State','xYear'], columns='Schedule_Group', values='RxCount').reset_index()
drug_sched_2016_df = drug_sched_2016_df.rename(columns={'2':'C2','3':'C3','4':'C4','All':'C2-C4'})
#drug_sched_2016_df['C2 per 1,000'] = drug_sched_2016_df['C2']/drug_sched_2016_df['Population']
drug_sched_2016_df.head()


Schedule_Group,Zip,County,State,xYear,C2,C3,C4,C2-C4
0,90001,Los Angeles,CA,2016,843,201,1710,2754
1,90002,Los Angeles,CA,2016,357,767,1059,2183
2,90003,Los Angeles,CA,2016,1329,259,2440,4028
3,90004,Los Angeles,CA,2016,3582,1707,5002,10291
4,90005,Los Angeles,CA,2016,439,1403,1434,3276


In [42]:
drug_metrics_df = pd.read_csv("Resources/schedule_drugs_table2_metrics_by_year_and_zip.csv")

# create dataframe with only 2016 drug data
drug_metrics_2016_df = drug_metrics_df.loc[drug_metrics_df['xYear']==2016]

#drug_metrics_df.sort_values('Zip')
# what % of records are unique Zips
drug_metrics_2016_df['Zip'].nunique()/drug_metrics_2016_df['Zip'].count()

rx_count_2016_df = drug_metrics_2016_df[['Zip','Rx_count_Pat','Pat_count','RxCount_Total_Pat']]
rx_count_2016_df = rx_count_2016_df.rename(columns={'Rx_count_Pat':'Opioid Rx Count',
                                                    'Pat_count':'Pop w Opioid Rx',
                                                    'RxCount_Total_Pat':'Tot Opioid Rx Fills'})


# Merge controlled Rx with total Rx count
rx_data_df = rx_count_2016_df.merge(drug_sched_2016_df,on='Zip', how='outer')
rx_data_df = rx_data_df[['Zip','County','State','Opioid Rx Count','Pop w Opioid Rx','Tot Opioid Rx Fills','C2','C3','C4','C2-C4']]

rx_data_df.head()

Unnamed: 0,Zip,County,State,Opioid Rx Count,Pop w Opioid Rx,Tot Opioid Rx Fills,C2,C3,C4,C2-C4
0,90001,Los Angeles,CA,7879,4202,7717,843,201,1710,2754
1,90002,Los Angeles,CA,10927,5072,10701,357,767,1059,2183
2,90003,Los Angeles,CA,12814,6073,12512,1329,259,2440,4028
3,90004,Los Angeles,CA,7149,3987,6832,3582,1707,5002,10291
4,90005,Los Angeles,CA,4292,2342,4188,439,1403,1434,3276


In [43]:
ca_adi_df = pd.read_csv("Resources/CA_ADI_by_ZIP.csv")
ca_adi_df = ca_adi_df.rename(columns={'ADI_STATERNK':'ADI State Rank','ADI_NATRANK':'ADI Natl Rank'})
ca_adi_df.head()

Unnamed: 0,Zip,ADI State Rank,ADI Natl Rank
0,90001,8.007138,45.794018
1,90002,8.125497,50.517541
2,90003,7.97254,46.906751
3,90004,4.238019,28.200175
4,90005,7.225923,57.775878


In [44]:
# Read Tapestry segments by CA zip
tapestry_df = pd.read_csv("Resources/Tapestry_by_CA_Zip.csv")
tapestry_df = tapestry_df.rename(columns={'ZIP Code':'Zip','Dominant Tapestry Segment Number per Zip':'Tapestry Seg Nr',
                                         'Dominant Tapestry Segment Code per Zip':'Tapestry Seg Code',
                                         'Dominant Tapestry Segment Name per Zip':'Tapestry Seg Name',
                                         'Dominant Lifemode Code':'Lifemode Code',
                                         'Dominant Lifemode Group Name':'Lifemode Group'})
tapestry_df.head()

Unnamed: 0,Zip,NAME,Tapestry Seg Nr,Tapestry Seg Code,Tapestry Seg Name,Lifemode Code,Lifemode Group
0,90001.0,Los Angeles,61.0,13B,Las Casas,13.0,Next Wave
1,90002.0,Los Angeles,61.0,13B,Las Casas,13.0,Next Wave
2,90003.0,Los Angeles,61.0,13B,Las Casas,13.0,Next Wave
3,90004.0,Los Angeles,60.0,13A,International Marketplace,13.0,Next Wave
4,90005.0,Los Angeles,62.0,13C,NeWest Residents,13.0,Next Wave


In [45]:
ca_doi_df = pd.read_csv("Resources/CA_DOI_Underserved_Zips.csv")
ca_doi_df.head()

Unnamed: 0,Zip,CA DOI Underserved Flag
0,90001,Y
1,90002,Y
2,90003,Y
3,90004,Y
4,90005,Y


In [46]:
ca_mort_df = pd.read_csv("Resources/deaths_by_zipcode_by_cause_1999_present.csv")

# create dataframe with only 2016 drug data
ca_2016_mort_df = ca_mort_df.loc[ca_mort_df['Year']==2016]

ca_2016_mort_df = ca_2016_mort_df.rename(columns={'ZIP Code':'Zip'})
ca_2016_mort_df = ca_2016_mort_df.set_index(['Zip']).reset_index()
ca_2016_mort_df = ca_2016_mort_df.pivot_table(index=['Zip'], columns='Causes of Death',values='Count').reset_index()
ca_2016_mort_df = ca_2016_mort_df.replace(np.nan, 0)
ca_2016_mort_df['Total Deaths'] = ca_2016_mort_df['ALZ']+ca_2016_mort_df['CAN']+ca_2016_mort_df['CLD']+ \
                                  ca_2016_mort_df['DIA']+ca_2016_mort_df['HTD']+ca_2016_mort_df['HYP']+ \
                                  ca_2016_mort_df['INJ']+ca_2016_mort_df['LIV']+ca_2016_mort_df['NEP']+ \
                                  ca_2016_mort_df['OTH']+ca_2016_mort_df['PNF']+ca_2016_mort_df['STK']+ \
                                  ca_2016_mort_df['SUI']

ca_2016_mort_df.head()

Causes of Death,Zip,ALZ,CAN,CLD,DIA,HTD,HYP,INJ,LIV,NEP,OTH,PNF,STK,SUI,Total Deaths
0,90001,5,39,7,14,77,9,21,13,4,49,4,12,3,257
1,90002,6,58,12,18,67,6,11,8,8,48,8,19,4,273
2,90003,7,71,11,12,101,7,24,10,6,80,9,18,4,360
3,90004,14,77,14,9,73,9,13,5,5,28,22,15,5,289
4,90005,8,42,5,7,49,4,6,2,0,26,10,9,2,170


##  Merge all relevant data into one dataframe

In [47]:
# Merge all relevant data sets
# Census and hospital data
combo1_df = ca_census_df.merge(ca_hosp_2016_df, left_on='Zip', right_on='pzip', how='inner')
# add Rx data
combo2_df = combo1_df.merge(rx_data_df, on='Zip', how='inner')
# add Area Deprivation Index
combo3_df = combo2_df.merge(ca_adi_df, on='Zip', how='inner')
# add Tapestry segmentation data
combo4_df = combo3_df.merge(tapestry_df, on='Zip', how='inner')
# add CA Dept of Insurance underserved zip code flag
combo5_df = combo4_df.merge(ca_doi_df, on='Zip', how='left')
combo5_df['CA DOI Underserved Flag'] = combo5_df['CA DOI Underserved Flag'].fillna('N')  # populates zip codes not in CA DOI list
# add CA Mortality by Cause data by zip code
combo6_df = combo5_df.merge(ca_2016_mort_df, on='Zip', how='inner')
combo6_df.count()

Zip                        1498
Lat                        1498
Lng                        1498
Total Pop                  1498
Male Pop                   1498
Pct Male                   1498
Female Pop                 1498
Pct Female                 1498
Total Employable Pop       1498
Full Time Employed Pop     1498
Pct Full Time Employed     1498
Total Households           1498
Household Median Income    1498
Insurable Pop              1498
Insured Pop                1498
Pct Insured                1498
Uninsured Pop              1498
pzip                       1498
Year                       1498
AS Only                    1497
ED Only                    1498
Inpatient                  1498
Inpatient from ED          1498
County                     1498
State                      1498
Opioid Rx Count            1498
Pop w Opioid Rx            1498
Tot Opioid Rx Fills        1498
C2                         1498
C3                         1498
C4                         1498
C2-C4   

In [48]:
# Add Ambulatory Surgery (AS), Emergency Dept (ED), Inpatient (discharges) and Inpatient from ED utilization rates per 1,000 pop
combo6_df['AS per 1,000'] = combo6_df['AS Only'] / combo6_df['Total Pop']*1000
combo6_df['ED per 1,000'] = combo6_df['ED Only'] / combo6_df['Total Pop']*1000
combo6_df['IP per 1,000'] = combo6_df['Inpatient'] / combo6_df['Total Pop']*1000
combo6_df['IP via ED per 1,000'] = combo6_df['Inpatient from ED'] / combo6_df['Total Pop']*1000

# Calculate drug utilization ratios per 1,000 population
combo6_df['C2 per 1,000'] = combo6_df['C2']/combo6_df['Total Pop']*1000
combo6_df['C3 per 1,000'] = combo6_df['C3']/combo6_df['Total Pop']*1000
combo6_df['C4 per 1,000'] = combo6_df['C4']/combo6_df['Total Pop']*1000
combo6_df['C2-C4 per 1,000'] = combo6_df['C2-C4']/combo6_df['Total Pop']*1000
combo6_df['Opioid Rx per 1,000'] = combo6_df['Opioid Rx Count']/combo6_df['Total Pop']*1000
combo6_df['Pop w Opioid Rx per 1,000'] = combo6_df['Pop w Opioid Rx']/combo6_df['Total Pop']*1000

# Calculate mortality rates per 1,000 population by top causes of death
combo6_df['Total Deaths per 1,000'] = combo6_df['Total Deaths']/combo6_df['Total Pop']*1000
combo6_df['Alzheimers Deaths per 1,000'] = combo6_df['ALZ']/combo6_df['Total Pop']*1000
combo6_df['Cancer Deaths per 1,000'] = combo6_df['CAN']/combo6_df['Total Pop']*1000
combo6_df['CLRD Deaths per 1,000'] = combo6_df['CLD']/combo6_df['Total Pop']*1000
combo6_df['Diabetes Deaths per 1,000'] = combo6_df['DIA']/combo6_df['Total Pop']*1000
combo6_df['Heart Disease Deaths per 1,000'] = combo6_df['HTD']/combo6_df['Total Pop']*1000
combo6_df['Hypertension Deaths per 1,000'] = combo6_df['HYP']/combo6_df['Total Pop']*1000
combo6_df['Accidental Deaths per 1,000'] = combo6_df['INJ']/combo6_df['Total Pop']*1000
combo6_df['Chronic Liver Disease Deaths per 1,000'] = combo6_df['LIV']/combo6_df['Total Pop']*1000
combo6_df['Nephrotic Diseases Deaths per 1,000'] = combo6_df['NEP']/combo6_df['Total Pop']*1000
combo6_df['Other Deaths per 1,000'] = combo6_df['OTH']/combo6_df['Total Pop']*1000
combo6_df['Pneumonia and Influenza Deaths per 1,000'] = combo6_df['PNF']/combo6_df['Total Pop']*1000
combo6_df['Stroke Deaths per 1,000'] = combo6_df['STK']/combo6_df['Total Pop']*1000
combo6_df['Suicide Deaths per 1,000'] = combo6_df['SUI']/combo6_df['Total Pop']*1000

combo6_df.count()

Zip                                         1498
Lat                                         1498
Lng                                         1498
Total Pop                                   1498
Male Pop                                    1498
Pct Male                                    1498
Female Pop                                  1498
Pct Female                                  1498
Total Employable Pop                        1498
Full Time Employed Pop                      1498
Pct Full Time Employed                      1498
Total Households                            1498
Household Median Income                     1498
Insurable Pop                               1498
Insured Pop                                 1498
Pct Insured                                 1498
Uninsured Pop                               1498
pzip                                        1498
Year                                        1498
AS Only                                     1497
ED Only             

## Clean final merged data and export to csv file

In [49]:
# Remove duplicate columns
combo6_df = combo6_df[['Zip','Lat','Lng','County','State','Year','Total Pop','Male Pop','Pct Male','Female Pop','Pct Female',
                       'Total Employable Pop','Full Time Employed Pop','Pct Full Time Employed','Total Households',
                       'Household Median Income','Insurable Pop','Insured Pop','Pct Insured','Uninsured Pop',
                       'ADI State Rank','ADI Natl Rank','CA DOI Underserved Flag','Tapestry Seg Nr','Tapestry Seg Code',
                       'Tapestry Seg Name','Lifemode Code','Lifemode Group','AS Only','ED Only','Inpatient',
                       'Inpatient from ED','Opioid Rx Count','Pop w Opioid Rx','Tot Opioid Rx Fills',
                       'ALZ','CAN','CLD','DIA','HTD','HYP','INJ','LIV','NEP','OTH','PNF','STK','SUI','Total Deaths',
                       'Total Deaths per 1,000','Alzheimers Deaths per 1,000','Cancer Deaths per 1,000',
                       'CLRD Deaths per 1,000','Diabetes Deaths per 1,000','Heart Disease Deaths per 1,000',
                       'Hypertension Deaths per 1,000','Accidental Deaths per 1,000','Chronic Liver Disease Deaths per 1,000',
                       'Nephrotic Diseases Deaths per 1,000','Other Deaths per 1,000','Pneumonia and Influenza Deaths per 1,000',
                       'Stroke Deaths per 1,000','Suicide Deaths per 1,000','C2','C3','C4','C2-C4','AS per 1,000',
                       'ED per 1,000','IP per 1,000','IP via ED per 1,000','C2 per 1,000','C3 per 1,000','C4 per 1,000',
                       'C2-C4 per 1,000','Opioid Rx per 1,000','Pop w Opioid Rx per 1,000']]

# Clean null and infinite values from data
combo7_df = combo6_df.replace([np.inf, -np.inf], np.nan)
combo8_df = combo7_df.dropna(axis=0)

# Save Clean Combined Data to csv file
combo8_df.to_csv(f"Combined_Hosp_Drug_Util_v3.csv", encoding='utf-8')

combo8_df.count()

Zip                                         1495
Lat                                         1495
Lng                                         1495
County                                      1495
State                                       1495
Year                                        1495
Total Pop                                   1495
Male Pop                                    1495
Pct Male                                    1495
Female Pop                                  1495
Pct Female                                  1495
Total Employable Pop                        1495
Full Time Employed Pop                      1495
Pct Full Time Employed                      1495
Total Households                            1495
Household Median Income                     1495
Insurable Pop                               1495
Insured Pop                                 1495
Pct Insured                                 1495
Uninsured Pop                               1495
ADI State Rank      