### Health Professional Shortage Areas (HPSA)

These data provide areas designated by HRSA as having shortages of primary care, dental care, or mental health providers. HRSA’s Bureau of Health Workforce (BHW) develops shortage designation criteria and uses them to decide whether or not a geographic area or population group is a Health Professional Shortage Area (HPSA), Medically Underserved Area (MUA), or Medically Underserved Population (MUP). More than 34 federal programs depend on HPSA/MUA/MUP designations to determine eligibility or as a funding preference. About 20 percent of the U.S. population resides in primary medical care HPSAs. HPSAs may have shortages of primary medical care, dental, or mental health providers; may be urban or rural areas; population groups; or medical or other public facilities.

**HPSAs** are defined service areas that demonstrate a critical shortage of primary care physicians, dentists or mental health providers. A HPSA can be a distinct geographic area (such as a country, grouping, census tract, township or borough), a specific population group within a defined geographic area (such as the population under 200 percent of poverty) or a specific public or non-profit facility (such as a prison).

source: https://data.hrsa.gov/data/download

I think the variable of interest here are the locations as county levels as
fips or "State and County Federal Information Processing Standard Code" column

* HPSA Score : 
    - This attribute represents the Health Professional Shortage Area (HPSA) Score developed by the National Health Service Corps (NHSC) in determining priorities for assignment of clinicians. The scores range from 0 to 26 where the higher the score, the greater the priority.

others that we might find useful if we do timeseries
*   HPSA Designation Date 
*   HPSA Designation Last Update Date 
*   HPSA Status : ['Withdrawn' 'Designated' 'Proposed For Withdrawal']
*   'HPSA Withdrawn Date String', 'Withdrawn Date'
* 



notes:
- averaged the hpsa facility scores for all in the FIPS county area




In [None]:

import requests
import pandas as pd
data_path = 'data'

Mounted at /drive


In [None]:
shortage_df = pd.read_csv('https://data.hrsa.gov//DataDownload/DD_Files/BCD_HPSA_FCT_DET_PC.csv')

In [None]:
shortage_df  = shortage_df[(shortage_df['HPSA Status']=='Designated')|(shortage_df['HPSA Status']=='Proposed For Withdrawal')]
# shortage_df_remove_dups = shortage_df.drop(['HPSA Geography Identification Number', 'HPSA Component Name'], axis=1)
geo_shortage_designated_df = shortage_df[(shortage_df['Designation Type']=='Geographic HPSA')]
# hpsa component name varies 'census tract', but the other values are the same for that fips county
# so we need to drop duplicates 
geo_shortage_designated_df = geo_shortage_designated_df.drop_duplicates()

geo_shortage_designated_df.rename(columns={'State and County Federal Information Processing Standard Code': 'FIPS'}, inplace=True)

In [None]:
geo_shortage_designated_df[geo_shortage_designated_df['FIPS']=='04019'][['HPSA Component Name', 'FIPS', 'Designation Type', 'HPSA Name', 'HPSA Score']]


Unnamed: 0,HPSA Component Name,FIPS,Designation Type,HPSA Name,HPSA Score
15580,41.19,4019,Geographic HPSA,Sahuarita,10
15581,43.29,4019,Geographic HPSA,Sahuarita,10
15582,43.26,4019,Geographic HPSA,Sahuarita,10
15583,43.27,4019,Geographic HPSA,Sahuarita,10
15584,43.23,4019,Geographic HPSA,Sahuarita,10
20082,"Census Tract 53, Pima County, Arizona",4019,Geographic HPSA,Tanque Verde,11
20083,"Census Tract 40.52, Pima County, Arizona",4019,Geographic HPSA,Tanque Verde,11
20084,"Census Tract 40.51, Pima County, Arizona",4019,Geographic HPSA,Tanque Verde,11
20085,"Census Tract 40.50, Pima County, Arizona",4019,Geographic HPSA,Tanque Verde,11
20086,"Census Tract 40.54, Pima County, Arizona",4019,Geographic HPSA,Tanque Verde,11


In [None]:

# geo_shortage_designated_df[['FIPS', 'HPSA Score']].groupby('FIPS').count().sort_values(by='HPSA Score', ascending=False)

geo_shortage_designated_df

Unnamed: 0,HPSA Name,HPSA ID,Designation Type,HPSA Discipline Class,HPSA Score,Primary State Abbreviation,HPSA Status,HPSA Designation Date,HPSA Designation Last Update Date,Metropolitan Indicator,...,Rural Status Code,State Abbreviation,FIPS,State FIPS Code,State Name,U.S. - Mexico Border 100 Kilometer Indicator,U.S. - Mexico Border County Indicator,Data Warehouse Record Create Date,Data Warehouse Record Create Date Text,Unnamed: 65
207,Rock Lake/LA Cross Service Area,1538714311,Geographic HPSA,Primary Care,15,WA,Proposed For Withdrawal,03/24/1987,09/09/2021,Unknown,...,R,WA,53075,53,Washington,N,N,07/18/2022,2022/07/18,
209,Enumclaw,1538700586,Geographic HPSA,Primary Care,12,WA,Designated,09/18/2017,09/10/2021,Unknown,...,N,WA,53033,53,Washington,N,N,07/18/2022,2022/07/18,
222,Island County,1538396778,Geographic HPSA,Primary Care,13,WA,Designated,07/21/2017,09/10/2021,Unknown,...,R,WA,53029,53,Washington,N,N,07/18/2022,2022/07/18,
228,North Spokane County,1538067837,Geographic HPSA,Primary Care,16,WA,Designated,09/15/2017,09/08/2021,Unknown,...,N,WA,53063,53,Washington,N,N,07/18/2022,2022/07/18,
634,Morrow County,1413297457,Geographic HPSA,Primary Care,11,OR,Designated,09/21/2020,09/10/2021,Unknown,...,R,OR,41049,41,Oregon,N,N,07/18/2022,2022/07/18,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
63620,Conejos County,1086689702,Geographic HPSA,Primary Care,21,CO,Designated,04/05/2017,09/10/2021,Unknown,...,R,CO,08021,8,Colorado,N,N,07/18/2022,2022/07/18,
63624,Moffat County,1086098043,Geographic HPSA,Primary Care,12,CO,Proposed For Withdrawal,04/04/2017,09/09/2021,Unknown,...,R,CO,08081,8,Colorado,N,N,07/18/2022,2022/07/18,
63824,MSSA 188.1-Big Bend/Montgomery Creek/Oak Run,1066701732,Geographic HPSA,Primary Care,14,CA,Designated,06/22/2022,06/22/2022,Unknown,...,R,CA,06089,6,California,N,N,07/18/2022,2022/07/18,
63825,MSSA 73/Bieber,1066701137,Geographic HPSA,Primary Care,10,CA,Proposed For Withdrawal,04/24/2017,09/10/2021,Unknown,...,R,CA,06035,6,California,N,N,07/18/2022,2022/07/18,


In [None]:

geo_shortage_designated_df_trunc = geo_shortage_designated_df[['FIPS', 'HPSA Score']]
# multiple fips codes 
grouped_shortages_fips_df = geo_shortage_designated_df.groupby('FIPS')['HPSA Score'].mean().reset_index()

In [None]:
len(grouped_shortages_fips_df['FIPS'].unique())

965

In [None]:
grouped_shortages_fips_df

Unnamed: 0,FIPS,HPSA Score
0,01001,12.0
1,01007,10.0
2,01009,7.0
3,01017,14.0
4,01019,11.0
...,...,...
960,69085,16.0
961,69100,16.0
962,69110,16.0
963,69120,16.0


In [None]:

grouped_shortages_fips_df.to_csv(f'{data_path}/processed/FIPS_HPSA_PC.csv', index=False)
grouped_shortages_fips_df

Unnamed: 0,FIPS,HPSA Score
0,01001,12.0
1,01007,10.0
2,01009,7.0
3,01017,14.0
4,01019,11.0
...,...,...
960,69085,16.0
961,69100,16.0
962,69110,16.0
963,69120,16.0


#### If we want to do subopulations: 


In [None]:
# If we wanted to do subpopulations: 
shortage_df['HPSA Population Type'].unique()

array(['Native American Population HPSA', 'Geographic Population',
       'Low Income Population HPSA', nan,
       'Low Income Homeless Population HPSA',
       'Low Income Homeless Migrant Seasonal Worker Population HPSA',
       'Low Income Migrant Seasonal Worker Population HPSA',
       'Medicaid Eligible Population HPSA',
       'Low Income Migrant Farmworker Population HPSA',
       'Migrant Farmworker Population HPSA',
       'Low Income Homeless Migrant Farmworker Population HPSA',
       'Other Population HPSA', 'Homeless Population HPSA',
       'Migrant Seasonal Worker Population HPSA'], dtype=object)

In [None]:
# IF we want to do subcategories 
import numpy as np
shortage_df = geo_shortage_designated_df

# core_cities_df['HPSA Low Income Pop. Score'] = core_cities_df.apply(lambda x: x['HPSA Score'] if x['HPSA Population Type'] is 'Low Income Population HPSA' else np.nan)
shortage_df['HPSA Low Income Pop. Score'] = np.where((shortage_df['HPSA Population Type'] == 'Low Income Population HPSA'), shortage_df['HPSA Score'], np.nan)
shortage_df['HPSA Geo Score'] = np.where((shortage_df['HPSA Population Type'] == 'Geographic Population'), shortage_df['HPSA Score'], np.nan)
shortage_df[(np.isnan(shortage_df['HPSA Low Income Pop. Score']))&(np.isnan(shortage_df['HPSA Geo Score']))]