### Health Professional Shortage Areas (HPSA)

These data provide areas designated by HRSA as having shortages of primary care, dental care, or mental health providers. HRSA’s Bureau of Health Workforce (BHW) develops shortage designation criteria and uses them to decide whether or not a geographic area or population group is a Health Professional Shortage Area (HPSA), Medically Underserved Area (MUA), or Medically Underserved Population (MUP). More than 34 federal programs depend on HPSA/MUA/MUP designations to determine eligibility or as a funding preference. About 20 percent of the U.S. population resides in primary medical care HPSAs. HPSAs may have shortages of primary medical care, dental, or mental health providers; may be urban or rural areas; population groups; or medical or other public facilities.

**HPSAs** are defined service areas that demonstrate a critical shortage of primary care physicians, dentists or mental health providers. A HPSA can be a distinct geographic area (such as a country, grouping, census tract, township or borough), a specific population group within a defined geographic area (such as the population under 200 percent of poverty) or a specific public or non-profit facility (such as a prison).

source: https://data.hrsa.gov/data/download

I think the variable of interest here are the locations as county levels as
fips or "State and County Federal Information Processing Standard Code" column

* HPSA Score : 
    - This attribute represents the Health Professional Shortage Area (HPSA) Score developed by the National Health Service Corps (NHSC) in determining priorities for assignment of clinicians. The scores range from 0 to 26 where the higher the score, the greater the priority.

others that we might find useful if we do timeseries
*   HPSA Designation Date 
*   HPSA Designation Last Update Date 
*   HPSA Status : ['Withdrawn' 'Designated' 'Proposed For Withdrawal']
*   'HPSA Withdrawn Date String', 'Withdrawn Date'
* 



notes:
- averaged the hpsa facility scores for all in the FIPS county area




In [1]:

import requests
import pandas as pd
import os
if 'COLAB_GPU' in os.environ:
    from google.colab import  drive
    drive.mount('/drive')
    data_path = '/drive/Shared drives/Capstone/notebooks/data'
else:
    data_path = 'data'


In [2]:
shortage_df = pd.read_csv('https://data.hrsa.gov//DataDownload/DD_Files/BCD_HPSA_FCT_DET_PC.csv')

In [3]:
shortage_df  = shortage_df[(shortage_df['HPSA Status']=='Designated')|(shortage_df['HPSA Status']=='Proposed For Withdrawal')]
# get geographical type shortages (by county)
geo_shortage_designated_df = shortage_df[(shortage_df['Designation Type']=='Geographic HPSA')]
# hpsa component name varies 'census tract', but the other values are the same for that fips county
# so we need to drop duplicates 
geo_shortage_designated_df = geo_shortage_designated_df.drop_duplicates()

geo_shortage_designated_df.rename(columns={'State and County Federal Information Processing Standard Code': 'FIPS'}, inplace=True)

In [4]:
geo_shortage_designated_df[geo_shortage_designated_df['FIPS']=='04019'][['HPSA Component Name', 'FIPS', 'Designation Type', 'HPSA Name', 'HPSA Score']]


Unnamed: 0,HPSA Component Name,FIPS,Designation Type,HPSA Name,HPSA Score
15543,"Census Tract 52, Pima County, Arizona",4019,Geographic HPSA,Ajo,17
15571,"Census Tract 40.43, Pima County, Arizona",4019,Geographic HPSA,Tucson South East,9
15572,"Census Tract 40.42, Pima County, Arizona",4019,Geographic HPSA,Tucson South East,9
18276,"Census Tract 40.50, Pima County, Arizona",4019,Geographic HPSA,Tanque Verde,11
18277,"Census Tract 40.51, Pima County, Arizona",4019,Geographic HPSA,Tanque Verde,11
18278,"Census Tract 40.54, Pima County, Arizona",4019,Geographic HPSA,Tanque Verde,11
18279,"Census Tract 40.53, Pima County, Arizona",4019,Geographic HPSA,Tanque Verde,11
18280,"Census Tract 40.52, Pima County, Arizona",4019,Geographic HPSA,Tanque Verde,11
18281,"Census Tract 53, Pima County, Arizona",4019,Geographic HPSA,Tanque Verde,11
18289,9410,4019,Geographic HPSA,Pascua Yaqui Tribe,20


In [5]:
geo_shortage_designated_df

Unnamed: 0,HPSA Name,HPSA ID,Designation Type,HPSA Discipline Class,HPSA Score,Primary State Abbreviation,HPSA Status,HPSA Designation Date,HPSA Designation Last Update Date,Metropolitan Indicator,...,Rural Status Code,State Abbreviation,FIPS,State FIPS Code,State Name,U.S. - Mexico Border 100 Kilometer Indicator,U.S. - Mexico Border County Indicator,Data Warehouse Record Create Date,Data Warehouse Record Create Date Text,Unnamed: 65
410,Hickman County,1476771931,Geographic HPSA,Primary Care,13,TN,Designated,03/17/2022,03/17/2022,Unknown,...,R,TN,47081,47,Tennessee,N,N,08/16/2022,2022/08/16,
411,Hardeman County,1476702836,Geographic HPSA,Primary Care,13,TN,Designated,12/07/2021,12/07/2021,Unknown,...,R,TN,47069,47,Tennessee,N,N,08/16/2022,2022/08/16,
412,Van Buren County,1476548687,Geographic HPSA,Primary Care,18,TN,Proposed For Withdrawal,04/23/2019,09/10/2021,Unknown,...,R,TN,47175,47,Tennessee,N,N,08/16/2022,2022/08/16,
413,Wayne County,1476410063,Geographic HPSA,Primary Care,11,TN,Designated,12/31/2018,03/23/2022,Unknown,...,R,TN,47181,47,Tennessee,N,N,08/16/2022,2022/08/16,
415,Wayne County,1476346552,Geographic HPSA,Primary Care,11,TN,Proposed For Withdrawal,03/23/2021,09/10/2021,Unknown,...,R,TN,47181,47,Tennessee,N,N,08/16/2022,2022/08/16,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
64209,Aleutians West Borough,1026580765,Geographic HPSA,Primary Care,10,AK,Designated,06/17/1992,03/14/2022,Unknown,...,R,AK,02016,2,Alaska,N,N,08/16/2022,2022/08/16,
64232,North Slope Borough,1025693889,Geographic HPSA,Primary Care,17,AK,Designated,07/25/1978,08/20/2021,Unknown,...,R,AK,02185,2,Alaska,N,N,08/16/2022,2022/08/16,
64241,Prince Of Wales-Hyder Census Area,1025475653,Geographic HPSA,Primary Care,10,AK,Proposed For Withdrawal,11/22/2011,03/24/2022,Unknown,...,R,AK,02198,2,Alaska,N,N,08/16/2022,2022/08/16,
64244,Yukon-Koyukuk Census Area,1025407997,Geographic HPSA,Primary Care,17,AK,Designated,03/14/1984,09/02/2021,Unknown,...,R,AK,02290,2,Alaska,N,N,08/16/2022,2022/08/16,


In [6]:

geo_shortage_designated_df_trunc = geo_shortage_designated_df[['FIPS', 'HPSA Score']]
# multiple fips codes 
grouped_shortages_fips_df = geo_shortage_designated_df.groupby('FIPS')['HPSA Score'].mean().reset_index()

In [7]:
len(grouped_shortages_fips_df['FIPS'].unique())

969

In [8]:
grouped_shortages_fips_df

Unnamed: 0,FIPS,HPSA Score
0,01001,12.0
1,01007,10.0
2,01009,7.0
3,01017,14.0
4,01019,11.0
...,...,...
964,69085,16.0
965,69100,16.0
966,69110,16.0
967,69120,16.0


In [9]:

grouped_shortages_fips_df.to_csv(f'{data_path}/processed/healthcare_shortages.csv', index=False)
grouped_shortages_fips_df

Unnamed: 0,FIPS,HPSA Score
0,01001,12.0
1,01007,10.0
2,01009,7.0
3,01017,14.0
4,01019,11.0
...,...,...
964,69085,16.0
965,69100,16.0
966,69110,16.0
967,69120,16.0


#### If we want to do subopulations: 


In [10]:
# If we wanted to do subpopulations: 
shortage_df['HPSA Population Type'].unique()

array(['Low Income Population HPSA', nan, 'Geographic Population',
       'Medicaid Eligible Population HPSA',
       'Low Income Homeless Population HPSA',
       'Low Income Migrant Farmworker Population HPSA',
       'Low Income Homeless Migrant Farmworker Population HPSA',
       'Migrant Seasonal Worker Population HPSA',
       'Low Income Homeless Migrant Seasonal Worker Population HPSA',
       'Native American Population HPSA', 'Other Population HPSA',
       'Homeless Population HPSA',
       'Low Income Migrant Seasonal Worker Population HPSA',
       'Migrant Farmworker Population HPSA'], dtype=object)

In [11]:
# IF we want to do subcategories 
import numpy as np
shortage_df = geo_shortage_designated_df

# core_cities_df['HPSA Low Income Pop. Score'] = core_cities_df.apply(lambda x: x['HPSA Score'] if x['HPSA Population Type'] is 'Low Income Population HPSA' else np.nan)
shortage_df['HPSA Low Income Pop. Score'] = np.where((shortage_df['HPSA Population Type'] == 'Low Income Population HPSA'), shortage_df['HPSA Score'], np.nan)
shortage_df['HPSA Geo Score'] = np.where((shortage_df['HPSA Population Type'] == 'Geographic Population'), shortage_df['HPSA Score'], np.nan)
shortage_df[(np.isnan(shortage_df['HPSA Low Income Pop. Score']))&(np.isnan(shortage_df['HPSA Geo Score']))]

Unnamed: 0,HPSA Name,HPSA ID,Designation Type,HPSA Discipline Class,HPSA Score,Primary State Abbreviation,HPSA Status,HPSA Designation Date,HPSA Designation Last Update Date,Metropolitan Indicator,...,FIPS,State FIPS Code,State Name,U.S. - Mexico Border 100 Kilometer Indicator,U.S. - Mexico Border County Indicator,Data Warehouse Record Create Date,Data Warehouse Record Create Date Text,Unnamed: 65,HPSA Low Income Pop. Score,HPSA Geo Score
