### Research Question: What is the relationship between internet access benefit uptake and SNAP benefit uptake?

#### NOTES

Background from BDT: There is a new FCC program called Affordable Connectivity Program (ACP) that helps ensure that households can afford the broadband they need for work, school, healthcare and more that replaced the Emergency Broadband Benefit (EBB) program. (source: https://www.fcc.gov/acp) Given ACP is newer and offers more benefits the priority is to look at ACP benefit enrollment versus SNAP benefit enrollment by geo (i.e. county). This will allow BDT to understand if there are specific regions where ACP enrollment is low relative to SNAP enrollment because being enrolled in SNAP is a qualifier for ACP. This comparison will need to use household data for SNAP because ACP is a household benefit. BDT currently operates in these 7 states- Colorado, Maryland, Michigan, New York, North Carolina, Pennsylvania, and South Carolina.

Locations of Data: ACP data is available at https://www.usac.org/about/affordable-connectivity-program/acp-enrollment-and-claims-tracker/#enrollment-and-claims-by-zipcode-and-county for 2022 (and there is a data dictionary at https://www.usac.org/wp-content/uploads/about/documents/acp/Data-Dictionary.pdf). SNAP data is located at https://www.fns.usda.gov/pd/supplemental-nutrition-assistance-program-snap.  

Challenge: To understand current gaps it is ideal to look at the most recent data available and the lowest common geo granularity (county). ACP data is available at the county level and has very recent data available for this level- October 2022. SNAP data at the county level has a significant lag- most recent data posted in December 2022 is from January 2022 (about a year old). This presents a gap in this analysis since a household enrolled in SNAP over a year ago may no longer be enrolled in SNAP or may have moved from that county. A comparison is made between the two but it is heavily caveated by the large time difference in the data points which may or may not provide useful directional trends. In addition to comparing these two most recent data sources, a comparison is also made with older ACP data (January) to match the timeframe of SNAP as a comparison. Lastly, in the SNAP county level data the state of New York is not broken down by county. Having only state-level SNAP data for New York prevents county-level analysis for New York. 

#### OPEN QUESTIONS

1. Is it beneficial to compare old SNAP data with current ACP data? Will this comparison be able to highlight current gaps?
2. The SNAP data divides persons and households into Public Assistance Participation and Non Public Assistance Participation. Currently the assumption is to use the combined Public Assistance Participation and Non Public Assistance Participation ("Calc: SNAP Total PA and Non-PA Households") to get the total households using SNAP. Is this valid? 
3. Some ACP data has the county information redacted with \**NOT AVAILABLE**. Is it correct to assume there is low risk for fasley identifying a country with low ACP adoption given this missing data?


#### DATA LOAD

In [108]:
#import libraries
import pandas as pd
pd.set_option('display.max_rows', None)

In [109]:
#import data
acp=pd.read_excel ('../data-sources/acp/ACP-Households-by-County-January-October-2022.xlsx',dtype={'State FIPS': object}) #acp data
sjan22=pd.read_excel ('../data-sources/snap/Jan-2022.xlsx',skiprows=3, usecols=('Substate/Region','Calc: SNAP Total PA and Non-PA Households')) #January 2022 snap data
#sjul21=pd.read_excel ('../data-sources/snap/Jul-2021.xlsx',skiprows=3, usecols=('Substate/Region','Calc: SNAP Total PA and Non-PA Households')) #July 2021 snap data
#sjul20=pd.read_excel ('../data-sources/snap/Jul-2020.xlsx',skiprows=3, usecols=('Substate/Region','Calc: SNAP Total PA and Non-PA Households')) #July 2020 snap data
#sjan21=pd.read_excel ('../data-sources/snap/Jan-2021.xls',skiprows=3,usecols=('Substate/Region','Calc: SNAP Total PA and Non-PA Households')) #January 2021 snap data

#### ACP DATA EXPLORATION

In [110]:
acp.head()

Unnamed: 0,Data Month,State,State Name,County Name,State FIPS,County FIPS,Net New Enrollments Alternative Verification Process,Net New Enrollments Verified by School,Net New Enrollments Lifeline,Net New Enrollments National Verifier Application,Net New Enrollments total,Total Alternative Verification Process,Total Verified by School,Total Lifeline,Total National Verifier Application,Total Subscribers
0,2022-01-01,AL,ALABAMA,AUTAUGA COUNTY,1,1,46.0,-1.0,20.0,53.0,118.0,618.0,2.0,440.0,874.0,1934.0
1,2022-01-01,AL,ALABAMA,BALDWIN COUNTY,1,3,-13.0,0.0,78.0,114.0,179.0,521.0,5.0,959.0,1281.0,2766.0
2,2022-01-01,AL,ALABAMA,BARBOUR COUNTY,1,5,41.0,0.0,56.0,25.0,122.0,881.0,1.0,444.0,417.0,1743.0
3,2022-01-01,AL,ALABAMA,BIBB COUNTY,1,7,0.0,0.0,14.0,23.0,37.0,9.0,5.0,182.0,304.0,500.0
4,2022-01-01,AL,ALABAMA,BLOUNT COUNTY,1,9,12.0,-1.0,18.0,46.0,75.0,196.0,11.0,293.0,573.0,1073.0


In [111]:
acp.tail()

Unnamed: 0,Data Month,State,State Name,County Name,State FIPS,County FIPS,Net New Enrollments Alternative Verification Process,Net New Enrollments Verified by School,Net New Enrollments Lifeline,Net New Enrollments National Verifier Application,Net New Enrollments total,Total Alternative Verification Process,Total Verified by School,Total Lifeline,Total National Verifier Application,Total Subscribers
32712,2022-10-01,PR,PUERTO RICO,**NOT AVAILABLE**,72,**NOT AVAILABLE**,0.0,0.0,-97.0,18.0,-79.0,0.0,0.0,45.0,133.0,178.0
32713,2022-10-01,VI,VIRGIN ISLANDS,ST. CROIX ISLAND,78,010,0.0,0.0,-5.0,195.0,190.0,0.0,1.0,100.0,1791.0,1892.0
32714,2022-10-01,VI,VIRGIN ISLANDS,ST. JOHN ISLAND,78,020,0.0,0.0,0.0,4.0,4.0,0.0,0.0,0.0,26.0,26.0
32715,2022-10-01,VI,VIRGIN ISLANDS,ST. THOMAS ISLAND,78,030,0.0,0.0,5.0,117.0,122.0,0.0,0.0,86.0,1229.0,1315.0
32716,2022-10-01,VI,VIRGIN ISLANDS,**NOT AVAILABLE**,78,**NOT AVAILABLE**,0.0,0.0,0.0,5.0,5.0,0.0,0.0,0.0,8.0,8.0


In [112]:
acp[acp['County Name']=='**NOT AVAILABLE**']

Unnamed: 0,Data Month,State,State Name,County Name,State FIPS,County FIPS,Net New Enrollments Alternative Verification Process,Net New Enrollments Verified by School,Net New Enrollments Lifeline,Net New Enrollments National Verifier Application,Net New Enrollments total,Total Alternative Verification Process,Total Verified by School,Total Lifeline,Total National Verifier Application,Total Subscribers
67,2022-01-01,AL,ALABAMA,**NOT AVAILABLE**,1,**NOT AVAILABLE**,15.0,0.0,6.0,3.0,24.0,161.0,1.0,96.0,137.0,395.0
93,2022-01-01,AK,ALASKA,**NOT AVAILABLE**,2,**NOT AVAILABLE**,0.0,0.0,0.0,-1.0,-1.0,0.0,0.0,5.0,8.0,13.0
98,2022-01-01,AS,AMERICAN SAMOA,**NOT AVAILABLE**,60,**NOT AVAILABLE**,0.0,0.0,0.0,2.0,2.0,0.0,0.0,0.0,4.0,4.0
114,2022-01-01,AZ,ARIZONA,**NOT AVAILABLE**,4,**NOT AVAILABLE**,1.0,0.0,11.0,10.0,22.0,3.0,0.0,467.0,41.0,511.0
190,2022-01-01,AR,ARKANSAS,**NOT AVAILABLE**,5,**NOT AVAILABLE**,-1.0,0.0,14.0,4.0,17.0,4.0,0.0,186.0,120.0,310.0
248,2022-01-01,CA,CALIFORNIA,**NOT AVAILABLE**,6,**NOT AVAILABLE**,1.0,0.0,1.0,7.0,9.0,9.0,0.0,55.0,68.0,132.0
306,2022-01-01,CO,COLORADO,**NOT AVAILABLE**,8,**NOT AVAILABLE**,2.0,0.0,2.0,5.0,9.0,9.0,0.0,51.0,141.0,201.0
321,2022-01-01,CT,CONNECTICUT,**NOT AVAILABLE**,9,**NOT AVAILABLE**,0.0,0.0,0.0,8.0,8.0,3.0,0.0,5.0,19.0,27.0
325,2022-01-01,DE,DELAWARE,**NOT AVAILABLE**,10,**NOT AVAILABLE**,0.0,0.0,1.0,1.0,2.0,1.0,0.0,1.0,5.0,7.0
394,2022-01-01,FL,FLORIDA,**NOT AVAILABLE**,12,**NOT AVAILABLE**,9.0,0.0,5.0,24.0,38.0,61.0,1.0,102.0,249.0,413.0


In [113]:
acp[acp['State Name']=='ALABAMA']

Unnamed: 0,Data Month,State,State Name,County Name,State FIPS,County FIPS,Net New Enrollments Alternative Verification Process,Net New Enrollments Verified by School,Net New Enrollments Lifeline,Net New Enrollments National Verifier Application,Net New Enrollments total,Total Alternative Verification Process,Total Verified by School,Total Lifeline,Total National Verifier Application,Total Subscribers
0,2022-01-01,AL,ALABAMA,AUTAUGA COUNTY,1,001,46.0,-1.0,20.0,53.0,118.0,618.0,2.0,440.0,874.0,1934.0
1,2022-01-01,AL,ALABAMA,BALDWIN COUNTY,1,003,-13.0,0.0,78.0,114.0,179.0,521.0,5.0,959.0,1281.0,2766.0
2,2022-01-01,AL,ALABAMA,BARBOUR COUNTY,1,005,41.0,0.0,56.0,25.0,122.0,881.0,1.0,444.0,417.0,1743.0
3,2022-01-01,AL,ALABAMA,BIBB COUNTY,1,007,0.0,0.0,14.0,23.0,37.0,9.0,5.0,182.0,304.0,500.0
4,2022-01-01,AL,ALABAMA,BLOUNT COUNTY,1,009,12.0,-1.0,18.0,46.0,75.0,196.0,11.0,293.0,573.0,1073.0
5,2022-01-01,AL,ALABAMA,BULLOCK COUNTY,1,011,0.0,0.0,74.0,-3.0,71.0,5.0,0.0,197.0,189.0,391.0
6,2022-01-01,AL,ALABAMA,BUTLER COUNTY,1,013,41.0,0.0,13.0,8.0,62.0,659.0,3.0,364.0,486.0,1512.0
7,2022-01-01,AL,ALABAMA,CALHOUN COUNTY,1,015,16.0,-1.0,99.0,129.0,243.0,286.0,5.0,1522.0,1747.0,3560.0
8,2022-01-01,AL,ALABAMA,CHAMBERS COUNTY,1,017,61.0,0.0,160.0,40.0,261.0,1097.0,1.0,788.0,549.0,2435.0
9,2022-01-01,AL,ALABAMA,CHEROKEE COUNTY,1,019,14.0,-1.0,15.0,19.0,47.0,206.0,71.0,226.0,289.0,792.0


In [114]:
acp.dtypes

Data Month                                              datetime64[ns]
State                                                           object
State Name                                                      object
County Name                                                     object
State FIPS                                                      object
County FIPS                                                     object
Net New Enrollments Alternative Verification Process           float64
Net New Enrollments Verified by School                         float64
Net New Enrollments Lifeline                                   float64
Net New Enrollments National Verifier Application              float64
Net New Enrollments total                                      float64
Total Alternative Verification Process                         float64
Total Verified by School                                       float64
Total Lifeline                                                 float64
Total 

In [115]:
acp.describe(include='all')

Unnamed: 0,Data Month,State,State Name,County Name,State FIPS,County FIPS,Net New Enrollments Alternative Verification Process,Net New Enrollments Verified by School,Net New Enrollments Lifeline,Net New Enrollments National Verifier Application,Net New Enrollments total,Total Alternative Verification Process,Total Verified by School,Total Lifeline,Total National Verifier Application,Total Subscribers
count,32717,32717,32717,32717,32717.0,32717,32715.0,32715.0,32715.0,32715.0,32715.0,32715.0,32715.0,32715.0,32715.0,32715.0
unique,10,56,56,1967,56.0,331,,,,,,,,,,
top,2022-10-01 00:00:00,TX,TEXAS,**NOT AVAILABLE**,48.0,**NOT AVAILABLE**,,,,,,,,,,
freq,3280,2545,2545,536,2545.0,536,,,,,,,,,,
first,2022-01-01 00:00:00,,,,,,,,,,,,,,,
last,2022-10-01 00:00:00,,,,,,,,,,,,,,,
mean,,,,,,,57.481492,-0.027113,17.939294,97.213969,172.607642,922.3849,2.090906,1517.899343,1316.392236,3758.767385
std,,,,,,,353.990986,0.817687,304.69379,416.662211,729.774093,4372.251688,12.379857,5755.727289,5220.603708,14243.974117
min,,,,,,,-1533.0,-24.0,-27109.0,-3366.0,-7237.0,0.0,0.0,0.0,0.0,0.0
25%,,,,,,,0.0,0.0,-3.0,4.0,6.0,1.0,0.0,88.0,88.0,225.0


#### ACP DATA CLEANUP

In [116]:
#check for redacted counties
acp[acp['County FIPS']=='**NOT AVAILABLE**']

Unnamed: 0,Data Month,State,State Name,County Name,State FIPS,County FIPS,Net New Enrollments Alternative Verification Process,Net New Enrollments Verified by School,Net New Enrollments Lifeline,Net New Enrollments National Verifier Application,Net New Enrollments total,Total Alternative Verification Process,Total Verified by School,Total Lifeline,Total National Verifier Application,Total Subscribers
67,2022-01-01,AL,ALABAMA,**NOT AVAILABLE**,1,**NOT AVAILABLE**,15.0,0.0,6.0,3.0,24.0,161.0,1.0,96.0,137.0,395.0
93,2022-01-01,AK,ALASKA,**NOT AVAILABLE**,2,**NOT AVAILABLE**,0.0,0.0,0.0,-1.0,-1.0,0.0,0.0,5.0,8.0,13.0
98,2022-01-01,AS,AMERICAN SAMOA,**NOT AVAILABLE**,60,**NOT AVAILABLE**,0.0,0.0,0.0,2.0,2.0,0.0,0.0,0.0,4.0,4.0
114,2022-01-01,AZ,ARIZONA,**NOT AVAILABLE**,4,**NOT AVAILABLE**,1.0,0.0,11.0,10.0,22.0,3.0,0.0,467.0,41.0,511.0
190,2022-01-01,AR,ARKANSAS,**NOT AVAILABLE**,5,**NOT AVAILABLE**,-1.0,0.0,14.0,4.0,17.0,4.0,0.0,186.0,120.0,310.0
248,2022-01-01,CA,CALIFORNIA,**NOT AVAILABLE**,6,**NOT AVAILABLE**,1.0,0.0,1.0,7.0,9.0,9.0,0.0,55.0,68.0,132.0
306,2022-01-01,CO,COLORADO,**NOT AVAILABLE**,8,**NOT AVAILABLE**,2.0,0.0,2.0,5.0,9.0,9.0,0.0,51.0,141.0,201.0
321,2022-01-01,CT,CONNECTICUT,**NOT AVAILABLE**,9,**NOT AVAILABLE**,0.0,0.0,0.0,8.0,8.0,3.0,0.0,5.0,19.0,27.0
325,2022-01-01,DE,DELAWARE,**NOT AVAILABLE**,10,**NOT AVAILABLE**,0.0,0.0,1.0,1.0,2.0,1.0,0.0,1.0,5.0,7.0
394,2022-01-01,FL,FLORIDA,**NOT AVAILABLE**,12,**NOT AVAILABLE**,9.0,0.0,5.0,24.0,38.0,61.0,1.0,102.0,249.0,413.0


In [117]:
# remove \*\*NOT AVAILABLE\*\*
acp['County FIPS'] = acp['County FIPS'].str.replace(r'\*\*NOT AVAILABLE\*\*', '')

In [118]:
acp[acp['County FIPS']=='**NOT AVAILABLE**']

Unnamed: 0,Data Month,State,State Name,County Name,State FIPS,County FIPS,Net New Enrollments Alternative Verification Process,Net New Enrollments Verified by School,Net New Enrollments Lifeline,Net New Enrollments National Verifier Application,Net New Enrollments total,Total Alternative Verification Process,Total Verified by School,Total Lifeline,Total National Verifier Application,Total Subscribers


In [119]:
acp.iloc[67,:]

Data Month                                              2022-01-01 00:00:00
State                                                                    AL
State Name                                                          ALABAMA
County Name                                               **NOT AVAILABLE**
State FIPS                                                               01
County FIPS                                                                
Net New Enrollments Alternative Verification Process                     15
Net New Enrollments Verified by School                                    0
Net New Enrollments Lifeline                                              6
Net New Enrollments National Verifier Application                         3
Net New Enrollments total                                                24
Total Alternative Verification Process                                  161
Total Verified by School                                                  1
Total Lifeli

In [120]:
#concatenate State and Country FIPS
acp['FIPS'] = acp['State FIPS'].astype(str) + acp['County FIPS'].astype(str)

In [121]:
acp.head()

Unnamed: 0,Data Month,State,State Name,County Name,State FIPS,County FIPS,Net New Enrollments Alternative Verification Process,Net New Enrollments Verified by School,Net New Enrollments Lifeline,Net New Enrollments National Verifier Application,Net New Enrollments total,Total Alternative Verification Process,Total Verified by School,Total Lifeline,Total National Verifier Application,Total Subscribers,FIPS
0,2022-01-01,AL,ALABAMA,AUTAUGA COUNTY,1,1,46.0,-1.0,20.0,53.0,118.0,618.0,2.0,440.0,874.0,1934.0,1001
1,2022-01-01,AL,ALABAMA,BALDWIN COUNTY,1,3,-13.0,0.0,78.0,114.0,179.0,521.0,5.0,959.0,1281.0,2766.0,1003
2,2022-01-01,AL,ALABAMA,BARBOUR COUNTY,1,5,41.0,0.0,56.0,25.0,122.0,881.0,1.0,444.0,417.0,1743.0,1005
3,2022-01-01,AL,ALABAMA,BIBB COUNTY,1,7,0.0,0.0,14.0,23.0,37.0,9.0,5.0,182.0,304.0,500.0,1007
4,2022-01-01,AL,ALABAMA,BLOUNT COUNTY,1,9,12.0,-1.0,18.0,46.0,75.0,196.0,11.0,293.0,573.0,1073.0,1009


In [122]:
#reduce columns
acp.drop(['State FIPS',
          'County FIPS',
          'Net New Enrollments Alternative Verification Process',
          'Net New Enrollments Verified by School',
          'Net New Enrollments Lifeline',
          'Net New Enrollments National Verifier Application',
          'Net New Enrollments total',
          'Total Alternative Verification Process',
          'Total Verified by School',
          'Total Lifeline',
          'Total National Verifier Application'
          ], axis=1,inplace=True)

In [123]:
acp=acp.loc[:, ['Data Month','FIPS','State','State Name','County Name','Total Subscribers']]

In [124]:
acp.head()

Unnamed: 0,Data Month,FIPS,State,State Name,County Name,Total Subscribers
0,2022-01-01,1001,AL,ALABAMA,AUTAUGA COUNTY,1934.0
1,2022-01-01,1003,AL,ALABAMA,BALDWIN COUNTY,2766.0
2,2022-01-01,1005,AL,ALABAMA,BARBOUR COUNTY,1743.0
3,2022-01-01,1007,AL,ALABAMA,BIBB COUNTY,500.0
4,2022-01-01,1009,AL,ALABAMA,BLOUNT COUNTY,1073.0


In [125]:
#check for null
acp.isnull().values.any()

True

In [126]:
#inspect null
acp[acp['Total Subscribers'].isnull()]

Unnamed: 0,Data Month,FIPS,State,State Name,County Name,Total Subscribers
23209,2022-08-01,11,DC,DISTRICT OF COLUMBIA,**NOT AVAILABLE**,
26489,2022-09-01,11,DC,DISTRICT OF COLUMBIA,**NOT AVAILABLE**,


In [127]:
acp[acp['State']=='DC']

Unnamed: 0,Data Month,FIPS,State,State Name,County Name,Total Subscribers
326,2022-01-01,11001,DC,DISTRICT OF COLUMBIA,DISTRICT OF COLUMBIA,32119.0
3591,2022-02-01,11001,DC,DISTRICT OF COLUMBIA,DISTRICT OF COLUMBIA,34645.0
6856,2022-03-01,11001,DC,DISTRICT OF COLUMBIA,DISTRICT OF COLUMBIA,38549.0
10123,2022-04-01,11001,DC,DISTRICT OF COLUMBIA,DISTRICT OF COLUMBIA,40761.0
13390,2022-05-01,11001,DC,DISTRICT OF COLUMBIA,DISTRICT OF COLUMBIA,41961.0
16655,2022-06-01,11001,DC,DISTRICT OF COLUMBIA,DISTRICT OF COLUMBIA,40514.0
19928,2022-07-01,11001,DC,DISTRICT OF COLUMBIA,DISTRICT OF COLUMBIA,41569.0
23208,2022-08-01,11001,DC,DISTRICT OF COLUMBIA,DISTRICT OF COLUMBIA,40713.0
23209,2022-08-01,11,DC,DISTRICT OF COLUMBIA,**NOT AVAILABLE**,
26488,2022-09-01,11001,DC,DISTRICT OF COLUMBIA,DISTRICT OF COLUMBIA,42939.0


In [128]:
#fill NaN
acp['Total Subscribers'] = acp['Total Subscribers'].fillna(0)

In [129]:
#convert subscribers to integer
acp['Total Subscribers'] = acp['Total Subscribers'].astype(int)

In [130]:
acp.dtypes

Data Month           datetime64[ns]
FIPS                         object
State                        object
State Name                   object
County Name                  object
Total Subscribers             int32
dtype: object

In [131]:
acp[acp['State']=='DC']

Unnamed: 0,Data Month,FIPS,State,State Name,County Name,Total Subscribers
326,2022-01-01,11001,DC,DISTRICT OF COLUMBIA,DISTRICT OF COLUMBIA,32119
3591,2022-02-01,11001,DC,DISTRICT OF COLUMBIA,DISTRICT OF COLUMBIA,34645
6856,2022-03-01,11001,DC,DISTRICT OF COLUMBIA,DISTRICT OF COLUMBIA,38549
10123,2022-04-01,11001,DC,DISTRICT OF COLUMBIA,DISTRICT OF COLUMBIA,40761
13390,2022-05-01,11001,DC,DISTRICT OF COLUMBIA,DISTRICT OF COLUMBIA,41961
16655,2022-06-01,11001,DC,DISTRICT OF COLUMBIA,DISTRICT OF COLUMBIA,40514
19928,2022-07-01,11001,DC,DISTRICT OF COLUMBIA,DISTRICT OF COLUMBIA,41569
23208,2022-08-01,11001,DC,DISTRICT OF COLUMBIA,DISTRICT OF COLUMBIA,40713
23209,2022-08-01,11,DC,DISTRICT OF COLUMBIA,**NOT AVAILABLE**,0
26488,2022-09-01,11001,DC,DISTRICT OF COLUMBIA,DISTRICT OF COLUMBIA,42939


In [132]:
#extract select months of data
ajan22= acp[acp['Data Month']=='2022-01-01 00:00:00']
aoct22= acp[acp['Data Month']=='2022-10-01 00:00:00']

In [133]:
ajan22 = ajan22.rename(columns={'Total Subscribers': 'Jan22 ACP Total Households'})
aoct22 = aoct22.rename(columns={'Total Subscribers': 'Oct22 ACP Total Households'})

In [134]:
aoct22.head()

Unnamed: 0,Data Month,FIPS,State,State Name,County Name,Oct22 ACP Total Households
29437,2022-10-01,1001,AL,ALABAMA,AUTAUGA COUNTY,2985
29438,2022-10-01,1003,AL,ALABAMA,BALDWIN COUNTY,3759
29439,2022-10-01,1005,AL,ALABAMA,BARBOUR COUNTY,2830
29440,2022-10-01,1007,AL,ALABAMA,BIBB COUNTY,727
29441,2022-10-01,1009,AL,ALABAMA,BLOUNT COUNTY,1875


In [135]:
ajan22.head()

Unnamed: 0,Data Month,FIPS,State,State Name,County Name,Jan22 ACP Total Households
0,2022-01-01,1001,AL,ALABAMA,AUTAUGA COUNTY,1934
1,2022-01-01,1003,AL,ALABAMA,BALDWIN COUNTY,2766
2,2022-01-01,1005,AL,ALABAMA,BARBOUR COUNTY,1743
3,2022-01-01,1007,AL,ALABAMA,BIBB COUNTY,500
4,2022-01-01,1009,AL,ALABAMA,BLOUNT COUNTY,1073


In [136]:
ajan22.dtypes

Data Month                    datetime64[ns]
FIPS                                  object
State                                 object
State Name                            object
County Name                           object
Jan22 ACP Total Households             int32
dtype: object

In [137]:
ajan22.describe(include='all')

Unnamed: 0,Data Month,FIPS,State,State Name,County Name,Jan22 ACP Total Households
count,3263,3263.0,3263,3263,3263,3263.0
unique,1,3263.0,56,56,1954,
top,2022-01-01 00:00:00,20039.0,TX,TEXAS,**NOT AVAILABLE**,
freq,3263,1.0,254,254,53,
first,2022-01-01 00:00:00,,,,,
last,2022-01-01 00:00:00,,,,,
mean,,,,,,2971.883849
std,,,,,,11028.629632
min,,,,,,0.0
25%,,,,,,180.0


#### SNAP DATA EXPLORATION

In [138]:
sjan22.head()

Unnamed: 0,Substate/Region,Calc: SNAP Total PA and Non-PA Households
0,0100101 AL EBT AUTAUGA CO FS OFF,3545.0
1,0100301 AL EBT BALDWIN CO FS OFF,9474.0
2,0100501 AL EBT BARBOUR CO FS OFF,2848.0
3,0100701 AL EBT BIBB CO FS OFF,1706.0
4,0100901 AL EBT BLOUNT CO FS OFF,3004.0


In [139]:
sjan22.tail()

Unnamed: 0,Substate/Region,Calc: SNAP Total PA and Non-PA Households
2641,5600002 WY EBT WYOMING SD PASS,14033.0
2642,U.S. Summary,21658121.0
2643,Data is Subject to Revision. The data reflect...,
2644,,
2645,,


In [157]:
sjan22[sjan22['FIPS Code'].str.startswith(('36'))]

Unnamed: 0,Substate/Region,Calc: SNAP Total PA and Non-PA Households,FIPS Code
1517,3600002 NY EBT NEW YORK STATE,1626534.0,36000


#### SNAP DATA CLEANUP

In [140]:
#remove summary line
sjan22=sjan22[sjan22['Substate/Region'] != 'U.S. Summary']

In [141]:
#remove summary line
sjan22=sjan22[sjan22['Substate/Region'] != 'Data is Subject to Revision.  The data reflected is the latest available data.']

In [142]:
#remove empty rows
sjan22=sjan22.dropna(how='all')

In [143]:
sjan22.tail()

Unnamed: 0,Substate/Region,Calc: SNAP Total PA and Non-PA Households
2637,5514901 WI EBT BAD RIVER TRIBAL COUNC,165.0
2638,5515101 WI EBT LAC DU FLAMBEAU TRIBAL,535.0
2639,5515301 WI EBT SOKAOGON TRIBAL AGENCY,96.0
2640,5515501 WI EBT POTAWATOMI TRIBE,17.0
2641,5600002 WY EBT WYOMING SD PASS,14033.0


In [144]:
#extract FIPS code
sjan22['FIPS Code']=sjan22['Substate/Region'].str[:5]

In [145]:
sjan22.head()

Unnamed: 0,Substate/Region,Calc: SNAP Total PA and Non-PA Households,FIPS Code
0,0100101 AL EBT AUTAUGA CO FS OFF,3545.0,1001
1,0100301 AL EBT BALDWIN CO FS OFF,9474.0,1003
2,0100501 AL EBT BARBOUR CO FS OFF,2848.0,1005
3,0100701 AL EBT BIBB CO FS OFF,1706.0,1007
4,0100901 AL EBT BLOUNT CO FS OFF,3004.0,1009


In [146]:
sjan22.dtypes

Substate/Region                               object
Calc: SNAP Total PA and Non-PA Households    float64
FIPS Code                                     object
dtype: object

In [147]:
#check for null
sjan22.isnull().values.any()

False

In [148]:
sjan22[sjan22.duplicated(['FIPS Code'], keep=False)]

Unnamed: 0,Substate/Region,Calc: SNAP Total PA and Non-PA Households,FIPS Code
4,0100901 AL EBT BLOUNT CO FS OFF,3004.0,1009
5,0100999 AL EBT AESAP - Alabama Dept. of HR,0.0,1009
159,0515992 AR EBT 92-1 Field Operations,0.0,5159
160,0515996 AR EBT 96-1 Field Operations,0.0,5159
531,1703100 IL EBT ILLINOIS,0.0,17031
532,1703101 IL EBT MED FIELD OPS - NORTH,3059.0,17031
533,1703106 IL EBT COOK - NORTHSIDE,11368.0,17031
534,1703108 IL EBT COOK - WICKER PARK,0.0,17031
535,1703109 IL EBT COOK - LOWER NORTH,21084.0,17031
536,1703110 IL EBT COOK - HUMBOLDT PARK,41110.0,17031


In [149]:
sjan22[sjan22['FIPS Code']=='41000']

Unnamed: 0,Substate/Region,Calc: SNAP Total PA and Non-PA Households,FIPS Code
1835,4100005 OR EBT ADULT & FAMILY SVCS DIV,371610.0,41000
1836,4100005 OR SSI ADULT & FAMILY SVCS DIV,35446.0,41000
1837,4100005 OR WRI ADULT & FAMILY SVCS DIV,17.0,41000


In [154]:
sjan22sum=sjan22.groupby(['FIPS Code'],as_index=False).sum()

In [155]:
sjan22sum[sjan22sum['FIPS Code']=='41000']

Unnamed: 0,FIPS Code,Calc: SNAP Total PA and Non-PA Households
1747,41000,407073.0


In [162]:
sjan22sum = sjan22sum.rename(columns={'Calc: SNAP Total PA and Non-PA Households': 'JAN22 SNAP Households','FIPS Code':'FIPS'})

In [163]:
sjan22sum.dtypes

FIPS                      object
JAN22 SNAP Households    float64
dtype: object

In [164]:
sjan22sum.describe(include='all')

Unnamed: 0,FIPS,JAN22 SNAP Households
count,2549.0,2549.0
unique,2549.0,
top,55045.0,
freq,1.0,
mean,,8496.713
std,,45873.01
min,,0.0
25%,,723.0
50%,,1785.0
75%,,4532.0


#### DOWNLOAD TRANSFORMED DATA

In [54]:
sjan22sum.to_csv('../data-sources/snap/sjan22sum.csv')
ajan22.to_csv('../data-sources/acp/ajan22.csv')
aoct22.to_csv('../data-sources/acp/aoct22.csv')

#### JOIN ACP & SNAP DATA

In [56]:
df = pd.merge(sjan22sum.assign(FIPS=sjan22sum.FIPS.astype(str)),
              asep22.assign(FIPS=asep22.FIPS.astype(str)),
              how='outer', on='FIPS')

In [57]:
df.head()

Unnamed: 0,FIPS,JAN22 SNAP Households,State Name,County Name,Sep22 ACP Total Subscribers
0,1001,3545.0,ALABAMA,AUTAUGA COUNTY,2944.0
1,1003,9474.0,ALABAMA,BALDWIN COUNTY,3603.0
2,1005,2848.0,ALABAMA,BARBOUR COUNTY,2764.0
3,1007,1706.0,ALABAMA,BIBB COUNTY,697.0
4,1009,3004.0,ALABAMA,BLOUNT COUNTY,1799.0


In [63]:
df[df['County Name']=='**NOT AVAILABLE**']

Unnamed: 0,FIPS,JAN22 SNAP Households,State Name,County Name,Sep22 ACP Total Subscribers
2549,1,,ALABAMA,**NOT AVAILABLE**,32.0
2581,2,,ALASKA,**NOT AVAILABLE**,2.0
2583,4,,ARIZONA,**NOT AVAILABLE**,152.0
2584,5,,ARKANSAS,**NOT AVAILABLE**,25.0
2585,6,,CALIFORNIA,**NOT AVAILABLE**,21.0
2589,8,,COLORADO,**NOT AVAILABLE**,25.0
2598,9,,CONNECTICUT,**NOT AVAILABLE**,6.0
2600,10,,DELAWARE,**NOT AVAILABLE**,2.0
2601,11,,DISTRICT OF COLUMBIA,**NOT AVAILABLE**,0.0
2604,12,,FLORIDA,**NOT AVAILABLE**,24.0


In [64]:
df=df[df['County Name']!='**NOT AVAILABLE**']

In [69]:
df.to_csv('../data-sources/acp/acpsnapmerge.csv')

In [72]:
df['Difference']= df['JAN22 SNAP Households']-df['Sep22 ACP Total Subscribers']

#### INSPECT COMBINED DATA 

In [65]:
df[df['State Name']=='COLORADO']['JAN22 SNAP Households'].nlargest(n=5)

230    42910.0
235    37703.0
265    20986.0
244    16989.0
276    14107.0
Name: JAN22 SNAP Households, dtype: float64

In [66]:
df[df['State Name']=='MICHIGAN'].sort_values(by=['JAN22 SNAP Households']).tail(5)

Unnamed: 0,FIPS,JAN22 SNAP Households,State Name,County Name,Sep22 ACP Total Subscribers
1140,26081,36062.0,MICHIGAN,KENT COUNTY,22479.0
1124,26049,42225.0,MICHIGAN,GENESEE COUNTY,27963.0
1162,26125,51353.0,MICHIGAN,OAKLAND COUNTY,31265.0
1149,26099,53831.0,MICHIGAN,MACOMB COUNTY,34598.0
1181,26163,214779.0,MICHIGAN,WAYNE COUNTY,147381.0


In [67]:
df[df['State Name']=='MICHIGAN'].sort_values(by=['Sep22 ACP Total Subscribers']).tail(5)

Unnamed: 0,FIPS,JAN22 SNAP Households,State Name,County Name,Sep22 ACP Total Subscribers
1140,26081,36062.0,MICHIGAN,KENT COUNTY,22479.0
1124,26049,42225.0,MICHIGAN,GENESEE COUNTY,27963.0
1162,26125,51353.0,MICHIGAN,OAKLAND COUNTY,31265.0
1149,26099,53831.0,MICHIGAN,MACOMB COUNTY,34598.0
1181,26163,214779.0,MICHIGAN,WAYNE COUNTY,147381.0


In [68]:
#sjul20.head()

In [74]:
#sjan21.head()

In [75]:
df.dtypes

FIPS                            object
JAN22 SNAP Households          float64
State Name                      object
County Name                     object
Sep22 ACP Total Subscribers    float64
Difference                     float64
dtype: object

In [79]:
df[df['State Name']=='MICHIGAN'].sort_values(by=['Difference']).tail(20)

Unnamed: 0,FIPS,JAN22 SNAP Households,State Name,County Name,Sep22 ACP Total Subscribers,Difference
1129,26059,2902.0,MICHIGAN,HILLSDALE COUNTY,1572.0,1330.0
1161,26123,3893.0,MICHIGAN,NEWAYGO COUNTY,2551.0,1342.0
1146,26093,4194.0,MICHIGAN,LIVINGSTON COUNTY,2617.0,1577.0
1175,26151,2736.0,MICHIGAN,SANILAC COUNTY,1131.0,1605.0
1157,26115,7771.0,MICHIGAN,MONROE COUNTY,5849.0,1922.0
1122,26045,5273.0,MICHIGAN,EATON COUNTY,2987.0,2286.0
1145,26091,5260.0,MICHIGAN,LENAWEE COUNTY,2906.0,2354.0
1179,26159,5462.0,MICHIGAN,VAN BUREN COUNTY,3059.0,2403.0
1110,26021,11165.0,MICHIGAN,BERRIEN COUNTY,7361.0,3804.0
1137,26075,10350.0,MICHIGAN,JACKSON COUNTY,6312.0,4038.0


In [80]:
#Colorado, Maryland, Michigan, New York, North Carolina, Pennsylvania, and South Carolina.

In [115]:
df[df['State Name']=='COLORADO'].sort_values(by=['Difference'],ascending=False)

Unnamed: 0,FIPS,JAN22 SNAP Households,State Name,County Name,Sep22 ACP Total Subscribers,Difference
235,8041,37703.0,COLORADO,EL PASO COUNTY,22261.0,15442.0
230,8031,42910.0,COLORADO,DENVER COUNTY,29574.0,13336.0
265,8101,20986.0,COLORADO,PUEBLO COUNTY,13226.0,7760.0
249,8069,13963.0,COLORADO,LARIMER COUNTY,7128.0,6835.0
276,8123,14107.0,COLORADO,WELD COUNTY,7462.0,6645.0
244,8059,16989.0,COLORADO,JEFFERSON COUNTY,11342.0,5647.0
221,8013,9494.0,COLORADO,BOULDER COUNTY,5962.0,3532.0
256,8083,2284.0,COLORADO,MONTEZUMA COUNTY,664.0,1620.0
232,8035,3549.0,COLORADO,DOUGLAS COUNTY,2109.0,1440.0
257,8085,2738.0,COLORADO,MONTROSE COUNTY,1466.0,1272.0


In [116]:
df[df['State Name']=='MARYLAND'].sort_values(by=['Difference'],ascending=False)

Unnamed: 0,FIPS,JAN22 SNAP Households,State Name,County Name,Sep22 ACP Total Subscribers,Difference
1098,24510,125590.0,MARYLAND,BALTIMORE CITY,54046.0,71544.0
1090,24033,85463.0,MARYLAND,PRINCE GEORGE'S COUNTY,22471.0,62992.0
1077,24005,71869.0,MARYLAND,BALTIMORE COUNTY,24894.0,46975.0
1089,24031,42150.0,MARYLAND,MONTGOMERY COUNTY,13823.0,28327.0
1076,24003,35629.0,MARYLAND,ANNE ARUNDEL COUNTY,9083.0,26546.0
1095,24043,14835.0,MARYLAND,WASHINGTON COUNTY,5048.0,9787.0
1086,24025,14063.0,MARYLAND,HARFORD COUNTY,5030.0,9033.0
1087,24027,12687.0,MARYLAND,HOWARD COUNTY,3908.0,8779.0
1082,24017,11744.0,MARYLAND,CHARLES COUNTY,3168.0,8576.0
1096,24045,12348.0,MARYLAND,WICOMICO COUNTY,4607.0,7741.0


In [119]:
df[df['State Name']=='MICHIGAN'].sort_values(by=['Difference'],ascending=False)

Unnamed: 0,FIPS,JAN22 SNAP Households,State Name,County Name,Sep22 ACP Total Subscribers,Difference
1181,26163,214779.0,MICHIGAN,WAYNE COUNTY,147381.0,67398.0
1162,26125,51353.0,MICHIGAN,OAKLAND COUNTY,31265.0,20088.0
1149,26099,53831.0,MICHIGAN,MACOMB COUNTY,34598.0,19233.0
1124,26049,42225.0,MICHIGAN,GENESEE COUNTY,27963.0,14262.0
1140,26081,36062.0,MICHIGAN,KENT COUNTY,22479.0,13583.0
1132,26065,20999.0,MICHIGAN,INGHAM COUNTY,11955.0,9044.0
1160,26121,17301.0,MICHIGAN,MUSKEGON COUNTY,10442.0,6859.0
1180,26161,14496.0,MICHIGAN,WASHTENAW COUNTY,8930.0,5566.0
1112,26025,11957.0,MICHIGAN,CALHOUN COUNTY,6988.0,4969.0
1173,26147,10769.0,MICHIGAN,ST. CLAIR COUNTY,5992.0,4777.0


In [124]:
df[df['State Name']=='NEW YORK'].sort_values(by=['Difference'],ascending=False).head(5)

Unnamed: 0,FIPS,JAN22 SNAP Households,State Name,County Name,Sep22 ACP Total Subscribers,Difference
2997,36001,,NEW YORK,ALBANY COUNTY,19077.0,
2998,36003,,NEW YORK,ALLEGANY COUNTY,2314.0,
2999,36005,,NEW YORK,BRONX COUNTY,129976.0,
3000,36007,,NEW YORK,BROOME COUNTY,14420.0,
3001,36009,,NEW YORK,CATTARAUGUS COUNTY,4437.0,


In [122]:
df[df['State Name']=='NORTH CAROLINA'].sort_values(by=['Difference'],ascending=False)

Unnamed: 0,FIPS,JAN22 SNAP Households,State Name,County Name,Sep22 ACP Total Subscribers,Difference
1490,37119,86662.0,NORTH CAROLINA,MECKLENBURG COUNTY,52742.0,33920.0
1471,37081,56353.0,NORTH CAROLINA,GUILFORD COUNTY,37173.0,19180.0
1522,37183,52435.0,NORTH CAROLINA,WAKE COUNTY,34402.0,18033.0
1456,37051,42653.0,NORTH CAROLINA,CUMBERLAND COUNTY,30687.0,11966.0
1504,37147,20963.0,NORTH CAROLINA,PITT COUNTY,9601.0,11362.0
1464,37067,34839.0,NORTH CAROLINA,FORSYTH COUNTY,25020.0,9819.0
1462,37063,24898.0,NORTH CAROLINA,DURHAM COUNTY,16237.0,8661.0
1508,37155,23577.0,NORTH CAROLINA,ROBESON COUNTY,15169.0,8408.0
1441,37021,20364.0,NORTH CAROLINA,BUNCOMBE COUNTY,13026.0,7338.0
1466,37071,22012.0,NORTH CAROLINA,GASTON COUNTY,16321.0,5691.0


In [123]:
df[df['State Name']=='PENNSYLVANIA'].sort_values(by=['Difference'],ascending=False)

Unnamed: 0,FIPS,JAN22 SNAP Households,State Name,County Name,Sep22 ACP Total Subscribers,Difference
1798,42101,266602.0,PENNSYLVANIA,PHILADELPHIA COUNTY,132316.0,134286.0
1749,42003,91042.0,PENNSYLVANIA,ALLEGHENY COUNTY,45155.0,45887.0
1770,42045,38314.0,PENNSYLVANIA,DELAWARE COUNTY,20160.0,18154.0
1793,42091,30537.0,PENNSYLVANIA,MONTGOMERY COUNTY,13161.0,17376.0
1787,42079,32482.0,PENNSYLVANIA,LUZERNE COUNTY,15818.0,16664.0
1812,42129,24333.0,PENNSYLVANIA,WESTMORELAND COUNTY,10589.0,13744.0
1783,42071,26509.0,PENNSYLVANIA,LANCASTER COUNTY,13861.0,12648.0
1786,42077,27379.0,PENNSYLVANIA,LEHIGH COUNTY,14818.0,12561.0
1756,42017,21602.0,PENNSYLVANIA,BUCKS COUNTY,9250.0,12352.0
1814,42133,26108.0,PENNSYLVANIA,YORK COUNTY,14009.0,12099.0


In [104]:
df[df['State Name']=='SOUTH CAROLINA'].sort_values(by=['Difference'],ascending=False).tail(100)

Unnamed: 0,FIPS,JAN22 SNAP Households,State Name,County Name,Sep22 ACP Total Subscribers,Difference
1825,45019,16930.0,SOUTH CAROLINA,CHARLESTON COUNTY,10391.0,6539.0
1861,45091,10422.0,SOUTH CAROLINA,YORK COUNTY,5635.0,4787.0
1817,45003,10510.0,SOUTH CAROLINA,AIKEN COUNTY,6560.0,3950.0
1823,45015,9454.0,SOUTH CAROLINA,BERKELEY COUNTY,5603.0,3851.0
1822,45013,5257.0,SOUTH CAROLINA,BEAUFORT COUNTY,2764.0,2493.0
1844,45057,5098.0,SOUTH CAROLINA,LANCASTER COUNTY,2917.0,2181.0
1830,45029,3960.0,SOUTH CAROLINA,COLLETON COUNTY,1808.0,2152.0
1843,45055,4386.0,SOUTH CAROLINA,KERSHAW COUNTY,2530.0,1856.0
1839,45047,5184.0,SOUTH CAROLINA,GREENWOOD COUNTY,3351.0,1833.0
1827,45023,3177.0,SOUTH CAROLINA,CHESTER COUNTY,1420.0,1757.0
