# SNAP Gap in Los Angeles County: Spatial Patterns and Predictors
#### Meaghan Woody


## Urban Planning 213 Final Project 

#### **Background:**

One of the most urgent priorities for anti-hunger nonprofit organizations and CalFresh agencies in California is targeted outreach to eligible but unenrolled households. However, few tools exist to systematically identify communities where enrollment is low, even though need is high. “SNAP Gap” refers to the proportion of those who are eligible but not enrolled in Federally funded Supplemental Nutrition Assistance Program (SNAP). California ranks significantly lower than half the US (81% in 2022, 67% in 2020), yet has the highest number of SNAP recipients (USDA, 2022). In 2020, it was reported there is an estimated 280,000 LA County households who remain eligible for but are not receiving CalFresh. In general, to get CalFresh benefits, the household's gross income must be less than the gross-income eligibility standard 130% of the Federal Poverty Level (FPL).

Understanding neighborhood-level aspects of SNAP participation can greatly inform outreach efforts. Mapping tools can help identify and share opportunities to maximize efforts by targeting local areas with the lowest participation rates. Low SNAP participation rates in California cannot be attributed to a single factor but rather a multitude of factors that span across policy, politics, administration, demographics, and cultural attitudes, therefore this project aims to identify predictors of SNAP gap through an exploratory analysis. 

#### **Research aims:** 

1. Map the distribution of SNAP gap and food insecurity in Los Angeles County in varying geographic units.
2. Conduct a spatial cluster analysis of SNAP gap rates
3. Identify predictors that explain SNAP gap using machine learning models trained on American Community Survey (ACS) variables.

#### **References:**

1. U.S. Department of Agriculture, Food and Nutrition Service. (2025, April 18). Reaching those in need: Estimates of state SNAP participation rates in 2022. https://www.fns.usda.gov/research/snap/state-participation-rates/2022
2. Los Angeles Regional Food Bank. (n.d.). Closing the SNAP Gap. Retrieved June 11, 2025, from https://www.lafoodbank.org/stories/closing-the-snap-gap/#:~:text=The%20gap%20during%202019%20was,pandemic%2C%20a%20gap%20still%20remains. https://www.lafoodbank.org/closing-the-snap-gap/.

#### **Workflow of notebooks:**

`0_Project intro & Data prep`: Data cleaning for all the ACS variables

`1_Mapping SNAP Rates`: Aims 1 and 2

`2_Machine Learning`: Aim 3

### 1) Access Census Data using the Census Bureau API

The table below shows all the variables needed to be pulled from the American Community Survey (2023) through the Census Bureau API.

The variable names can be accessed at here: https://api.census.gov/data/2023/acs/acs5/variables.html. ACS specifies the denominator for most variables. The variables we will use in analysis will be rates. 

Note: there are more variables that could be investigated in the future: English language proficiency, single parent households, disability, and household with children and elderly. 

| **ACS Description** | **Numerator Var** | **Denominator Var** | **New Variable Name** |
|---------------------|-------------------|----------------------|------------------------|
| **== Demographics ==** | | | |
| Foreign-born: Not a U.S. citizen | B05002_021E | B05002_013E | %_foreignborn |
| Population aged 65+ (males + females) | B01001_020E to B01001_025E, B01001_044E to B01001_049E | B01001_002E + B01001_026E | %_seniors |
| Hispanic or Latino | B03003_003E | B03003_001E | %_hispanic |
| Black or African American alone | B02001_003E | B02001_001E | %_black |
| Asian alone | B02001_005E | B02001_001E | %_asian |
| **== Socioeconomics ==** | | | |
| Income < 0.50 FPL | C17002_002E | C17002_001E | part of snap |
| Income 0.50–0.99 FPL | C17002_003E | C17002_001E | part of snap |
| Income 1.00–1.24 FPL | C17002_004E | C17002_001E | part of snap |
| Income 1.25–1.49 FPL | C17002_005E | C17002_001E | part snap |
| Median household income | B19013_001E | N/A | median_income |
| Employed (age 16+) | B23025_004E | B23025_002E | %_employed |
| Enrolled in college, undergraduate years | B14001_008E | B14001_002E | %_undergrad |
| **== Household ==** | | | |
| Average household size | B25010_001E | N/A | avg_hh_size |
| Renter occupied housing units | B25003_003E | B25003_001E | %_renters |
| Rent burdened (30% or more of income) | B25070_007E to B25070_010E | B25070_001E | %_rent_burd |
| No Internet access | B28002_013E | B28002_001E | %_no_int |
| No vehicle | B25044_003E | B25044_001E | %_no_vehic
| **== SNAP ==** | | | |
| Household did not receive SNAP, income < FPL | B22003_006E | C17002_002E+C17002_003E+C17002_004E+C17002_005E | snap_ebne |
| (ALL) Household received SNAP | B22003_002E | B22003_001E | snap_rate |

Use the Census API to request all the variables in the table above for Los Angeles County (0637) tracts.

In [5]:
import json
import requests
import pandas as pd

r = requests.get(
    "https://api.census.gov/data/2023/acs/acs5?"
    "get=B05002_021E,B05002_013E,"
    "B01001_002E,B01001_026E,"
    "B01001_020E,B01001_021E,B01001_022E,B01001_023E,B01001_024E,B01001_025E,"
    "B01001_044E,B01001_045E,B01001_046E,B01001_047E,B01001_048E,B01001_049E,"
    "B03003_003E,B03003_001E,"
    "B02001_003E,B02001_005E,B02001_001E,"
    "C17002_001E,C17002_002E,C17002_003E,C17002_004E,C17002_005E,"
    "B19013_001E,"
    "B23025_004E,B23025_002E,"
    "B14001_008E,B14001_002E,"
    "B25010_001E,"
    "B25003_003E,B25003_001E,"
    "B25070_007E,B25070_008E,B25070_009E,B25070_010E,B25070_001E,"
    "B28002_013E,B28002_001E,"
    "B25044_003E,B25044_001E,"
    "B22003_006E,B22003_002E,B22003_001E"
    "&for=tract:*&in=state:06%20county:037"
)
censusdata = r.json()
census_snap = pd.DataFrame(censusdata[1:], columns=censusdata[0])
census_snap.head()

Unnamed: 0,B05002_021E,B05002_013E,B01001_002E,B01001_026E,B01001_020E,B01001_021E,B01001_022E,B01001_023E,B01001_024E,B01001_025E,...,B28002_013E,B28002_001E,B25044_003E,B25044_001E,B22003_006E,B22003_002E,B22003_001E,state,county,tract
0,347,1608,2035,2117,101,120,55,50,63,16,...,125,1558,59,1558,144,283,1558,6,37,101110
1,347,2122,2130,2068,49,66,149,54,32,115,...,19,1407,0,1407,27,160,1407,6,37,101122
2,600,1559,1585,1849,13,50,74,27,34,0,...,84,1357,12,1357,66,324,1357,6,37,101220
3,761,2100,2046,1885,65,57,100,19,0,188,...,241,1483,13,1483,159,433,1483,6,37,101221
4,456,1414,1239,1333,23,49,38,16,0,28,...,113,948,0,948,118,284,948,6,37,101222


Below shows the calculation of all the rates that will be used for analysis. 

In [7]:
# Convert variables to float in variable calculations

# == Demographics ==
census_snap['%_foreignborn'] = (census_snap['B05002_021E'].astype(float) / census_snap['B05002_013E'].astype(float)) * 100

seniors_num = (census_snap[['B01001_020E', 'B01001_021E', 'B01001_022E', 'B01001_023E', 'B01001_024E', 'B01001_025E',
                                 'B01001_044E', 'B01001_045E', 'B01001_046E', 'B01001_047E', 'B01001_048E', 'B01001_049E']]
                     .astype(float).sum(axis=1))
                     # Sum of males age>65 (B01001_020E to B01001_025E) and females age>65 (B01001_044E to B01001_049E)
seniors_denom = census_snap['B01001_002E'].astype(float) + census_snap['B01001_026E'].astype(float)
                     # Sum of all males and all females
census_snap['%_seniors'] = (seniors_num / seniors_denom) * 100

census_snap['%_hispanic'] = (census_snap['B03003_003E'].astype(float) / census_snap['B03003_001E'].astype(float)) * 100
census_snap['%_black'] = (census_snap['B02001_003E'].astype(float) / census_snap['B02001_001E'].astype(float)) * 100
census_snap['%_asian'] = (census_snap['B02001_005E'].astype(float) / census_snap['B02001_001E'].astype(float)) * 100

# == Socioeconomics ==
poverty = (census_snap['C17002_002E'].astype(float) + census_snap['C17002_003E'].astype(float) + 
                      census_snap['C17002_004E'].astype(float) + census_snap['C17002_005E'].astype(float))
                    # Sum of income groups 150% FPL (for SNAP rates) 

census_snap['%_employed'] = (census_snap['B23025_004E'].astype(float) / census_snap['B23025_002E'].astype(float)) * 100
census_snap['%_undergrad'] = (census_snap['B14001_008E'].astype(float) / census_snap['B14001_002E'].astype(float)) * 100

# == Household ==
census_snap['%_renters'] = (census_snap['B25003_003E'].astype(float) / census_snap['B25003_001E'].astype(float)) * 100

rent_burd_num = census_snap[['B25070_007E', 'B25070_008E', 'B25070_009E', 'B25070_010E']].astype(float).sum(axis=1)
                  # Rent burdened sum rent over 30% of income)
census_snap['%_rent_burd'] = (rent_burd_num / census_snap['B25070_001E'].astype(float)) * 100

census_snap['%_no_int'] = (census_snap['B28002_013E'].astype(float) / census_snap['B28002_001E'].astype(float)) * 100
census_snap['%_no_vehic'] = (census_snap['B25044_003E'].astype(float) / census_snap['B25044_001E'].astype(float)) * 100

# == SNAP ==
census_snap['snap_ebne'] = (census_snap['B22003_006E'].astype(float) / poverty) * 100
census_snap['snap_rate'] = (census_snap['B22003_002E'].astype(float) / census_snap['B22003_001E'].astype(float)) * 100

# For the two variables that do not have denominators, only renaming is needed
census_snap['median_income'] = census_snap['B19013_001E'].astype(float)
census_snap['avg_hh_size'] = census_snap['B25010_001E'].astype(float)

# View new variables
census_snap.head()

Unnamed: 0,B05002_021E,B05002_013E,B01001_002E,B01001_026E,B01001_020E,B01001_021E,B01001_022E,B01001_023E,B01001_024E,B01001_025E,...,%_employed,%_undergrad,%_renters,%_rent_burd,%_no_int,%_no_vehic,snap_ebne,snap_rate,median_income,avg_hh_size
0,347,1608,2035,2117,101,120,55,50,63,16,...,93.125811,32.312253,48.780488,52.5,8.023107,3.786906,23.300971,18.164313,84091.0,2.6
1,347,2122,2130,2068,49,66,149,54,32,115,...,90.264317,31.619537,22.103767,69.77492,1.350391,0.0,6.601467,11.371713,99583.0,2.98
2,600,1559,1585,1849,13,50,74,27,34,0,...,91.412742,21.700224,56.079587,43.232589,6.190125,0.884304,10.122699,23.876197,69676.0,2.53
3,761,2100,2046,1885,65,57,100,19,0,188,...,96.729958,26.835902,77.815239,68.717504,16.250843,0.876601,14.627415,29.197572,53798.0,2.59
4,456,1414,1239,1333,23,49,38,16,0,28,...,93.887147,34.913793,92.088608,55.32646,11.919831,0.0,17.744361,29.957806,45662.0,2.7


In [8]:
# View all the variable names 
print(list(census_snap.columns))

['B05002_021E', 'B05002_013E', 'B01001_002E', 'B01001_026E', 'B01001_020E', 'B01001_021E', 'B01001_022E', 'B01001_023E', 'B01001_024E', 'B01001_025E', 'B01001_044E', 'B01001_045E', 'B01001_046E', 'B01001_047E', 'B01001_048E', 'B01001_049E', 'B03003_003E', 'B03003_001E', 'B02001_003E', 'B02001_005E', 'B02001_001E', 'C17002_001E', 'C17002_002E', 'C17002_003E', 'C17002_004E', 'C17002_005E', 'B19013_001E', 'B23025_004E', 'B23025_002E', 'B14001_008E', 'B14001_002E', 'B25010_001E', 'B25003_003E', 'B25003_001E', 'B25070_007E', 'B25070_008E', 'B25070_009E', 'B25070_010E', 'B25070_001E', 'B28002_013E', 'B28002_001E', 'B25044_003E', 'B25044_001E', 'B22003_006E', 'B22003_002E', 'B22003_001E', 'state', 'county', 'tract', '%_foreignborn', '%_seniors', '%_hispanic', '%_black', '%_asian', '%_employed', '%_undergrad', '%_renters', '%_rent_burd', '%_no_int', '%_no_vehic', 'snap_ebne', 'snap_rate', 'median_income', 'avg_hh_size']


### 2) Load in Shapefiles for Los Angeles County

Join the American Community Survey variables with shape files for mapping in the next step. 

Data obtained from https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html
& https://www.census.gov/cgi-bin/geo/shapefiles/index.php 

In [10]:
import geopandas as gpd

# Load LA County tract shapefile
tracts = gpd.read_file('/Users/markwoody/Desktop/UP 213/UP213Final/final/data/tl_2020_06037_tract20')
print(tracts.columns)
tracts.head(5)

# Create GEOID in both datasets
census_snap['GEOID'] = census_snap['state'] + census_snap['county'] + census_snap['tract']
tracts['GEOID'] = tracts['GEOID20']

# Set GEOID as index
tracts = tracts.set_index('GEOID')
census_snap = census_snap.set_index('GEOID')

# Join geometry from tracts into full census_snap
snap_tracts = census_snap.join(tracts[['geometry']], how='inner')
snap_tracts = gpd.GeoDataFrame(snap_tracts, geometry='geometry', crs='EPSG:4326')
snap_tracts.head(5)

Index(['STATEFP20', 'COUNTYFP20', 'TRACTCE20', 'GEOID20', 'NAME20',
       'NAMELSAD20', 'MTFCC20', 'FUNCSTAT20', 'ALAND20', 'AWATER20',
       'INTPTLAT20', 'INTPTLON20', 'geometry'],
      dtype='object')


Unnamed: 0_level_0,B05002_021E,B05002_013E,B01001_002E,B01001_026E,B01001_020E,B01001_021E,B01001_022E,B01001_023E,B01001_024E,B01001_025E,...,%_undergrad,%_renters,%_rent_burd,%_no_int,%_no_vehic,snap_ebne,snap_rate,median_income,avg_hh_size,geometry
GEOID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
6037101110,347,1608,2035,2117,101,120,55,50,63,16,...,32.312253,48.780488,52.5,8.023107,3.786906,23.300971,18.164313,84091.0,2.6,"POLYGON ((-118.30229 34.2587, -118.30091 34.25..."
6037101122,347,2122,2130,2068,49,66,149,54,32,115,...,31.619537,22.103767,69.77492,1.350391,0.0,6.601467,11.371713,99583.0,2.98,"POLYGON ((-118.30334 34.27371, -118.3033 34.27..."
6037101220,600,1559,1585,1849,13,50,74,27,34,0,...,21.700224,56.079587,43.232589,6.190125,0.884304,10.122699,23.876197,69676.0,2.53,"POLYGON ((-118.28592 34.25227, -118.28592 34.2..."
6037101221,761,2100,2046,1885,65,57,100,19,0,188,...,26.835902,77.815239,68.717504,16.250843,0.876601,14.627415,29.197572,53798.0,2.59,"POLYGON ((-118.29945 34.25598, -118.29792 34.2..."
6037101222,456,1414,1239,1333,23,49,38,16,0,28,...,34.913793,92.088608,55.32646,11.919831,0.0,17.744361,29.957806,45662.0,2.7,"POLYGON ((-118.29434 34.25233, -118.29318 34.2..."


Save this dataset so we can use it in the next notebook for mapping.

In [12]:
snap_tracts.to_file("snap_tracts_exp.gpkg", driver="GPKG")

I am also interested at mapping at the zipcode tabulation area level, which might be more meaningful for SNAP outreach. The join will be performed the same. 

In [14]:
# Load LA County ZCTA shapefile
zcta = gpd.read_file('/Users/markwoody/Desktop/UP 213/UP213Final/final/data/tl_2024_us_zcta520.zip')
print(zcta.columns)
zcta.head(5)

Index(['ZCTA5CE20', 'GEOID20', 'GEOIDFQ20', 'CLASSFP20', 'MTFCC20',
       'FUNCSTAT20', 'ALAND20', 'AWATER20', 'INTPTLAT20', 'INTPTLON20',
       'geometry'],
      dtype='object')


Unnamed: 0,ZCTA5CE20,GEOID20,GEOIDFQ20,CLASSFP20,MTFCC20,FUNCSTAT20,ALAND20,AWATER20,INTPTLAT20,INTPTLON20,geometry
0,47236,47236,860Z200US47236,B5,G6350,S,1029063,0,39.1517426,-85.7252769,"POLYGON ((-85.7341 39.15597, -85.72794 39.1561..."
1,47870,47870,860Z200US47870,B5,G6350,S,8830,0,39.3701518,-87.4735141,"POLYGON ((-87.47414 39.37016, -87.47409 39.370..."
2,47851,47851,860Z200US47851,B5,G6350,S,53326,0,39.5735839,-87.2459559,"POLYGON ((-87.24769 39.5745, -87.24711 39.5744..."
3,47337,47337,860Z200US47337,B5,G6350,S,303089,0,39.8027537,-85.437285,"POLYGON ((-85.44357 39.80328, -85.44346 39.803..."
4,47435,47435,860Z200US47435,B5,G6350,S,13302,0,39.2657557,-86.2951577,"POLYGON ((-86.29592 39.26547, -86.29592 39.266..."


In [15]:
# Create GEOID in both datasets
census_snap['GEOID'] = census_snap['state'] + census_snap['county'] + census_snap['tract']
zcta['GEOID'] = zcta['GEOID20']

# Set GEOID as index
zcta = zcta.set_index('GEOID')
census_snap = census_snap.set_index('GEOID')

# Join geometry from tracts into full census_snap
snap_zcta = census_snap.join(zcta[['geometry']], how='inner')
snap_zcta = gpd.GeoDataFrame(snap_zcta, geometry='geometry', crs='EPSG:4326')
snap_zcta.head(5)

snap_zcta.to_file("snap_zcta_exp.gpkg", driver="GPKG")