# Final Notebook: Food Access in LA County
## Group Members: Madi Hamilton, Jessica Fay, Meaghan Woody, Branden Bohrnsen
### UP221 Winter 2024

### Description
**Research Question:** Are there geographic disparities trends in food insecurity and coronary heart disease in Los Angeles County?

**Notebook purpose:** Prepare data for creating maps

**Data sources:**
1. USC Neighborhood Data for Social Change
2. U.S. Census Bureau - American Community Survey 2016-2020
3. U.S. Census Tracts 2020

### Import Libraries

In [2]:
import pandas as pd
import geopandas as gpd 
import plotly.express as px
import matplotlib.pyplot as plt
import numpy as np
import contextily as cx
import mapclassify

### Create merged dataset with variables and geometry

In [3]:
# Import merged dataset for USC and Census variables
food=pd.read_csv('finaldata_0303.csv')
food.info(verbose=True, show_counts=True)

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4910 entries, 0 to 4909
Data columns (total 27 columns):
 #   Column                             Non-Null Count  Dtype  
---  ------                             --------------  -----  
 0   geoid20_x                          4910 non-null   object 
 1   CT20                               4910 non-null   object 
 2   FIPS_census                        4910 non-null   float64
 3   tract                              4910 non-null   float64
 4   ShapeSTArea                        4910 non-null   float64
 5   ShapeSTLength                      4908 non-null   float64
 6   geometry_x                         4906 non-null   object 
 7   % Hispanic or Latino               4910 non-null   int64  
 8   % Not a Citizen                    4910 non-null   int64  
 9   % Unemployed                       4910 non-null   int64  
 10  Population Density (Per Sq. Mile)  4910 non-null   int64  
 11  Median Household Income            4910 non-null   int64

In [6]:
# Load in tract shape file
tracts = gpd.read_file('tl_2020_06_tract.shp')
tracts['geoid20_x']=tracts.TRACTCE
tracts['COUNTYFP'].unique()
#filterto LA County census tracts
latracts = tracts.query("COUNTYFP == '037'")

# Conditionally delete rows where hispanic % is outlier
new = food[food['% Hispanic or Latino'] <= 100]

In [7]:
# Merging the data
tracts_census= latracts.merge(new,on="geoid20_x")

In [8]:
# Export DataFrame to CSV
tracts_census.to_csv('merged_data_jf.csv', index=False)

## Create new variables for analysis

In [9]:
# Create low access percent variable
a = food['lowaccess_count']
b =food['denom_total_pop']
food['lowaccess_pct']=round((a/b)*100)

In [10]:
# Create Food Insecurity Index 
# Create a Natural Breaks classifier
classifier = mapclassify.NaturalBreaks.make(k=5)

# Age score
tracts_census['age_pct_score'] = tracts_census[['Percent 65 years and over']].apply(classifier)
tracts_census[['Percent 65 years and over', 'age_pct_score']].head()

# Hispanic score
tracts_census['hisp_pct_score'] = tracts_census[['% Hispanic or Latino']].apply(classifier)
tracts_census[['% Hispanic or Latino', 'hisp_pct_score']].head()

# Unemployment score
tracts_census['emp_pct_score'] = tracts_census[['% Unemployed']].apply(classifier)
tracts_census[['% Unemployed', 'emp_pct_score']].head()

# Food Index: Age + Hispanic + Unemployment scores 
tracts_census['priority_index'] = tracts_census['age_pct_score']+tracts_census['hisp_pct_score']+tracts_census['emp_pct_score']