# Link2Feed Client Mapping Project: Analyzing client population to find new service opportunities

## Notebook 2 of 2: Mapping and Visual Analysis Notebook

### About the Data Set: 
This data set is pulled from the Link2Feed client intake software system that is implemented by the Food Bank of Central and Eastern NC (FB of CENC). Whenever a client is enrolled in CFSP or TEFAP programs they enter their information into the Link2Feed system so the FB of CENC and associated food pantries can track usage and ensure that persons are qualified to receive the services from the FB that they are requesting. The data set is over 2.5 GB, covers a time frame from October 2019 to present, has almost 200k rows and over 480 columns of data. Most of the data is demographic information. 

From this data set, the FB of CENC wants to visualize where their clients live. These clients are mostly senior citizens, who rely on governmental assistance of some sort, and may or may not have difficulty with transportation, have very low income, and generally more disadvantaged than average citizens. Because this population is disadvantaged, it is in both the client's and the Food Bank's best interest to ensure food is being distributed in the most efficient way possible. The goal of this project to is to test the hypothesis below to determine where the Food Bank can influence the addition of new pantries or add services to existing pantries. By mapping out clusters of clients geographically, and associated food pantries and distribution centers, we may be able to identify more strategic locations for food distributors which can result in more efficient services to a disadvantaged population. 

### Hypothesis: 
Food Bank CFSP and TEFAP client populations will receive services in a way that saves them time, and transportation costs when new food pantries are opened in locations closer to where they live that provide greater accessibility than the current pantry distribution. 

### Methods: 
The project will be conducted using python, pandas, geopandas, geopy, and folium libraries. The data sources will be client information from the link2feed system provided by the Food Bank, county shapefile information made publicly available by the state of NC, and food pantry location data provided by the Food Bank. These three sources will be integrated into an interactive map that will be distributed to the Food Bank as a tool to aid in decision making. Finally, this project will be broken down into two separate notebooks. Notebook 1 is for importing, cleaning and combining data to prepare for analysis. Notebook two is designed for map creation and will be used as a presentation tool.

### Notes: 

## Part 1: Data Import

In [1]:
import geopandas as gpd
import pandas as pd
import numpy as np
import folium
import folium.plugins as plugins

In [2]:
#read in all data sets 

#client population data set
clients = pd.read_csv(r"C:\Users\htwal\Jupyter Projects\6a.food_bank_client_mapping\Processed Data\ClientMapping.csv")

#facility locations data set
agency = pd.read_csv(r"C:\Users\htwal\Jupyter Projects\6a.food_bank_client_mapping\Final Data\agencyDfFinal.csv")

#NC counties data set
county = gpd.read_file(r"C:\Users\htwal\Jupyter Projects\6a.food_bank_client_mapping\Raw Data\NCDOT_County_Boundaries2.geojson")


### County wrangling

In [3]:
counties = ['Brunswick', 'Carteret', 'Chatham', 'Columbus', 'Craven', 'Duplin', 'Durham', 'Edgecombe', 'Franklin',
            'Granville', 'Greene', 'Halifax', 'Harnett', 'Johnston', 'Jones', 'Lee', 'Lenoir', 'Moore', 'Nash',
            'New Hanover', 'Onslow', 'Orange', 'Pamlico', 'Pender', 'Person', 'Pitt', 'Richmond', 'Sampson', 
            'Scotland', 'Vance', 'Wake', 'Warren', 'Wayne', 'Wilson']

county = county[county['CountyName'].isin(counties)]
raliegh = ['Duplin', 'Franklin', 'Halifax', 'Harnett', 'Johnston', 'Nash', 'Sampson', 'Wake', 'Warren', 'Wayne']
durham = ['Chatham', 'Durham', 'Granville', 'Orange', 'Person', 'Vance']
newBern = ['Carteret', 'Craven', 'Jones', 'Onslow', 'Pamlico']
wilmington = ['Brunswick', 'Columbus', 'New Hanover', 'Pender']
greenville = ['Edgecombe', 'Greene', 'Lenoir', 'Pitt', 'Wilson']
sandhills = ['Lee', 'Moore', 'Richmond', 'Scotland']

### CFSP and TEFAP client Populations with Food Pantry Locations

In [4]:
#split the clients data frame into three different dfs
csfpClients = clients[clients['Program Name'] == 'CSFP Visit']
tefapClients = clients[clients['Program Name'] == 'TEFAP Pantry Visit']
fpClients = clients[clients['Program Name'] == 'Food Pantry Visit']

In [5]:
agency.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 888 entries, 0 to 887
Data columns (total 8 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   Unnamed: 0        886 non-null    object 
 1   Site Name Master  888 non-null    object 
 2   Lat               888 non-null    float64
 3   Lon               888 non-null    float64
 4   Address Master    436 non-null    object 
 5   CSFP              217 non-null    object 
 6   TEFAP             237 non-null    object 
 7   Score             888 non-null    float64
dtypes: float64(3), object(5)
memory usage: 55.6+ KB


In [6]:
agency['CSFP'] = agency['CSFP'].fillna('')
agency['TEFAP'] = agency['TEFAP'].fillna('')
agency['Services'] = agency['CSFP'] + ' ' + agency['TEFAP']
agency['Services'] = agency['Services'].str.strip()

In [7]:
agency['Services'].value_counts()

                   495
CSFP               156
TEFAP              137
CSFP TEFAP          55
TEFAP TEMP          39
CSFP TEFAP TEMP      6
Name: Services, dtype: int64

In [8]:
#concat strings for the pop up icon
agency['popup'] = agency['Site Name Master'] + ' / ' + agency['Services'] + ' / ' + agency['Score'].astype(str)

### Map Code

In [9]:
#generate df's that contain each facility by location_desc
agencyTefap = agency[(agency['Services'] == 'TEFAP') | (agency['Services'] == 'TEFAP TEMP')]
agencyCsfp = agency[(agency['Services'] == 'CSFP')]
agencyNone = agency[(agency['Services'] == '')]
agencyCsfpTefap = agency[(agency['Services'] == 'CSFP TEFAP') | (agency['Services'] == 'CSFP TEFAP TEMP')]

color coding:
1. = Beige - agency present only in master agency dataset
2. = Gray - agency present only in distribution dataset
3. = Orange - agency present in master and distribution sets
4. = Red - agency present only in link2feed dataset
5. = Light Blue - agency present in master and link2feed dataset
6. = Dark Blue - agency present in distribution and link2feed data sets only
7. = Green - agency present in all three datasets

In [11]:
mp = folium.Map(
    location=[34.39109797893491, -78.13374537548596],
    tiles="cartodbpositron",
    zoom_start=8,
)

#plotting county geoJSON
fga = folium.FeatureGroup(name='Counties')
countyStyleFunction = lambda x:{'fillOpacity':0.1}
countypop = folium.features.GeoJsonPopup(fields=['CountyName'])
fga.add_child(folium.features.GeoJson(county, style_function=countyStyleFunction, popup=countypop))
    
#plotting the three different types of clients:
def zipperDot(df, col1, col2):
    '''This function creates a list of lat/lon values from two separate series of a data frame and creates a list of tuples'''
    lat = df[col1].tolist()
    lon = df[col2].tolist()
    latLon = list(zip(lat, lon))
    return latLon

fgb = folium.FeatureGroup(name="CSFP visit", show=False)
csfp = list(zipperDot(csfpClients, 'Latitude', 'Longitude')) 
for lat, lon in csfp:
    fgb.add_child(folium.CircleMarker(location=[lat, lon], radius=1, color='blue'))

fgc = folium.FeatureGroup(name="TEFAP visit", show=False)
tefap = list(zipperDot(tefapClients, 'Latitude', 'Longitude')) 
for lat, lon in tefap:
    fgc.add_child(folium.CircleMarker(location=[lat, lon], radius=1, color='green'))
    
fgd = folium.FeatureGroup(name="Food Pantry visit", show=False)
fpv = list(zipperDot(fpClients, 'Latitude', 'Longitude')) 
for lat, lon in fpv:
    fgd.add_child(folium.CircleMarker(location=[lat, lon], radius=1, color='red'))
    
#add in pantrys: (color based on ser, vices offered, and icon based on score)
def zipperPantry(df, col1, col2, col3, col4):
    lat = df[col1].tolist()
    lon = df[col2].tolist()
    pop = df[col3].tolist()
    score = df[col4].tolist()
    info = list(zip(lat, lon, pop, score))
    return info

fge = folium.FeatureGroup(name="CSFP pantry", show=False)
csfpPantry = list(zipperPantry(agencyCsfp, 'Lat', 'Lon', 'popup', 'Score')) 
for lat, lon, pop, score in csfpPantry:
    fge.add_child(folium.Marker(location=[lat, lon], popup=pop, icon=folium.Icon(icon='fa-shopping-basket', prefix='fa', color='blue')))

fgf = folium.FeatureGroup(name="TEFAP pantry", show=False)
tefapPantry = list(zipperPantry(agencyTefap, 'Lat', 'Lon', 'popup', 'Score')) 
for lat, lon, pop, score in tefapPantry:
    fgf.add_child(folium.Marker(location=[lat, lon], popup=pop, icon=folium.Icon(icon='fa-shopping-basket', prefix='fa', color='green')))
    
fgg = folium.FeatureGroup(name="No Services", show=False)
nonePantry = list(zipperPantry(agencyNone, 'Lat', 'Lon', 'popup', 'Score')) 
for lat, lon, pop, score in nonePantry:
    fgg.add_child(folium.Marker(location=[lat, lon], popup=pop, icon=folium.Icon(icon='fa-shopping-basket', prefix='fa', color='red')))
    
fgh = folium.FeatureGroup(name="CSFP TEFAP pantry", show=False)
bothPantry = list(zipperPantry(agencyCsfpTefap, 'Lat', 'Lon', 'popup', 'Score')) 
for lat, lon, pop, score in bothPantry:
    fgh.add_child(folium.Marker(location=[lat, lon], popup=pop, icon=folium.Icon(icon='fa-diamond', prefix='fa', color='gray')))

#add plugins
folium.plugins.MeasureControl(primary_length_unit='miles').add_to(mp)

mp.add_child(fga)
mp.add_child(fgb)
mp.add_child(fgc)
mp.add_child(fgd)
mp.add_child(fge)
mp.add_child(fgf)
mp.add_child(fgg)
mp.add_child(fgh)
mp.add_child(folium.LayerControl())

mp.save('ballerMap.html')