## **GROUP 9 EDA**
---
**AUTHORS**
- Jared Coffey (jcoffey7@u.rochester.edu)
- Kyle Chang (kchang27@u.rochester.edu)
- Junhan Yu (jyu55@u.rochester.edu)
---

**TABLE O' CONTENTS**
| Section | Description |
| ----------- | ----------- |
|  **1**  | **Exploratory Data Analysis** |
| 1.1 | Basic Imports |
| 1.2 | Gun Violence Data EDA |
| 1.3 | Food Insecurity Data EDA |
|  **2**  | **Distribution Modeling** |
| 2.1 | Fitting the Distributions |
| 2.2 | Comparing Distributions |
| **3** | **Sourcing** |

### SECTION 1: EXPLORATORY DATA ANALYSIS

#### SECTION 1.1): BASIC IMPORTS

In [14]:
# BASIC DS LIBS
import openpyxl
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# FILE IO LIBS
import os 
from os.path import join 

# GEOSPATIAL LIBS
from osgeo import gdal
import geopandas as gpd
from geopandas import GeoDataFrame

# PLOT LIBS
import plotly.express as px 
from plotly.offline import plot
import plotly.graph_objects as go
from shapely.geometry import Point
from plotly.subplots import make_subplots


In [48]:
# IMPORT DATA
hunger_data = pd.read_csv(r'data/foodlink_data.csv')
mealmap_data = pd.read_excel(r'data/MMG2022_2020-2019Data_ToShare.xlsx')
gun_data = pd.read_csv(r'data/Rochester_NY_Shooting_Victims.csv')

#### SECTION 1.2): GUN VIOLENCE DATA EDA

In [38]:
print("GUN VIOLENCE DATASET INFO")
#print(gun_data.info())
gun_data.head()
#print('SOME QUICK SUMMARY STATS (for numeric data):')
#print(gun_data.describe())

GUN VIOLENCE DATASET INFO


Unnamed: 0,X,Y,ID,Case_Number,Address,Occurred_Date,Occurred_Month,Occurred_Year,Crime_Type,Multiple_Shooting,Gender,Race,Ethnicity,Victim_Age,Victim_Age_Band,Latitude,Longitude,ObjectId
0,-77.61089,43.184163,7b9b3c9f85a442ce6a0aa2433ab8614f,22-109240,442 Remington St,2022/06/01 04:00:00+00,6,2022,Shooting,No,M,WHITE,HISPANIC,34,25-44,43.184163,-77.61089,1
1,-77.598893,43.181793,82e3549d4dc8a835dea7130df2dcc851,22-106950,904 Hudson Ave,2022/05/29 04:00:00+00,5,2022,Homicide,No,M,WHITE,HISPANIC,42,25-44,43.181793,-77.598893,2
2,-77.630378,43.14297,bb9b98f90af42194e3a5b12a1ba8405f,22-106780,168 Bartlett St,2022/05/29 04:00:00+00,5,2022,Shooting,No,M,BLACK,NON HISPANIC,60,45-Older,43.14297,-77.630378,3
3,-77.580628,43.167216,dd0ab9214578b6e0d35467f6bb7b0fc3,22-106380,720 N Goodman St,2022/05/28 04:00:00+00,5,2022,Shooting,No,M,BLACK,NON HISPANIC,28,25-44,43.167216,-77.580628,4
4,-77.641972,43.162394,5aac53d1517de031545f178e919c35b8,22-104298,138 Murray St,2022/05/26 04:00:00+00,5,2022,Shooting,No,M,WHITE,HISPANIC,15,15-24,43.162394,-77.641972,5


In [29]:
# PRELIM QUESTIONS
print("MODAL DATA:")
print(gun_data['Occurred_Date'].mode(), '\n')
print(gun_data['Occurred_Month'].mode(),'\n')
print(gun_data['Occurred_Year'].mode(),'\n')
print(gun_data['Victim_Age_Band'].mode(),'\n')
print(gun_data['Gender'].mode(),'\n')
print(gun_data['Race'].mode(),'\n')
print(gun_data['Address'].mode(),'\n')

MODAL DATA:
0    2020/09/19 04:00:00+00
Name: Occurred_Date, dtype: object 

0    7
Name: Occurred_Month, dtype: int64 

0    2021
Name: Occurred_Year, dtype: int64 

0    15-24
Name: Victim_Age_Band, dtype: object 

0    M
Name: Gender, dtype: object 

0    BLACK
Name: Race, dtype: object 

0    278 Pennsylvania Ave
Name: Address, dtype: object 



In [47]:
# PLOTTING GUN VIOLENCE INCIDENTS
gun_race = px.scatter_mapbox(
                        data_frame = gun_data,
                        lat = gun_data['Latitude'],
                        lon = gun_data['Longitude'],
                        color= gun_data['Crime_Type'],    # TODO: replace with our metric
                        text = gun_data['Crime_Type'],    # TODO: replace with our metric
                        hover_name = gun_data['Case_Number'],
                        hover_data= [gun_data['Occurred_Date'],gun_data['Address']],
                        zoom = 10,
                        mapbox_style = 'open-street-map',
                        title='Gun Violence Incidents by Zipcode',
                        height = 500,
                        width = 700
                        )


gun_race.show()

#### SECTION 1.3) Food Insecurity DATA (FoodLink)

In [39]:
# NAMING COLUMNS
hunger_data.columns = ['Zip Code', 'Latitude', 'Longitude', 'Food Insecurity']

# FIND ALL ZIPCODES IN THE ROCHESTER AREA
roc_zip = list(range(14604, 14624))
roc_zip.append(14626)
roc_zip.append(14627)
roc_zip.append(14642)

# REMAKE DATAFRAME INTO ONLY ROC DATA
hunger_data = hunger_data.loc[hunger_data['Zip Code'].isin(roc_zip)]

print("FOODLINK DATASET INFO")
print(hunger_data.info())

FOODLINK DATASET INFO
<class 'pandas.core.frame.DataFrame'>
Int64Index: 20 entries, 124 to 145
Data columns (total 4 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   Zip Code         20 non-null     int64  
 1   Latitude         20 non-null     float64
 2   Longitude        20 non-null     float64
 3   Food Insecurity  20 non-null     float64
dtypes: float64(3), int64(1)
memory usage: 800.0 bytes
None


In [41]:
hunger_rate = px.scatter_mapbox(
                        data_frame = hunger_data,
                        lat = hunger_data['Latitude'],
                        lon = hunger_data['Longitude'],
                        color= hunger_data['Food Insecurity'],
                        zoom = 10,
                        hover_data = [hunger_data['Food Insecurity'],
                                      hunger_data['Zip Code']],
                                      
                        size = hunger_data['Food Insecurity'],
                        mapbox_style = 'open-street-map',
                        title='Rochester Zip Codes by Food Insecurity Rate',
                        height = 500,
                        width = 700,
                        color_continuous_scale='RdYlGn_r'
                )
hunger_rate.show()

### SECTION 2: DISTRIBUTION MODELING

https://medium.com/the-researchers-guide/finding-the-best-distribution-that-fits-your-data-using-pythons-fitter-library-319a5a0972e9 

### SOURCING
---
[1] Dwyer, M. (2018), “New food insecurity data show level of need in Rochester, other communities,” Foodlink Inc, Available athttps://foodlinkny.org/new-food-insecurity-data-show-level-of-need-in-rochester-other-communities/.

[2] “Overall (all ages) Hunger & Poverty in the United States | Map the Meal Gap” (n.d.). Available athttps://map.feedingamerica.org.

[3] “Rochester NY Shooting Victims | Rochester NY Shooting Victims | Rochester, NY Police Department Open Data Portal” (n.d.). Available athttps://data-rpdny.opendata.arcgis.com/datasets/rochester-ny-shooting-victims/explore?location=43.180005%2C-77.596549%2C5.00.
