Analyzing 911 Calls and 
[Click here for live site!](https://m-sender.github.io/ServiceLearning)

# **Max Sender and Sam Traylor**

### Data set link: [Calls for service 2021](https://data.nola.gov/Public-Safety-and-Preparedness/Calls-for-Service-2021/3pha-hum9)

What this data set is a collection of 9-1-1 calls in 2021 in the New Orleans Area. This set contains basic things such as the type of incident, where it was, the police department, and timing, and more.

## Questions

#### We find this data set to be very insightful and can answer a lot of different questions. One route we can take is analyzing the data set to focusing on emergency response and answer questions regarding that. If this route is chosen, another dataset that could be of use is [Police Zone Information](https://data.nola.gov/dataset/Police-Zones/fngt-zkj9). This lets us expand our questions to answer more zone and area specific questions. Questions that we can answer going this route are:

*   Average response time by incident?

*   Average response time by zone/area?

*   Average response time by incident in specific areas?

#### Another route we can go with the data is focusing more on the crime aspect of the data set. This route will be more focused on answering questions about crime in specific areas instead of the emergency response.

*   Most frequent crimes in specific areas?

*   Based on the value counts of each type of crime in each area can we generalize patterns like violent crime happening more in one area, theft in another, etc?

*   What are the most frequest crimes by time of day in conjunction with a specific area?

#### There are more routes we can choose from and more questions will come to mind upon further analysis of the datasets. A combination of multiple routes will most likely render the most promising and insightful results.

## Collaboration plan:

We plan to collaborate via meetings over zoom, and store our data in a shared github. Any particular challenges that have to be solved in a pair-programming setting will be dealt with using live share on vscode.

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [3]:
df_Calls_untidy = pd.read_csv("../data/Calls_for_Service_2021.csv")
df_zones_untidy = pd.read_csv("../data/Police_Zones_data.csv")

In [4]:
df_Calls = df_Calls_untidy.drop(columns=['NOPD_Item','Type','InitialType','MapX','MapY','Disposition','Beat'])
df_Calls["TimeDispatch"] = pd.to_datetime(df_Calls["TimeDispatch"])
df_Calls["TimeArrive"] = pd.to_datetime(df_Calls["TimeArrive"])
df_Calls.head()

Unnamed: 0,TypeText,Priority,InitialTypeText,InitialPriority,TimeCreate,TimeDispatch,TimeArrive,TimeClosed,DispositionText,SelfInitiated,BLOCK_ADDRESS,Zip,PoliceDistrict,Location
0,AREA CHECK,1K,AREA CHECK,1K,01/01/2021 12:01:28 AM,2021-01-01 00:01:28,2021-01-01 00:01:28,01/01/2021 12:40:31 AM,Necessary Action Taken,Y,Vicksburg St & Brooks St,70124.0,3,POINT (-90.10764787 29.99729994)
1,"BURGLAR ALARM, SILENT",1A,"BURGLAR ALARM, SILENT",2E,01/01/2021 12:01:34 AM,2021-01-01 03:39:56,2021-01-01 03:43:58,01/01/2021 03:51:24 AM,Necessary Action Taken,N,036XX Baronne St,70115.0,6,POINT (-90.09455243 29.92938301)
2,AREA CHECK,1K,AREA CHECK,1K,01/01/2021 12:01:47 AM,2021-01-01 00:01:47,2021-01-01 00:01:47,01/01/2021 03:03:53 AM,Necessary Action Taken,Y,Decatur St & Iberville St,,8,POINT (-90.06636912 29.95282347)
3,FIREWORKS,1A,FIREWORKS,2J,01/01/2021 12:02:13 AM,NaT,NaT,01/01/2021 12:17:36 AM,VOID,N,055XX Sutton Pl,70131.0,4,POINT (-89.9964721 29.91905338)
4,DISCHARGING FIREARM,1A,DISCHARGING FIREARM,2D,01/01/2021 12:02:14 AM,2021-01-01 07:08:36,NaT,01/01/2021 07:08:48 AM,Necessary Action Taken,N,Lonely Oak Dr & Selma St,70126.0,7,POINT (-90.00138771 30.01667289)


**Columns Explained:**
* NOPD_item: Unique identifier for each incident
* Type: Type of incident (ID)
* TypeText: Type of incident (text)
* Priority: Priority of incident (ID)
* InitialType: Initial type of incident (ID)
* InitialTypeText: Initial type of incident (text)
* InitialPriority: Initial priority of incident (ID)
* Mapx: Latitude of incident
* Mapy: Longitude of incident
* TimeCreate: Time of incident
* TimeDispatch: Time of dispatch
* TimeArrive: Time of arrival
* TimeClose: Time of closure
* Disposition: Disposition of incident (ID)
* DispositionText: Disposition of incident (text)
* SelfInitiated: Self-initiated (Y or N)
* Beat: 
* BLOCK_ADDRESS: Block address of incident
* Zip: Zip code of incident
* PoliceDistrict: Police district of incident (ID)
* Location: Location of incident (ID)
Each entry in the dataset in a unique call to 911 dispatch with relevant information.


In [5]:
df_zones_untidy.head(5)
df_zones = df_zones_untidy.set_index("OBJECTID")
df_zones

Unnamed: 0_level_0,the_geom,Zone,District,Shape_Length,Shape_Area
OBJECTID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1883,MULTIPOLYGON (((-90.066369220964 29.9528235013...,8C,8,6792.167688,2.012343e+06
1855,MULTIPOLYGON (((-90.096467011995 29.9414493437...,6K,6,8518.748545,4.324270e+06
1860,MULTIPOLYGON (((-90.100235367426 29.9209080675...,6P,6,11703.354343,7.651238e+06
1784,MULTIPOLYGON (((-90.108388478789 29.9349666704...,2I,2,14995.498757,1.318799e+07
1847,MULTIPOLYGON (((-90.065506929482 29.9391674081...,6C,6,13163.392602,5.747707e+06
...,...,...,...,...,...
1795,MULTIPOLYGON (((-90.111520545347 29.9527465406...,2U,2,13072.875152,9.234422e+06
1819,MULTIPOLYGON (((-90.058618074754 29.9913549365...,3X,3,19241.277584,1.299197e+07
1827,MULTIPOLYGON (((-90.040725013307 29.9469852425...,4G,4,14627.126145,1.251777e+07
1882,MULTIPOLYGON (((-90.058944234626 29.9502226780...,8B,8,10947.349649,6.273422e+06


* the_geom: Polygon defining the zone in question
* OBJECTID: ??
* Zone: The police zone
* District: The district within the zone
* Shape_Length: The perimeter of the zone
* Share_Area: The the area inside of the zone

In [23]:
#Using that same response time column, we could look at the means across different areas (using the police district or zip column of this dataset)
#Using the results of the last question, we could further specify the avergage response time across incident type column values AND area column values.
#Using zone information and response time, determine "holes" in the zones where response time is higher than the norm or where the area has an increase in crime due to the response times.
#Get the value counts of each different crime for each time of day (we could categorize into several-hour windows like afternoon, evening, night, late night)
#We could use measures of variance like the standard deviation from average response time, which would allow us to identify 'holes' wherever the response time is far higher than average.
df_Calls["responseTime"] = df_Calls.TimeArrive - df_Calls.TimeDispatch

print("Maximum response time: ", df_Calls.responseTime.max())
print("Mean response time: ", df_Calls.responseTime.mean(), "\n") 

mean_by_zone = df_Calls.groupby(["PoliceDistrict"])
for group in mean_by_zone:
    print("Average response time in District", group[0], ": ", group[1].responseTime.mean())

Maximum response time:  3 days 01:03:27
Mean response time:  0 days 00:07:38.175869587 

Average response time in District 0 :  0 days 00:01:53.575574600
Average response time in District 1 :  0 days 00:05:16.409412358
Average response time in District 2 :  0 days 00:05:30.057592679
Average response time in District 3 :  0 days 00:06:21.208254123
Average response time in District 4 :  0 days 00:04:48.211173576
Average response time in District 5 :  0 days 00:10:29.393924408
Average response time in District 6 :  0 days 00:06:54.651413444
Average response time in District 7 :  0 days 00:17:43.915321735
Average response time in District 8 :  0 days 00:05:38.026938803
