# How to stay safe in New York?

This project looks at the evolution of complaints reported to the police in New York during the last year (from July 2018 until June 2019).
The goal is to analyze the data in order to find out how to stay safe in a big town like New York.

## 1. Questions

The questions I want to answer are:

- In which district are there the most complaints?

- What are the characteristics of people the most at risk?

- At what time are there more complaints?

- Is there any correlation between poverty and the amount of complaints made?

## 2. Dataset used

The dataset I used comes from the Open Data website of New York City Police Department (NYPD).

It is named "Incident Level Complaint Data - current year through most recent full quarter" and it  was downloaded from the following webpage: https://www1.nyc.gov/site/nypd/stats/crime-statistics/citywide-crime-stats.page

The data spans one year from July 2018 until June 2019.


For the last question I completed the dataset with some information about the poverty by ethnical background (from the following link https://statisticalatlas.com/place/New-York/New-York/Race-and-Ethnicity#figure/race-and-ethnicity) and I also used the data of poverty as a function of ethnical background in New York City (found on the following link: https://www1.nyc.gov/assets/opportunity/pdf/NYCPov-Brochure-2018-Digital.pdf)

## 3. Data analysis

After being cleaned the data has been analyzed to find the answers to the various questions I had.

### 3.1 In which district are there the most complaints?

By calculating the amount of complaints by district I found the following results:



![complaints_district](images/complaints_districts.png "Complaints by district")


We can see that the maximum amount of complaints is Brooklyn, Manhattan and the Bronx followed by Queens and Staten Island. 

It is very clear that the safest district to stay in new York is in Staten Island. In order to confirm this we can plot a heatmap showing the density of complaints in New York as shown in the next image:



![complaints_heatmap](images/complaints_heatmap.png "Complaints by district - heatmap")

### 3.2 What are the characteristics of people the most at risk?

The complaints have been grouped by the sex of the victims and the result is shown below:

![complaints_victim_sex](images/complaints_victim_sex.png "Complaints by sex victim")

We can see that women are more often vicitms than men. What about the ethnic background of the victims. You can see the result in the graph below:


![complaints_victim_ethnic](images/complaints_victim_ethnic.png "Complaints by victim ethnical background")

### 3.3  At what time are there more complaints?

We can have a look at the hours of the day when there are the maximum amount of complaints. You can see the result below:


![complaints_hour](images/complaints_hour.png "Complaints by hour")

We can see that most of the complaints are happening between 15h and 18h.

We can also have a look at which days has the largest amount of complaints. You can see the result below:



![complaints_day](images/complaints_day.png "Complaints by day")

There is not a particular influence of the day of the week on the number of complaints. Friday is the day with the most complaints but there is no particular day with a big difference in the amount of complaints.

### 3.4 - Is there any correlation between poverty and the amount of complaints made?

One of the questions I wanted to answer was to see whether there is an influence of poverty on the amount of complaints.

In order to do so I completed the dataset with some information about the poverty by ethnical background and I also used the data of poverty as a function of ethnical background in New York City.


![complaints_poverty](images/complaints_poverty.png "Complaints vs poverty")


This graph shows on the x axis the amount of people in poverty for 10 000 people in function of the ethnic background. On the y axis it shows the number suspects of complaints by ethnic background for 10 000 complaints.


There seems to be a correlation if you only look at Asian, White and Hispanic ethnic background. But for this you would need to exclude the point from Afro american. Therefore there is no correlation between poverty and the suspects of complaints.

It makes sense as there can be many factors involved in people behaving in ways that are against the law. One might think about factors such as education, the environment where people live, gang culture, personal reasons (psychological problems for example)

### 3.5 - Are people with an Afro-american background more likely to be victims?

In order to answer this question we will calculate the ratio of Afro-american people that are victims (compared to the total Afro-american population of New York) and compare it to the ratio for the population that is not Afro-american.

The population of afro-american and non afro-americans in New York is as follows 
Note: This data was obtained from the following census information for 2018
https://www1.nyc.gov/site/planning/planning-level/nyc-population/current-future-populations.page 

Total New York population: 8398748 <br>
Afro-american population: 2049295 <br>
Non Afro-american population: 6349453 <br>


The amount of vicitms (between June 2018 and June 2019) is as follows <br>
Total victims: 155464 <br>
Afro american victims: 56314 <br>
Non Afro american victims:  99150 <br>


Therefore the ratios for afro-american and non afro-americans are as follows:<br>
Afro american victims ratio = 2,75% <br>
Non Afro american victims ratio = 1,56% <br>

The difference is quite large and it shows that during that time the proportion of Afro-americans being a victim is larger compared to the rest of the population.<br>
This conclusion is only for the period betweenJune 2018 and June 2019. In order to see if this is a tendency we would need to repeat the calculation for other years.

## Litterature

Find below some articles that are related to crime analysis in New York or Chicago

https://nycdatascience.com/blog/student-works/crime-and-demographics-in-new-york-city/

https://www.datasciencecentral.com/profiles/blogs/7-sins-in-nyc
https://www.kaggle.com/djonafegnem/chicago-crime-data-analysis