# COMM2550 Spring 2023 
# **Crime and Light:**
## Data: Street Lights, Crime, Philly Districts (OpenDataPhilly)
**Annabel Sumardi and Michael Li**

---
## Making interpretations on the presence of street lights and crime in the Philadelphia datasets 
- Looking into whether light affects amount of crime

### The Data 
- Data is available on opendataphilly.org
    - *Street Light data*: https://opendataphilly.org/datasets/street-poles/ 
    - *Crime data*: https://opendataphilly.org/datasets/crime-incidents/
    - *Philly Police District data*: https://opendataphilly.org/datasets/police-districts/
    - *Philly Sunset data*: provided by Prof. O'Donnell
    - *Philly Neighborhood data*: https://github.com/blackmad/neighborhoods/blob/master/gn-philadelphia.geojson

* Related articles
    * https://urbanlabs.uchicago.edu/projects/crime-lights-study
    * https://popcenter.asu.edu/content/improving-street-lighting-reduce-crime-residential-areas-page-2 

 
**Research Question**: Does the presence of street light poles in a neighborhood have a correlation with crime rates in that area? What inferences or further questions can we make based on a correlational relationship?

**Analysis:**
What parts of Philly are better lit? Why?
What parts of Philly have more crime? 
Does the amount of light in certain Philly districts affect the amount of crime in that district? 

**Hypothesis**: We hypothesize that the presence of light poles in a neighborhood has a negative correlation with crime rates. If we add more light poles into a neighborhood, the crime rates will go down. 

**Possible conclusions**: If the results show a negative correlation between light pole presence and crime rates, it can be concluded that increasing the number of light poles in a neighborhood may be an effective strategy for reducing crime. If there is no significant correlation between light pole presence and crime rates, it may be necessary to investigate other factors that may be contributing to crime in the area.
*Similarly, crime rates can be correlated to other data sets, so note there could be confounding variables even if we find a correlation.*

### Our Data Analysis, Explained and Interpreted
In the data_analysis notebook, we analyzed the correlation between brightness and crime. In order to do so, we are utilizing the Open Data Philly datasets which regard Street Pole presence and Crime presence. 

**0. Before beginning our data analysis,** 
- We imported the Philly police district data in order to add district borders onto our map. This will help better visualize the data. 
- We later decided we wanted to dive into the data with more granularity, which we eventually plotted by neighborhood. This allowed us to visualize the data in more granular sectors than police district.

**1. First, we must aggregate the street pole data onto a map of brightness.** 
- Because there is a mass amount of street lights, the map appears to be fully covered when not accounting for density of the lights. Thus, we decided to try heat mapping the street light presence. 
    - This way, our data illustrates where the street lights are most present with the use of the darker green shades, an example of the expressiveness principle. Because we associate darker colors with higher density, this choice of visualization matches what a viewer would most easily interpret. 
    - *In the hexplot,* we saw how the districts surrounding Center City, spanning from Chinatown area to Fishtown area, seemed to have the highest number of street lights. This fact makes sense as the downtown area in Philly is more populated and more visited, thus street poles are more necessary. 
    - *In the heatmap,* the districts with the most amount of street lights actually appear to be a bit further out than Center City. This could be due to the fact that the districts are all different sizes, so the larger districts will by nature have more street lights since there is more area to cover. 

<font color="ForestGreen"> **Used the following codeblock to plot out street pole heatmap** </font>

> <font color="ForestGreen"> fig, ax = plt.subplots(figsize=(10,10))  
> district_pole_gdf.plot(column='pole_count', cmap='BuGn', linewidth=1, ax=ax, edgecolor='black', legend=True)  
> ax.set_xlabel('Longitude')  
> ax.set_ylabel('Latitude')  
> ax.set_title('Street Poles Heatmap by District')  
> dplot=districts_gdf.apply(show_district_nums, axis=1)  
</font>

**2. Second, we heatmapped the Crime presence by district.** 
- We wanted to have a way to eventually aggregate our light and crime data, so sorting by district allowed us to do so. 
    - We first mapped the crimes onto the same map of Philly as the street lights were mapped onto, utilizing a method of heat mapping to best illustrate where the crimes were most occurring. 
    - We noted how the districts with the highest crime did not line up with the districts with the highest amount of street lights, so if we find a correlation, then this fact supports the idea it would be a negative correlation instead of a positive correlation.
    - *In the heatmap,* the crime data appeared to be the highest in a couple of districts, one of which bordered the UPenn Drexel area. 

<font color="ForestGreen"> **Used the following codeblock to plot out night crimes on a heatmap** </font>

> <font color="ForestGreen"> fig, ax = plt.subplots(figsize=(10,10))  
district_crime_gdf.plot(column='all_night_crimes', cmap='BuPu', linewidth=1, ax=ax, edgecolor='black', legend=True)  
ax.set_xlabel('Longitude')  
ax.set_ylabel('Latitude')  
ax.set_title('Number of Night Crimes per District', fontsize=16)  
districts_gdf.apply(show_district_nums, axis=1) </font>

**3. Finally, we aggregated the data into a table together,** integrating the datasets into a multi-variable data table. With the correlation tool, we illustrated how there is a `0.53` correlation between street lights and crime. Because our tables and heatmaps suggest that a higher amount of light correlates to slightly less crime, we could lean toward the idea that light and crime have a negative correlation. Thus, more light could lead to a decrease in crime, as we hypothesized. 

<font color="ForestGreen"> **Used the following codeblock to find correlations** </font>

> <font color="ForestGreen"> combined_df[['pole_count', 'all_night_crimes']].corr() </font>

**4. At this point, we wanted to dig a little deeper into our data,** as we already set the baseline steps for the coding and heatmapping. We decided to ask the question, could light affect different types of crime different amounts? 
- First, we created a filter which split the crime data into severe crimes, which we defined as ‘Aggravated Assault No Firearm’, 'Aggravated Assault Firearm', 'Rape', 'Other Sex Offenses (Not Commercialized)', 'Arson', and 'Offenses Against Family and Children'. 
- Second, we created a filter which split the crime data into petty crimes, which we defined as 'Thefts', 'Theft from Vehicle', 'Robbery No Firearm', 'Burglary Non-Residential', ‘Vandalism/Criminal Mischief', 'Disorderly Conduct', 'DRIVING UNDER THE INFLUENCE', 'Prostitution and Commercialized Vice', 'Public Drunkenness', 'Liquor Law Violations', and 'Gambling Violations'. 
- Using these two filters, we followed the same steps we used in heatmapping all night crimes to create two new heatmaps which illustrated where severe cirmes were most present as well as where petty crimes were most present. 
- Then, we followed our same correlation table steps to define the correlations between lights and severe crime and lights and petty crime. We observed that light and severe crime had a correlation of `.36`, and then we noted that light and petty crime had a correlation of `.50`. 
- The light and severe crime correlation was relatively weaker compared to the light and petty crime correlation. We felt this made sense as petty crimes like pickpocketing might be prevented by the presence of light, due to the added difficulty which the ability to see or spot who the criminal is induces. 
- The weaker correlation between light and severe crime might be due to the fact that a street lamp would probably not prevent a murderer or other violent crime from occurring.  

**5. After our presentation, we decided the next step would be to increase the granularity of our analysis,** and we decided to accomplish this goal by mapping by neighborhood. 
- We could note that Frankford has the highest amount of severe crimes, with a count of `44` severe crimes.
    - We then decided to zoom into this neighborhood to visualize how Franford's severe crime and street light data mapped out
        - Frankford is a neighborhood located in the Northeast section of Philadelphia that has experienced some of the highest rates of severe crime in the city. According to crime statistics from the Philadelphia Police Department, Frankford consistently ranks as one of the top neighborhoods in the city for incidents of assault and other severe crimes.
        - The visualization was intriguing, but it did not actually reveal a large amount of correlation between street poles and severe crime. Because we have previously noted how severe crime and street lights have a weaker correlation, this could be due to the fact that we are analyzing based on severe crimes. 
- So we could also note that Upper Kensington has the highest amount of petty crimes, at a count of `97`. 
    - We decided to zoom into Upper Kensington with a similar visualization style as Franford. 
        - Upper Kensington is a neighborhood located in the Lower Northeast section of Philadelphia that has experienced some of the highest rates of petty crime in the city. Petty crimes are typically nonviolent offenses that involve theft or property damage.
        - According to crime statistics from the Philadelphia Police Department, Upper Kensington consistently ranks as one of the top neighborhoods in the city for incidents of petty crime. These types of crimes can have a significant impact on the quality of life for residents, as they create a sense of insecurity and vulnerability. 
- Both of these neighborhoods are located along the SEPTA Market-Frankford line, which could contribute to their higher instances of crime. This could be a counfounding variable which is making us visualize correlation where there may not be any. This is just to emphasize that any correlation we note does not equal causation. 

<font color="ForestGreen"> **Used the following codeblock to map Upper Kensington** </font>

> <font color="ForestGreen"> uk_pole_density = len(uk_sp_gdf) / uk_gdf.geometry.iloc[0].area  
    uk_crime_density = len(uk_petty_gdf) / uk_gdf.geometry.iloc[0].area  
    uk_pole_normalized_count = len(uk_sp_gdf) / uk_gdf.geometry.iloc[0].area * .000001  
    uk_crime_normalized_count = len(uk_petty_gdf) / uk_gdf.geometry.iloc[0].area * .0005  
    fig, ax = plt.subplots(figsize=(10, 10))  
    uk_gdf.plot(ax=ax, color='lightgrey', edgecolor='black', label='Upper Kensington')  
    uk_sp_gdf.plot(ax=ax, color='red', markersize=uk_pole_normalized_count, label='Street Poles')  
    uk_petty_gdf.plot(ax=ax, color='blue', markersize=uk_crime_normalized_count, label='Petty Crimes')  
    ax.set_title('Street Poles and Petty Crimes in Upper Kensington')      
    ax.legend() 
    plt.show()   </font>

--- 

This knowledge obtained from our data could contribute to city-wide solutions for preventing crime and decreasing overall crime rates. By increasing the presence of street lights, districts could be made safer. This could be due to the fact that it is easier to commit a crime if one cannot be seen, so adding lights could counteract that idea. We could also look to the idea that [citizens feel safer when walking under light as opposed to down a dark alley](https://citymonitor.ai/infrastructure/architecture-design/what-makes-people-feel-safe-night-science-street-lights-4251). In whatever reasoning, if increasing the amount of lights could help decrease crime, then it would be a good idea to install more lights, especially in crime-heavy areas. Especially if the change would be within Philly's budget, then it could serve a positive purpose.

Overall, analyzing the correlation between brightness and crime using Open Data Philly datasets can provide valuable insights into how light levels can impact crime rates in a city. By using data to inform policy decisions and allocate resources more effectively, we can work towards creating a safer and more secure city for everyone. This was a really fun project for us to work on in order to test our Python skills on real data. 