---
format: 
  html:
    toc: true
    page-layout: full
execute:
    warning: false
    echo: true
    eval: true
---

## **Distribution of Assaults**

***

The **histogram** and **density** plot of assault counts below shows a pronounced right skew, indicating that the majority of areas exhibit low assault counts, while a few areas report significantly higher instances. This distribution suggests that while most neighborhoods experience relatively few assaults, certain locations are disproportionately affected.

In [None]:
#| code-fold: true

if Assault21_net['countAssault'].isna().any():
    Assault21_net = Assault21_net.dropna()

plt.figure(figsize=(10, 6))
sns.histplot(Assault21_net['countAssault'], bins=30, color="#777181", edgecolor="white")
plt.title("Distribution of Assaults", fontsize=18, fontweight='bold')
plt.suptitle("Chicago, IL -- 2021", fontsize=12, y=0.87)
plt.xlabel("Assault Incidents", fontsize=10)
plt.ylabel("Count", fontsize=10)
plt.xticks(rotation=0, ha='center', fontsize=8)
plt.yticks(fontsize=8)
plt.grid(axis='y', linestyle='-', alpha=0.1)
plt.gca().set_facecolor('white')
plt.gca().spines['top'].set_visible(False)
plt.gca().spines['right'].set_visible(False)
plt.gca().spines['left'].set_color('grey')
plt.gca().spines['bottom'].set_color('grey')
plt.show()


![](../images/bar2.jpeg){width=75%}

***

In [None]:
#| code-fold: true

plt.figure(figsize=(10, 6))
sns.kdeplot(data=Assault21_net, x='countAssault', fill=True, color='#777181', alpha=0.5, linewidth = 0)
plt.title("Density Plot of Assaults")
plt.suptitle("Chicago, IL -- 2021", fontsize=12, y=0.87)
plt.xlabel("Assault Incidents")
plt.ylabel("Density")
plt.xticks(rotation=0, ha='center', fontsize=8)
plt.yticks(fontsize=8)
plt.gca().set_facecolor('white')
plt.gca().spines['top'].set_visible(False)
plt.gca().spines['right'].set_visible(False)
plt.gca().spines['left'].set_color('grey')
plt.gca().spines['bottom'].set_color('grey')
plt.show()

![](../images/density.jpeg){width=75%}

***

To process and visualize the various risk factors on the fishnet grid, we implemented a systematic approach involving the retrieval, preparation, and integration of multiple datasets. The process began with fetching data from various endpoints using the client.get method. Each endpoint corresponded to a specific urban risk factor, such as **graffiti, non-functional street lights, liquor retail stores**, and **ShotSpotter incidents**. The retrieved data was stored as individual DataFrames, filtered to include only records from the year 2018, and further refined to focus on relevant categories, such as graffiti locations and street light outages.

Next, we prepared the data for geospatial analysis. Each DataFrame was converted into a GeoDataFrame by transforming the latitude and longitude columns into geometric points to ensure proper geographic representation. Additionally, the coordinate reference system (CRS) of each GeoDataFrame was transformed to match that of the fishnet grid, ensuring spatial alignment for subsequent analysis.

We added a new column called Legend to each GeoDataFrame, categorizing the data by risk factor.  For example, the Legend column included values such as "Graffiti," "StreetLightsOut," "LiquorRetail," and "ShotSpotter," clearly differentiating between the various datasets. All individual GeoDataFrames were then combined into a single dataset named variable_net using the pd.concat function, creating a unified dataset encompassing all points of interest for the identified risk factors.

A spatial join (`sjoin`) was then performed to associate each point representing a specific risk factor with a corresponding fishnet grid cell. This step ensured that every data point was correctly placed within the spatial context of the grid cells, allowing us an examination of the risk factor distribution across the city. Finally, the concatenated DataFrame was converted into a GeoDataFrame with the appropriate CRS retained, allowing for spatial operations and visualization of the risk factors within the grid.
