In [0]:
import base64
import os

def show_graph(filename):
    folder_path = "/Volumes/final_project/graphs/graphs/"
    full_path = os.path.join(folder_path, filename)
    
    try:
        with open(full_path, "rb") as f:
            data = base64.b64encode(f.read()).decode("utf-8")
        displayHTML(f'<img src="data:image/png;base64,{data}" style="max-width:100%; height:auto;">')
    except FileNotFoundError:
        print(f"Not found: {full_path}")

# Data Analysis Report: Chicago Crime 2025

## 1. Problem Statement Recap
The objective of this analysis was to answer the core question: **How can law enforcement agencies optimize patrol resource allocation by predicting crime hotspots based on historical spatiotemporal patterns?**

Specifically, we aimed to determine if crime types vary significantly across districts and time of day, and if we can segment districts to prioritize specific intervention strategies.

## 2. Exploratory Data Analysis (EDA) Findings

### Temporal Trends
* **Seasonality:** As seen in the *Daily Crime Trends by Category* chart below, crime volume exhibits clear seasonal variability. Property crime (green line) consistently dominates the volume, showing an upward trend during warmer months (peaking around July-August). Violent crime (orange line) remains relatively stable but lower in volume.

In [0]:
show_graph("Crime Trends by Category (2025).png")

### Hourly Patterns
The *Crime Intensity Heatmap* reveals a distinct temporal "danger zone." Crime intensity (dark red) peaks between **12:00 PM and 11:00 PM**, with the highest concentration occurring on Friday and Saturday nights. Conversely, the early morning hours (03:00 AM – 06:00 AM) show the lowest activity across all days of the week.

In [0]:
show_graph("Crime Intensity Heatmap- Day of Week vs. Hour of Day.png")

### Spatial Analysis
**High-Volume Districts:** The *Total Crime Volume by Police District* bar chart identifies Districts **001, 008, 011, and 018** as the highest volume areas, each exceeding 12,000 incidents. In contrast, districts like 020 and 031 have significantly lower reported incidents.

In [0]:
show_graph("Crime Volume by Police District.png")

**Arrest Efficiency:** The scatter plot *District Analysis: Crime Volume vs. Arrest Rate* uncovers a critical disparity. There is no linear correlation between high crime volume and high arrest rates. Some districts with moderate crime volume achieve arrest rates near 26%, while several high-volume districts have arrest rates below 15%. This suggests that resource saturation in high-crime areas does not automatically translate to higher clearance rates.

In [0]:
show_graph("District Analysis- Crime Volume vs. Arrest Rate (Size = Violent Crimes).png")

## 3. Advanced Analytics & Modeling

### District Clustering (K-Means)
We applied K-Means clustering (k=3) to segment police districts into actionable risk profiles based on property crime, violent crime, and arrest rates. The visualization below reveals three distinct groups:
1.  **Cluster 0 (Blue - "Balanced/Lower Risk"):** These districts generally show lower to moderate levels of both property and violent crime.
2.  **Cluster 1 (Orange - "Violent Crime Hotspots"):** This cluster is characterized by a disproportionately high ratio of violent crimes relative to property crimes (points shifted higher on the Y-axis relative to X). These areas represent the highest risk to public safety.
3.  **Cluster 2 (Green - "High Intensity/Property Dominated"):** These districts experience the highest absolute volume of crime, particularly property crime (far right on X-axis). This likely includes downtown or high-traffic commercial zones.

In [0]:
show_graph("District Segmentation- K-Means Clustering (k=3).png")

### Hypothesis Testing: Weekend Effect
* **Hypothesis:** Daily crime volume is significantly higher on weekends.
* **Result:** The statistical analysis yielded an unexpected result. The average daily crime count is nearly identical: **2,818 on weekdays** versus **2,833 on weekends**.
* **Interpretation:** While the *total* volume doesn't change drastically, the *nature* and *timing* of the crime likely shifts (as hinted at by the Friday/Saturday night intensity in the heatmap), meaning patrols must adjust tactics rather than just volume.

In [0]:
show_graph("Distribution of Daily Crime Volume- Weekday (0) vs Weekend (1).png")

## 4. Actionable Insights & Recommendations

Based on the analysis, we propose the following data-driven strategies:

1.  **Targeted Resource Allocation (Cluster-Based):**
    * **Cluster 1 (Orange Districts):** Deploy specialized violence intervention units and de-escalation teams.
    * **Cluster 2 (Green Districts):** Increase visible foot and bike patrols to deter theft and property damage, especially during business hours and early evenings.

2.  **Shift Realignment:**
    * Resources should be maximized during the **12:00 PM – 11:00 PM** window.
    * Friday and Saturday nights require specific "surge" capacity for crowd control and alcohol-related offenses.

3.  **Arrest Rate Investigation:**
    * Conduct a deep-dive audit into the "outlier" districts that have high crime volume but low arrest rates (<10%).

## 5. Limitations & Assumptions

* **Reporting Bias:** The analysis relies solely on reported incidents.
* **Geospatial Filtering:** We assumed that coordinates falling outside the Chicago bounding box were errors and excluded them.
* **Weekend Definition:** We defined "Weekend" strictly as Saturday and Sunday.