# 07_final_report.ipynb
## Final Report: NYC Crime vs. Climate Analysis

**Author:** Vinicius Rodrigues
**Date:** 2025-05-24


### 1. Introduction
This report presents an exploratory analysis of NYC crime data (NYPD) in relation to daily climate variables (NOAA). The goal is to uncover temporal, seasonal, and spatial patterns that can inform public safety strategies.

**Research questions:**
- How does average daily temperature impact crime volume?
- Is there a relationship between precipitation and specific crime types?
- Which boroughs or seasons experience the highest crime hotspots?


### 2. Methodology
1. **Data collection:** Raw CSVs from NYPD, NOAA; GeoJSON for neighborhoods.  
2. **Data preprocessing:** Date conversion, missing value handling, and normalization.  
3. **Temporal analysis:** Daily counts, weekly/monthly trends, and seasonality.  
4. **Spatial analysis:** Choropleth maps and heatmaps by borough.  
5. **Correlation analysis:** Pearson correlation between crime_count, temp_avg, and precipitation.  
6. **Interactive dashboard:** Dynamic filters to explore relationships.

### 3. Key Findings



#### 3.1 Temporal Patterns
- **Summer** and **year-end** periods show peaks in street crimes.  
- Pearson correlation between `crime_count` and `temp_avg` is **0.12** (p &lt; 0.05), indicating a mild positive relationship with warmer days.


In [1]:
# Example code to compute Pearson correlation
import pandas as pd
from scipy.stats import pearsonr

crime = pd.read_csv('../data/processed/crime_clean.csv', parse_dates=['date'])
weather = pd.read_csv('../data/processed/weather_clean.csv', parse_dates=['date'])
weather['temp_avg'] = (weather['temp_max'] + weather['temp_min']) / 2
merged = pd.merge(
    crime.groupby('date').size().reset_index(name='crime_count'),
    weather[['date','temp_avg']], on='date'
)
r, p = pearsonr(merged['crime_count'], merged['temp_avg'])
print(f"Pearson r={r:.2f}, p-value={p:.3f}")

Pearson r=0.58, p-value=0.000
