# Analysis of Paris Accident Data - Part 3

**Goal**: Derive insights from the enriched accident data to support strategic decision-making.

In this notebook, we:
- Identify the top arrondissements and high-risk streets
- Analyze temporal patterns (monthly and weekly)
- Examine weather, road, and transport conditions
- Visualize data to support actionable recommendations

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import calendar

# Load the enriched dataset
df = pd.read_csv('../data/accidents_parsed.csv', sep=';', parse_dates=['accident_date'])

print(df.info())
df.head(3)

## Top Arrondissements and High-Risk Streets

We identify the top 3 arrondissements by unique accident count and then determine the top 3 high-accident streets in each.

In [None]:
# Count unique accidents per arrondissement
accidents_per_arr = df.groupby('arrondissement')['accident_ID'].nunique().sort_values(ascending=False)
top_3_arr = accidents_per_arr.head(3)

print("Top 3 arrondissements with highest accident counts:")
display(top_3_arr)

# Clean street names for clarity
def clean_street_name(address_series):
    return (address_series
            .str.upper()
            .str.replace(r'\bBD\b', 'BOULEVARD', regex=True)
            .str.replace(r'\bRTE\b', 'ROUTE', regex=True)
            .str.replace(r'\bAV\b', 'AVENUE', regex=True))

df['clean_address'] = clean_street_name(df['address'])

# For each top arrondissement, find the top 3 streets by accident count
top_streets = {}
for arr in top_3_arr.index:
    mask = (df['arrondissement'] == arr)
    street_counts = df[mask].groupby('clean_address')['accident_ID'].nunique().sort_values(ascending=False)
    top_streets[arr] = street_counts.head(3)

print("High-risk streets in top arrondissements:")
for arr in top_streets:
    print(f"Arrondissement {arr}:")
    display(top_streets[arr])

### Visualization: Accident Counts by Arrondissement

In [None]:
plt.figure(figsize=(10, 6))
sns.barplot(x=accidents_per_arr.index, y=accidents_per_arr.values, color='tomato')
plt.title('Number of Unique Accidents per Arrondissement')
plt.xlabel('Arrondissement')
plt.ylabel('Unique Accident Count')
plt.tight_layout()
plt.show()

## Temporal Analysis

We analyze the distribution of accidents by month and weekday to identify peak times.

In [None]:
# Extract month and weekday from accident_date
df['month'] = df['accident_date'].dt.month
df['weekday'] = df['accident_date'].dt.weekday

monthly_accidents = df.groupby('month')['accident_ID'].nunique().sort_index()
weekday_accidents = df.groupby('weekday')['accident_ID'].nunique().sort_index()

plt.figure(figsize=(14, 6))

# Monthly Distribution
plt.subplot(1, 2, 1)
monthly_accidents.plot(kind='bar', color='#4ECDC4')
plt.title('Monthly Accident Distribution')
plt.xticks(ticks=range(12), labels=[calendar.month_abbr[m] for m in range(1,13)])
plt.xlabel('Month')
plt.ylabel('Number of Accidents')

# Weekly Distribution
plt.subplot(1, 2, 2)
weekday_accidents.plot(kind='bar', color='#45B7D1')
plt.title('Weekly Accident Distribution')
plt.xticks(ticks=range(7), labels=['Mon','Tue','Wed','Thu','Fri','Sat','Sun'], rotation=0)
plt.xlabel('Day of Week')
plt.ylabel('Number of Accidents')

plt.tight_layout()
plt.show()

## Analysis of Weather, Road, and Transportation Modes

We examine weather conditions, road surface conditions, and the distribution of victim transport modes.

In [None]:
# Weather and Road Surface Conditions
weather_counts = df['weather_condition'].value_counts(dropna=True)
road_surface_counts = df['road_surface'].value_counts(dropna=True)

print("Top Weather Conditions:")
display(weather_counts.head(5))

print("\nTop Road Surface Conditions:")
display(road_surface_counts.head(5))

# Transportation Mode Distribution
transport_modes = df['victim_transport_mode'].value_counts(dropna=True)

plt.figure(figsize=(8, 6))
transport_modes.head(5).plot(kind='bar', color='#45B7D1')
plt.title('Transport Mode Distribution in Accidents')
plt.ylabel('Count of Victims')
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()

## Summary and Actionable Insights

Based on our analysis, key insights include:
- Concentration of accidents in top arrondissements and on specific high-risk streets
- Temporal peaks that could guide targeted interventions
- Weather and road conditions that may require infrastructure or policy changes

These insights support strategic decisions on where to deploy resources for maximum impact.