Where are the most dangerous intersections or road segments in Paris?

Are there spatial clusters of accidents with specific characteristics?

How does accident frequency correlate with road configuration?

Are there days of the week, or months of the year when accidents are more frequent?

Are there any trends in accident frequency over time?

How do weather conditions correlate with accident patterns?

In [15]:
import pandas as pd
from tabulate import tabulate

In [16]:
df = pd.read_csv("../data/accidents-enriched.csv", encoding="utf-8", sep=",")

In [17]:
print(df.columns)

Index(['Date', 'Arrondissement', 'Mode', 'Catégorie', 'Gravité', 'Age',
       'Genre', 'Milieu', 'Adresse', 'Longitude', 'Latitude', 'Id accident',
       'PIM/BD PERIPHERIQUE', 'Tranche d'age', 'Blessés Légers',
       'Blessés hospitalisés', 'Tué', 'road_configuration', 'lighting',
       'weather', 'surface', 'vehicle_types', 'driver_sex'],
      dtype='object')


In [18]:
# --- Basic Statistics on 'driver_sex' ---
print("--- Basic Statistics on 'driver_sex' ---")

# 1. Value Counts
# Convert the series to a DataFrame so we can tabulate it nicely.
driver_sex_counts_df = driver_sex_counts.reset_index()
driver_sex_counts_df.columns = ['Driver Sex', 'Count']
print("\nValue Counts of Driver Sex:")
print(tabulate(driver_sex_counts_df, headers='keys', tablefmt='pretty', showindex=False))

# 2. Percentages
driver_sex_percentages_df = driver_sex_percentages.reset_index()
driver_sex_percentages_df.columns = ['Driver Sex', 'Percentage']
print("\nPercentages of Driver Sex:")
print(tabulate(driver_sex_percentages_df, headers='keys', tablefmt='pretty', showindex=False))

# --- Driver Sex and Accident Severity ---
print("\n--- Driver Sex and Accident Severity ---")

# 3. Driver Sex Distribution by Accident Severity
driver_sex_by_severity_df = driver_sex_by_severity.reset_index(name='Count')
driver_sex_by_severity_df.columns = ['Accident Severity (Gravité)', 'Driver Sex', 'Count']
print("\nDriver Sex Distribution by Accident Severity:")
print(tabulate(driver_sex_by_severity_df, headers='keys', tablefmt='pretty', showindex=False))

# 4. Percentages of Driver Sex by Accident Severity
driver_sex_by_severity_percentages_df = driver_sex_by_severity_percentages.reset_index(name='Percentage')
driver_sex_by_severity_percentages_df.columns = ['Accident Severity (Gravité)', 'Driver Sex', 'Percentage']
print("\nPercentages of Driver Sex by Accident Severity:")
print(tabulate(driver_sex_by_severity_percentages_df, headers='keys', tablefmt='pretty', showindex=False))

# --- Driver Sex and Mode of Transport ---
print("\n--- Driver Sex and Mode of Transport ---")

# 5. Driver Sex Distribution by Mode of Transport
driver_sex_by_mode_df = driver_sex_by_mode.reset_index(name='Count')
driver_sex_by_mode_df.columns = ['Mode of Transport', 'Driver Sex', 'Count']
print("\nDriver Sex Distribution by Mode of Transport:")
print(tabulate(driver_sex_by_mode_df, headers='keys', tablefmt='pretty', showindex=False))

# 6. Percentages of Driver Sex by Mode of Transport
driver_sex_by_mode_percentages_df = driver_sex_by_mode_percentages.reset_index(name='Percentage')
driver_sex_by_mode_percentages_df.columns = ['Mode of Transport', 'Driver Sex', 'Percentage']
print("\nPercentages of Driver Sex by Mode of Transport:")
print(tabulate(driver_sex_by_mode_percentages_df, headers='keys', tablefmt='pretty', showindex=False))

# --- Additional Statistics (Examples) ---
print("\n--- Additional Statistics (Examples) ---")

# 7. Driver Sex Distribution by Lighting Conditions
driver_sex_by_lighting_df = driver_sex_by_lighting.reset_index(name='Count')
driver_sex_by_lighting_df.columns = ['Lighting Condition', 'Driver Sex', 'Count']
print("\nDriver Sex Distribution by Lighting Conditions:")
print(tabulate(driver_sex_by_lighting_df, headers='keys', tablefmt='pretty', showindex=False))

# 8. Driver Sex Distribution by Road Configuration
driver_sex_by_road_df = driver_sex_by_road.reset_index(name='Count')
driver_sex_by_road_df.columns = ['Road Configuration', 'Driver Sex', 'Count']
print("\nDriver Sex Distribution by Road Configuration:")
print(tabulate(driver_sex_by_road_df, headers='keys', tablefmt='pretty', showindex=False))

# --- Crosstab for Detailed Analysis ---
print("\n--- Crosstab for Detailed Analysis ---")

# 9. Crosstab of Driver Sex and Accident Severity
cross_severity_sex_df = cross_severity_sex.reset_index()
print("\nCrosstab of Accident Severity and Driver Sex:")
print(tabulate(cross_severity_sex_df, headers='keys', tablefmt='pretty', showindex=False))

# 10. Crosstab of Driver Sex and Accident Severity (Percentages)
cross_severity_sex_percentages_df = cross_severity_sex_percentages.reset_index()
print("\nCrosstab of Accident Severity and Driver Sex (Percentages):")
print(tabulate(cross_severity_sex_percentages_df, headers='keys', tablefmt='pretty', showindex=False))

# 11. Crosstab of Driver Sex and Mode
cross_mode_sex_df = cross_mode_sex.reset_index()
print("\nCrosstab of Mode and Driver Sex:")
print(tabulate(cross_mode_sex_df, headers='keys', tablefmt='pretty', showindex=False))

# 12. Crosstab of Driver Sex and Mode (Percentages)
cross_mode_sex_percentages_df = cross_mode_sex_percentages.reset_index()
print("\nCrosstab of Mode and Driver Sex (Percentages):")
print(tabulate(cross_mode_sex_percentages_df, headers='keys', tablefmt='pretty', showindex=False))

--- Basic Statistics on 'driver_sex' ---

Value Counts of Driver Sex:
+------------+-------+
| Driver Sex | Count |
+------------+-------+
|  Masculin  | 27332 |
|    nan     | 7988  |
|  Feminin   | 5891  |
+------------+-------+

Percentages of Driver Sex:
+------------+--------------------+
| Driver Sex |     Percentage     |
+------------+--------------------+
|  Masculin  |  66.3220984688554  |
|    nan     | 19.38317439518575  |
|  Feminin   | 14.294727135958846 |
+------------+--------------------+

--- Driver Sex and Accident Severity ---

Driver Sex Distribution by Accident Severity:
+-----------------------------+------------+-------+
| Accident Severity (Gravité) | Driver Sex | Count |
+-----------------------------+------------+-------+
|     Blessé hospitalisé      |  Masculin  | 2114  |
|     Blessé hospitalisé      |    nan     |  519  |
|     Blessé hospitalisé      |  Feminin   |  284  |
|        Blessé léger         |  Masculin  | 25021 |
|        Blessé léger        