 <span style="color:#42a5f5; font-size:2em; font-weight:bold;">Jointure des données d'incidents et de mobilisations</span>

 <span style="font-weight:bold">Ce notebook à pour but d'explorer et de réaliser une jointureentre les données de  mobilisations et d'incidents !</span>

<span style="color:#e91e63; font-size:1em; font-weight:bold;"> 1. Import des données précedemment chargées et nettoyées</span>

In [1]:
#import des BU
import pandas as pd
import warnings
# Supprimer les warnings pour une meilleure lisibilité
warnings.filterwarnings("ignore")

In [None]:
# Chargement des fichiers nettoyés
df_incidents = pd.read_csv("../../data/raw/Cleaned_data/InUSE/cleaned_data_incidents.csv", dtype={"IncidentNumber": str}, low_memory=False)
df_mobilisations = pd.read_csv("../../data/raw/Cleaned_data/InUSE/cleaned_data_mobilisations.csv", dtype={"IncidentNumber": str}, low_memory=False)


In [3]:
# 🔧 Nettoyage de la colonne IncidentNumber
def clean_incident_number(value):
    if pd.isna(value):
        return None
    value = str(value)
    if '.' in value:
        return value.split('.')[0]
    if '-' in value:
        return value.split('-')[0]
    return value

df_incidents["IncidentNumber_clean"] = df_incidents["IncidentNumber"].apply(clean_incident_number)
df_mobilisations["IncidentNumber_clean"] = df_mobilisations["IncidentNumber"].apply(clean_incident_number)


In [4]:
# 📑 Colonnes à conserver pour les mobilisations
mobilisations_cols_to_keep = [
    "IncidentNumber_clean", "CalYear", "BoroughName", "WardName", "HourOfCall",
    "DateAndTimeMobilised", "DateAndTimeMobile", "DateAndTimeArrived",
    "TurnoutTimeSeconds", "TravelTimeSeconds", "AttendanceTimeSeconds",
    "DeployedFromStation_Name", "DeployedFromLocation"
]

df_mobilisations_reduced = df_mobilisations[mobilisations_cols_to_keep].copy()

In [5]:
# Jointure complète avec indicateur
df_merge = df_incidents.merge(
    df_mobilisations,
    how="outer",
    on="IncidentNumber_clean",
    suffixes=("_incident", "_mobilisation"),
    indicator=True
)

In [13]:
# Séparer les cas :
df_jointure = df_merge[df_merge["_merge"] == "both"].copy()
df_left_only = df_merge[df_merge["_merge"] == "left_only"].copy()
df_right_only = df_merge[df_merge["_merge"] == "right_only"].copy()

KeyboardInterrupt: 

In [12]:
# Sauvegardes
output_path = "../../data/processed/"
df_jointure.to_csv(f"{output_path}df_jointure_incidents_mobilisations.csv", index=False)
df_left_only.to_csv(f"{output_path}df_incidents_without_mobilisations.csv", index=False)
df_right_only.to_csv(f"{output_path}df_mobilisations_without_incidents.csv", index=False)


OSError: Cannot save file into a non-existent directory: '..\..\data\processed'

In [9]:
# Résumés
print(f"Jointure réussie : {df_jointure.shape[0]} lignes")
print(f"Incidents sans mobilisation : {df_left_only.shape[0]} lignes")
print(f"Mobilisations sans incident : {df_right_only.shape[0]} lignes")

Jointure réussie : 9949296 lignes
Incidents sans mobilisation : 78266 lignes
Mobilisations sans incident : 2426 lignes


In [None]:
#  Aperçus
print("\nAperçu jointure :")
display(df_jointure.head())

print("\nIncidents sans mobilisation :")
display(df_left_only.head())

print("\n Mobilisations sans incident :")
display(df_right_only.head())



🧭 Aperçu jointure :


Unnamed: 0,IncidentNumber_incident,DateOfCall,CalYear_incident,TimeOfCall,HourOfCall_incident,IncidentGroup,StopCodeDescription,SpecialServiceType,PropertyCategory,PropertyType,...,DateAndTimeReturned,DeployedFromStation_Code,DeployedFromStation_Name,DeployedFromLocation,PumpOrder,PlusCode_Code,PlusCode_Description,DelayCodeId,DelayCode_Description,_merge
0,000001-01012016,2016-01-01,2016.0,00:03:17,0.0,False Alarm,False alarm - Good intent,,Dwelling,House - single occupancy,...,,E25,Plumstead,Home Station,1.0,Initial,Initial Mobilisation,,,both
1,000001-01012016,2016-01-01,2016.0,00:03:17,0.0,False Alarm,False alarm - Good intent,,Dwelling,House - single occupancy,...,,A24,Soho,Home Station,1.0,Initial,Initial Mobilisation,9.0,"Traffic, roadworks, etc",both
2,000001-01012016,2016-01-01,2016.0,00:03:17,0.0,False Alarm,False alarm - Good intent,,Dwelling,House - single occupancy,...,,G38,Heston,Home Station,2.0,Initial,Initial Mobilisation,7.0,Arrived but held up - Other reason,both
3,000001-01012016,2016-01-01,2016.0,00:03:17,0.0,False Alarm,False alarm - Good intent,,Dwelling,House - single occupancy,...,,G38,Heston,Home Station,1.0,Initial,Initial Mobilisation,7.0,Arrived but held up - Other reason,both
4,000001-01012016,2016-01-01,2016.0,00:03:17,0.0,False Alarm,False alarm - Good intent,,Dwelling,House - single occupancy,...,,G24,Southall,Home Station,3.0,Initial,Initial Mobilisation,,,both



📉 Incidents sans mobilisation :


Unnamed: 0,IncidentNumber_incident,DateOfCall,CalYear_incident,TimeOfCall,HourOfCall_incident,IncidentGroup,StopCodeDescription,SpecialServiceType,PropertyCategory,PropertyType,...,DateAndTimeReturned,DeployedFromStation_Code,DeployedFromStation_Name,DeployedFromLocation,PumpOrder,PlusCode_Code,PlusCode_Description,DelayCodeId,DelayCode_Description,_merge
272257,004605-12012021,2021-01-12,2021.0,12:52:12,12.0,Special Service,Special Service,Lift Release,Dwelling,Purpose Built Flats/Maisonettes - 10 or more s...,...,,,,,,,,,,left_only
272258,004605-10012023,2023-01-10,2023.0,18:09:44,18.0,Special Service,Special Service,Lift Release,Non Residential,Shopping Centre,...,,,,,,,,,,left_only
410424,006932-17012018,2018-01-17,2018.0,15:54:10,15.0,Special Service,Special Service,Lift Release,Non Residential,Purpose built office,...,,,,,,,,,,left_only
2207215,037464-16032022,2022-03-16,2022.0,14:44:29,14.0,Fire,Primary Fire,,Road Vehicle,Lorry/HGV,...,,,,,,,,,,left_only
2207216,037464-05032025,2025-03-05,2025.0,12:04:11,12.0,Special Service,Special Service,Lift Release,Other Residential,Retirement/Old Persons Home,...,,,,,,,,,,left_only



📈 Mobilisations sans incident :


Unnamed: 0,IncidentNumber_incident,DateOfCall,CalYear_incident,TimeOfCall,HourOfCall_incident,IncidentGroup,StopCodeDescription,SpecialServiceType,PropertyCategory,PropertyType,...,DateAndTimeReturned,DeployedFromStation_Code,DeployedFromStation_Name,DeployedFromLocation,PumpOrder,PlusCode_Code,PlusCode_Description,DelayCodeId,DelayCode_Description,_merge
3151292,,,,,,,,,,,...,,A35,Enfield,Home Station,1.0,Initial,Initial Mobilisation,,,right_only
3725945,,,,,,,,,,,...,,H37,Wallington,Home Station,1.0,Initial,Initial Mobilisation,,,right_only
3827852,,,,,,,,,,,...,,G27,North Kensington,Home Station,1.0,Initial,Initial Mobilisation,12.0,Not held up,right_only
3827853,,,,,,,,,,,...,,ESX,Essex,Other Station,1.0,Initial,Initial Mobilisation,,,right_only
3840219,,,,,,,,,,,...,,G31,Northolt,Home Station,1.0,Initial,Initial Mobilisation,7.0,Arrived but held up - Other reason,right_only
