# Génération de Résultats Concrets
## Prédiction des moments de vie - Outputs Business

Ce notebook génère des résultats exploitables pour le business :
- Scores de propension pour tous les clients
- Segmentation par moment de vie
- Identification des clients à forte valeur
- Recommandations d'actions commerciales

## 1. Imports et Configuration

In [19]:
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, roc_auc_score
import warnings

warnings.filterwarnings("ignore")
from datetime import datetime

print(" Imports réussis")

 Imports réussis


## 2. Chargement des Données

In [20]:
print(" Chargement des données...")
clients_df = pd.read_csv("../data/clients_data.csv")
life_events_df = pd.read_csv("../data/life_events.csv")

print(f"    {len(clients_df):,} clients | {len(life_events_df):,} événements")

display(clients_df.head())

 Chargement des données...
    10,000 clients | 1,745 événements


Unnamed: 0,client_id,age,genre,situation_familiale,nb_enfants,csp,region,anciennete_banque_mois,revenu_mensuel,epargne_totale,...,nb_appels_conseiller_6mois,nb_visites_agence_6mois,recherche_pret_perso_recent,augmentation_epargne_recente,ouverture_compte_epargne_recent,consultation_assurance_vie,simulation_pret_immobilier,variation_revenus_recente,consultation_placements,consultation_pret_pro
0,CLI_000001,51,H,marie,0,ouvrier,IDF,46,2340.87549,36185.721244,...,0,1,0,0,0,0,0,0,0,0
1,CLI_000002,30,H,celibataire,1,employe,IDF,76,3488.229435,27063.294343,...,0,0,0,0,0,0,0,0,0,0
2,CLI_000003,37,H,divorce,1,etudiant,Autre,56,831.133826,22003.780519,...,0,2,0,0,0,0,0,0,0,0
3,CLI_000004,46,F,marie,2,cadre,IDF,310,3990.968188,86641.514165,...,0,1,0,0,0,0,0,0,0,0
4,CLI_000005,46,F,marie,0,cadre,IDF,207,3431.919432,130216.176674,...,2,1,0,1,0,0,0,0,0,0


## 3. Création des Variables Cibles Multiples

In [21]:
print(" Création des variables cibles multiples...")

# Liste des moments de vie
life_events_list = life_events_df["moment_de_vie"].unique()

# Créer une colonne par moment de vie
for event in life_events_list:
    target_clients = life_events_df[life_events_df["moment_de_vie"] == event][
        "client_id"
    ].unique()
    clients_df[f"target_{event}"] = (
        clients_df["client_id"].isin(target_clients).astype(int)
    )

print(f"    {len(life_events_list)} cibles créées:")
for event in life_events_list:
    count = clients_df[f"target_{event}"].sum()
    pct = count / len(clients_df) * 100
    print(f"      - {event}: {count:,} clients ({pct:.1f}%)")

 Création des variables cibles multiples...
    8 cibles créées:
      - achat_immobilier: 348 clients (3.5%)
      - deces_proche: 250 clients (2.5%)
      - mariage: 99 clients (1.0%)
      - changement_emploi: 618 clients (6.2%)
      - naissance: 192 clients (1.9%)
      - divorce: 128 clients (1.3%)
      - retraite: 27 clients (0.3%)
      - creation_entreprise: 83 clients (0.8%)


## 4. Preprocessing des Features

In [22]:
print(" Preprocessing des features...")

# Séparer features et targets
target_columns = [f"target_{event}" for event in life_events_list]
feature_cols = [
    col for col in clients_df.columns if col not in ["client_id"] + target_columns
]

X = clients_df[feature_cols].copy()
client_ids = clients_df["client_id"].copy()

# Encoder les catégorielles
categorical_cols = X.select_dtypes(include=["object"]).columns
le_dict = {}
for col in categorical_cols:
    le = LabelEncoder()
    X[col] = le.fit_transform(X[col])
    le_dict[col] = le

# Normalisation
scaler = StandardScaler()
X_scaled = pd.DataFrame(scaler.fit_transform(X), columns=X.columns, index=X.index)

print(f"    Features preprocessées: {X_scaled.shape}")

 Preprocessing des features...
    Features preprocessées: (10000, 26)


## 5. Entraînement des Modèles pour Chaque Moment de Vie

In [23]:
print(" Entraînement des modèles pour chaque moment de vie...")
print("=" * 70)

results = {}

for event in life_events_list:
    print(f"\n Moment de vie: {event}")
    y = clients_df[f"target_{event}"]

    # Split train/test
    X_train, X_test, y_train, y_test = train_test_split(
        X_scaled, y, test_size=0.3, random_state=42, stratify=y
    )

    # Entraîner Random Forest
    rf_model = RandomForestClassifier(
        n_estimators=100,
        max_depth=10,
        random_state=42,
        class_weight="balanced",
        n_jobs=-1,
    )
    rf_model.fit(X_train, y_train)

    # Prédictions et probabilités
    y_pred = rf_model.predict(X_test)
    y_proba = rf_model.predict_proba(X_test)[:, 1]

    # Métriques
    auc_score = roc_auc_score(y_test, y_proba)

    # Validation croisée
    cv_scores = cross_val_score(rf_model, X_train, y_train, cv=5, scoring="roc_auc")

    print(f"   - AUC: {auc_score:.3f}")
    print(f"   - CV AUC: {cv_scores.mean():.3f} (+/- {cv_scores.std():.3f})")

    # Sauvegarder les résultats
    results[event] = {
        "model": rf_model,
        "auc": auc_score,
        "cv_auc_mean": cv_scores.mean(),
        "cv_auc_std": cv_scores.std(),
    }

print("\n Tous les modèles entraînés")

 Entraînement des modèles pour chaque moment de vie...

 Moment de vie: achat_immobilier
   - AUC: 0.931
   - CV AUC: 0.931 (+/- 0.021)

 Moment de vie: deces_proche
   - AUC: 0.691
   - CV AUC: 0.660 (+/- 0.017)

 Moment de vie: mariage
   - AUC: 0.943
   - CV AUC: 0.869 (+/- 0.065)

 Moment de vie: changement_emploi
   - AUC: 0.846
   - CV AUC: 0.814 (+/- 0.020)

 Moment de vie: naissance
   - AUC: 0.926
   - CV AUC: 0.920 (+/- 0.022)

 Moment de vie: divorce
   - AUC: 0.627
   - CV AUC: 0.603 (+/- 0.032)

 Moment de vie: retraite
   - AUC: 0.748
   - CV AUC: 0.857 (+/- 0.123)

 Moment de vie: creation_entreprise
   - AUC: 0.803
   - CV AUC: 0.861 (+/- 0.072)

 Tous les modèles entraînés


## 6. Génération des Scores de Propension

In [24]:
print(" Génération des scores de propension...")

propensity_df = clients_df[
    [
        "client_id",
        "age",
        "genre",
        "situation_familiale",
        "csp",
        "revenu_mensuel",
        "epargne_totale",
    ]
].copy()

# Calculer la propension pour chaque événement
for event in life_events_list:
    model = results[event]["model"]
    propensity_scores = model.predict_proba(X_scaled)[:, 1]
    propensity_df[f"propension_{event}"] = propensity_scores

# Ajouter le moment de vie le plus probable
propensity_cols = [f"propension_{event}" for event in life_events_list]
propensity_df["moment_vie_principal"] = (
    propensity_df[propensity_cols].idxmax(axis=1).str.replace("propension_", "")
)
propensity_df["score_max"] = propensity_df[propensity_cols].max(axis=1)

print(f"    Scores calculés pour {len(propensity_df):,} clients")

display(propensity_df.head(10))

 Génération des scores de propension...
    Scores calculés pour 10,000 clients


Unnamed: 0,client_id,age,genre,situation_familiale,csp,revenu_mensuel,epargne_totale,propension_achat_immobilier,propension_deces_proche,propension_mariage,propension_changement_emploi,propension_naissance,propension_divorce,propension_retraite,propension_creation_entreprise,moment_vie_principal,score_max
0,CLI_000001,51,H,marie,ouvrier,2340.87549,36185.721244,0.093649,0.501749,0.031945,0.379751,0.022501,0.307542,0.0,0.132129,deces_proche,0.501749
1,CLI_000002,30,H,celibataire,employe,3488.229435,27063.294343,0.092111,0.437648,0.106934,0.369225,0.062504,0.26632,0.0,0.106153,deces_proche,0.437648
2,CLI_000003,37,H,divorce,etudiant,831.133826,22003.780519,0.083471,0.501326,0.120992,0.348568,0.08783,0.117107,0.009741,0.025869,deces_proche,0.501326
3,CLI_000004,46,F,marie,cadre,3990.968188,86641.514165,0.044911,0.279838,0.010773,0.33268,0.043242,0.10808,0.0,0.01714,changement_emploi,0.33268
4,CLI_000005,46,F,marie,cadre,3431.919432,130216.176674,0.727599,0.365448,0.043908,0.238503,0.018929,0.090853,0.0,0.108413,achat_immobilier,0.727599
5,CLI_000006,24,F,marie,profession_liberale,9081.347399,76443.817896,0.358028,0.036791,0.07485,0.222602,0.127347,0.086834,0.0,0.012493,achat_immobilier,0.358028
6,CLI_000007,49,F,celibataire,retraite,882.903908,7810.686283,0.048276,0.506125,0.008626,0.337572,0.003559,0.31279,0.0,0.014378,deces_proche,0.506125
7,CLI_000008,39,H,celibataire,cadre,7885.947084,329502.867262,0.086937,0.377625,0.266202,0.084425,0.089019,0.083868,0.0,0.069014,deces_proche,0.377625
8,CLI_000009,58,F,celibataire,cadre,3133.374615,9769.996336,0.011027,0.463969,0.006026,0.387886,0.006912,0.195813,0.0,0.019927,deces_proche,0.463969
9,CLI_000010,40,H,celibataire,ouvrier,3456.768389,33380.119591,0.097881,0.140397,0.00915,0.889526,0.064944,0.183524,0.0,0.200308,changement_emploi,0.889526


## 7. Segmentation des Clients

In [25]:
print(" Segmentation des clients...")

segments = []

for event in life_events_list:
    col_name = f"propension_{event}"

    # Créer des segments: Faible, Moyen, Élevé
    propensity_df[f"segment_{event}"] = pd.cut(
        propensity_df[col_name],
        bins=[0, 0.3, 0.6, 1.0],
        labels=["Faible", "Moyen", "Élevé"],
    )

    # Compter les clients par segment
    segment_counts = propensity_df[f"segment_{event}"].value_counts()

    segments.append(
        {
            "moment_de_vie": event,
            "clients_elevee": segment_counts.get("Élevé", 0),
            "clients_moyenne": segment_counts.get("Moyen", 0),
            "clients_faible": segment_counts.get("Faible", 0),
            "score_moyen": propensity_df[col_name].mean(),
            "score_median": propensity_df[col_name].median(),
        }
    )

segments_df = pd.DataFrame(segments)
print("\n Segments créés:")
display(segments_df)

 Segmentation des clients...

 Segments créés:


Unnamed: 0,moment_de_vie,clients_elevee,clients_moyenne,clients_faible,score_moyen,score_median
0,achat_immobilier,421,620,8955,0.119428,0.068694
1,deces_proche,95,4683,5219,0.280831,0.283947
2,mariage,63,102,8753,0.050436,0.025343
3,changement_emploi,544,4593,4860,0.284327,0.311524
4,naissance,186,321,9414,0.077325,0.04298
5,divorce,48,890,9061,0.164386,0.14678
6,retraite,16,5,2324,0.007869,0.0
7,creation_entreprise,58,115,8812,0.059959,0.037113


## 8. Identification des Clients à Forte Valeur

In [26]:
top_n = 100
print(f" Identification des top {top_n} clients par moment de vie...\n")

high_value_clients = {}

for event in life_events_list:
    col_name = f"propension_{event}"

    # Top N clients
    top_clients = propensity_df.nlargest(top_n, col_name)[
        [
            "client_id",
            "age",
            "genre",
            "situation_familiale",
            "csp",
            "revenu_mensuel",
            "epargne_totale",
            col_name,
        ]
    ].copy()

    top_clients = top_clients.rename(columns={col_name: "score_propension"})
    top_clients["moment_de_vie"] = event
    top_clients["rang"] = range(1, len(top_clients) + 1)

    high_value_clients[event] = top_clients

    print(
        f"    {event}: {len(top_clients)} clients (score moyen: {top_clients['score_propension'].mean():.3f})"
    )

print("\n Exemple - Top 10 clients pour 'achat_immobilier':")
if "achat_immobilier" in high_value_clients:
    display(high_value_clients["achat_immobilier"].head(10))

 Identification des top 100 clients par moment de vie...

    achat_immobilier: 100 clients (score moyen: 0.920)
    deces_proche: 100 clients (score moyen: 0.624)
    mariage: 100 clients (score moyen: 0.630)
    changement_emploi: 100 clients (score moyen: 0.909)
    naissance: 100 clients (score moyen: 0.823)
    divorce: 100 clients (score moyen: 0.597)
    retraite: 100 clients (score moyen: 0.288)
    creation_entreprise: 100 clients (score moyen: 0.641)

 Exemple - Top 10 clients pour 'achat_immobilier':


Unnamed: 0,client_id,age,genre,situation_familiale,csp,revenu_mensuel,epargne_totale,score_propension,moment_de_vie,rang
4729,CLI_004730,35,H,marie,employe,2788.865312,11228.595062,0.973792,achat_immobilier,1
4124,CLI_004125,47,F,celibataire,profession_liberale,2308.494945,12229.748154,0.973316,achat_immobilier,2
4972,CLI_004973,27,F,divorce,ouvrier,2045.905916,27284.051908,0.972858,achat_immobilier,3
5758,CLI_005759,43,H,marie,cadre,2910.858596,12895.620467,0.971793,achat_immobilier,4
1320,CLI_001321,53,F,marie,cadre,3641.165941,19830.445764,0.967636,achat_immobilier,5
6769,CLI_006770,42,F,celibataire,cadre,4878.468235,27949.313179,0.964495,achat_immobilier,6
7241,CLI_007242,44,F,marie,cadre,5037.544935,118760.101212,0.963887,achat_immobilier,7
2086,CLI_002087,38,H,celibataire,etudiant,1202.955364,19737.141553,0.961189,achat_immobilier,8
7741,CLI_007742,27,F,divorce,retraite,1477.200508,54043.769897,0.96,achat_immobilier,9
4855,CLI_004856,38,F,marie,retraite,2073.955517,42361.804543,0.958965,achat_immobilier,10


## 9. Recommandations Business

In [27]:
print(" Génération des recommandations business...\n")

# Mapping des actions recommandées
actions_map = {
    "achat_immobilier": "Campagne prêt immobilier - Simulation personnalisée",
    "mariage": "Offre compte joint + assurance vie couple",
    "naissance": "Plan épargne enfant + assurance scolaire",
    "divorce": "Accompagnement restructuration financière",
    "retraite": "Bilan retraite + produits de rente",
    "creation_entreprise": "Pack entrepreneur - Compte pro + financement",
    "deces_proche": "Accompagnement succession + conseil patrimonial",
    "demenagement": "Services domiciliation + offre multi-équipement",
    "changement_emploi": "Renégociation crédits + épargne salariale",
}

recommendations = []

for event in life_events_list:
    # Analyser le segment "Élevé"
    high_prop_clients = propensity_df[propensity_df[f"segment_{event}"] == "Élevé"]

    if len(high_prop_clients) > 0:
        # Profil type
        age_moyen = high_prop_clients["age"].mean()
        csp_principale = (
            high_prop_clients["csp"].mode()[0]
            if len(high_prop_clients["csp"].mode()) > 0
            else "N/A"
        )
        revenu_moyen = high_prop_clients["revenu_mensuel"].mean()
        epargne_moyenne = high_prop_clients["epargne_totale"].mean()

        # Recommandation
        rec = {
            "moment_de_vie": event,
            "nb_clients_cibles": len(high_prop_clients),
            "age_moyen": round(age_moyen, 1),
            "csp_principale": csp_principale,
            "revenu_moyen": round(revenu_moyen, 0),
            "epargne_moyenne": round(epargne_moyenne, 0),
            "action_recommandee": actions_map.get(
                event, "Contact personnalisé conseiller"
            ),
            "canal_prioritaire": "Email personnalisé + appel conseiller",
        }

        recommendations.append(rec)

recommendations_df = pd.DataFrame(recommendations)
print(" Recommandations générées\n")
display(recommendations_df)

 Génération des recommandations business...

 Recommandations générées



Unnamed: 0,moment_de_vie,nb_clients_cibles,age_moyen,csp_principale,revenu_moyen,epargne_moyenne,action_recommandee,canal_prioritaire
0,achat_immobilier,421,42.0,employe,3055.0,35284.0,Campagne prêt immobilier - Simulation personna...,Email personnalisé + appel conseiller
1,deces_proche,95,48.8,employe,2220.0,26240.0,Accompagnement succession + conseil patrimonial,Email personnalisé + appel conseiller
2,mariage,63,36.2,employe,2782.0,28739.0,Offre compte joint + assurance vie couple,Email personnalisé + appel conseiller
3,changement_emploi,544,43.8,employe,2952.0,34648.0,Renégociation crédits + épargne salariale,Email personnalisé + appel conseiller
4,naissance,186,37.3,employe,3027.0,35296.0,Plan épargne enfant + assurance scolaire,Email personnalisé + appel conseiller
5,divorce,48,43.4,employe,2714.0,34576.0,Accompagnement restructuration financière,Email personnalisé + appel conseiller
6,retraite,16,56.5,employe,2134.0,45289.0,Bilan retraite + produits de rente,Email personnalisé + appel conseiller
7,creation_entreprise,58,41.8,employe,2944.0,34248.0,Pack entrepreneur - Compte pro + financement,Email personnalisé + appel conseiller


## 10. Performance des Modèles

In [28]:
performance_data = []
for event in life_events_list:
    performance_data.append(
        {
            "moment_de_vie": event,
            "auc_score": results[event]["auc"],
            "cv_auc_mean": results[event]["cv_auc_mean"],
            "cv_auc_std": results[event]["cv_auc_std"],
        }
    )

performance_df = pd.DataFrame(performance_data)
print(" Performance des modèles:")
display(performance_df)

 Performance des modèles:


Unnamed: 0,moment_de_vie,auc_score,cv_auc_mean,cv_auc_std
0,achat_immobilier,0.930707,0.930754,0.021134
1,deces_proche,0.691191,0.660345,0.017201
2,mariage,0.942514,0.868867,0.06481
3,changement_emploi,0.845665,0.813691,0.020062
4,naissance,0.926293,0.920239,0.021829
5,divorce,0.626515,0.603264,0.03237
6,retraite,0.748099,0.856515,0.122549
7,creation_entreprise,0.802716,0.861044,0.072089


## 11. Sauvegarde des Résultats

In [29]:
print(" Sauvegarde des résultats...\n")

# 1. Scores de propension tous clients
propensity_df.to_csv("../data/resultats_scores_propension.csv", index=False)
print("    resultats_scores_propension.csv")

# 2. Segments
segments_df.to_csv("../data/resultats_segments.csv", index=False)
print("    resultats_segments.csv")

# 3. Top clients par événement
for event, df in high_value_clients.items():
    filename = f"../data/resultats_top_clients_{event.replace(' ', '_')}.csv"
    df.to_csv(filename, index=False)
print(f"    {len(high_value_clients)} fichiers top clients")

# 4. Recommandations business
recommendations_df.to_csv("../data/resultats_recommandations_business.csv", index=False)
print("    resultats_recommandations_business.csv")

# 5. Performance des modèles
performance_df.to_csv("../data/resultats_performance_modeles.csv", index=False)
print("    resultats_performance_modeles.csv")

 Sauvegarde des résultats...

    resultats_scores_propension.csv
    resultats_segments.csv
    8 fichiers top clients
    resultats_recommandations_business.csv
    resultats_performance_modeles.csv


## 12. Rapport Consolidé

In [30]:
print("=" * 80)
print("RAPPORT DE RÉSULTATS - PRÉDICTION DES MOMENTS DE VIE")
print(f"Date: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
print("=" * 80)

# Vue d'ensemble
print("\n VUE D'ENSEMBLE")
print("-" * 80)
print(f"Nombre total de clients analysés: {len(propensity_df):,}")
print(f"Nombre de moments de vie modélisés: {len(life_events_list)}")
print(f"Score AUC moyen: {performance_df['auc_score'].mean():.3f}")

# Performance
print("\n PERFORMANCE DES MODÈLES")
print("-" * 80)
for _, row in performance_df.iterrows():
    print(
        f"  • {row['moment_de_vie']}: AUC = {row['auc_score']:.3f} (CV: {row['cv_auc_mean']:.3f} ± {row['cv_auc_std']:.3f})"
    )

# Top 3 opportunités
print("\n TOP 3 OPPORTUNITÉS")
print("-" * 80)
top_opportunities = segments_df.nlargest(3, "clients_elevee")
for idx, (_, row) in enumerate(top_opportunities.iterrows(), 1):
    rec = recommendations_df[
        recommendations_df["moment_de_vie"] == row["moment_de_vie"]
    ].iloc[0]
    print(f"\n  {idx}. {row['moment_de_vie'].upper()}")
    print(f"     Clients cibles: {rec['nb_clients_cibles']:,}")
    print(
        f"     Profil: {rec['csp_principale']}, {rec['age_moyen']:.0f} ans, {rec['revenu_moyen']:,.0f}€/mois"
    )
    print(f"     Action: {rec['action_recommandee']}")

# Stats globales
print("\n STATISTIQUES GLOBALES")
print("-" * 80)
high_propensity_count = len(propensity_df[propensity_df["score_max"] > 0.6])
print(f"Clients à haute propension (>0.6): {high_propensity_count:,}")

print("\n ANALYSE TERMINÉE AVEC SUCCÈS!")
print("=" * 80)

RAPPORT DE RÉSULTATS - PRÉDICTION DES MOMENTS DE VIE
Date: 2025-11-18 10:56:09

 VUE D'ENSEMBLE
--------------------------------------------------------------------------------
Nombre total de clients analysés: 10,000
Nombre de moments de vie modélisés: 8
Score AUC moyen: 0.814

 PERFORMANCE DES MODÈLES
--------------------------------------------------------------------------------
  • achat_immobilier: AUC = 0.931 (CV: 0.931 ± 0.021)
  • deces_proche: AUC = 0.691 (CV: 0.660 ± 0.017)
  • mariage: AUC = 0.943 (CV: 0.869 ± 0.065)
  • changement_emploi: AUC = 0.846 (CV: 0.814 ± 0.020)
  • naissance: AUC = 0.926 (CV: 0.920 ± 0.022)
  • divorce: AUC = 0.627 (CV: 0.603 ± 0.032)
  • retraite: AUC = 0.748 (CV: 0.857 ± 0.123)
  • creation_entreprise: AUC = 0.803 (CV: 0.861 ± 0.072)

 TOP 3 OPPORTUNITÉS
--------------------------------------------------------------------------------

  1. CHANGEMENT_EMPLOI
     Clients cibles: 544
     Profil: employe, 44 ans, 2,952€/mois
     Action: Renégocia

## 13. Actions Recommandées

### Actions immédiates :
1.  Contacter les top 100 clients par moment de vie
2.  Lancer campagnes ciblées sur segments à propension élevée
3.  Personnaliser l'app mobile avec offres contextuelles
4.  Former les conseillers sur les profils types identifiés
5.  Mettre en place un monitoring mensuel des scores

### Améliorations futures :
- Affiner les modèles avec plus de données historiques
- Tester d'autres algorithmes (XGBoost, LightGBM)
- Intégrer des données externes (immobilier, emploi)
- Développer un tableau de bord temps réel