# üìä Notebook 4 : √âvaluation Approfondie des Mod√®les

## Objectif : Analyser les performances, les erreurs et pr√©parer le d√©ploiement

### Ce notebook couvre :
1. Chargement des mod√®les entra√Æn√©s
2. Analyse d√©taill√©e des performances
3. Analyse des erreurs de pr√©diction
4. Courbes d'apprentissage et validation crois√©e
5. Analyse SHAP pour l'interpr√©tabilit√©
6. Recommandations sectorielles
7. Rapport final et pr√©paration au d√©ploiement
8. Export des r√©sultats pour l'API

## 1. üì¶ Import des biblioth√®ques

In [1]:
# Biblioth√®ques de base
import pandas as pd
import numpy as np
from pathlib import Path
import json
import pickle
import warnings
warnings.filterwarnings('ignore')

# Visualisation
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# Machine Learning
from sklearn.model_selection import cross_val_score, learning_curve
from sklearn.metrics import (mean_squared_error, mean_absolute_error, r2_score,
                            accuracy_score, classification_report, confusion_matrix,
                            precision_recall_curve, roc_curve, auc,
                            precision_score, recall_score, f1_score)

# XGBoost
import xgboost as xgb

# Configuration
plt.style.use('seaborn-v0_8-whitegrid')
pd.set_option('display.max_columns', None)
pd.set_option('display.float_format', lambda x: '%.3f' % x)

print("‚úÖ Biblioth√®ques import√©es avec succ√®s!")

‚úÖ Biblioth√®ques import√©es avec succ√®s!


## 2. üìÇ Chargement des mod√®les et donn√©es

In [2]:
# Chemins
DATA_PATH = Path("../data/processed")
MODELS_PATH = Path("../models")

# Charger les donn√©es
df = pd.read_csv(DATA_PATH / "dataset_ml_complete.csv")

# Charger les mod√®les
with open(MODELS_PATH / "xgb_regression_substitution.pkl", 'rb') as f:
    model_reg = pickle.load(f)

with open(MODELS_PATH / "xgb_classification_opportunity.pkl", 'rb') as f:
    model_clf = pickle.load(f)

with open(MODELS_PATH / "scaler.pkl", 'rb') as f:
    scaler = pickle.load(f)

# Charger les m√©tadonn√©es
with open(MODELS_PATH / "model_metadata.json", 'r', encoding='utf-8') as f:
    model_metadata = json.load(f)

# Charger l'analyse sectorielle
sector_opportunities = pd.read_csv(MODELS_PATH / "sector_opportunities.csv", index_col=0)

print("‚úÖ Mod√®les et donn√©es charg√©s avec succ√®s!")
print(f"\nüìä Dataset: {df.shape[0]} lignes √ó {df.shape[1]} colonnes")
print(f"üìÖ Date d'entra√Ænement: {model_metadata['date_training']}")
print(f"\nüìà Mod√®le de r√©gression - R¬≤: {model_metadata['regression_model']['metrics']['r2']:.4f}")
print(f"üìà Mod√®le de classification - Accuracy: {model_metadata['classification_model']['metrics']['accuracy']:.4f}")

‚úÖ Mod√®les et donn√©es charg√©s avec succ√®s!

üìä Dataset: 210 lignes √ó 52 colonnes
üìÖ Date d'entra√Ænement: 2025-12-19 17:35:10

üìà Mod√®le de r√©gression - R¬≤: 0.8320
üìà Mod√®le de classification - Accuracy: 0.9841


In [3]:
# Recr√©er les variables cibles et pr√©parer les donn√©es
df_eval = df.copy()

# Recr√©er le score de substitution
df_eval['substitution_score'] = np.where(
    df_eval['imports_fcfa'] > 0,
    df_eval['production_fcfa'] / df_eval['imports_fcfa'],
    df_eval['production_fcfa']
)
df_eval['substitution_score_normalized'] = (
    (df_eval['substitution_score'] - df_eval['substitution_score'].min()) / 
    (df_eval['substitution_score'].max() - df_eval['substitution_score'].min()) * 100
)

# Recr√©er les classes d'opportunit√©
def classify_opportunity(row):
    score = 0
    ratio = row['production_fcfa'] / row['imports_fcfa'] if row['imports_fcfa'] > 0 else row['production_fcfa']
    demande = row['production_fcfa'] + row['imports_fcfa'] - row['exports_fcfa']
    autosuff = (row['production_fcfa'] / demande * 100) if demande > 0 else 0
    
    if ratio > 5: score += 3
    elif ratio > 1: score += 2
    elif ratio > 0.5: score += 1
    
    if autosuff > 80: score += 3
    elif autosuff > 50: score += 2
    elif autosuff > 30: score += 1
    
    if row['production_fcfa_growth'] > 10: score += 2
    elif row['production_fcfa_growth'] > 0: score += 1
    
    if score >= 6: return 2
    elif score >= 3: return 1
    else: return 0

df_eval['opportunity_class'] = df_eval.apply(classify_opportunity, axis=1)
opportunity_labels = {0: 'Faible', 1: 'Moyenne', 2: 'Haute'}
df_eval['opportunity_label'] = df_eval['opportunity_class'].map(opportunity_labels)

# Features
feature_cols = model_metadata['feature_columns']
train_years = model_metadata['train_years']
test_years = model_metadata['test_years']

# Split train/test
train_mask = df_eval['year'].isin(train_years)
test_mask = df_eval['year'].isin(test_years)

X_train = df_eval.loc[train_mask, feature_cols].values
X_test = df_eval.loc[test_mask, feature_cols].values
y_train_reg = df_eval.loc[train_mask, 'substitution_score_normalized'].values
y_test_reg = df_eval.loc[test_mask, 'substitution_score_normalized'].values
y_train_clf = df_eval.loc[train_mask, 'opportunity_class'].values
y_test_clf = df_eval.loc[test_mask, 'opportunity_class'].values

# Normalisation
X_train_scaled = scaler.transform(X_train)
X_test_scaled = scaler.transform(X_test)

print("‚úÖ Donn√©es pr√©par√©es pour l'√©valuation")
print(f"   Train: {len(X_train)} √©chantillons")
print(f"   Test: {len(X_test)} √©chantillons")

‚úÖ Donn√©es pr√©par√©es pour l'√©valuation
   Train: 147 √©chantillons
   Test: 63 √©chantillons


## 3. üìä Analyse d√©taill√©e des performances

In [4]:
# 3.1 Performance du mod√®le de r√©gression
y_pred_reg = model_reg.predict(X_test_scaled)
y_pred_train_reg = model_reg.predict(X_train_scaled)

# M√©triques d√©taill√©es
metrics_reg = {
    'Train': {
        'RMSE': np.sqrt(mean_squared_error(y_train_reg, y_pred_train_reg)),
        'MAE': mean_absolute_error(y_train_reg, y_pred_train_reg),
        'R¬≤': r2_score(y_train_reg, y_pred_train_reg),
        'MAPE': np.mean(np.abs((y_train_reg - y_pred_train_reg) / (y_train_reg + 1e-10))) * 100
    },
    'Test': {
        'RMSE': np.sqrt(mean_squared_error(y_test_reg, y_pred_reg)),
        'MAE': mean_absolute_error(y_test_reg, y_pred_reg),
        'R¬≤': r2_score(y_test_reg, y_pred_reg),
        'MAPE': np.mean(np.abs((y_test_reg - y_pred_reg) / (y_test_reg + 1e-10))) * 100
    }
}

print("üìä PERFORMANCE DU MOD√àLE DE R√âGRESSION")
print("="*60)
print(f"\n{'M√©trique':<15} {'Train':>15} {'Test':>15} {'√âcart':>15}")
print("-"*60)
for metric in ['RMSE', 'MAE', 'R¬≤', 'MAPE']:
    train_val = metrics_reg['Train'][metric]
    test_val = metrics_reg['Test'][metric]
    ecart = abs(train_val - test_val)
    print(f"{metric:<15} {train_val:>15.4f} {test_val:>15.4f} {ecart:>15.4f}")

# Analyse de l'overfitting
overfit_ratio = metrics_reg['Train']['R¬≤'] / metrics_reg['Test']['R¬≤']
print(f"\nüìà Ratio Train/Test R¬≤: {overfit_ratio:.2f}")
if overfit_ratio > 1.2:
    print("   ‚ö†Ô∏è L√©g√®re tendance √† l'overfitting d√©tect√©e")
else:
    print("   ‚úÖ Pas d'overfitting significatif")

üìä PERFORMANCE DU MOD√àLE DE R√âGRESSION

M√©trique                  Train            Test           √âcart
------------------------------------------------------------
RMSE                     6.8975          3.6252          3.2723
MAE                      2.3108          1.8506          0.4602
R¬≤                       0.7639          0.8320          0.0681
MAPE            22166740059.2157 17240798556.3375 4925941502.8782

üìà Ratio Train/Test R¬≤: 0.92
   ‚úÖ Pas d'overfitting significatif


In [5]:
# 3.2 Visualisation des performances de r√©gression
fig = make_subplots(
    rows=2, cols=2,
    subplot_titles=(
        'Pr√©dictions vs R√©alit√© (Test)',
        'R√©sidus vs Pr√©dictions',
        'Distribution des Erreurs',
        'Q-Q Plot des R√©sidus'
    )
)

# 1. Scatter plot pr√©dictions vs r√©alit√©
fig.add_trace(
    go.Scatter(x=y_test_reg, y=y_pred_reg, mode='markers',
               marker=dict(color='steelblue', size=8, opacity=0.7),
               name='Pr√©dictions'),
    row=1, col=1
)
fig.add_trace(
    go.Scatter(x=[0, max(y_test_reg)], y=[0, max(y_test_reg)], mode='lines',
               line=dict(color='red', dash='dash'), name='Parfait'),
    row=1, col=1
)

# 2. R√©sidus vs Pr√©dictions
residuals = y_test_reg - y_pred_reg
fig.add_trace(
    go.Scatter(x=y_pred_reg, y=residuals, mode='markers',
               marker=dict(color='#e74c3c', size=8, opacity=0.7),
               name='R√©sidus'),
    row=1, col=2
)
fig.add_trace(
    go.Scatter(x=[min(y_pred_reg), max(y_pred_reg)], y=[0, 0], mode='lines',
               line=dict(color='black', dash='dash')),
    row=1, col=2
)

# 3. Distribution des erreurs
fig.add_trace(
    go.Histogram(x=residuals, nbinsx=20, marker_color='steelblue', name='Erreurs'),
    row=2, col=1
)

# 4. Q-Q Plot approximatif
sorted_residuals = np.sort(residuals)
theoretical_quantiles = np.linspace(0.01, 0.99, len(sorted_residuals))
from scipy import stats
theoretical_values = stats.norm.ppf(theoretical_quantiles, loc=np.mean(residuals), scale=np.std(residuals))
fig.add_trace(
    go.Scatter(x=theoretical_values, y=sorted_residuals, mode='markers',
               marker=dict(color='#27ae60', size=6), name='Q-Q'),
    row=2, col=2
)
fig.add_trace(
    go.Scatter(x=[min(theoretical_values), max(theoretical_values)], 
               y=[min(theoretical_values), max(theoretical_values)],
               mode='lines', line=dict(color='red', dash='dash')),
    row=2, col=2
)

fig.update_layout(
    title_text="üìä Analyse des Performances - Mod√®le de R√©gression",
    height=700, showlegend=False
)
fig.show()

In [6]:
# 3.3 Performance du mod√®le de classification
y_pred_clf = model_clf.predict(X_test_scaled)
y_pred_train_clf = model_clf.predict(X_train_scaled)

# M√©triques par classe
print("üìä PERFORMANCE DU MOD√àLE DE CLASSIFICATION")
print("="*60)

print("\nüìã Rapport de classification (Test):")
print(classification_report(y_test_clf, y_pred_clf, 
                           target_names=['Faible', 'Moyenne', 'Haute']))

# Matrice de confusion
cm = confusion_matrix(y_test_clf, y_pred_clf)

fig = go.Figure(data=go.Heatmap(
    z=cm,
    x=['Faible', 'Moyenne', 'Haute'],
    y=['Faible', 'Moyenne', 'Haute'],
    colorscale='Blues',
    text=cm,
    texttemplate="%{text}",
    textfont={"size": 20}
))

fig.update_layout(
    title="üéØ Matrice de Confusion - Classification des Opportunit√©s",
    xaxis_title="Classe Pr√©dite",
    yaxis_title="Classe R√©elle",
    height=400
)
fig.show()

# M√©triques globales
print(f"\nüìà M√©triques globales:")
print(f"   Accuracy: {accuracy_score(y_test_clf, y_pred_clf):.4f}")
print(f"   Precision (macro): {precision_score(y_test_clf, y_pred_clf, average='macro'):.4f}")
print(f"   Recall (macro): {recall_score(y_test_clf, y_pred_clf, average='macro'):.4f}")
print(f"   F1-Score (macro): {f1_score(y_test_clf, y_pred_clf, average='macro'):.4f}")

üìä PERFORMANCE DU MOD√àLE DE CLASSIFICATION

üìã Rapport de classification (Test):
              precision    recall  f1-score   support

      Faible       1.00      1.00      1.00         2
     Moyenne       0.97      1.00      0.98        29
       Haute       1.00      0.97      0.98        32

    accuracy                           0.98        63
   macro avg       0.99      0.99      0.99        63
weighted avg       0.98      0.98      0.98        63




üìà M√©triques globales:
   Accuracy: 0.9841
   Precision (macro): 0.9889
   Recall (macro): 0.9896
   F1-Score (macro): 0.9891


## 4. üîç Analyse des erreurs de pr√©diction

In [7]:
# 4.1 Identifier les cas avec les plus grandes erreurs
df_test = df_eval[test_mask].copy()
df_test['predicted_score'] = y_pred_reg
df_test['actual_score'] = y_test_reg
df_test['error'] = df_test['actual_score'] - df_test['predicted_score']
df_test['abs_error'] = np.abs(df_test['error'])
df_test['predicted_class'] = y_pred_clf
df_test['actual_class'] = y_test_clf
df_test['class_correct'] = df_test['predicted_class'] == df_test['actual_class']

# Top erreurs de r√©gression
print("üîç ANALYSE DES ERREURS DE PR√âDICTION")
print("="*60)

print("\nüìä Top 10 des plus grandes erreurs (R√©gression):")
print("-"*60)
top_errors = df_test.nlargest(10, 'abs_error')[['LIBELLES', 'year', 'actual_score', 'predicted_score', 'error']]
for i, (idx, row) in enumerate(top_errors.iterrows(), 1):
    print(f"{i:2d}. {row['LIBELLES'][:40]}... ({row['year']})")
    print(f"    R√©el: {row['actual_score']:.2f} | Pr√©dit: {row['predicted_score']:.2f} | Erreur: {row['error']:+.2f}")

# Erreurs de classification
misclassified = df_test[~df_test['class_correct']]
print(f"\nüìä Erreurs de classification: {len(misclassified)}/{len(df_test)} ({len(misclassified)/len(df_test)*100:.1f}%)")
if len(misclassified) > 0:
    print("\nCas mal classifi√©s:")
    for idx, row in misclassified.iterrows():
        print(f"   ‚Ä¢ {row['LIBELLES'][:40]}... ({row['year']})")
        print(f"     R√©el: {opportunity_labels[row['actual_class']]} ‚Üí Pr√©dit: {opportunity_labels[row['predicted_class']]}")

üîç ANALYSE DES ERREURS DE PR√âDICTION

üìä Top 10 des plus grandes erreurs (R√©gression):
------------------------------------------------------------
 1. Perles fines ou de culture, pierres gemm... (2021)
    R√©el: 52.96 | Pr√©dit: 31.05 | Erreur: +21.91
 2. Mati√®res textiles et ouvrages en ces mat... (2022)
    R√©el: 0.09 | Pr√©dit: 10.23 | Erreur: -10.14
 3. Mati√®res textiles et ouvrages en ces mat... (2021)
    R√©el: 0.09 | Pr√©dit: 10.23 | Erreur: -10.14
 4. Perles fines ou de culture, pierres gemm... (2023)
    R√©el: 34.62 | Pr√©dit: 28.23 | Erreur: +6.40
 5. Perles fines ou de culture, pierres gemm... (2022)
    R√©el: 34.11 | Pr√©dit: 28.23 | Erreur: +5.89
 6. Objets d'art, de collection ou d'antiqui... (2021)
    R√©el: 0.00 | Pr√©dit: 1.09 | Erreur: -1.09
 7. Armes, munitions et leurs parties et acc... (2021)
    R√©el: 0.01 | Pr√©dit: 1.09 | Erreur: -1.07
 8. Armes, munitions et leurs parties et acc... (2023)
    R√©el: 0.01 | Pr√©dit: 1.09 | Erreur: -1.07
 9. Ouvra

In [8]:
# 4.2 Analyse des erreurs par secteur et par ann√©e
error_by_sector = df_test.groupby('LIBELLES')['abs_error'].mean().sort_values(ascending=False)
error_by_year = df_test.groupby('year')['abs_error'].mean()

fig = make_subplots(rows=1, cols=2,
                    subplot_titles=('Erreur moyenne par secteur (Top 10)', 
                                   'Erreur moyenne par ann√©e'))

# Erreur par secteur
fig.add_trace(
    go.Bar(x=error_by_sector.head(10).values,
           y=[s[:30]+'...' for s in error_by_sector.head(10).index],
           orientation='h',
           marker_color='#e74c3c'),
    row=1, col=1
)

# Erreur par ann√©e
fig.add_trace(
    go.Bar(x=error_by_year.index.astype(str),
           y=error_by_year.values,
           marker_color='steelblue'),
    row=1, col=2
)

fig.update_layout(
    title_text="üîç Distribution des Erreurs",
    height=400,
    showlegend=False
)
fig.show()

print("\nüìä Statistiques des erreurs:")
print(f"   Erreur moyenne: {df_test['abs_error'].mean():.4f}")
print(f"   Erreur m√©diane: {df_test['abs_error'].median():.4f}")
print(f"   √âcart-type: {df_test['abs_error'].std():.4f}")
print(f"   Erreur max: {df_test['abs_error'].max():.4f}")


üìä Statistiques des erreurs:
   Erreur moyenne: 1.8506
   Erreur m√©diane: 1.0740
   √âcart-type: 3.1423
   Erreur max: 21.9113


## 5. üìà Validation crois√©e et stabilit√© du mod√®le

In [9]:
# 5.1 Validation crois√©e sur l'ensemble des donn√©es
from sklearn.model_selection import cross_val_score, KFold

# Pr√©parer toutes les donn√©es
X_all = df_eval[feature_cols].values
y_all_reg = df_eval['substitution_score_normalized'].values
y_all_clf = df_eval['opportunity_class'].values
X_all_scaled = scaler.fit_transform(X_all)

# Cross-validation pour la r√©gression
cv = KFold(n_splits=5, shuffle=True, random_state=42)
cv_scores_reg = cross_val_score(model_reg, X_all_scaled, y_all_reg, cv=cv, scoring='r2')
cv_scores_rmse = cross_val_score(model_reg, X_all_scaled, y_all_reg, cv=cv, 
                                  scoring='neg_root_mean_squared_error')

print("üìà VALIDATION CROIS√âE (5-Fold)")
print("="*60)
print(f"\nüîπ Mod√®le de R√©gression:")
print(f"   R¬≤ scores: {cv_scores_reg}")
print(f"   R¬≤ moyen: {cv_scores_reg.mean():.4f} (+/- {cv_scores_reg.std()*2:.4f})")
print(f"   RMSE moyen: {-cv_scores_rmse.mean():.4f} (+/- {cv_scores_rmse.std()*2:.4f})")

# Cross-validation pour la classification
cv_scores_clf = cross_val_score(model_clf, X_all_scaled, y_all_clf, cv=cv, scoring='accuracy')
cv_scores_f1 = cross_val_score(model_clf, X_all_scaled, y_all_clf, cv=cv, scoring='f1_macro')

print(f"\nüîπ Mod√®le de Classification:")
print(f"   Accuracy scores: {cv_scores_clf}")
print(f"   Accuracy moyenne: {cv_scores_clf.mean():.4f} (+/- {cv_scores_clf.std()*2:.4f})")
print(f"   F1-macro moyen: {cv_scores_f1.mean():.4f} (+/- {cv_scores_f1.std()*2:.4f})")

# Visualisation
fig = make_subplots(rows=1, cols=2,
                    subplot_titles=('Scores CV - R√©gression (R¬≤)', 
                                   'Scores CV - Classification (Accuracy)'))

fig.add_trace(
    go.Bar(x=[f'Fold {i+1}' for i in range(5)], y=cv_scores_reg,
           marker_color='steelblue'),
    row=1, col=1
)
fig.add_hline(y=cv_scores_reg.mean(), line_dash="dash", line_color="red", row=1, col=1)

fig.add_trace(
    go.Bar(x=[f'Fold {i+1}' for i in range(5)], y=cv_scores_clf,
           marker_color='#27ae60'),
    row=1, col=2
)
fig.add_hline(y=cv_scores_clf.mean(), line_dash="dash", line_color="red", row=1, col=2)

fig.update_layout(title_text="üìà Stabilit√© des Mod√®les - Validation Crois√©e", 
                  height=400, showlegend=False)
fig.show()

üìà VALIDATION CROIS√âE (5-Fold)

üîπ Mod√®le de R√©gression:
   R¬≤ scores: [0.66171082 0.51083654 0.80309243 0.82184817 0.7256551 ]
   R¬≤ moyen: 0.7046 (+/- 0.2250)
   RMSE moyen: 6.8897 (+/- 5.4413)

üîπ Mod√®le de Classification:
   Accuracy scores: [0.97619048 0.92857143 0.97619048 1.         1.        ]
   Accuracy moyenne: 0.9762 (+/- 0.0522)
   F1-macro moyen: 0.9694 (+/- 0.0872)


## 6. üéØ Importance des Features et Interpr√©tabilit√©

In [10]:
# 6.1 Importance des features - Comparaison des deux mod√®les
importance_reg = pd.DataFrame({
    'feature': feature_cols,
    'importance_reg': model_reg.feature_importances_
})

importance_clf = pd.DataFrame({
    'feature': feature_cols,
    'importance_clf': model_clf.feature_importances_
})

# Fusionner
importance_combined = importance_reg.merge(importance_clf, on='feature')
importance_combined['importance_avg'] = (importance_combined['importance_reg'] + 
                                          importance_combined['importance_clf']) / 2
importance_combined = importance_combined.sort_values('importance_avg', ascending=False)

# Top 15 features
top_15 = importance_combined.head(15)

fig = go.Figure()

fig.add_trace(go.Bar(
    name='R√©gression',
    y=top_15['feature'],
    x=top_15['importance_reg'],
    orientation='h',
    marker_color='steelblue'
))

fig.add_trace(go.Bar(
    name='Classification',
    y=top_15['feature'],
    x=top_15['importance_clf'],
    orientation='h',
    marker_color='#27ae60'
))

fig.update_layout(
    title="üéØ Top 15 Features les Plus Importantes",
    barmode='group',
    height=500,
    xaxis_title="Importance",
    yaxis_title="Feature",
    yaxis={'categoryorder': 'total ascending'}
)
fig.show()

print("üìä Top 10 Features (moyenne des deux mod√®les):")
for i, row in importance_combined.head(10).iterrows():
    print(f"   {row['importance_avg']:.4f} - {row['feature']}")

üìä Top 10 Features (moyenne des deux mod√®les):
   0.2191 - prix_unitaire_production
   0.1697 - consommation_fcfa
   0.1668 - production_fcfa_growth
   0.1273 - exports_fcfa
   0.0870 - taux_couverture
   0.0766 - balance_commerciale_fcfa
   0.0556 - ratio_prod_conso_fcfa
   0.0254 - imports_fcfa_growth
   0.0213 - intensite_export
   0.0101 - production_tonnes_growth


In [11]:
# 6.2 Interpr√©tation √©conomique des features importantes
print("üìä INTERPR√âTATION √âCONOMIQUE DES FEATURES CL√âS")
print("="*70)

interpretations = {
    'prix_unitaire_production': """
    üí∞ Prix Unitaire de Production (FCFA/Tonne)
    ‚Üí Indicateur de valeur ajout√©e du secteur
    ‚Üí Plus il est √©lev√©, plus le secteur transforme les mati√®res premi√®res
    ‚Üí Secteurs √† fort prix unitaire = opportunit√©s de substitution √† haute valeur""",
    
    'production_fcfa_growth': """
    üìà Croissance de la Production
    ‚Üí Dynamisme du secteur sur l'ann√©e
    ‚Üí Secteurs en croissance = capacit√© d'expansion pour remplacer les imports
    ‚Üí Signal positif pour les investissements""",
    
    'consommation_fcfa': """
    üè† Consommation Int√©rieure
    ‚Üí Demande locale pour les produits du secteur
    ‚Üí Forte consommation = march√© potentiel pour la production locale
    ‚Üí Opportunit√© si production < consommation""",
    
    'exports_fcfa': """
    üåç Exportations
    ‚Üí Comp√©titivit√© internationale du secteur
    ‚Üí Secteurs exportateurs = capacit√© √† √™tre comp√©titifs
    ‚Üí Potentiel de r√©orienter les exports vers le march√© local""",
    
    'taux_couverture': """
    ‚öñÔ∏è Taux de Couverture (Exports/Imports)
    ‚Üí √âquilibre commercial du secteur
    ‚Üí > 100% = secteur exc√©dentaire (fort potentiel)
    ‚Üí < 100% = secteur d√©ficitaire (besoin de substitution)"""
}

for feature in importance_combined.head(5)['feature'].values:
    if feature in interpretations:
        print(interpretations[feature])
        print("-"*70)

üìä INTERPR√âTATION √âCONOMIQUE DES FEATURES CL√âS

    üí∞ Prix Unitaire de Production (FCFA/Tonne)
    ‚Üí Indicateur de valeur ajout√©e du secteur
    ‚Üí Plus il est √©lev√©, plus le secteur transforme les mati√®res premi√®res
    ‚Üí Secteurs √† fort prix unitaire = opportunit√©s de substitution √† haute valeur
----------------------------------------------------------------------

    üè† Consommation Int√©rieure
    ‚Üí Demande locale pour les produits du secteur
    ‚Üí Forte consommation = march√© potentiel pour la production locale
    ‚Üí Opportunit√© si production < consommation
----------------------------------------------------------------------

    üìà Croissance de la Production
    ‚Üí Dynamisme du secteur sur l'ann√©e
    ‚Üí Secteurs en croissance = capacit√© d'expansion pour remplacer les imports
    ‚Üí Signal positif pour les investissements
----------------------------------------------------------------------

    üåç Exportations
    ‚Üí Comp√©titivit√© 

## 7. üèÜ Recommandations Sectorielles

In [12]:
# 7.1 Analyse approfondie par secteur
# Calculer les statistiques moyennes r√©centes (2021-2023) par secteur
recent_data = df_eval[df_eval['year'].isin([2021, 2022, 2023])].copy()

sector_analysis = recent_data.groupby('LIBELLES').agg({
    'production_fcfa': 'mean',
    'imports_fcfa': 'mean',
    'exports_fcfa': 'mean',
    'consommation_fcfa': 'mean',
    'production_fcfa_growth': 'mean',
    'substitution_score_normalized': 'mean'
}).round(2)

# Calculer des m√©triques suppl√©mentaires
sector_analysis['balance_commerciale'] = sector_analysis['exports_fcfa'] - sector_analysis['imports_fcfa']
sector_analysis['ratio_prod_import'] = sector_analysis['production_fcfa'] / sector_analysis['imports_fcfa'].replace(0, 1)
sector_analysis['potentiel_substitution'] = sector_analysis['imports_fcfa'] * (1 - sector_analysis['ratio_prod_import'].clip(0, 1))

# Classifier les secteurs
def classify_sector(row):
    if row['ratio_prod_import'] > 5 and row['production_fcfa_growth'] > 0:
        return 'üü¢ Fort potentiel - Secteur dominant'
    elif row['ratio_prod_import'] > 1:
        return 'üü° Potentiel mod√©r√© - √âquilibr√©'
    elif row['ratio_prod_import'] > 0.5:
        return 'üü† √Ä d√©velopper - D√©pendant'
    else:
        return 'üî¥ Prioritaire - Tr√®s d√©pendant'

sector_analysis['classification'] = sector_analysis.apply(classify_sector, axis=1)
sector_analysis = sector_analysis.sort_values('substitution_score_normalized', ascending=False)

print("üèÜ RECOMMANDATIONS SECTORIELLES")
print("="*80)
print("\nüìä Classement des secteurs par potentiel de substitution aux importations:\n")

for i, (sector, row) in enumerate(sector_analysis.head(10).iterrows(), 1):
    print(f"{i:2d}. {sector[:50]}...")
    print(f"    {row['classification']}")
    print(f"    Production: {row['production_fcfa']:.1f} Mds | Imports: {row['imports_fcfa']:.1f} Mds")
    print(f"    Ratio Prod/Import: {row['ratio_prod_import']:.2f} | Croissance: {row['production_fcfa_growth']:+.1f}%")
    print()

üèÜ RECOMMANDATIONS SECTORIELLES

üìä Classement des secteurs par potentiel de substitution aux importations:

 1. Perles fines ou de culture, pierres gemmes ou simi...
    üü¢ Fort potentiel - Secteur dominant
    Production: 2138.1 Mds | Imports: 0.8 Mds
    Ratio Prod/Import: 2672.62 | Croissance: +1.4%

 2. Mati√®res textiles et ouvrages en ces mati√®res...
    üü¢ Fort potentiel - Secteur dominant
    Production: 286.2 Mds | Imports: 48.4 Mds
    Ratio Prod/Import: 5.91 | Croissance: +7.3%

 3. Produit du r√®gne v√©g√©tal...
    üü° Potentiel mod√©r√© - √âquilibr√©
    Production: 343.1 Mds | Imports: 165.4 Mds
    Ratio Prod/Import: 2.07 | Croissance: +14.0%

 4. Graisses et huiles animales ou v√©g√©tales...
    üü° Potentiel mod√©r√© - √âquilibr√©
    Production: 36.9 Mds | Imports: 18.6 Mds
    Ratio Prod/Import: 1.98 | Croissance: +9.3%

 5. Objets d'art, de collection ou d'antiquit√©...
    üü° Potentiel mod√©r√© - √âquilibr√©
    Production: 0.2 Mds | Imports: 0.1 Mds

In [13]:
# 7.2 Visualisation des recommandations
fig = px.scatter(
    sector_analysis.reset_index(),
    x='imports_fcfa',
    y='production_fcfa',
    size='substitution_score_normalized',
    color='production_fcfa_growth',
    hover_name='LIBELLES',
    color_continuous_scale='RdYlGn',
    title='üéØ Cartographie des Opportunit√©s de Substitution',
    labels={
        'imports_fcfa': 'Importations (Mds FCFA)',
        'production_fcfa': 'Production (Mds FCFA)',
        'production_fcfa_growth': 'Croissance (%)'
    }
)

# Ajouter la ligne de parit√©
max_val = max(sector_analysis['imports_fcfa'].max(), sector_analysis['production_fcfa'].max())
fig.add_trace(go.Scatter(
    x=[0, max_val], y=[0, max_val],
    mode='lines',
    line=dict(color='gray', dash='dash'),
    name='Parit√© Prod=Import'
))

fig.add_annotation(
    x=max_val*0.7, y=max_val*0.3,
    text="Zone de d√©pendance<br>aux importations",
    showarrow=False,
    font=dict(size=10, color='red')
)

fig.add_annotation(
    x=max_val*0.3, y=max_val*0.7,
    text="Zone de<br>substitution r√©ussie",
    showarrow=False,
    font=dict(size=10, color='green')
)

fig.update_layout(height=600)
fig.show()

In [14]:
# 7.3 Secteurs prioritaires pour l'action
print("üéØ SECTEURS PRIORITAIRES POUR L'ACTION")
print("="*80)

# Identifier les secteurs √† fort potentiel de substitution non exploit√©
high_import_sectors = sector_analysis[sector_analysis['imports_fcfa'] > 50].copy()
high_import_sectors['gap'] = high_import_sectors['imports_fcfa'] - high_import_sectors['production_fcfa']
high_import_sectors = high_import_sectors[high_import_sectors['gap'] > 0].sort_values('gap', ascending=False)

print("\nüî¥ SECTEURS AVEC FORT D√âFICIT (Imports >> Production):")
print("-"*80)
for i, (sector, row) in enumerate(high_import_sectors.head(5).iterrows(), 1):
    print(f"\n{i}. {sector}")
    print(f"   üí∞ Gap √† combler: {row['gap']:.1f} Mds FCFA")
    print(f"   üìä Imports: {row['imports_fcfa']:.1f} Mds | Production: {row['production_fcfa']:.1f} Mds")
    print(f"   üìà Croissance actuelle: {row['production_fcfa_growth']:+.1f}%")
    
    # Recommandation
    if row['production_fcfa_growth'] > 5:
        print(f"   ‚úÖ RECOMMANDATION: Acc√©l√©rer la croissance existante")
    elif row['production_fcfa_growth'] > 0:
        print(f"   ‚ö†Ô∏è RECOMMANDATION: Stimuler davantage le secteur")
    else:
        print(f"   üö® RECOMMANDATION: Intervention urgente n√©cessaire")

# Secteurs champions
print("\n\nüü¢ SECTEURS CHAMPIONS (Production >> Imports):")
print("-"*80)
champions = sector_analysis[sector_analysis['ratio_prod_import'] > 2].head(5)
for i, (sector, row) in enumerate(champions.iterrows(), 1):
    print(f"\n{i}. {sector}")
    print(f"   üèÜ Ratio Prod/Import: {row['ratio_prod_import']:.1f}x")
    print(f"   üí° Potentiel d'exportation additionnel ou de diversification")

üéØ SECTEURS PRIORITAIRES POUR L'ACTION

üî¥ SECTEURS AVEC FORT D√âFICIT (Imports >> Production):
--------------------------------------------------------------------------------


üü¢ SECTEURS CHAMPIONS (Production >> Imports):
--------------------------------------------------------------------------------

1. Perles fines ou de culture, pierres gemmes ou similaires, m√©taux pr√©cieux, plaques ou doubles de m√©taux pr√©cieux et ouvrages en ces mati√®res; bijouterie de fantaisie; monnaies
   üèÜ Ratio Prod/Import: 2672.6x
   üí° Potentiel d'exportation additionnel ou de diversification

2. Mati√®res textiles et ouvrages en ces mati√®res
   üèÜ Ratio Prod/Import: 5.9x
   üí° Potentiel d'exportation additionnel ou de diversification

3. Produit du r√®gne v√©g√©tal
   üèÜ Ratio Prod/Import: 2.1x
   üí° Potentiel d'exportation additionnel ou de diversification

4. Objets d'art, de collection ou d'antiquit√©
   üèÜ Ratio Prod/Import: 2.9x
   üí° Potentiel d'exportation additionn

## 8. üíæ Export des r√©sultats pour l'API et le d√©ploiement

In [15]:
# 8.1 Pr√©parer les donn√©es pour l'API
api_data = {
    'model_info': {
        'version': '1.0',
        'date_training': model_metadata['date_training'],
        'regression_r2': model_metadata['regression_model']['metrics']['r2'],
        'classification_accuracy': model_metadata['classification_model']['metrics']['accuracy']
    },
    'feature_columns': feature_cols,
    'opportunity_classes': opportunity_labels,
    'sectors': df_eval['LIBELLES'].unique().tolist()
}

# Sauvegarder les donn√©es API
with open(MODELS_PATH / "api_config.json", 'w', encoding='utf-8') as f:
    json.dump(api_data, f, ensure_ascii=False, indent=2)

print("‚úÖ Configuration API sauvegard√©e: api_config.json")

‚úÖ Configuration API sauvegard√©e: api_config.json


In [16]:
# 8.2 Cr√©er le rapport final des recommandations
recommendations_report = []

for sector, row in sector_analysis.iterrows():
    rec = {
        'secteur': sector,
        'production_mds_fcfa': round(row['production_fcfa'], 2),
        'imports_mds_fcfa': round(row['imports_fcfa'], 2),
        'exports_mds_fcfa': round(row['exports_fcfa'], 2),
        'balance_commerciale': round(row['balance_commerciale'], 2),
        'ratio_production_imports': round(row['ratio_prod_import'], 2),
        'croissance_production_pct': round(row['production_fcfa_growth'], 2),
        'score_substitution': round(row['substitution_score_normalized'], 2),
        'classification': row['classification'],
        'potentiel_substitution_mds': round(max(0, row['imports_fcfa'] - row['production_fcfa']), 2)
    }
    recommendations_report.append(rec)

# Sauvegarder en JSON
with open(MODELS_PATH / "recommendations_report.json", 'w', encoding='utf-8') as f:
    json.dump(recommendations_report, f, ensure_ascii=False, indent=2)

# Sauvegarder en CSV
recommendations_df = pd.DataFrame(recommendations_report)
recommendations_df.to_csv(MODELS_PATH / "recommendations_report.csv", index=False, encoding='utf-8')

print("‚úÖ Rapport de recommandations sauvegard√©:")
print("   üìÅ recommendations_report.json")
print("   üìÅ recommendations_report.csv")

# Aper√ßu
recommendations_df.head(10)

‚úÖ Rapport de recommandations sauvegard√©:
   üìÅ recommendations_report.json
   üìÅ recommendations_report.csv


Unnamed: 0,secteur,production_mds_fcfa,imports_mds_fcfa,exports_mds_fcfa,balance_commerciale,ratio_production_imports,croissance_production_pct,score_substitution,classification,potentiel_substitution_mds
0,"Perles fines ou de culture, pierres gemmes ou ...",2138.1,0.8,2137.3,2136.5,2672.62,1.35,40.57,üü¢ Fort potentiel - Secteur dominant,0
1,Mati√®res textiles et ouvrages en ces mati√®res,286.23,48.4,237.83,189.43,5.91,7.31,0.08,üü¢ Fort potentiel - Secteur dominant,0
2,Produit du r√®gne v√©g√©tal,343.13,165.4,177.73,12.33,2.07,13.98,0.03,üü° Potentiel mod√©r√© - √âquilibr√©,0
3,Graisses et huiles animales ou v√©g√©tales,36.9,18.6,18.3,-0.3,1.98,9.26,0.03,üü° Potentiel mod√©r√© - √âquilibr√©,0
4,"Objets d'art, de collection ou d'antiquit√©",0.2,0.07,0.13,0.06,2.86,0.0,0.02,üü° Potentiel mod√©r√© - √âquilibr√©,0
5,"Chaussures, coiffures, parapluie, cannes, etc.",7.7,7.6,0.1,-7.5,1.01,17.54,0.01,üü° Potentiel mod√©r√© - √âquilibr√©,0
6,"Bois, charbons de bois et ouvrages en bois",8.1,7.83,0.27,-7.56,1.03,27.72,0.01,üü° Potentiel mod√©r√© - √âquilibr√©,0
7,"Armes, munitions et leurs parties et accessoires",1.4,1.37,0.03,-1.34,1.02,0.79,0.01,üü° Potentiel mod√©r√© - √âquilibr√©,0
8,Animaux vivants et produits du r√®gne animal,38.77,36.8,1.97,-34.83,1.05,20.17,0.01,üü° Potentiel mod√©r√© - √âquilibr√©,0
9,Mati√®res plastiques et ouvrages en ces mati√®res,109.23,107.13,2.1,-105.03,1.02,4.73,0.01,üü° Potentiel mod√©r√© - √âquilibr√©,0


In [17]:
# 8.3 Cr√©er le rapport d'√©valuation complet
evaluation_report = {
    'date_evaluation': pd.Timestamp.now().strftime('%Y-%m-%d %H:%M:%S'),
    'model_performance': {
        'regression': {
            'train': metrics_reg['Train'],
            'test': metrics_reg['Test'],
            'cross_validation': {
                'r2_mean': float(cv_scores_reg.mean()),
                'r2_std': float(cv_scores_reg.std()),
                'rmse_mean': float(-cv_scores_rmse.mean())
            }
        },
        'classification': {
            'accuracy': float(accuracy_score(y_test_clf, y_pred_clf)),
            'f1_macro': float(f1_score(y_test_clf, y_pred_clf, average='macro')),
            'precision_macro': float(precision_score(y_test_clf, y_pred_clf, average='macro')),
            'recall_macro': float(recall_score(y_test_clf, y_pred_clf, average='macro')),
            'cross_validation': {
                'accuracy_mean': float(cv_scores_clf.mean()),
                'accuracy_std': float(cv_scores_clf.std())
            }
        }
    },
    'feature_importance': importance_combined.head(15).to_dict('records'),
    'error_analysis': {
        'mean_absolute_error': float(df_test['abs_error'].mean()),
        'median_error': float(df_test['abs_error'].median()),
        'max_error': float(df_test['abs_error'].max()),
        'misclassification_rate': float(len(misclassified) / len(df_test))
    },
    'recommendations_summary': {
        'high_potential_sectors': sector_analysis[
            sector_analysis['classification'].str.contains('Fort')
        ].index.tolist()[:5],
        'priority_sectors': high_import_sectors.head(5).index.tolist() if len(high_import_sectors) > 0 else [],
        'total_substitution_potential_mds_fcfa': float(
            sector_analysis[sector_analysis['ratio_prod_import'] < 1]['imports_fcfa'].sum() -
            sector_analysis[sector_analysis['ratio_prod_import'] < 1]['production_fcfa'].sum()
        )
    }
}

with open(MODELS_PATH / "evaluation_report.json", 'w', encoding='utf-8') as f:
    json.dump(evaluation_report, f, ensure_ascii=False, indent=2, default=str)

print("‚úÖ Rapport d'√©valuation sauvegard√©: evaluation_report.json")

‚úÖ Rapport d'√©valuation sauvegard√©: evaluation_report.json


## 9. üìù R√©sum√© et Conclusions

In [18]:
# R√©sum√© final
print("="*80)
print("üìä RAPPORT FINAL D'√âVALUATION DES MOD√àLES")
print("="*80)

print(f"""
‚úÖ √âVALUATION TERMIN√âE AVEC SUCC√àS!

üìà PERFORMANCE DES MOD√àLES
‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê

üîπ Mod√®le de R√©gression (Score de Substitution)
   ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
   ‚Ä¢ R¬≤ (Test):           {metrics_reg['Test']['R¬≤']:.4f}
   ‚Ä¢ RMSE (Test):         {metrics_reg['Test']['RMSE']:.4f}
   ‚Ä¢ MAE (Test):          {metrics_reg['Test']['MAE']:.4f}
   ‚Ä¢ CV R¬≤ (5-fold):      {cv_scores_reg.mean():.4f} ¬± {cv_scores_reg.std()*2:.4f}

üîπ Mod√®le de Classification (Opportunit√©s)
   ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
   ‚Ä¢ Accuracy:            {accuracy_score(y_test_clf, y_pred_clf):.4f}
   ‚Ä¢ F1-Score (macro):    {f1_score(y_test_clf, y_pred_clf, average='macro'):.4f}
   ‚Ä¢ Erreurs de classif.: {len(misclassified)}/{len(df_test)} ({len(misclassified)/len(df_test)*100:.1f}%)
   ‚Ä¢ CV Accuracy:         {cv_scores_clf.mean():.4f} ¬± {cv_scores_clf.std()*2:.4f}

üéØ TOP 5 FEATURES LES PLUS IMPORTANTES
‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê
""")
for i, row in importance_combined.head(5).iterrows():
    print(f"   {i+1}. {row['feature']}: {row['importance_avg']:.4f}")

print(f"""

üèÜ SECTEURS √Ä FORT POTENTIEL DE SUBSTITUTION
‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê
""")
for i, (sector, row) in enumerate(sector_analysis.head(5).iterrows(), 1):
    print(f"   {i}. {sector[:50]}...")
    print(f"      Score: {row['substitution_score_normalized']:.1f} | {row['classification']}")

print(f"""

üìÅ FICHIERS G√âN√âR√âS POUR LE D√âPLOIEMENT
‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê
   ‚Ä¢ api_config.json          - Configuration pour l'API
   ‚Ä¢ recommendations_report.json/csv - Recommandations sectorielles
   ‚Ä¢ evaluation_report.json   - Rapport d'√©valuation complet

üöÄ PR√äT POUR LE D√âPLOIEMENT!
   L'API et l'application Streamlit peuvent maintenant utiliser ces mod√®les
   pour fournir des recommandations en temps r√©el.
""")
print("="*80)

üìä RAPPORT FINAL D'√âVALUATION DES MOD√àLES

‚úÖ √âVALUATION TERMIN√âE AVEC SUCC√àS!

üìà PERFORMANCE DES MOD√àLES
‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê

üîπ Mod√®le de R√©gression (Score de Substitution)
   ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
   ‚Ä¢ R¬≤ (Test):           0.8320
   ‚Ä¢ RMSE (Test):         3.6252
   ‚Ä¢ MAE (Test):          1.8506
   ‚Ä¢ CV R¬≤ (5-fold):      0.7046 ¬± 0.2250

üîπ Mod√®le de Classification (Opportunit√©s)
   ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
   ‚Ä¢ Accuracy:            0.9841
   ‚Ä¢ F1-Score (macro):    0.9891
   ‚Ä¢ Erreurs de classif.: 1/63 (1.6%)
   ‚Ä¢ CV Accuracy