# üè¢ Priorisation des B√¢timents √† Risque - Montr√©al

**Projet VILLE_IA** - Institut de la r√©silience et de l'innovation urbaine (IRIU)

## Approche Innovante: SANS G√©omatique

Ce notebook d√©montre comment identifier les b√¢timents prioritaires pour r√©novation √©nerg√©tique et adaptation climatique **sans utiliser d'outils g√©omatiques**.

### Notre Innovation
- ‚úÖ Utilisation de l'intelligence des codes postaux
- ‚úÖ Analyse textuelle des adresses
- ‚úÖ Machine Learning multi-crit√®res
- ‚úÖ Proxys de vuln√©rabilit√© par arrondissement

---

## 1. Configuration et Import

In [None]:
# Imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
from pathlib import Path
import warnings
warnings.filterwarnings('ignore')

# Configuration des graphiques
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")
%matplotlib inline

print("‚úÖ Imports r√©ussis")

## 2. Chargement des Donn√©es

Nous chargeons les donn√©es disponibles sur les b√¢timents municipaux de Montr√©al.

In [None]:
# Charger les donn√©es de b√¢timents
buildings = pd.read_csv('data/batiments-municipaux.csv')

print(f"üìä Nombre de b√¢timents: {len(buildings):,}")
print(f"\nüìã Colonnes disponibles:")
print(buildings.columns.tolist())

# Aper√ßu
buildings.head()

## 3. Exploration Rapide

In [None]:
# Statistiques descriptives
print("üìà Statistiques Descriptives:")
print(buildings.describe())

# Valeurs manquantes
print("\n‚ùì Valeurs Manquantes:")
missing = buildings.isnull().sum()
print(missing[missing > 0].sort_values(ascending=False))

In [None]:
# Distribution par arrondissement
fig = px.bar(
    buildings['boroughName'].value_counts().reset_index(),
    x='boroughName',
    y='count',
    title='Nombre de B√¢timents par Arrondissement',
    labels={'boroughName': 'Arrondissement', 'count': 'Nombre de B√¢timents'}
)
fig.update_layout(xaxis_tickangle=-45, height=500)
fig.show()

## 4. Ex√©cution du Pipeline Complet

### √âtape 1: Matching Intelligent (Sans G√©omatique)

In [None]:
# Importer notre classe de matching intelligent
import sys
sys.path.append('.')

from importlib import import_module
matching = import_module('02_intelligent_matching')
IntelligentMatcher = matching.IntelligentMatcher

# Cr√©er le matcher
matcher = IntelligentMatcher()

# Enrichir avec intelligence des codes postaux
buildings_enriched = matcher.enrich_with_postal_code_intelligence(buildings.copy())

print("‚úÖ Enrichissement compl√©t√©")
print(f"\nüìç Colonnes ajout√©es:")
print("  - postal_prefix: Pr√©fixe du code postal (ex: H3)")
print("  - postal_flood_risk: Risque inondation bas√© sur le code postal")
print("  - postal_heat_risk: Risque chaleur bas√© sur le code postal")

# Exemple
buildings_enriched[['buildingName', 'address', 'postal_prefix', 'postal_flood_risk', 'postal_heat_risk']].head(10)

### Visualisation des Risques par Code Postal

In [None]:
# Agr√©gation par code postal
postal_risks = buildings_enriched.groupby('postal_prefix').agg({
    'postal_flood_risk': 'mean',
    'postal_heat_risk': 'mean',
    'buildingid': 'count'
}).reset_index()

postal_risks.columns = ['Code Postal', 'Risque Inondation', 'Risque Chaleur', 'Nombre de B√¢timents']

# Graphique
fig = go.Figure()

fig.add_trace(go.Bar(
    x=postal_risks['Code Postal'],
    y=postal_risks['Risque Inondation'],
    name='Risque Inondation',
    marker_color='lightblue'
))

fig.add_trace(go.Bar(
    x=postal_risks['Code Postal'],
    y=postal_risks['Risque Chaleur'],
    name='Risque Chaleur',
    marker_color='orange'
))

fig.update_layout(
    title='Risques Climatiques par Zone (Code Postal)',
    xaxis_title='Pr√©fixe Code Postal',
    yaxis_title='Score de Risque',
    barmode='group',
    height=500
)

fig.show()

## 5. Mod√®le de Priorisation ML

### Calcul des Scores de Priorit√©

In [None]:
# Importer le mod√®le de priorisation
ml_model = import_module('03_ml_prioritization_model')
BuildingRiskPrioritizer = ml_model.BuildingRiskPrioritizer

# Initialiser le mod√®le
model = BuildingRiskPrioritizer()

# Cr√©er les features
features = model.create_feature_matrix(buildings_enriched)

print("\n‚úÖ Features cr√©√©es:")
print(features.columns.tolist())
print("\nüìä Statistiques des features:")
features.describe()

In [None]:
# Calculer les scores de priorit√©
buildings_enriched['priority_score'] = model.calculate_priority_score(features)

# Classifier par niveau
buildings_enriched['priority_level'] = pd.cut(
    buildings_enriched['priority_score'],
    bins=[0, 40, 60, 80, 100],
    labels=['Low', 'Medium', 'High', 'Critical']
)

# Ajouter les features individuelles
for col in features.columns:
    buildings_enriched[f'score_{col}'] = features[col]

print("‚úÖ Scores calcul√©s")
print("\nüìä Distribution des niveaux de priorit√©:")
print(buildings_enriched['priority_level'].value_counts())

## 6. Analyse des R√©sultats

### Top 20 B√¢timents Prioritaires

In [None]:
# Top 20
top_20 = buildings_enriched.nlargest(20, 'priority_score')[[
    'buildingName', 'address', 'boroughName', 
    'priority_score', 'priority_level',
    'score_energy_risk', 'score_climate_risk', 'score_social_vulnerability'
]]

print("üèÜ TOP 20 B√ÇTIMENTS PRIORITAIRES:")
top_20

### Visualisation des Scores Multi-Dimensionnels

In [None]:
# Scatter plot 3D
fig = px.scatter_3d(
    buildings_enriched,
    x='score_energy_risk',
    y='score_climate_risk',
    z='score_social_vulnerability',
    color='priority_level',
    hover_data=['buildingName', 'boroughName'],
    title='Analyse Multi-Dimensionnelle des Risques',
    labels={
        'score_energy_risk': 'Risque √ânerg√©tique',
        'score_climate_risk': 'Risque Climatique',
        'score_social_vulnerability': 'Vuln√©rabilit√© Sociale'
    },
    color_discrete_map={
        'Critical': '#ff4444',
        'High': '#ff8c00',
        'Medium': '#ffd700',
        'Low': '#90ee90'
    }
)

fig.update_layout(height=700)
fig.show()

### Analyse par Arrondissement

In [None]:
# Statistiques par arrondissement
borough_analysis = buildings_enriched.groupby('boroughName').agg({
    'priority_score': ['mean', 'max', 'min'],
    'buildingid': 'count',
    'score_energy_risk': 'mean',
    'score_climate_risk': 'mean',
    'score_social_vulnerability': 'mean'
}).round(2)

borough_analysis.columns = [
    'Score Moyen', 'Score Max', 'Score Min', 'Nb B√¢timents',
    'Risque √ânergie', 'Risque Climat', 'Vuln√©rabilit√© Sociale'
]

borough_analysis = borough_analysis.sort_values('Score Moyen', ascending=False)

print("üìä ANALYSE PAR ARRONDISSEMENT:")
borough_analysis

In [None]:
# Heatmap
fig = px.imshow(
    borough_analysis[['Score Moyen', 'Risque √ânergie', 'Risque Climat', 'Vuln√©rabilit√© Sociale']],
    labels=dict(x="M√©triques", y="Arrondissement", color="Score"),
    title="Carte de Chaleur: Risques par Arrondissement",
    aspect="auto",
    color_continuous_scale="RdYlGn_r"
)

fig.update_layout(height=600)
fig.show()

## 7. Estimation d'Impact

### Potentiel de R√©duction de GES

In [None]:
# Calculer le potentiel GES
buildings_enriched['estimated_ges_reduction'] = (
    buildings_enriched['buildingArea'].fillna(
        buildings_enriched['builtArea'].fillna(1000)
    ) / 100 *
    buildings_enriched['score_energy_risk'] *
    buildings_enriched['score_age_risk'] *
    2.5  # Facteur de conversion
)

# Statistiques
total_potential = buildings_enriched['estimated_ges_reduction'].sum()
top_100_potential = buildings_enriched.nlargest(100, 'priority_score')['estimated_ges_reduction'].sum()
critical_potential = buildings_enriched[
    buildings_enriched['priority_level'] == 'Critical'
]['estimated_ges_reduction'].sum()

print("üå± POTENTIEL DE R√âDUCTION GES:")
print(f"\n  Total (tous b√¢timents): {total_potential:,.0f} tonnes CO‚ÇÇ/an")
print(f"  Top 100 b√¢timents: {top_100_potential:,.0f} tonnes CO‚ÇÇ/an ({top_100_potential/total_potential*100:.1f}%)")
print(f"  B√¢timents critiques: {critical_potential:,.0f} tonnes CO‚ÇÇ/an ({critical_potential/total_potential*100:.1f}%)")

In [None]:
# Courbe cumulative d'impact
buildings_sorted = buildings_enriched.sort_values('priority_score', ascending=False).reset_index(drop=True)
buildings_sorted['cumulative_ges'] = buildings_sorted['estimated_ges_reduction'].cumsum()
buildings_sorted['cumulative_pct'] = buildings_sorted['cumulative_ges'] / total_potential * 100

fig = go.Figure()

fig.add_trace(go.Scatter(
    x=list(range(1, len(buildings_sorted) + 1)),
    y=buildings_sorted['cumulative_pct'],
    mode='lines',
    name='Impact Cumulatif',
    line=dict(color='green', width=3)
))

# Ligne des 100 premiers
fig.add_vline(x=100, line_dash="dash", line_color="red",
              annotation_text="Top 100", annotation_position="top right")

fig.update_layout(
    title='Courbe d\'Impact Cumulatif: R√©duction GES',
    xaxis_title='Nombre de B√¢timents R√©nov√©s (par ordre de priorit√©)',
    yaxis_title='% du Potentiel Total de R√©duction GES',
    height=500
)

fig.show()

print(f"\nüí° Insight: En r√©novant les {100} b√¢timents les plus prioritaires,")
print(f"   on capture {top_100_potential/total_potential*100:.1f}% du potentiel total de r√©duction.")

## 8. Analyse par √Çge de B√¢timent

In [None]:
# Cr√©er des cat√©gories d'√¢ge
current_year = 2024
buildings_enriched['building_age'] = current_year - buildings_enriched['buildingConstrYear']
buildings_enriched['age_category'] = pd.cut(
    buildings_enriched['building_age'],
    bins=[0, 20, 40, 60, 100, 200],
    labels=['< 20 ans', '20-40 ans', '40-60 ans', '60-100 ans', '> 100 ans']
)

# Analyse par √¢ge
age_analysis = buildings_enriched.groupby('age_category').agg({
    'priority_score': 'mean',
    'buildingid': 'count',
    'estimated_ges_reduction': 'sum'
}).round(2)

age_analysis.columns = ['Score Moyen', 'Nombre', 'Potentiel GES Total']

print("üèóÔ∏è ANALYSE PAR √ÇGE DE B√ÇTIMENT:")
age_analysis

In [None]:
# Visualisation
fig = go.Figure()

fig.add_trace(go.Bar(
    x=age_analysis.index,
    y=age_analysis['Score Moyen'],
    name='Score Moyen',
    marker_color='lightblue',
    yaxis='y'
))

fig.add_trace(go.Scatter(
    x=age_analysis.index,
    y=age_analysis['Potentiel GES Total'],
    name='Potentiel GES',
    marker_color='green',
    yaxis='y2',
    mode='lines+markers',
    line=dict(width=3)
))

fig.update_layout(
    title='Priorit√© et Impact par √Çge de B√¢timent',
    xaxis_title='Cat√©gorie d\'√Çge',
    yaxis_title='Score de Priorit√© Moyen',
    yaxis2=dict(
        title='Potentiel GES Total (tonnes CO‚ÇÇ/an)',
        overlaying='y',
        side='right'
    ),
    height=500
)

fig.show()

## 9. Export des R√©sultats

In [None]:
# Sauvegarder les r√©sultats complets
buildings_enriched.to_csv('output_buildings_prioritized.csv', index=False, encoding='utf-8-sig')
print("‚úÖ R√©sultats complets sauvegard√©s: output_buildings_prioritized.csv")

# Sauvegarder le Top 100
top_100 = buildings_enriched.nlargest(100, 'priority_score')
top_100.to_csv('output_top_100_priorities.csv', index=False, encoding='utf-8-sig')
print("‚úÖ Top 100 sauvegard√©: output_top_100_priorities.csv")

# Rapport sommaire
print("\n" + "="*80)
print("R√âSUM√â FINAL")
print("="*80)
print(f"\nüìä B√¢timents analys√©s: {len(buildings_enriched):,}")
print(f"\nüéØ Distribution des priorit√©s:")
for level in ['Critical', 'High', 'Medium', 'Low']:
    count = len(buildings_enriched[buildings_enriched['priority_level'] == level])
    pct = count / len(buildings_enriched) * 100
    print(f"   {level:10s}: {count:4d} ({pct:5.1f}%)")

print(f"\nüå± Potentiel total de r√©duction GES: {total_potential:,.0f} tonnes CO‚ÇÇ/an")
print(f"   Top 100 b√¢timents: {top_100_potential:,.0f} tonnes CO‚ÇÇ/an")

print("\nüèÜ Top 5 B√¢timents:")
for i, row in buildings_enriched.nlargest(5, 'priority_score').iterrows():
    print(f"   {row['buildingName'][:50]:50s} - Score: {row['priority_score']:.1f}")

print("\n" + "="*80)
print("‚úÖ ANALYSE COMPL√âT√âE")
print("="*80)
print("\nProchaines √©tapes:")
print("  1. Lancer le dashboard web: streamlit run 04_web_dashboard.py")
print("  2. Consulter la m√©thodologie: METHODOLOGY.md")
print("  3. Planifier les interventions bas√©es sur les r√©sultats")

## 10. Conclusion

### Ce que nous avons d√©montr√©:

‚úÖ **Approche sans g√©omatique fonctionnelle**
- Remplacement des coordonn√©es par l'intelligence des codes postaux
- Proxys g√©ographiques efficaces
- R√©sultats coh√©rents et actionnables

‚úÖ **Priorisation multi-crit√®res**
- 40% Impact √©nerg√©tique / GES
- 30% Risques climatiques
- 20% √âquit√© sociale
- 10% Potentiel d'impact

‚úÖ **Solution accessible et reproductible**
- Pas besoin de logiciels SIG
- Code Python standard
- Pipeline automatis√©
- R√©sultats transparents

### Prochaines √âtapes pour les Municipalit√©s:

1. **Court terme (3 mois)**: Auditer les b√¢timents critiques
2. **Moyen terme (1 an)**: R√©nover le Top 20
3. **Long terme (3-5 ans)**: Programme syst√©matique Top 100

---

**üìß Questions?** Consultez METHODOLOGY.md ou contactez le projet VILLE_IA

**üåê Dashboard Web:** `streamlit run 04_web_dashboard.py`
