
# 🧭 Accidents — Notebook unique (Setup → Ingestion → Modélisation → Analyse)

Ce notebook **centralise tout** pour l'équipe :  
1) **Setup PostgreSQL local** (création base/utilisateur schémas)  
2) **Ingestion** depuis l'API publique (Opendatasoft) vers **BRONZE**  
3) **Modélisation** (SILVER / GOLD)  
4) **Analyses SQL** (requêtes prêtes à l'emploi)  

> Hypothèse : chacun a **PostgreSQL installé en local** (port 5432).  
> On évite Docker. DBeaver reste facultatif (client), **tout s'exécute ici**.



## 0) Pré-requis Python (une fois)
Exécute la cellule suivante si besoin pour installer les dépendances dans ton environnement.


In [None]:

# Si nécessaire, décommente la ligne suivante :
# %pip install psycopg2-binary SQLAlchemy pandas python-dotenv requests tqdm



## 1) Configuration (variables d'environnement)
- Par défaut on utilise `accidents/accidents` sur `localhost:5432` et base `accidents`.
- Si tu veux surcharger, crée un fichier `.env` à côté de ce notebook.


In [None]:
# Build a minimal notebook to: Create/Connect DB → Create bronze → Insert from the given CSV path.
import nbformat as nbf
from pathlib import Path

nb = nbf.v4.new_notebook()
cells = []

cells.append(nbf.v4.new_markdown_cell(
"# 🟤 Bronze — Ingestion CSV unique (PostgreSQL local)\n\n"
"Scope **strict** :\n"
"1. Créer/Connecter la base locale\n"
"2. Créer le schéma `bronze` et la table de stockage brut\n"
"3. **Charger UNIQUEMENT** le fichier CSV fourni (une passe)\n\n"
"> Hypothèses : PostgreSQL installé en local (port 5432). Pas de Docker requis.\n"
"> Tout s'exécute **ici**, DBeaver facultatif.\n"
))

cells.append(nbf.v4.new_markdown_cell("## 0) Dépendances (exécuter une seule fois si besoin)"))
cells.append(nbf.v4.new_code_cell(
"# Décommente si nécessaire :\n"
"# %pip install psycopg2-binary SQLAlchemy pandas python-dotenv pyarrow\n"
))

cells.append(nbf.v4.new_markdown_cell(
"## 1) Configuration\n"
"- Modifiez `RAW_FILE` si nécessaire. Par défaut :\n"
"  - `etl-road-safety\\data\\cleaned\\accidents_clean.csv` (Windows)\n"
"  - fallback: `./data/raw/accidents_clean.csv`\n"
"  - fallback: `/mnt/data/accidents_clean.csv` (pour test ici)\n"
"- Vous pouvez surcharger la connexion via un `.env`.\n"
))
cells.append(nbf.v4.new_code_cell(
"import os\n"
"from pathlib import Path\n"
"from dotenv import load_dotenv\n\n"
"# Charger .env si présent (facultatif)\n"
"if Path('.env').exists():\n"
"    load_dotenv('.env')\n\n"
"# Connexion PostgreSQL locale (par défaut)\n"
"PG_HOST = os.getenv('POSTGRES_HOST', 'localhost')\n"
"PG_PORT = int(os.getenv('POSTGRES_PORT', '5432'))\n"
"PG_DB   = os.getenv('POSTGRES_DB',   'accidents')\n"
"PG_USER = os.getenv('POSTGRES_USER', 'accidents')\n"
"PG_PASS = os.getenv('POSTGRES_PASSWORD', 'accidents')\n"
"PG_SU_USER = os.getenv('POSTGRES_SU_USER', 'postgres')\n"
"PG_SU_PASS = os.getenv('POSTGRES_SU_PASSWORD', 'postgres')\n\n"
"# Fichier CSV cible (ordre de priorité)\n"
"candidates = [\n"
"    Path(r'etl-road-safety\\data\\cleaned\\accidents_clean.csv'),\n"
"    Path('./data/raw/accidents_clean.csv'),\n"
"    Path('/mnt/data/accidents_clean.csv'),\n"
"]\n"
"RAW_FILE = next((p for p in candidates if p.exists()), candidates[0])\n"
"print('🔧 Config DB:', f'{PG_USER}@{PG_HOST}:{PG_PORT}/{PG_DB}')\n"
"print('📄 Fichier CSV:', RAW_FILE)\n"
))

cells.append(nbf.v4.new_markdown_cell("## 2) Connexion & Bootstrap (idempotent)"))
cells.append(nbf.v4.new_code_cell(
"from sqlalchemy import create_engine, text\n"
"from sqlalchemy.engine import Engine\n\n"
"def mk_url(user, pwd, db):\n"
"    return f'postgresql+psycopg2://{user}:{pwd}@{PG_HOST}:{PG_PORT}/{db}'\n\n"
"def get_engine(user, pwd, db) -> Engine:\n"
"    return create_engine(mk_url(user, pwd, db), pool_pre_ping=True, future=True)\n\n"
"def run_sql(engine: Engine, sql: str, params: dict|None=None):\n"
"    with engine.begin() as conn:\n"
"        conn.execute(text(sql), params or {})\n\n"
"# Superuser → créer rôle et base si absents\n"
"su_engine = get_engine(PG_SU_USER, PG_SU_PASS, 'postgres')\n"
"run_sql(su_engine, f\"\"\"\n"
"DO $$\n"
"BEGIN\n"
"   IF NOT EXISTS (SELECT FROM pg_roles WHERE rolname = '{PG_USER}') THEN\n"
"      CREATE ROLE {PG_USER} LOGIN PASSWORD '{PG_PASS}';\n"
"   END IF;\n"
"END$$;\n"
"\"\"\")\n"
"run_sql(su_engine, f\"\"\"\n"
"DO $$\n"
"BEGIN\n"
"   IF NOT EXISTS (SELECT FROM pg_database WHERE datname = '{PG_DB}') THEN\n"
"      CREATE DATABASE {PG_DB} OWNER {PG_USER};\n"
"   END IF;\n"
"END$$;\n"
"\"\"\")\n\n"
"# Connexion applicative et schéma bronze\n"
"app_engine = get_engine(PG_USER, PG_PASS, PG_DB)\n"
"run_sql(app_engine, 'CREATE SCHEMA IF NOT EXISTS bronze AUTHORIZATION current_user;')\n"
"print('✅ Bootstrap OK — base, user, schéma bronze prêts.')\n"
))

cells.append(nbf.v4.new_markdown_cell("## 3) Table BRONZE (stockage brut JSONB)"))
cells.append(nbf.v4.new_code_cell(
"bronze_sql = (\n"
"    'CREATE TABLE IF NOT EXISTS bronze.raw_ingest ('\n"
"    '  id_big BIGSERIAL PRIMARY KEY,'\n"
"    '  source_file TEXT,'\n"
"    '  payload_json JSONB NOT NULL,'\n"
"    '  ingested_at TIMESTAMPTZ NOT NULL DEFAULT NOW()'\n"
"    ');'\n"
"    'CREATE INDEX IF NOT EXISTS ix_raw_payload_gin ON bronze.raw_ingest USING GIN (payload_json);'\n"
")\n"
"run_sql(app_engine, bronze_sql)\n"
"print('✅ Table bronze.raw_ingest prête.')\n"
))

cells.append(nbf.v4.new_markdown_cell("## 4) Chargement du CSV → bronze.raw_ingest"))
cells.append(nbf.v4.new_code_cell(
"import pandas as pd\n"
"import json\n"
"from sqlalchemy import text\n"
"from pathlib import Path\n\n"
"assert Path(RAW_FILE).exists(), f'Fichier introuvable: {RAW_FILE}'\n"
"try:\n"
"    df = pd.read_csv(RAW_FILE, sep=';', dtype=str, low_memory=False)\n"
"except Exception:\n"
"    df = pd.read_csv(RAW_FILE, sep=',', dtype=str, low_memory=False)\n"
"df = df.fillna(value=None)\n"
"records = df.to_dict(orient='records')\n"
"print('📦 Lignes à insérer:', len(records))\n\n"
"batch_size = 5000\n"
"total = 0\n"
"for i in range(0, len(records), batch_size):\n"
"    chunk = records[i:i+batch_size]\n"
"    values_sql = []\n"
"    params = {}\n"
"    for j, rec in enumerate(chunk):\n"
"        key = f'rec_{i+j}'\n"
"        params[key] = json.dumps(rec, ensure_ascii=False)\n" 
"        params[f'src_{i+j}'] = str(Path(RAW_FILE))\n"
"        values_sql.append(f'(:src_{i+j}, CAST(:{key} AS JSONB))')\n"
"    sql = 'INSERT INTO bronze.raw_ingest (source_file, payload_json) VALUES ' + ','.join(values_sql) + ';'\n"
"    with app_engine.begin() as conn:\n"
"        conn.execute(text(sql), params)\n"
"    total += len(chunk)\n"
"print('✅ Insertions effectuées:', total)\n"
))

cells.append(nbf.v4.new_markdown_cell("## 5) Contrôles rapides"))
cells.append(nbf.v4.new_code_cell(
"import pandas as pd\n"
"from sqlalchemy import text\n\n"
"with app_engine.connect() as conn:\n"
"    n = conn.execute(text('SELECT COUNT(*) FROM bronze.raw_ingest')).scalar_one()\n"
"    print('📊 Total lignes en bronze.raw_ingest:', n)\n"
"    sample = pd.read_sql(text(\n"
"        'SELECT id_big, source_file, left(payload_json::text, 200) AS payload_preview, ingested_at '\n"
"        'FROM bronze.raw_ingest ORDER BY id_big DESC LIMIT 5'\n"
"    ), conn)\n"
"\n"
"sample\n"
))

nb['cells'] = cells

out_path = Path('/mnt/data/bronze_ingest_accidents.ipynb')
out_path.write_bytes(nbf.writes(nb).encode('utf-8'))
out_path.as_posix()



## 2) Connexion PostgreSQL (helpers)


In [None]:

from sqlalchemy import create_engine, text
from sqlalchemy.engine import Engine

def mk_url(user, pwd, db):
    return f"postgresql+psycopg2://{user}:{pwd}@{PG_HOST}:{PG_PORT}/{db}"

def get_engine(user, pwd, db) -> Engine:
    return create_engine(mk_url(user, pwd, db), pool_pre_ping=True, future=True)

def run_sql(engine: Engine, sql: str, params: dict|None=None):
    with engine.begin() as conn:
        conn.execute(text(sql), params or {})



## 3) Bootstrap BDD (rôles, base, schémas, droits)

Cette cellule est **idempotente** : elle peut être relancée sans casser l'état.


In [None]:

# On se connecte d'abord au superuser sur la base 'postgres' pour créer user & base
su_engine = get_engine(PG_SU_USER, PG_SU_PASS, 'postgres')

# 3.1 Créer le rôle applicatif (mot de passe commun en local)
run_sql(su_engine, f"""
DO $$
BEGIN
   IF NOT EXISTS (SELECT FROM pg_catalog.pg_roles WHERE rolname = '{PG_USER}') THEN
      CREATE ROLE {PG_USER} LOGIN PASSWORD '{PG_PASS}';
   END IF;
END$$;
""")

# 3.2 Créer la base si absente + owner
run_sql(su_engine, f"""
DO $$
BEGIN
   IF NOT EXISTS (SELECT FROM pg_database WHERE datname = '{PG_DB}') THEN
      CREATE DATABASE {PG_DB} OWNER {PG_USER};
   END IF;
END$$;
""")

# 3.3 Schémas & droits par défaut
app_engine = get_engine(PG_USER, PG_PASS, PG_DB)

run_sql(app_engine, """
CREATE SCHEMA IF NOT EXISTS bronze AUTHORIZATION current_user;
CREATE SCHEMA IF NOT EXISTS silver AUTHORIZATION current_user;
CREATE SCHEMA IF NOT EXISTS gold   AUTHORIZATION current_user;
""")

# Droits par défaut (lecture/écriture bronze/silver, lecture gold)
run_sql(app_engine, """
ALTER DEFAULT PRIVILEGES IN SCHEMA bronze GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO current_user;
ALTER DEFAULT PRIVILEGES IN SCHEMA silver GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO current_user;
ALTER DEFAULT PRIVILEGES IN SCHEMA gold   GRANT SELECT ON TABLES TO current_user;
""")

print("✅ Bootstrap OK — base, rôle, schémas prêts.")



## 4) DDL BRONZE (tables + index)
> **Adapte** la structure à votre dictionnaire. Cette version stocke les champs clés + le brut JSON.


In [None]:

bronze_sql = '''
CREATE TABLE IF NOT EXISTS bronze.accidents (
  id_accident        TEXT PRIMARY KEY,
  date_heure         TIMESTAMPTZ,
  departement_code   TEXT,
  commune_code       TEXT,
  gravite            TEXT,
  type_route         TEXT,
  condition_meteo    TEXT,
  luminosite         TEXT,
  nb_vehicules       INT,
  payload_json       JSONB
);

CREATE INDEX IF NOT EXISTS ix_accidents_date     ON bronze.accidents(date_heure);
CREATE INDEX IF NOT EXISTS ix_accidents_commune  ON bronze.accidents(commune_code);
CREATE INDEX IF NOT EXISTS ix_accidents_gravite  ON bronze.accidents(gravite);
'''
run_sql(app_engine, bronze_sql)
print("✅ DDL BRONZE créé/à jour.")



## 5) Ingestion API → BRONZE

Source : Opendatasoft — *accidents-corporels-de-la-circulation-millesime*  
Pagination **par pages de 1000** (modifiable). On persiste un CSV en `data/raw/` et on **upsert** en base.


In [None]:

import requests, time, json
import pandas as pd
from pathlib import Path
from tqdm import tqdm

BASE_URL = "https://public.opendatasoft.com/api/records/1.0/search/"
DATASET = "accidents-corporels-de-la-circulation-millesime"

RAW_DIR = Path("data/raw")
RAW_DIR.mkdir(parents=True, exist_ok=True)

def fetch_page(offset=0, limit=1000):
    params = {"dataset": DATASET, "rows": limit, "start": offset}
    r = requests.get(BASE_URL, params=params, timeout=60)
    r.raise_for_status()
    return r.json().get("records", [])

# 5.1 Récupération (limite paramétrable)
page_size = 1000
max_pages  = 5   # ← ajuste (None = jusqu’au bout, attention au temps)
records = []
offset = 0
pages = 0

while True:
    page = fetch_page(offset, page_size)
    if not page:
        break
    records.extend(page)
    offset += page_size
    pages += 1
    if max_pages and pages >= max_pages:
        break
    time.sleep(0.2)

print(f"📦 Récupérés: {len(records)} en {pages} pages.")

# 5.2 Sauvegarde brute
raw_path = RAW_DIR / "accidents_raw.jsonl"
with raw_path.open("w", encoding="utf-8") as f:
    for rec in records:
        f.write(json.dumps(rec, ensure_ascii=False) + "\n")
print(f"💾 Écrit: {raw_path.resolve()}")

# 5.3 Normalisation minimale → DataFrame
rows = []
for rec in records:
    fid = rec.get("fields", {})

    rows.append({
        "id_accident": rec.get("recordid"),
        "date_heure": fid.get("date", None),
        "departement_code": fid.get("departement", None),
        "commune_code": fid.get("com_code", None) or fid.get("com", None),
        "gravite": fid.get("gravite", None) or fid.get("grav", None),
        "type_route": fid.get("catr", None) or fid.get("type_voie", None),
        "condition_meteo": fid.get("atm", None) or fid.get("conditions_atmospheriques", None),
        "luminosite": fid.get("lum", None) or fid.get("luminosite", None),
        "nb_vehicules": fid.get("nbv", None),
        "payload_json": rec.get("fields", {}),
    })

df = pd.DataFrame(rows)
df['date_heure'] = pd.to_datetime(df['date_heure'], errors='coerce', utc=True)
print(df.head(3))



### 5.4 Chargement BRONZE (UPSERT)


In [None]:

from sqlalchemy.dialects.postgresql import insert

with app_engine.begin() as conn:
    tbl_cols = ["id_accident","date_heure","departement_code","commune_code",
                "gravite","type_route","condition_meteo","luminosite",
                "nb_vehicules","payload_json"]
    for _, row in tqdm(df.iterrows(), total=len(df)):
        stmt = insert(text("bronze.accidents")).values({c: row[c] for c in tbl_cols})
        # SQLAlchemy core + text doesn't support insert(...).on_conflict_do_update with plain text table easily,
        # so we fallback to a MERGE-like pattern using raw SQL.


In [None]:

def upsert_bronze_batch(engine, df, batch_size=1000):
    # Simple batched UPSERT using native SQL to keep it engine-agnostic
    import math
    cols = ["id_accident","date_heure","departement_code","commune_code",
            "gravite","type_route","condition_meteo","luminosite",
            "nb_vehicules","payload_json"]
    total = len(df)
    for i in tqdm(range(0, total, batch_size)):
        chunk = df.iloc[i:i+batch_size].copy()
        # Build VALUES list
        values_sql = []
        params = {}
        for j, (_, r) in enumerate(chunk.iterrows()):
            placeholders = []
            for c in cols:
                key = f"{c}_{i+j}"
                val = r[c]
                params[key] = val
                placeholders.append(f":{key}")
            values_sql.append("(" + ",".join(placeholders) + ")")
        sql = f'''
        INSERT INTO bronze.accidents ({",".join(cols)})
        VALUES {",".join(values_sql)}
        ON CONFLICT (id_accident) DO UPDATE SET
            date_heure = EXCLUDED.date_heure,
            departement_code = EXCLUDED.departement_code,
            commune_code = EXCLUDED.commune_code,
            gravite = EXCLUDED.gravite,
            type_route = EXCLUDED.type_route,
            condition_meteo = EXCLUDED.condition_meteo,
            luminosite = EXCLUDED.luminosite,
            nb_vehicules = EXCLUDED.nb_vehicules,
            payload_json = EXCLUDED.payload_json;
        '''
        with app_engine.begin() as conn:
            conn.execute(text(sql), params)

upsert_bronze_batch(app_engine, df, batch_size=1000)
print("✅ BRONZE chargé (UPSERT).")



## 6) SILVER — vues nettoyées/typées (exemple)


In [None]:

silver_sql = '''
CREATE OR REPLACE VIEW silver.accidents_clean AS
SELECT
  id_accident,
  date_heure,
  COALESCE(departement_code, payload_json->>'dep')      AS departement_code,
  COALESCE(commune_code,     payload_json->>'com')      AS commune_code,
  gravite,
  type_route,
  condition_meteo,
  luminosite,
  nb_vehicules
FROM bronze.accidents;
'''
run_sql(app_engine, silver_sql)
print("✅ Vues SILVER prêtes.")



## 7) GOLD — modèle dimensionnel minimal (exemple)
- **dim_temps**, **dim_lieu**, **dim_conditions**, **fact_accidents**.


In [None]:

gold_sql = '''
CREATE TABLE IF NOT EXISTS gold.dim_temps (
  id_temps SERIAL PRIMARY KEY,
  date_heure TIMESTAMPTZ UNIQUE,
  annee INT, mois INT, semaine INT, jour INT, heure INT
);

CREATE TABLE IF NOT EXISTS gold.dim_lieu (
  id_lieu SERIAL PRIMARY KEY,
  departement_code TEXT,
  commune_code TEXT,
  UNIQUE(departement_code, commune_code)
);

CREATE TABLE IF NOT EXISTS gold.dim_conditions (
  id_cond SERIAL PRIMARY KEY,
  gravite TEXT,
  type_route TEXT,
  condition_meteo TEXT,
  luminosite TEXT,
  UNIQUE(gravite, type_route, condition_meteo, luminosite)
);

CREATE TABLE IF NOT EXISTS gold.fact_accidents (
  id_accident TEXT PRIMARY KEY,
  id_temps INT REFERENCES gold.dim_temps(id_temps),
  id_lieu  INT REFERENCES gold.dim_lieu(id_lieu),
  id_cond  INT REFERENCES gold.dim_conditions(id_cond),
  nb_vehicules INT
);
'''
run_sql(app_engine, gold_sql)
print("✅ Tables GOLD créées.")



### 7.1 Peuplement GOLD (ETL simple)


In [None]:

# Temps
run_sql(app_engine, '''
INSERT INTO gold.dim_temps (date_heure, annee, mois, semaine, jour, heure)
SELECT DISTINCT date_heure,
       EXTRACT(YEAR  FROM date_heure)::INT,
       EXTRACT(MONTH FROM date_heure)::INT,
       EXTRACT(WEEK  FROM date_heure)::INT,
       EXTRACT(DAY   FROM date_heure)::INT,
       EXTRACT(HOUR  FROM date_heure)::INT
FROM silver.accidents_clean
WHERE date_heure IS NOT NULL
ON CONFLICT (date_heure) DO NOTHING;
''')

# Lieu
run_sql(app_engine, '''
INSERT INTO gold.dim_lieu (departement_code, commune_code)
SELECT DISTINCT departement_code, commune_code
FROM silver.accidents_clean
ON CONFLICT (departement_code, commune_code) DO NOTHING;
''')

# Conditions
run_sql(app_engine, '''
INSERT INTO gold.dim_conditions (gravite, type_route, condition_meteo, luminosite)
SELECT DISTINCT gravite, type_route, condition_meteo, luminosite
FROM silver.accidents_clean
ON CONFLICT (gravite, type_route, condition_meteo, luminosite) DO NOTHING;
''')

# Fait
run_sql(app_engine, '''
INSERT INTO gold.fact_accidents (id_accident, id_temps, id_lieu, id_cond, nb_vehicules)
SELECT s.id_accident,
       t.id_temps,
       l.id_lieu,
       c.id_cond,
       s.nb_vehicules
FROM silver.accidents_clean s
JOIN gold.dim_temps t ON t.date_heure = s.date_heure
JOIN gold.dim_lieu  l ON l.departement_code = s.departement_code AND l.commune_code = s.commune_code
JOIN gold.dim_conditions c ON c.gravite = s.gravite AND c.type_route = s.type_route
                           AND c.condition_meteo = s.condition_meteo AND c.luminosite = s.luminosite
ON CONFLICT (id_accident) DO NOTHING;
''')

print("✅ GOLD peuplé.")



## 8) Analyses (exemples demandés)


In [None]:

import pandas as pd

def query_df(sql):
    with app_engine.connect() as conn:
        return pd.read_sql(text(sql), conn)

# 8.1 Conditions risquées vs moyenne nationale
q1 = '''
WITH national AS (
  SELECT AVG(nb_vehicules) AS avg_nb FROM gold.fact_accidents
)
SELECT c.gravite, c.type_route, c.condition_meteo, c.luminosite,
       COUNT(*) AS accidents, AVG(f.nb_vehicules) AS avg_nb
FROM gold.fact_accidents f
JOIN gold.dim_conditions c ON f.id_cond = c.id_cond
GROUP BY 1,2,3,4
ORDER BY accidents DESC
LIMIT 20;
'''
df1 = query_df(q1); df1.head(10)


In [None]:

# 8.2 Zones fréquentées : plus d'accidents graves ou juste plus d'accidents ?
q2 = '''
SELECT l.departement_code, l.commune_code,
       COUNT(*) AS total_accidents,
       SUM(CASE WHEN c.gravite IN ('3','4','grave','mortel') THEN 1 ELSE 0 END) AS accidents_graves
FROM gold.fact_accidents f
JOIN gold.dim_lieu l ON f.id_lieu = l.id_lieu
JOIN gold.dim_conditions c ON f.id_cond = c.id_cond
GROUP BY 1,2
ORDER BY total_accidents DESC
LIMIT 20;
'''
df2 = query_df(q2); df2.head(10)


In [None]:

# 8.3 Détection de semaines anormales (écart à la moyenne)
q3 = '''
WITH weekly AS (
  SELECT t.annee, t.semaine, COUNT(*) AS nb
  FROM gold.fact_accidents f
  JOIN gold.dim_temps t ON f.id_temps = t.id_temps
  GROUP BY t.annee, t.semaine
),
stats AS (
  SELECT AVG(nb) AS mu, STDDEV_POP(nb) AS sigma FROM weekly
)
SELECT w.*, ROUND((w.nb - s.mu) / NULLIF(s.sigma,0), 2) AS zscore
FROM weekly w CROSS JOIN stats s
ORDER BY zscore DESC NULLS LAST
LIMIT 20;
'''
df3 = query_df(q3); df3.head(10)



## 9) Indexation & partitionnement (pistes)
- Index typiques : `fact_accidents(id_temps)`, `fact_accidents(id_lieu)`, `dim_temps(date_heure)`.
- Partitionnement par **année** sur `dim_temps` ou directement une table de faits partitionnée par `date_heure` (via clé étrangère).


In [None]:

run_sql(app_engine, '''
CREATE INDEX IF NOT EXISTS ix_fact_time ON gold.fact_accidents(id_temps);
CREATE INDEX IF NOT EXISTS ix_fact_lieu ON gold.fact_accidents(id_lieu);
CREATE INDEX IF NOT EXISTS ix_dim_temps_date ON gold.dim_temps(date_heure);
''')
print("✅ Index conseillés créés (si absents).")



## 10) Résumé
- **Tout** est exécutable ici (sans Docker).
- Idempotent : tu peux relancer les cellules sans casser la base.
- Adapte les DDL **BRONZE/SILVER/GOLD** à votre dictionnaire réel.
- Tu peux ouvrir DBeaver si tu veux une vue graphique, mais **tout passe par ce notebook**.
