# Physical Climate Risk Propagation Model - EU-27 SAM (Scenario Runs)
## End-to-end simulation, post-processing, and visualization

This notebook provides a **clean, reproducible workflow** to:
1. Load and prepare the Eurostat-based Social Accounting Matrix (SAM)
2. Initialize the IO Climate Risk Propagation Model
3. Define and run climate-related shock scenarios
4. Post-process model outputs into economic impact indicators
5. Visualize results in a dashboard-ready format

All heavy logic is delegated to the `src/io_climate` package.
This notebook acts as an **orchestrator and presentation layer**, not a model implementation.

This notebook is the *single entry point* to:
- load the latest EU-27 SAM (long format) from Databricks
- extract IO-style blocks (`Z`, `FD`, `X`, `A`, `globsec_of`, `node_labels`)
- run the **physical-risk propagation** model under supply/demand shocks
- inspect and visualize impacts (country / sector), including a Europe choropleth



## 0) Environment and imports

**Prerequisites**
- You are running on a Databricks cluster or a local environment with a working Spark session (`spark`) connected to the Databricks metastore.
- The project `src/` folder is importable (we add it to `sys.path` below).

If you are running locally, ensure Databricks Connect is configured and `spark` is available.


In [1]:
import os
import sys

import numpy as np
import pandas as pd

# --- Make project imports work from the repo root
PROJECT_ROOT = os.path.abspath(os.getcwd())
SRC_DIR = os.path.join(PROJECT_ROOT, "src")
if SRC_DIR not in sys.path:
    sys.path.insert(0, SRC_DIR)

from data_io.eurostat_sam import load_sam_latest_year, extract_model_inputs_from_sam
from src.data_io.sector_decoder import build_sector_decoder
from io_climate.postprocess import postprocess_results
from src.io_climate.viz import build_dashboard_bundle
from io_climate.model import IOClimateModel


## 1) Load SAM and extract model inputs

The SAM is expected in **long format** and must include:
- `geo_ava`, `ind_ava` (producer country/account)
- `geo_use`, `ind_use` (user country/account)
- `value` (flow)
- `share` (precomputed coefficient: cell value / total output of user column)

`extract_model_inputs_from_sam(...)` builds:
- `Z` (n×n) intermediate flows among production sectors (`P_*`)
- `FD` (n,) final demand from accounts `[HH, GOV, CF, WRL_REST]`
- `X` (n,) gross output = `row_sum(Z) + FD`
- `A` (n×n) technical coefficients (from `share` for the `P_* → P_*` block)
- `globsec_of` (n,) maps each node to its global sector id (same `P_*` across countries)
- `node_labels` list of `"CC::P_..."` strings, consistent with matrix ordering


In [2]:
# Load latest year from Databricks table
sam_df, latest_year = load_sam_latest_year(spark)
print("SAM year:", latest_year)
print("Rows:", sam_df.count())

# Extract model inputs (numpy arrays + labels)
Z, FD, X, A, globsec_of, node_labels = extract_model_inputs_from_sam(sam_df)

n = len(node_labels)
print("n nodes:", n)
print("Z shape:", Z.shape, "A shape:", A.shape, "FD shape:", FD.shape, "X shape:", X.shape)


SAM year: 2022
Rows: 2040159
n nodes: 1695
Z shape: (1695, 1695) A shape: (1695, 1695) FD shape: (1695,) X shape: (1695,)


**Sector name mapping table**

If available, you can load a sector mapping/decoder table to attach human-readable sector names to results. 
At this stage, the model uses **NACE codes** from `node_labels` (e.g. `IT::P_D35`). The mapping is **not** required to run the model.


In [7]:
sectors_in_sam = sorted({lbl.split("::")[1] for lbl in node_labels})
sector_decoder = build_sector_decoder(
    spark,
    sectors_in_sam=sectors_in_sam,
)
print("Sectors in SAM:", len(sectors_in_sam))

Sectors in SAM: 63


**Quick sanity checks**

These checks catch the most common ingestion issues (ordering mismatches, missing blocks, negative values).


In [8]:
assert Z.shape == (n, n)
assert A.shape == (n, n)
assert FD.shape == (n,)
assert X.shape == (n,)
assert len(globsec_of) == n

print("Min values — Z:", float(Z.min()), "A:", float(A.min()), "FD:", float(FD.min()), "X:", float(X.min()))
print("Totals — sum(Z):", float(Z.sum()), "sum(FD):", float(FD.sum()), "sum(X):", float(X.sum()))

# Accounting identity at baseline (row accounting)
row_gap = np.linalg.norm(X - (FD + Z.sum(axis=1)), ord=1)
print("Baseline accounting gap (L1):", float(row_gap))


Min values — Z: 0.0 A: 0.0 FD: -0.040412221578703625 X: 1.2000000000907844
Totals — sum(Z): 13941501.474966172 sum(FD): 14831340.773248244 sum(X): 28772842.248214405
Baseline accounting gap (L1): 0.0


## 2) Instantiate the model

`IOClimateModel` stores the **baseline economy** internally (`Z0`, `A0`, `L0`, `X0`, `FD0`), and enforces a fixed **global-technology** matrix `A_G` derived from baseline global-sector aggregation.

For scenario diagnostics, keep a copy of baseline output **outside** the model, to avoid any ambiguity.


In [9]:
model = IOClimateModel(
    Z=Z,
    FD=FD,
    X=X,
    globsec_of=globsec_of,
    A=A,
    node_labels=node_labels,
)

# Baseline copies for diagnostics
X_baseline = X.copy()
FD_baseline = FD.copy()

print("Model ready.")
print("Baseline total output:", float(X_baseline.sum()))
print("Baseline total final demand:", float(FD_baseline.sum()))
print("Global sectors:", model.S_glob)


Model ready.
Baseline total output: 28772842.248214405
Baseline total final demand: 14831340.773248244
Global sectors: 63


## 3) Define and run a scenario

You can define shocks either:
- by providing `sd` and `sp` vectors directly, or
- by specifying `(country_codes, sector_codes, supply_shock_pct, demand_shock_pct)`.

This notebook uses **scenario mode** for convenience.

Notes:
- Supply shock reduces capacity: `X_cap = X0 * (1 - sp)`
- Demand shock reduces initial final demand: `FD_post = FD0 * (1 - sd)`
- Outer iteration reduces demand *further* only where implied demand is lower than `FD_post` (elementwise monotone update).


In [10]:
scenario = dict(
    # Supply shock target: Portugal energy/utilities
    supply_country_codes=["PT"],      # Portugal
    supply_sector_codes=["P_D35"],        # Electricity, gas, steam
    supply_shock_pct=5.0,

    # Demand shock target: Italy, several service sectors
    demand_country_codes=["IT"],            # Italy
    demand_sector_codes=["P_H49", "P_J61", "P_M69_70"],  # transport, telecom, professional services
    demand_shock_pct=2.5,

    gamma=0.5,
    max_iter=100,
    tol=1e-3,
    return_history=True,
)

results = model.run(**scenario)

print("Converged:", results["converged"])
print("Iterations:", results["iterations"])


Converged: True
Iterations: 3


## 4) Build node-level impact table

We compute losses (and gains) in terms of output, value added, and structural change at node level.
The post-processing step produces standardized tables:

- `df_nodes`: node-level impacts (country–sector)
- `df_country`: country aggregates
- `df_sector`: sector aggregates (decoded names if available)
- `df_links_weakened` / `df_links_strengthened`: largest linkage changes (ΔA by default)

Units:
- Output and value added are in the same unit as the SAM flows (typically EUR).


In [20]:
# Postprocess run outputs into standardized tables and KPIs
pp = postprocess_results(
    node_labels=node_labels,
    Z0=Z,
    X0=X_baseline,
    Z1=results["Z_final"],
    X1=results["X_supply_final"],
    FD_post=results.get("FD_post_final"),
    sector_name_map=sector_decoder,   # optional; comment out if not available
    linkage_metric="A",              # structural change via technical coefficients
    top_k_links=25,
)

df_nodes = pp.df_nodes
df_country = pp.df_country
df_sector = pp.df_sector

# Structural change (top linkages)
df_links_weakened = pp.df_links_weakened
df_links_strengthened = pp.df_links_strengthened

display(pp.meta)


{'X_baseline_total': 28772842.248214405,
 'X_final_total': 27968010.72781489,
 'X_loss_abs_total': 804831.5203995146,
 'X_loss_pct_total': 2.7971915789774333,
 'VA_baseline_total': 14831340.77324824,
 'VA_final_total': 14417911.695065305,
 'VA_loss_abs_total': 413429.0781829357,
 'VA_loss_pct_total': 2.7875367743464627,
 'FD_post_total': 14417908.343604557}

In [19]:
# Displaying all tables with their top 10 rows
print("Most Impacted Nodes")
display(df_nodes.head(10))

print("\nMost Impacted Countries")
display(df_country.head(10))

print("\nMost Impacted Sectors")
display(df_sector.head(10))

print("\nTop Weakened Linkages (most negative ΔA)")
display(df_links_weakened.head(10))

print("\nTop Strengthened Linkages (most positive ΔA)")
display(df_links_strengthened.head(10))

Most Impacted Nodes


Unnamed: 0,node,country,sector,X_baseline,X_final,loss_abs,loss_pct,sector_name,VA_baseline,VA_final,VA_loss_abs,VA_loss_pct
0,AT::P_A01,AT,P_A01,10170.862001,9616.942118,553.919883,5.446145,"Crop and animal production, hunting and relate...",4386.116991,4147.242703,238.874288,5.446145
1,AT::P_A02,AT,P_A02,2903.627,2694.477342,209.149658,7.203048,Forestry and logging,1247.327498,1157.481894,89.845603,7.203048
2,AT::P_A03,AT,P_A03,111.881,106.690129,5.190871,4.639636,Fishing and aquaculture,38.56856,36.779119,1.789441,4.639636
3,AT::P_B,AT,P_B,3452.811,3356.478828,96.332172,2.789964,P_B,1668.297402,1621.752471,46.54493,2.789966
4,AT::P_C10-12,AT,P_C10-12,28074.827001,27292.312293,782.514708,2.787247,Manufacture of food products; beverages and to...,7311.576528,7107.782115,203.794413,2.787284
5,AT::P_C13-15,AT,P_C13-15,2348.636,2283.633657,65.002343,2.767664,"Manufacture of textiles, wearing apparel, leat...",884.34813,859.87198,24.476149,2.767705
6,AT::P_C16,AT,P_C16,10829.488001,10521.617289,307.870712,2.842893,Manufacture of wood and of products of wood an...,3554.409379,3453.360753,101.048626,2.842909
7,AT::P_C17,AT,P_C17,7605.928001,7379.416383,226.511618,2.978093,Manufacture of paper and paper products,2822.957457,2738.886931,84.070525,2.978101
8,AT::P_C18,AT,P_C18,1935.736,1827.851669,107.884332,5.573298,Printing and reproduction of recorded media,796.06621,751.699069,44.367141,5.573298
9,AT::P_C19,AT,P_C19,6834.943001,6637.090885,197.852116,2.894715,Manufacture of coke and refined petroleum prod...,1996.614514,1938.813044,57.80147,2.894974



Most Impacted Countries


Unnamed: 0,country,X_baseline,X_final,loss_abs,VA_baseline,VA_final,VA_loss_abs,loss_pct,VA_loss_pct
5,DE,6905942.0,6719606.0,186336.5582,3557843.0,3462199.0,95643.628211,2.698206,2.688248
11,FR,4681131.0,4554784.0,126347.368,2380209.0,2316213.0,63996.053149,2.699078,2.688673
15,IT,3751974.0,3647679.0,104295.237501,1827599.0,1777140.0,50458.430065,2.779743,2.760914
9,ES,2442810.0,2376560.0,66249.340307,1324897.0,1289222.0,35675.644818,2.712014,2.69271
20,NL,1738425.0,1691643.0,46781.454547,960703.6,934905.5,25798.105804,2.691026,2.685335
21,PL,1310212.0,1274155.0,36056.436321,644390.7,626626.4,17764.295635,2.751955,2.756759
1,BE,1099704.0,1070049.0,29655.577793,544618.3,529970.6,14647.683489,2.696687,2.689532
24,SE,921562.6,896616.5,24946.117388,498610.3,485111.7,13498.597437,2.706937,2.707244
0,AT,811038.7,786896.0,24142.628736,402775.5,390926.1,11849.493899,2.976754,2.94196
22,PT,436781.6,416120.6,20660.93881,214106.3,203486.1,10620.291383,4.730268,4.960288



Most Impacted Sectors


Unnamed: 0,sector,sector_name,X_baseline,X_final,loss_abs,VA_baseline,VA_final,VA_loss_abs,loss_pct,VA_loss_pct
26,P_F,Construction,2166042.0,2106849.0,59193.229549,830513.0,807798.1,22714.928884,2.732783,2.735048
43,P_L,Real estate activities,1951634.0,1897623.0,54011.314417,1502713.0,1460974.0,41739.462955,2.767492,2.777607
28,P_G46,"Wholesale trade, except of motor vehicles and ...",1541742.0,1498775.0,42967.006023,859233.8,835226.9,24006.845013,2.786912,2.793983
53,P_O84,Public administration and defence; compulsory ...,1367827.0,1331293.0,36533.795422,1005180.0,978291.5,26888.389223,2.670937,2.674983
23,P_D35,"Electricity, gas, steam and air conditioning s...",1140940.0,1107852.0,33088.320336,438839.5,426197.6,12641.904405,2.900093,2.880758
4,P_C10-12,Manufacture of food products; beverages and to...,1127953.0,1096613.0,31340.579405,267069.8,259654.6,7415.208301,2.778535,2.776505
55,P_Q86,Human health activities,1168269.0,1136968.0,31301.048766,857622.7,834629.6,22993.051924,2.679268,2.681022
29,P_G47,"Retail trade, except of motor vehicles and mot...",1047942.0,1019132.0,28810.584437,616938.5,599932.0,17006.486043,2.749253,2.756593
44,P_M69_70,Legal and accounting activities; activities of...,944264.2,917533.1,26731.104237,508080.0,493621.4,14458.635897,2.830893,2.84574
39,P_J62_63,"Computer programming, consultancy, and informa...",821342.5,798512.7,22829.769663,456123.6,443420.2,12703.424355,2.779568,2.785084



Top Weakened Linkages (most negative ΔA)


Unnamed: 0,i_node,j_node,i_label,j_label,baseline,final,delta,delta_rel
0,1404,1404,PT::P_D35,PT::P_D35,0.631098,0.629731,-0.001367,-0.002166
1,1404,1440,PT::P_D35,PT::P_S94,0.034965,0.034784,-0.000181,-0.00517
2,1404,1439,PT::P_D35,PT::P_R93,0.028753,0.028598,-0.000155,-0.005395
3,1404,1410,PT::P_D35,PT::P_G47,0.032804,0.032653,-0.000151,-0.004616
4,1404,1442,PT::P_D35,PT::P_S96,0.030399,0.03026,-0.000139,-0.00457
5,1404,1406,PT::P_D35,PT::P_E37-39,0.046711,0.04658,-0.000131,-0.002809
6,1404,1402,PT::P_D35,PT::P_C31_32,0.023324,0.023195,-0.000129,-0.00553
7,1404,1438,PT::P_D35,PT::P_R90-92,0.019971,0.019847,-0.000124,-0.006214
8,1404,1434,PT::P_D35,PT::P_O84,0.018351,0.018233,-0.000117,-0.00639
9,1404,1412,PT::P_D35,PT::P_H50,0.021061,0.020945,-0.000116,-0.005518



Top Strengthened Linkages (most positive ΔA)


Unnamed: 0,i_node,j_node,i_label,j_label,baseline,final,delta,delta_rel
0,1390,1413,PT::P_C19,PT::P_H51,0.226929,0.228122,0.001193,0.005258
1,575,1413,ES::P_C19,PT::P_H51,0.176486,0.177414,0.000928,0.005258
2,1431,1397,PT::P_N78,PT::P_C26,0.160289,0.161092,0.000803,0.005008
3,715,1404,FR::P_D35,PT::P_D35,5.6e-05,0.000859,0.000802,14.297044
4,337,1404,DE::P_D35,PT::P_D35,7.2e-05,0.00087,0.000798,11.044042
5,1386,1386,PT::P_C13-15,PT::P_C13-15,0.299468,0.300265,0.000797,0.002661
6,1407,1407,PT::P_F,PT::P_F,0.227395,0.228172,0.000777,0.003416
7,1422,1423,PT::P_K65,PT::P_K66,0.143022,0.143647,0.000625,0.00437
8,589,1404,ES::P_D35,PT::P_D35,0.032278,0.032896,0.000618,0.019153
9,1416,1413,PT::P_I,PT::P_H51,0.11184,0.112428,0.000588,0.005258


## 5) Visualizations

This section provides:
- choropleth map (Europe) of absolute or percent losses by country
- bar charts of absolute losses by country and sector
- bar charts of most strenghtened and weakened trade linkages

If `matplotlib` is not installed in your environment, the bar charts will fall back to tables.


In [16]:
# Build and show dashboard visualizations
bundle = build_dashboard_bundle(
    pp,
    country_metric_for_map="loss_pct",
    top_k_countries=20,
    top_k_sectors=20,
    top_k_links=20,
    use_country_names=True,
)

bundle.figures["country_map"].show()
bundle.figures["top_countries"].show()
bundle.figures["top_sectors"].show()

# Optional linkage charts if you computed them
if "links_strengthened" in bundle.figures:
    bundle.figures["links_strengthened"].show()
    bundle.figures["links_weakened"].show()


## 6) Diagnostics: demand adjustment loop

The model iterates on post-shock final demand `FD_post` until it matches the feasible implied demand.
Use this section to verify:
- total demand contraction
- how many outer iterations were needed


In [15]:
FD_post_final = results["FD_post_final"]
FD_implied_final = results["FD_implied_final"]

print("Total baseline FD:", float(FD_baseline.sum()))
print("Total final FD_post:", float(FD_post_final.sum()))
print("Total final implied FD:", float(FD_implied_final.sum()))
print("Max unmet FD (nodewise):", float(np.max(FD_post_final - FD_implied_final)))
print("Max slack FD (nodewise):", float(np.max(FD_implied_final - FD_post_final)))

# Optional: inspect the demand history (outer loop)
if "FD_post_history" in results:
    totals = [float(v.sum()) for v in results["FD_post_history"]]
    print("FD_post totals by outer iteration:", totals)


Total baseline FD: 14831340.773248244
Total final FD_post: 14417908.343604557
Total final implied FD: 14417908.412178027
Max unmet FD (nodewise): 0.0
Max slack FD (nodewise): 0.040412221578703625
FD_post totals by outer iteration: [14830127.342571039, 14435219.040870292, 14419305.635478921]
