# Modelling infrastructure exposure and risk

This notebook forms the basis of "Hands-On 6" in the CCG course.

It uses the road network and flood dataset extracted in the previous tutorial.

1. Exposure - overlay sample flood extent with the network and estimate flood depth of exposure
2. Vulnerability - assume depth-damage curve (fragility curve) for the road and
   - show how the exposure is translated to damage
   - create a table with probability, flood depth, length exposed, fragility, cost/km, direct damage
4. Risk - show a risk calculation on the table and generate the result
5. Future risk - repeat with climate projections and compare with baseline

By the end of this tutorial you should be able to:
* Assess direct damage and indirect disruptions to infrastructure assets
* Apply the risk calculation to understand how to generate loss-probability curves
* Show how different flood hazards introduce uncertainty in risk estimations

In [None]:
# Imports from Python standard library
import os

# see https://docs.python.org/3/library/warnings.html
import warnings

# see https://docs.python.org/3/library/glob.html
from glob import glob

# Imports from other Python packages
import geopandas as gpd

# numpy is used by pandas and geopandas to store data in efficient arrays
# we use it in this notebook to help with trapezoidal integration
# see https://numpy.org/
import numpy as np
import pandas as pd

# seaborn helps produce more complex plots
# see https://seaborn.pydata.org/
import seaborn as sns
from pyproj import Geod

# tqdm lets us show progress bars (and تقدّم means "progress" in Arabic)
# see https://tqdm.github.io/
from tqdm.notebook import tqdm

Change this to point to your data folder as in the previous tutorial:

In [None]:
data_folder = "../data"

## 1. Exposure


List all the hazard files in the `flood_layer` folder:

In [None]:
hazard_files = sorted(glob(os.path.join(data_folder, "flood_layer/*.gpkg")))
hazard_files

In [None]:
def read_file_without_warnings(path, **kwd):
    with warnings.catch_warnings():
        warnings.simplefilter("ignore")
        data = gpd.read_file(path, **kwd)
    return data

Read in roads again, then do intersections against all hazard scenarios.

In [None]:
roads = read_file_without_warnings(
    os.path.join(data_folder, "GHA_OSM_roads.gpkg"), layer="edges"
)
roads.head(2)

In [None]:
for hazard_file in tqdm(hazard_files):
    # read file
    flood = read_file_without_warnings(hazard_file).rename(
        columns={"DN": "depth_m"}
    )
    flood = flood[flood.depth_m > 0]

    # run intersection
    intersections = gpd.overlay(roads, flood, how="intersection")
    # calculate intersection lengths
    geod = Geod(ellps="WGS84")
    intersections["flood_length_m"] = intersections.geometry.apply(
        geod.geometry_length
    )
    # save file
    output_file = os.path.join(
        data_folder,
        "results",
        os.path.basename(hazard_file).replace(".gpkg", "_exposure.gpkg"),
    )
    if len(intersections):
        intersections.to_file(output_file, driver="GPKG")

List all the results just created:

In [None]:
intersection_files = sorted(
    glob(os.path.join(data_folder, "results/inunriver*.gpkg"))
)
intersection_files

Read and combine all the exposed lengths:

In [None]:
all_intersections = []

for intersection_file in tqdm(intersection_files):
    # split up the filename to pull out metadata
    hazard, rcp, gcm, epoch, rp, _ = os.path.basename(intersection_file).split(
        "_"
    )
    gcm = gcm.replace("0", "")
    rp = int(rp.replace("rp", ""))
    epoch = int(epoch)

    # read file
    intersections = read_file_without_warnings(intersection_file)
    # drop road length and geometry fields
    intersections.drop(columns="length_m", inplace=True)
    # add metadata about the hazard and scenario
    intersections["hazard"] = hazard
    intersections["rcp"] = rcp
    intersections["gcm"] = gcm
    intersections["epoch"] = epoch
    intersections["rp"] = rp

    all_intersections.append(intersections)

# group all together
all_intersections = pd.concat(all_intersections)
all_intersections

Summarise total length of roads exposed to depth 2m or greater flooding, under different return periods and climate scenarios:


In [None]:
summary = (
    all_intersections[all_intersections.depth_m >= 2.0]
    .groupby(["hazard", "rcp", "gcm", "epoch", "rp"])
    .sum()
    .drop(columns=["depth_m"])
)
summary

Plot exposure against return period, with separate plot areas for each Representative Concentration Pathway (RCP), and different colours for the different Global Climate Models (GCM): 

In [None]:
sns.lmplot(
    "rp",
    "flood_length_m",
    data=summary.reset_index(),
    hue="gcm",
    col="rcp",
    fit_reg=False,
)

## 2. Vulnerability

Set up fragility curve assumptions, where probability of damage (`pfail`) depends on whether a road is paved and the depth of flood it is exposed to. 

These assumptions are derived from Koks, E.E., Rozenberg, J., Zorn, C. et al. A global multi-hazard risk analysis of road and railway infrastructure assets. Nat Commun 10, 2677 (2019). https://doi.org/10.1038/s41467-019-10442-3, Figure S3, extrapolated to 2m and 3m depths. 

The analysis is likely to be highly sensitive to these assumptions, and this approach is strongly limited by the availability and quality of fragility data, as well as the assumption that fragility can be related to flood depth alone - flood water velocity would be an important factor in a more detailed vulnerability assessment. 

In [None]:
fragility = pd.DataFrame(
    {
        "paved": [True, True, True, False, False, False],
        "depth_m": ["1", "2", ">=3", "1", "2", ">=3"],
        "pfail": [0.1, 0.3, 0.5, 0.9, 1.0, 1.0],
    }
)
fragility

Set up cost assumptions. 

These are taken from Koks et al (2019) again, Table S8, construction costs to be assumed as an estimate of full rehabilitation after flood damage. 

Again the analysis is likely to be highly sensitive to these assumptions, which should be replaced by better estimates if available.

In [None]:
costs = pd.DataFrame(
    {
        "kind": ["paved_four_lane", "paved_two_lane", "unpaved"],
        "cost_usd_per_km": [3_800_000, 932_740, 22_780],
    }
)
costs

Set up assumptions about which roads are paved or unpaved, and number of lanes.

In [None]:
sorted(all_intersections.road_type.unique())

Assume all `tertiary` roads are unpaved, all others are paved.

In [None]:
all_intersections["paved"] = ~(all_intersections.road_type == "tertiary")

In [None]:
def kind(road_type):
    if road_type in ("trunk", "trunk_link", "motorway"):
        return "paved_four_lane"
    elif road_type in ("primary", "primary_link", "secondary"):
        return "paved_two_lane"
    else:
        return "unpaved"


all_intersections["kind"] = all_intersections.road_type.apply(kind)

In [None]:
all_intersections = all_intersections.merge(costs, on="kind")

Discard all information on flood depths greater than 3m in order to use the fragility curve to estimate `pfail` for each exposed section.

In [None]:
all_intersections_coarse_depth = all_intersections.copy()
all_intersections_coarse_depth.depth_m = (
    all_intersections_coarse_depth.depth_m.apply(
        lambda d: str(d) if d < 3 else ">=3"
    )
)

In [None]:
all_intersections_coarse_depth = all_intersections_coarse_depth.merge(
    fragility, on=["depth_m", "paved"]
)

Finally estimate cost of rehabilitation for each exposed section

In [None]:
all_intersections_coarse_depth["damage_usd"] = (
    all_intersections_coarse_depth.flood_length_m
    * all_intersections_coarse_depth.cost_usd_per_km
    / 1000
)
all_intersections_coarse_depth.head(2)

In [None]:
all_intersections_coarse_depth.to_file(
    os.path.join(data_folder, "results/flood_exposure.gpkg"), driver="GPKG"
)

In [None]:
all_intersections_coarse_depth.drop(columns="geometry").to_csv(
    os.path.join(data_folder, "results/flood_exposure.csv"), index=False
)

In [None]:
summary = (
    all_intersections_coarse_depth.groupby(
        ["hazard", "rcp", "gcm", "epoch", "rp"]
    )
    .sum()
    .drop(columns=["paved", "cost_usd_per_km", "pfail"])
)
summary

## 3. Risk

Calculate expected annual damages for each road under historical hazard.

Start by selecting only historical intersections, and keeping only the road ID, return period, probability of damage, and cost of rehabilitation if damaged.

In [None]:
historical = all_intersections_coarse_depth[
    all_intersections_coarse_depth.rcp == "historical"
][["id", "rp", "pfail", "damage_usd"]]

Calculated the expected damage for each length exposed (under a given return period)

In [None]:
historical["expected_damage_usd"] = historical.pfail * historical.damage_usd

Sum up the expected damage for each road, per return period

In [None]:
historical = (
    historical.groupby(["id", "rp"])
    .sum()
    .drop(columns=["pfail", "damage_usd"])
    .reset_index()
)
historical.head(2)

Pivot the table to create columns for each return period - now there is one row per road.

In [None]:
historical = historical.pivot(index="id", columns="rp").replace(
    float("NaN"), 0
)
historical.columns = [f"rp{rp}" for _, rp in historical.columns]
historical.head(2)

Calculate expected annual damages, integrating under the expected damage curve over return periods.

In [None]:
def expected_annual_damages(row):
    return np.trapz([row.rp1000, row.rp100, row.rp10], x=[0.001, 0.01, 0.1])


historical["ead_usd"] = historical.apply(expected_annual_damages, axis=1)
historical.head(2)

In [None]:
historical.to_csv(
    os.path.join(data_folder, "results/flood_risk_historical.csv")
)

## 4. Future risk

Calculate expected annual damages under each future scenario (for each global climate model and representative concentration pathway).

This follows the same method as for historical flooding above, with the added variables of climate model and rcp.

In [None]:
future = all_intersections_coarse_depth[
    ["id", "rp", "rcp", "gcm", "pfail", "damage_usd"]
].copy()

Calculated the expected damage for each length exposed (under a given return period, gcm and rcp)

In [None]:
future["expected_damage_usd"] = future.pfail * future.damage_usd

Sum up the expected damage for each road, per return period, gcm and rcp

In [None]:
future = (
    future.groupby(["id", "rp", "rcp", "gcm"])
    .sum()
    .drop(columns=["pfail", "damage_usd"])
    .reset_index()
)
future.head(2)

Pivot the table to create columns for each return period - now there is one row per road, gcm and rcp.

In [None]:
future = future.pivot(index=["id", "rcp", "gcm"], columns="rp").replace(
    float("NaN"), 0
)
future.columns = [f"rp{rp}" for _, rp in future.columns]
future.head(2)

Calculate expected annual damages, integrating under the expected damage curve over return periods.

In [None]:
future["ead_usd"] = future.apply(expected_annual_damages, axis=1)

In [None]:
future.to_csv(os.path.join(data_folder, "results/flood_risk.csv"))

Pick out an individual road by id, to spot check uncertainty:

In [None]:
future.loc["roade_10028"]

Summarise total expected annual (direct) damages, showing variation between climate models and representative concentration pathways.

In [None]:
summary = (
    future.reset_index()
    .drop(columns=["id", "rp10", "rp100", "rp1000"])
    .groupby(["rcp", "gcm"])
    .sum()
)
summary

In [None]:
sns.lmplot(
    "rcp", "ead_usd", data=summary.reset_index(), hue="gcm", fit_reg=False
)