# Uncertainty in vulnerability functions
Uncertainty in vulnerability functions can be specified in the vulnerability configuration.

## Deterministic case
In the deterministic case, that is when there is no uncertainty, the vulnerability function is simply a curve. The hazard indicator values, $x_i$ $i \in [1 \dots n]$ are provided with the corresponding impacts (damage/disruption) $y_i$.

## Generical case
In the general case, the hazard indicator values are provided in exactly the same way, but instead of a single impact for each value $x_i$, a cumulative probability density function (CDF) is specified. For a hazard indicator value $x$ that lies between two 'pillar values', the impact would be found in the deterministic case by linear interpolation. In the general case, it is the *CDFs* that are interpolated. 

The CDFs are specified in a non-parametric way with the cumulative probabilities provided for hazard indicator value $x_i$ for a number of impact pillar values $y_j$.

$F_i(y) = \mathbb{P}(Y \leq y|x_i)$

The CDF, $F_i(y)$, is given for points $y_j$, $j \in [1 \dots m]$.
$F_{ij} = \mathbb{P}(Y \leq y_j|x_i)$

$x = [x_1, x_2, \dots, x_n ]$  
$y = [y_1, y_2, \dots, y_m ]$  
$z = [[F_{11}, F_{12}, \dots, F_{1m}], [F_{21}, F_{12}, \dots, F_{2m}], \dots, [F_{n1}, F_{n2}, \dots, F_{nm}]]$

Note that in this scheme the impact pillar values $y_j$ determine the granularity of the impacts that apply across all CDFs. The granularity should be fine enough that the uncertainty is dominated by the CDFs so that the discretization error does not materially affect the calculation. 

## Generating CDFs
CDFs can be inferred directly from historical data or from ensembles of damage curves derived for different properties for which the precise building specification and situation is known. In some cases however, perhaps only an estimate of the uncertainty around a mean impact is available in the form of a standard deviation. In such cases a parametric model is used whereby the CDFs are modelled as beta or truncated gaussian distributions. In physrisk, discretized non-parametric CDFs are derived for these parametric distributions so that these can then treated identically.

An example is given for the case where a beta distribution is used.

In [1]:
import numpy as np
from physrisk.vulnerability_models.cdf_based_vuln_function import CDFBasedVulnerabilityFunction

hazard_indicator = np.array([0.0, 0.5, 1.0, 1.5, 2.0, 3.0, 4.0, 5.0, 6.0])
impact_mean = np.array([0.0, 0.327, 0.494, 0.617, 0.721, 0.870, 0.931, 0.984, 1.0])
impact_stddev = np.array([0.0, 0.12, 0.11, 0.10, 0.10, 0.08, 0.06, 0.02, 0.0])
vuln = CDFBasedVulnerabilityFunction(hazard_indicator, impact_mean, impact_stddev, kind="beta")

In [2]:
import plotly.graph_objects as go

fig = go.Figure()
for i in [0, 2, 4, 6, 8]:
    fig.add_scatter(x=vuln.cdf[i, :], y=vuln.impact, name=f"Flood depth {hazard_indicator[i]}m")
# also interpolate for a depth of 0.6m and display
interp = vuln.interpolate_cdfs(np.array([0.6]))
fig.add_scatter(x=interp[0, :], y=vuln.impact, name=f"Interpolated depth 0.6m")
fig.update_xaxes(title="Cumulative probability", title_font={"size": 14})
fig.update_yaxes(title="Fractional damage", title_font={"size": 14})
fig.show()

Probability bins are an alternative way to visualize the probability distributions.

In [3]:
fig = go.Figure()
impact_bin_centres = (vuln.impact[0:-1] + vuln.impact[1:]) / 2
impact_bin_width = vuln.impact[1:] - vuln.impact[0:-1]
for i in [2, 4, 6]:
    probs = vuln.cdf[i, 1:] - vuln.cdf[i, 0:-1]
    fig.add_bar(x=impact_bin_centres, y=probs, width=impact_bin_width, name=f"Flood depth {hazard_indicator[i]}m", 
                opacity=0.7)
fig.update_xaxes(title="Fractional damage", title_font={"size": 14})
fig.update_yaxes(title="Probability", title_font={"size": 14})
fig.show()

In [4]:
# we can do a sanity test say for the 1.0m flood depth case: is the standard deviation of the distribution as specified above (0.11 m)
i = 2
probs = vuln.cdf[i, 1:] - vuln.cdf[i, 0:-1]
mean = np.sum(probs * impact_bin_centres)
std = np.sqrt(np.sum(probs * impact_bin_centres * impact_bin_centres) - mean * mean)
print(f"Standard deviation is {std:0.3}m for flood depth {hazard_indicator[i]}")

Standard deviation is 0.114m for flood depth 1.0
