# Calibration of the NCCS pipeline

# Introduction

This report will guide you through our calibration methodology, explaining our main requirements in calibrating the NCCS pipeline, discussing the technicalities of a parameter search, and describing how the chosen solution is implemented.

The model is a complex, multi-component system that integrates various data sources, impact functions, and a supply chain model to provide insights into potential disruptions and losses across different sectors and regions. However, with complexity come the challenges of accuracy and reliability. We need to calibrate and validate the model to ensure it agrees with observational data, to validate it, and to get a sense of the uncertainties in our model pipeline.

Our modelling pipeline has several components, and each step in the chain needs tuning to align with real-world observations and our own and expert knowledge. Given the right tools, we can calibrate both individual components to stand alone and recreate observation, and the whole pipeline at once. Calibrating individual components is faster and cheaper, and so we focus on those first, however the components have to work well together without propagating biases for the model to be useful: the last step of the calibration is a whole-pipeline calibration.

There are several technical challenges that we need to bear in mind while choosing a calibration methodology:
- Computation costs: the NCCS pipeline is very computationally expensive to run. We want a calibration methodology that minimises the number of times we have to evaluate the model
- Few observations: we don't have observations for all components of the model. We would like a holistic calibration approach where components with observations can inform components without, and where expert judgement can easily be incorporated when observations are missing. We accept that when there is no data, we may need to show this uncertainty in our results 
- Uncertainty quantification: we don't just want a point estimate of the 'best' combination of parameters for tuning the model. We know there are large uncertainties here, that they interact with each other, and they are not the same in all components. We would like a solution that lets us sample from the uncertainty space to run the model with different possible parameter combinations.

We choose a _Bayesian optimisation_ approach for calibration. This approach allows efficient exploration of the high-dimensional parameter space while quantifying uncertainties in the calibrated model. The optimization process iteratively samples the parameter space, evaluates the model against observations, and updates the probabilistic description of plausible parameter combinations until a required level of confidence is reached. It does this while trying to minimise the number of times the modelling chain must be run. The output quantifies and communicates the uncertainties inherent in the calibrated model.

We implement the calibration in stages, gradually building complexity and improving accuracy:

- Version 1: Calibration of direct asset loss impact functions against EM-DAT data. This is where observational data is most abundant and where we know calibrations for the CLIMADA model don't exist or are outdated. 
- Version 2: Introduction of regional vulnerability factors to the asset loss calibration, using ND-GAIN indices.
- Version 3: Incorporation of expert judgment and limited observations to calibrate business interruption components.
- Version 4: Whole-pipeline calibration to address inter-component biases and ensure consistency across the entire model chain.

The calibration is an ongoing process, and we're learning as we go. There are still a number of uncertainties and challenges to overcome, and these are also explained. Further input, especially theoretical, is very very welcome!

# Choosing a calibration approach

## What are we calibrating?

Literally everything in our model is uncertain. It would be unreasonable and unhelpful for us to start describing the uncertainties in our hazards, exposures, and even our methodologies. There's a balance to be struck, and we'll do that here by representing our model uncertainty as part of the impact functions that we're using. That means we want to calibrate impact functions so that the modelled asset losses and production losses agree with the observations that we have.

There are many impact functions in our current implementation of the model: about half of them relate hazard intensity to asset loss, and half relate sectoral asset loss to sectoral production loss/business interruption (I use the terms interchangeably here ... apologies). 

Each impact function is either an empirical relationship between two variables (e.g. the Schwierz relationship between observed wind speeds and asset losses), or a parametrised functional form (e.g. the Emanuel sigmoid relationship between wind speeds and asset losses). In the case of the former, we can adjust the function by stretching and scaling it. In the case of the latter we can change the parameters that define the function itself. In each of these cases we can define adjustment parameters that change the function, and thus the modelled outputs.

// If time: illustrations!

This means that our calibration is a parameter search: we want to adjust the impact functions to best recreate the observations that we have. That's good news, most calibration is a parameter search of some form.

Note that we're not adjusting the supply chain istelf at the moment. If we manage to collect observations that would help here (e.g. historical time series of flows of goods), we can look at tuning the supply chain model as well. The same principles apply here as with any other component of the pipeline.


## What is an optimisation?

An algorithmic optimisation lets us search a parameter space for the combination of values that best 'solves' some problem that we're interested in. This could be as simple as a linear regression, or as complex as tuning the hyperparameters of a neural network. In our case we're interested in tuning impact functions in CLIMADA, adjusting their size and shape until the modelled losses best resemble the observations that we have.

Every optimisation problem requires three elements (or some variation on these three elements):

- **The parameter space:** information about all the different variables whose possible values we want to explore, plus any constraints we want to place on their permitted values, or rules about which combinations of values can be used. In this case, these are the parameters that together define our impact function(s).
- **The objective function:** a function (often a statistical model) that takes these parameters as input and maps them to some output. For us, this is a wrapper around part or all of our modelling chain that maps the input parameters to various modelled impacts. In the example this uses the input parameters to create an impact function, and runs CLIMADA with this impact function, outputting losses. The output doesn't have to be a single value, it can be anything you like: for example, average annual loss from an event set, or a vector of losses from every event on every sector, or losses from a single location, or a combination of all these things. 
- **The cost function:** This takes two inputs, the output from the objective function, and someobservational or training data which is considered to be the target value(s) for the objective function. The cost function tells the optimisation how well or poorly each combination of parameters performs compared to the observations. In our example, our observations are event losses from EM-DAT and the cost function calculates the difference between modelled and observed losses and returns a summary statistic (we discuss the choice of this statistic later).

The optimisation then explores the parameter space, attempting to find a combination of parameters that minimises the cost function (and in our case, in as few steps as possible). Most of the content below ends up being about this process to find the combination of parameters that gives the minimum value of the cost function: how to model an approximation of the loss function and how to sample from the parameter space most efficiently.

An optimisation algorithm searches for this the optimal parameter combination that minimises the cost function with an iterative process. For each iteration:

- The algorithm looks at the points it has already sampled from the parameter space and what it knows about the behaviour of the cost function
- It uses some modelling process (see below) to choose another point in the parameter space to sample that will add the most useful information about the location of the function's minimum (discussed below too)
- It runs the model (the objective function) for the new sampled parameters from the parameter space and calculates the value of the cost function
- If some stopping criterion is met (e.g. only small improvements in our knowledge of the cost function, a maximum number of iterations reached) it stops. Otherwise it starts the next iteration.

This process can be represented as a flow chart:

<img src="images/calibration_optimisation_flow.png" width="500" alt="Flow chart representing the flow of an optimisation algorithm">

How the algorithm thinks about the form of the cost function and how it chooses its next sample from the parameter space are big and interesting problems with a lot of solutions. The solution we will choosing is Bayesian optimisation with Gaussian process priors. The 'Bayesian optimisation' is our method for selecting the next sample from the parameter space, the 'Gaussian process priors' refer to how we approximate the cost function using what we know.


## What does a Bayesian approach mean?

Statisticians often divide into two camps: Bayesians and frequentists. Both are good approaches to different problems (and, if you dig deep enough, can be mathematically equivalent). Frequentist methods treat model parameters as _fixed but unknown_ facts about the universe, estimating them based solely on observed data. In contrast, Bayesian methods consider parameters as _uncertain_: random variables with prior probabilistic distributions which are updated using observed data to produce posterior distributions.

A Bayesian framework offers several advantages for our calibration problem:
- It naturally incorporates prior knowledge and expert judgment, particularly useful for components with limited observational data
- It provides a coherent framework for uncertainty quantification, essential for complex, multi-component models
- It allows for more intuitive interpretation of results, expressing parameter estimates as probability distributions rather than point estimates
- It facilitates the integration of diverse data sources and the propagation of uncertainties through the entire model chain, making it particularly well-suited for the hierarchical and iterative calibration strategy


## Why do we use Gaussian processes in optimisation?

A Gaussian process is a good way to approximate any unknown function defined over a continuous domain where we know its value at a finite set of points (e.g. the value of the cost function). In particular, the way a Gaussian process approximates a function is mathematically nice: as awe get more information the approximation gets more and more accurate, it doesn't expect a particular statistical distribution or functional form (which we wouldn't expect in a complex problem like this) and it has, predictable, cheap-to-calculate properties, including marginals and conditionals which are important in Bayesian inference. These is important when we use the approximated function for operations such as choosing the next parameter combination to test.

In straightforward approaches to an optimisation with a Gaussian process, the optimising algorithm will gradually refine its understanding of the parameter space by sampling points from within it. The optimiser will typically look at the local gradient of the function and its Hessian (i.e. second derivatives) to choose the next location to sample. This works well, but there are more efficient ways to search if you really want to reduce the number of times you have to call your objective function (the model)

At the cost of more expensive calculations of where to sample next, a Bayesian process can make better inferences about the combination of parameters that, when sampled, are most likely to improve its knowledge about the cost function's minimum.


## What is Bayesian optimisation?

Bayesian optimisation refers to a particular family of optimisation algorithms that considers the objective function itself as probabilistic, that is, each step of the optimisation isn't considering a single 'best guess', but instead looks at the whole space of possible approximating functions (in our case, the space of Gaussian processes defined on samples from our parameter space). Each function in the space has an associated plausibility quantifying how well it does (or doesn't) agree with the data we have.

A Bayesian optimisation algorithm consists of an iterative process that repeatedly (and very very strategically) samples the parameter space based on what it already knows about the cost function's behaviour (the prior distribution), runs our model for the sampled parameters and calculates the cost function, and then updates its statistical description of the plausibility of all approximating functions with the new information. This updated description is the 'posterior' distribution. Each step of the optimisation algorithm is mathematically a Bayesian update, the core process of all Bayesian inference.


## Bayesian optimisation is good with expensive-to-run models

We choose a Bayesian optimisation algortihm because it prioritises reaching conclusions in _as few steps as possible_ of the optimisation iteration. Given how expensive it is to run our modelling chain (or even a component of the chain) we want to minimise the number of times we have to run the model. This isn't always a requirement of optimisation routines: many (even most) statistical models are fairly simple and create objective functions that can be evaluated in microseconds. In these cases the model's computational cost isn't a huge factor in algorithm design. Thankfully, a lot of people care about exactly this problem at the moment because a lot of people are exploring hyperparemeter spaces used to design machine learning models, which are usually quite expensive to train.

Bayesian optimisation does this by allowing our algorithm to consider more information (encoded as uncertainty) in its decision-making than other, simpler approaches. Each iteration of the optimisation algorithm is a bigger improvement on the previous step when compared with the stepwise improvements that other algorithms make (usually! It of course depends on you setting up your model and optimisation correctly). Note: a Bayesian approach isn't the only way to accomplish this! But it ends up being quite a neat one, in my opinion at least.

This comes at the cost of more computationally complex overhead to decide on the next sample, but when our model is expensive to run this is definitely worth it.


## The posterior distribution encodes uncertainty

Modelling the objective function as a probabilistic family of functions has other benefits. It means that the algorithm's final output is also a probabilistic distribution of functions, rather than a single parameter estimate (though it contains a 'best' parameter selection too). It tells us everything the algorithm has inferred about the cost function's form for each point in the parameter space we get a quantification of how plausible that combination of parameters is, given the observations we have (according to the cost function).

This posterior distribution is exceedingly useful because we can sample it like any other statistical distribution. The samples can be used to explore explore the full range of plausible impact functions generated through the calibration, and therefore the uncertainty in our impacts. 


## The algorithm

In this Bayesian formulation of the problem, each step of the iterative search proceeds as in the above outline:

- We want to approximate the cost function with a Gaussian process. The Bayesian approach assumes that there is a non-parametric statistical distribution of cost functions, each with a prior plausibility. If this is the first iteration, the priors are set by the user. Otherwise the posterior of the previous iteration becomes the prior for the next iteration.
- An aquisition function (see below) is used to choose a new point to sample from the parameter space, trying to maximise the additional information we gain from evaluating the objective function at this point.
- The model is run and the cost function calculated
- Our probabilistic model of plausible approximations of the objective function is updated with this new data point to generate a posterior distribution of plausible approximations of the objective function. This approximation will be more accurate and constrained than the prior.
- If some user-provided stopping criterion is met it stops. Otherwise it starts the next iteration.

The optimisation module underlying all our code uses this approach and is described in the documentation for the `bayesian-optimisation` package: https://github.com/bayesian-optimization/BayesianOptimization.

A wrapper around the algorithm has been produced by Lukas Riedel and is in CLIMADA's `calibration` package. Many many thanks to Lukas!

For more information on Bayesian optimisation in general see https://proceedings.neurips.cc/paper_files/paper/2012/file/05311655a15b75fab86956663e1819cd-Paper.pdf or https://arxiv.org/pdf/1012.2599v1


## What needs tuning in a Bayesian optimisation?

This is where my knowledge starts to run thin.

- **The acquisition function** is another cost function. This one is used to choose the next combination of parameters to sample from the parameter space, given what we already know about the space from the samples we've taken. There are lots of ways to pick an aquisition function, some of which are detailed in this paper https://proceedings.neurips.cc/paper_files/paper/2012/file/05311655a15b75fab86956663e1819cd-Paper.pdf. The choice is usually one between maximising a probability of improvement, an expected improvement, to minimise regret (or optimise based on confidence intervals). This latter approach is the one recommended by the python package chosen below, and so the one we use. It allows you to set the balance between thoroughly exploring the parameter space, and getting the local minimum exactly right. We use the out-of-the-box value for the parameter, but it could be worth balancing this towards more thorough exploration, since we don't need a precisely correct solution.
- **The covariance function** or the kernel is a key element of how the algorithm understands the behaviour of the space of Gaussian processes. We use the Matérn 5/2 kernel, which is recommended in the above paper and is the default in the package we're using. If the optimisation really struggles, we could look at other options
- There are other parameters too (e.g. the alpha parameter, which needs to be adjusted when the objective function is very noisy), but it doesn't seem worth looking into them.


## Summary: why Bayesian?

- It lets us use all available info without making further assumptions
- It works with probability distributions of how plausible different descriptions of our the objective function are, given the samples it has made.
- ... this distribution is _nonparametric_, i.e. it can have an arbitrary and complex functional form, which we would require for a complex multidimensional parameter space like the one we're exploring
- ... and this posterior probability distribution can be sampled as a way of exploring model uncertainty
- when we are uncertain about a parameter and have few observations, we can use expert judgement as our prior
- when we are uncertain about a parameter and have few observations, that uncertainty is present in the final model
- uncertainty isn't defined parameter-by-parameter, rather, is is a holistic quantification defined across the entire parameter space. Furthermore, during a calibration uncertainty is able to propagate through the modelling chain, meaning there isn't a false sense of precision when calibrating later model components 

Note: A Bayesian approach isn't the only way to solve these issues, however it one good way to solve them.

# Uncertainty simulations

## Sampling the posterior

One of the lovely things about Bayesian posteriors (and also priors) is that they are 'generative'. That means that you can sample from it, as with parametric distributions. This allows us to think of parameters we've calibrated as uncertain, not just as point estimates.

Until now our NCCS simulations have been running using our 'best' point estimates for each parameter. But we can also run simulations with other choices of each parameter. In theory, we can run hundreds of supply chain simulations, each one sampling from the uncertainty distribution of the calibrated parameters. This will give a 'full' range of uncertainty in the modelled results by repeating our analyses with humdreds of plausible tunings of the model. In the outputs of these simulations the highest impact isn't just from the most intense event in the event set, it's from the most intense event in the simulation with the most extreme impact function sample.

CLIMADA's `unsequa` module provides a suite of tools to set up these simulations.

## Storylines

In practice, however, we may find that these are far too computationally expensive. Currently a full modelling chain takes many many hours to run.

While we have a few ideas on how to speed it up (see below), and there are some neat ways to explore the parameter uncertainty space without recalculating all of the impacts, we may decide that we don't have the time or computational resources for a fully probabilistic exploration of the uncertainty space.

If this happens, we would fall back on the well-loved Storylines approach. In this case we run the modelling chain with our 'most likely' combination of parameters, and communicate this as our best guess. We then choose some other interesting combinations of parameters that we would like to explore. These are usually chosen with narrative interest, e.g. where asset losses at the high/low end of the uncertainty range, or where extreme events are particularly damaging, or where business interruption has a stronger/weaker effect than we estimate. Or a combination of these things. We re-run the modelling chain for each of these and report on what we see.


## How to speed up simulations
(Most readers can skip this section.)

- Deployment on the CelsiusPro VM. This has a _lot_ of computational resources.
- Component-wise uncertainty assessment instead of running the whole modelling chain.
- Linear scaling of direct impacts is cheap: if we already have an impact calculated, we can just multiply the impacts by a constant instead of recalculating them (with some caveats).
- Collapsing the event set: many events will look similar to each other, and will have similar impacts. We can 'prune' our event sets by identifying pairs of events that have similar impacts, removing one of the events, and adding its frequency to the other. This makes a very very small change to the expected losses and speeds up calculations. This is especially worth doing for less impactful events, since it is less important to model the full range of possible weak impacts.
- Reducing IO costs (mostly the need to read in Exposures multiple times and to store an Impact object's `imp_mat` attribute. Both of these could be avoided after Samuel Juhel's refactor of the Supply Chain module, we hope).
- Cacheing: (still thinking about this one...) Every time we run a component of the model pipeline, we can save some of the outputs in a cache along with the parameters that were used to generate them. These can then be used in future calculations: either as precalculated samples from the parameter space that can help build a prior, or as precalulated model runs when we exploring uncertainty in the calibrated model.
- Parallelisation: while the optimisation algorithm of choose sample, evaluate, repeat sounds very linear, there are ways to parallelise this. We can look into them if we need them.


# Implementation in the NCCS codebase

The calibration is being implemented in four Versions, each one improving on the previous. Version 1 starts with a simple calibration of impact functions for asset damages against observations, since this is the most uncertain part of the model with the most available observations. Version 2 introduces a more nuanced view of regional vulnerability to the asset loss calculation. Version 3 looks at the business interruption component of the model,  

## Version 1: calibrate direct asset loss impact functions

Here we take a very pragmatic approach. The goal is to calibrate everything component-wise, and to calibrate only the components where there is enough data for a meaningful level of certainty in the outputs.

In practice, that means calibrating direct asset loss impact functions for each hazard to EM-DAT loss data. In this first version we don't have enough data to meaningfully improve on the HAZUS functions in the business interruption components.

Therefore, we are calibrating this part of the modelling pipeline:

<img src="images/calibration_component_asset_loss.png" width="800" alt="The NCCS modelling pipeline with the asset loss component highlighted">


Zooming in on the calibrated component, the detailed optimisation looks like this:

<img src="images/calibration_component_asset_loss_detailed.png" width="500" alt="A flow chart for calibration against asset loss">


### Choice of impact function and parameter space

For each of the impact functions we're working with, we assume that the impact function we're fitting is a sigmoid function. This is very common in risk modelling.

A typical sigmoid curve in CLIMADA looks like this (this is the (old) out-of-the-box impact function for asset damage from tropical cyclone winds):

<img src="images/calibration_emanuel_sigmoid.webp" width="500" alt="CLIMADA's default impact function sigmoid curve for TC asset losses">

A sigmoid function is defined (for us) by three variables, `v_thresh`, `v_half` and `scale`:

- `v_thresh` is the point on the x-axis where impacts start
- `v_half` is the point on the x-axis where impacts reach 50% of `scale`
- `scale` is the maximum value of an impact (usually 100%)

Adjusting `v_thresh`and `v_half` together is equivalent to a translation of the function.

Adjusting `v_thresh` is equivalent to stretching the function along the x-axis (with `v_half` held constant).

Adjusting `scale` is equivalent to a vertical scaling of the function.

The most basic restrictions on the parameters are that they must all be positive, and `v_half` must always be larger than `v_thresh`. In practice we use more restrictive priors to narrow the search space.


### Choice of cost function

The choice of the cost function here is important, because it determines what we want to get right. There are two common choices in risk modelling like this: either the mean square difference or the mean square log difference. Both of these are common error statistics. The first is a measure of how far off you are when you're trying to reproduce each observation. The second tells you how far off _the right order of magnitude_ you are when trying to reproduce each observation. This affects what our model most cares about getting right.

The mean square difference cares most about getting higher losses right, since a $1,000 error is equally important for $10 k event as for a $10 bn event.

The mean square log difference cares most about getting events about right. Since is looks at logarithms of the values, a 50% error is equally important for a $10 k event as for a $10 bn event.

In the first calibrations here, we take the latter approach, hoping to get the order of magnitude about right. (Thinking about it, I think the former approach might be more justifiable: the main disruption to supply chains is likely to come from really big events, and maybe we want to focus on getting them right!)


### Uncertainty and generative models

As already said, the advantage of a Bayesian posterior distribution is that it can be sampled. This means that, after calibration, we can run model simulations with different plausible combinations of parameters and see how the losses vary, giving us an idea of the uncertainty in the model. For each Version of the calibration, we can do this and show where the model's uncertainties are, and how it compares to observations.
 

### V1 calibrated components

Each of these components is calibrated separately against the EM-DAT observations we have:

- Tropical cyclone asset loss against EM-DAT
- River flood asset loss against EM-DAT
- European windstorm asset loss against EM-DAT

There is some regional varition in the calibration. For both the tropical cyclone and river flood impact functions the globe is split into regions, and impact functions are fit independently for each region (and the European windstorm is already regional). This is improved on in Version 2. 


### Still to be solved:

- Relative crop yield (need observation data)
- If we have other loss observations beyond EM-DAT we could extend the calibration to include them
- Initial work suggests that searching for three parameters is too much, and the optimisation can't solve this well. We get a better, narrower fit by applying stricter priors based on previous studies (expert judgement), including holding `v_thresh` constant. 
- Some form of validation: likely a cross-validation


### Notes

- While in this example, the observations for validation are all single-event losses, the observations that we're validating against are allowed to be very heterogeneous. As long as it can be calculated from the model output and evaluated with a cost function, it can be compared to an observation. That means that we could add summary stats, e.g. expected annual impacts, from other studies. The methodology is flexible.
- There are sufficiently many observations here that our choice of priors aren't too important
- Wildfire is now no longer part of the NCCS study and isn't included in the modelling or calibration

## Version 2: add regional vulnerability to direct asset losses

Version 2 is similar to Version 1, but adds an extra degree of freedom to each calibrated impact function in the form of a vulnerability parameter. This is (proposed) to be taken from the ND-GAIN data that is already used by EBP in their metrics, since it's trusted by our group and ensures a consisten idea of 'vulnerability' between components of the project.

The calibration will assign a vulnerability index to each country and allow the vulnerability curves to change based on vulnerability. (Exactly what we use will take a little experimentation: should it be the raw ND-GAIN index, the country rank, quantile information, or something else? And which part of the impact function should it change? We can run a few quick tests.)

There are two challenges from  adding an extra degree of freedom:

- **Computational complexity:** as the parameter space grows in dimensionality, it takes longer for the algorithm to explore it and find the optimal parameter combinations
- **Parameter uncertainty:** for each additional parameter you add to a statistical model, there is a tradeoff between the explanatory power it provides versus the increasing uncertainty in each fitted parametr, especially in models trained on small datasetslike ours. Our hope is that vulnerability adds so much explanatory power that there is little additional uncertainty. It's a reasonable hope, since we know losses depend a lot on a country's vulnerability, and we'll be able to train the function on more data since we won't be fitting independent functions for different world regions. We'll be able to compare the V1 and V2 calibrations and decide where the vulnerability information improves the model.

### V2 calibration components
These are the same as the V1 components

### Still to be solved

- The best way to represent the effect of national vulnerability in a family of impact functions, discussed above.


## Version 3: add expert judgement (and any observations) for uncalibrated components

Until now, the calibration has only focussed on calibrating the asset loss calculations. We now look at the Business Interruption componenet. This section looks only at the calculations of BI from asset losses (adjusting the entire chain from hazard to BI is discussed below):

<img src="images/calibration_component_BI.png" width="800" alt="The NCCS modelling pipeline with the production loss component highlighted">

Currently we have very few observations of business interruption, meaning that this calibration won't change much. We can use some expert judgement to specify our priors (some confidence interval around the Hazus values), and that might be as far as we can go. If that is the case, this stage of the calibrations will mostly serve to quantify some uncertainty around business interruption.

The corrections will be implemented as small tweaks to the existing Hazus functions. These will solve two issues:
- The Hazus functions were created for the US and trained on US-only data. We may want to come up with a correction factor to account for regional differences in BI around the world. (Note the _may_ here: I argue against this below.)
- The Hazus functions were trained on different hazard data to the hazard we're using in NCCS. Two hazard datasets are likely to disagree, and there will be biases when one is compared to the other. We need to correct for this. This can either be done as an explicit bias correction (if we're able to compare our modelled losses to those in Hazus – we're looking into this) or the biases will be implicity corrected later when these parameters are adjusted as part of the whole-pipeline calibration.


### V3 calibration components

- Business interruption as a function of (precalculated) asset loss: forestry, service, manufacturing, mining, energy (and any other) economic subsectors against any observations that we can find.


### Still to be solved

- What mathematical form do we want to use to describe perturbations to the (empirical) functions built by Hazus? Ideally we would apply some linear transformation of the production loss function (translating and scaling it), but this would require us to estimate two parameters. Given our lack of observations, we may need to reduce this complexity and provide a simple scaling. To further simplify things, we may want to reduce the number of sector-adjustments we train: perhaps we decide to train a single adjustment to use with manufacturing, energy and mining. Perhaps we decide to apply a single bias-correcting adjustment to all sectors. We'll only know when we've finished collecting our observation data and can see how well things agree.
- How do we choose priors for the adjustments? This will most likely be done with expert judgement. Without any additional information, a normal distribution around the Hazus values is reasonable. 
- How to describe the effect of regional vulnerability, if at all? We will have very little data for this, so I will argue that we should try to avoid a regional adjustment here: the theoretical justification is that regional vulnerabilities are already included earlier in the pipeline during the asset loss calculation. This means that we are already modelling higher asset losses in more vulnerable countries compared to less vulnerable countries, and this translates into higher business interruption. We will have to inspect the available and modelled data to see if it holds up.


### Notes

- We don't model business interruption for agriculture: we assume the asset loss of crops is equal to the production loss 

# Version 4: calibrate entire modelling pipeline

Version 4 is the big conceptual shift. Instead of calibrating individual components we calibrate multiple components at once:

<img src="images/calibration_whole_chain.png" width="800" alt="The NCCS modelling pipeline for a whole-pipeline calibration">

(This diagram doesn't include the right hand side of the modelling chain because I'm not expecting to have observations for it. It would be easy, though not cheap, to extend this.)

While calibrating individual components is good, it doesn't guarantee that we have a good calibration when we combine them into a single pipeline. This is especially true in the NCCS chain where the production loss impact curves are produced from HAZUS data which was calibrated against different hazard data to the CLIMADA asset loss data. That means that if, for example, the CLIMADA hazard has a low bias compared to the hazard used in HAZUS calibrations, it will propagate through the production loss chain, giving low biases to the modelled production loss, even though each component was calibrated.

The purpose of a whole-chain calibration is to fix these biases, adjusting all parameters simultaneously. (Assuming, however, that we have observations for the whole chain, i.e. relating hazard to business interruption – even if it's just return-period BI.)

### Notes

- This is considerably more computationally complex because we can't run separate calibrations for each hazard or each sector. That means exploring a huge parameter space: 
- We would probably use narrower priors in this calibration, given the computational complexity. The adjustments here are (hopefully) just to adjust for biases, rather than find completely different solutions to the ones found in the calibration of the individual components.
- There's a risk that calibrating the entire pipeline in one go gives too many degrees of freedom (tweaking both impact functions and BI functions simultaneously), and the calibration is unable to identify a narrow region of plausible parameter combinations that explain our observations. This is something we'll have to narrow down with expert judgement and manual comparisons with anyu observations. If the results are still too uncertain, we will take a more pragmatic approach and calibrate the model pipeline components separately, focussing more on the bias correction of impacts that are passed into the BI calculations.


# Conclusion

Calibration is hard. We will need to make a bunch of careful expert judgements to constrain our uncertainty, but by the end of this we should have a pipeline that will give us our calibrated best guess of supply chain risk, and uncertainty ranges around it. 