Skip to content

Estimating Prediction Uncertainty

litwiti edited this page Mar 9, 2020 · 1 revision

Estimating Prediction Uncertainty

Conceptual Introduction

Drawing accurate conclusions from a calibrated model requires measures of how precise the currently estimated model parameters and predictions are. To infer parameter uncertainties from the model, the profile likelihood approach is employed, see for example Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood . This idea reduces to varying the parameter of interest as a fixed constraint, re-optimizing the objective funtion for all the other parameters.

Similarly, in order to obtain a prediction uncertainty for a specific experimental condition (e.g. observable, time point), prediction profiles can be introduced, which instead of a parameter of interest fix a prediction of interest to several values before re-optimizing all parameters under this constraint. Confidence intervals then can be derived by finding where the profiles cross a specific threshold.

Direct calculation of prediction profiles would lead to optimization under non-linear constraints, which is numerically challenging. Validation profiles can be used to circumvent this issue: By including a data point with varying values and some uncertainty to the objective function and re-optimizing under no constraint for each data value, a confidence interval for a possible next measurement of such a data point can be derived. Since optimization is performed unconstrained and prediction profiles can be inferred from validation profiles, prediction profiles can be efficiently computed indirectly. Such an approach is described in Likelihood based observability analysis and confidence intervals for predictions of dynamic models . This relation is visualized in the following figure:

In the figure above, each of the sampled validation data points (z, VPL) is mapped to a prediction profile data point (pred, PPL), yielding a prediction profile over a smaller range of values. Note that if the standard deviation of the newly included data point approaches zero, validation profiles should converge to prediction profiles.

How to sample validation/prediction profiles with d2d

Prediction and validation profile calculation is supported in d2d by use of the function VPL, which first and foremost is an efficient and robust validation profile sampling algorithm. Since prediction uncertainties are frequently of interest, modifications to calculate prediction profiles by use of VPL are implemented as well. This article aims at presenting a comprehensive introduction on how this function is used. Note that the principle of the sampling procedure is in essence the same as for parameters, hence quickly glance over the article Adapting the PLE Settings if the idea is unknown to you.

Initializing the main algorithm: InitVPL

First of all, all calculation results and configs can be found in the struct ar.vpl . In order to calculate validation profiles, it should be clarified which user input is mandatory to correctly initialize the calculation. Most of this comes down to specifying the experimental condition of the prediction which consists of the model index m, the data index d, the index of the considered observable idpred and a time point tpred of interest. Since the validation profile algorithm temporarily includes a data point in the objective function, specifying its standard deviation sigma is recommended, but if it has not been specified it will be simulated if possible. This user input has to be passed to the function

InitVPL(m,d,idpred,tpred,sigma,stepmode)

which initializes the profile calculation. The argument stepmode is optional and will be explained in a later section. InitVPL sets up the substructs

ar.vpl.general
ar.vpl.config

where information on the experimental condition will be stored in the field 'general' and default algorithm configurations are set up in the field 'config'. Algorithm properties can now be easily adapted by assigning new values to the corresponding fields in ar.vpl.config. Furthermore, it is recommended to initialize the calculation from a local (or even global optimum).

Sampling the profile: VPL

If you initialized the algorithm correctly, simply calling

VPL

does the rest automatically. Use of this function creates two new fields, namely:

ar.vpl.results
ar.vpl.test

In the field 'test' some algorithm intern results are saved. The field 'results' contains everything which is most likely important to the end user, such as calculation results for validation and prediction profiles. To this end, note that pairing the fields 'z' with 'chi2' pertains to validation profiles, while the fields 'pred' and 'ppl' pertain to prediction profiles.

How to calculate prediction profiles?

It was explained in the introduction that prediction profiles can be inferred from validation profiles. This means that by calculating the validation profile over a certain interval, you will also obtain a corresponding prediction profile on a smaller interval (for not directly obvious reasons).

  • By default, the validation profile sampling is continued only for as long as a certain confidence threshold is not yet crossed. This implies that the prediction profile is probably not sampled to its respective confidence threshold. Thus, to sample the validation profile until the prediction profile crosses its threshold, set the flag ar.vpl.config.prediction = 1 .

  • In order to efficiently calculate prediction profiles, it is strongly recommended to set the standard deviation to a value at least as small as the range of prediction uncertainty (which of course is not known before attempting the calculation). Since validation and prediction profiles should get more similar in theory for smaller sigma, decreasing this value will usually improve the result, unless it is too small and causes numerical instabilities.

  • If this does not work, specifying another step choice algorithm (which chooses by how much the data point is changed in each iteration) might solve the issue. By simulataneous use of stepmode = 1 and ar.vpl.config.prediction = 1 an alternative step choice algorithm which decreases the step size if the last observed difference of prediction profile values exceeds ar.vpl.config.chi2dif_max and increases the step size if it falls below ar.vpl.config.chi2dif_min is enabled, directly controlling the step size by virtue of the prediction profile rather than the validation profile. Note that stepmode has to be set when initializing by InitVPL.

  • In order to cover prediction profiles for model predictions which are not equal to any of the specified model obserables, the corresponding prediction must be specified as an observable with no data points and a fixed error when building the model. (Developer Note: It would be nice to do this without having to mess with the model structure.)

Other useful algorithm options

As it was already mentioned earlier, most algorithm options can be specified after InitVPL has been executed.

  • ar.vpl.config.chi2dif_max = 0.2 and ar.vpl.config.chi2dif_min = 0.05 set different properties depending on which step shoice algorithm is employed. In default settings (stepmode = 2), ar.vpl.config.chi2dif_max is the maximal allowed change of the validation profile value in one step. This can be dynamically checked while proposing data steps without optimization of the objective function. If the observed change of the last step falls below ar.vpl.config.chi2dif_min, the next proposed step is further increased, disreagrding the (theoretical) threshold set by ar.vpl.config.chi2dif_max.
  • Changing ar.vpl.config.maxsteps = 100 changes the amount of steps which are sampled (for each upward and downward direction from the starting data point).
  • Setting ar.vpl.showCalculation = 0 suppresses the temporary output results.
  • ar.vpl.config.firststep controls the default first step size from the start and is the default step size which is attempted if the objective function change is negative. This is a safety measure since the adaptive choice can not reasonably control the case of a decreasing objective function.
  • ar.vpl.config.chi2max sets the threshold for when the sampling should be stopped. By default, this is chosen as to be a reasonable choice for 95%-confidence intervals.
  • ar.vpl.config.minstepsize and ar.vpl.config.maxstepsize set the minimal/maximal value allowed for any data step. ar.vpl.config.maxrange specifies a maximal range of data values which can be spanned before the calculation is terminated. All of these settings are defined in relation to sigma to set a reasonable data scale.
  • If there are discontinuities in the resulting validation profile, vplSmooth can be used to attempt to remove them.
  • A basic visualization of the resulting profile is given by vplPlot.

A basic example

This piece of code could be used in the Becker model found in ...\Examples\Becker_Science2010 to calculate the prediction profile for observable Epo_mem_cpm at time point t = 70.

Setup; % Set up model and data
InitVPL(1,1,2,70,0.02); % Set sigma = 0.02 manually 
ar.vpl.config.prediction = 1; % Sample the prediction profile until threshold 
ar.vpl.config.chi2dif_max = 0.4; % Increase allowed VPL change per step
VPL; % Calculate the validation profile
vplPlot; % Visualize the result
Clone this wiki locally