Skip to content

Latest commit

 

History

History
128 lines (92 loc) · 3.85 KB

validation.rst

File metadata and controls

128 lines (92 loc) · 3.85 KB

Validation

We must always test the performance of a linear predictor, a procedure known as 'validation'. In leave-p-out cross-validation, instead of using a validation set distinct from X, we partition X into a set containing p elements, XP, and a set containing n − p elements, X. We construct a predictor for each Xi ∈ XP based on X, namely i.

Commonly, p = 1. We denote the j-th leave-one-out linear predictor of Xj by  − j. Similarly we denote the j-th leave-one-out linear predictor of Xj by  − j. The leave-one-out cross-validation linear predictor residual is


e( − j) = Zj −  − j,

and the standardized leave-one-out cross-validation linear predictor residual is

$$d(\hat{Z}_{- j}) = \dfrac{Z_{j} - \hat{Z}_{- j}}{\sqrt{\operatorname{var}(\hat{Z}_{- j})}}.$$

The behaviour of the observed predictor residuals should be consistent with the assumptions we have made about their distribution.

A good summary statistic is provided by the leave-one-out cross-validation score

$$R^{2} = \dfrac{1}{n}\sum_{i}d(\hat{Z}_{- i})^{2}.$$

PyMimic and validation

The LOO residuals and their variances are stored as the attribute loocv belonging to the classes BLP and BLUP. The module mim.plot contains the function diagnostic(), which plots (1) the standardized LOO residuals against the LOO predictions and (2) the LOO predictions against against their true values.

Let us validate the predictor we used in fitting a curve to a noisy sample of the Branin function.

Generate the sample.

python

>>> import pymimic as mim >>> bounds = [[-5., 10.], [0., 15.]] >>> ttrain = mim.design(bounds) >>> xtrain = mim.testfunc.branin(ttrain.T) + 10.np.random.randn(20)

Then create a BLUP object.

python

>>> blup = mim.Blup(ttrain, xtrain, 10.**2.) >>> blup.opt() direc: array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]]) fun: 169.40632097688868 message: 'Optimization terminated successfully.' nfev: 146 nit: 3 status: 0 success: True x: array([1.57069249, 0.08700307, 0.01322379])

Now compute the the LOO prediction residuals and their variances.

python

>>> blup.loocv (array([ 51.2065913 , -25.12770775, 19.39411263, -26.15146519, 4.46734306, -15.74786283, 23.69329757, 4.60761066, 11.70811329, -17.17898489, 2.20727369, -31.45013478, 31.868371 , 11.49837525, 16.08973502, -11.11629167, -10.78144937, -18.14250451, -3.22056033, 8.3367355 ]), array([999.76676916, 904.42982646, 888.29496818, 382.93648001, 817.77977058, 279.85877825, 267.51340703, 712.60225493, 267.13441485, 426.51943097, 426.51943097, 267.13441485, 712.60225493, 267.51340703, 279.85877825, 817.77977058, 382.93648001, 888.29496818, 904.42982646, 999.76676916]))

Also compute the LOOCV validation score.

python

>>> blp.R2 9992.090822466882

Now plot the results.

python

>>> mim.plot.diagnostic(xtrain, *blup.loocv)

The result is show in branin_blp.

Left: the standardized LOO residuals plotted against the LOO predictions. Right: the LOO predictions plotted against their true values.Left: the standardized LOO residuals plotted against the LOO predictions. Right: the LOO predictions plotted against their true values.

The standardized LOO residuals are small and randomly distributed. When plotted against their true values, the predictions lie around the diagonal. So we say, in this case, that the predictor has passed validation, and that we may trust the fitted curve and its associated prediction interval.