Skip to content

Commit

Permalink
Docs on how to use the emulator on PROSAIL
Browse files Browse the repository at this point in the history
  • Loading branch information
Jose Gomez-Dans committed Oct 31, 2018
1 parent 04743e7 commit 2680433
Show file tree
Hide file tree
Showing 6 changed files with 706 additions and 126 deletions.
141 changes: 141 additions & 0 deletions docs/emulating_rt_model.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
Emulating a typical radiative transfer model
===============================================

This package was designed to emulate radiatie transfer models. The process entails the following steps:

1. Decide on what input parameters are required
2. Decide their ranges
3. Generate a training input parameter set
4. Run the model for each element of the input training set and store the outputs
5. Pass the input and output training pairs to the library, and let it fit the hyperparameters

We can show how this works with an example of the PROSPECT+SAIL model.

Setting the input parameter ranges
-----------------------------------

We can set the parameter names and their ranges simply by having lists with minimum and maximum values. This assumes a uniformly-distributed parameter distribution between those two boundaries, but other distributions are possible (we never had any reason to try them though!). We additionally set up the SRFs and other variables that need to be defined here... We train the model on 250 samples and test on (say) 100. 100 validation samples is probably too few, but for the sake of not waiting too much... ;-)

.. code-block:: python
from functools import partial
import numpy as np
import gp_emulator
import prosail
# Spectral band definition. Just a top hat, with start and
# end wavlengths as an example
b_min = np.array( [ 620., 841, 459, 545, 1230, 1628, 2105] )
b_max = np.array( [ 670., 876, 479, 565, 1250, 1652, 2155] )
wv = np.arange ( 400, 2501 )
passband = []
# Number of training and validation samples
n_train = 250
n_validate = 100
# Validation number is small, increase to a more realistic value
# if you want
# Define the parameter names and their ranges
# Note that we are working here in transformed coordinates...
# Define geometry. Each emulator is for one geometry
sza = 30.
vza = 0.
raa = 0. # in degrees
parameters = [ 'n', 'cab', 'car', 'cbrown', 'cw', 'cm', 'lai', 'ala', 'bsoil', 'psoil']
min_vals = [ 0.8 , 0.46301307, 0.95122942, 0. , 0.02829699,
0.03651617, 0.04978707, 0.44444444, 0. , 0.]
max_vals = [ 2.5 , 0.998002 , 1. , 1. , 0.80654144,
0.84366482, 0.99501248, 0.55555556, 2. , 1 ]
We then require a function for calling the RT model. In the case of PROSAIL, we can do that easily from Python, in other models available in e.g. Fortran, you could have a function that calls the external model


.. code-block:: python
def inverse_transform ( x ):
"""Inverse transform the PROSAIL parameters"""
x_out = x*1.
# Cab, posn 1
x_out[1] = -100.*np.log ( x[1] )
# Cab, posn 2
x_out[2] = -100.*np.log ( x[2] )
# Cw, posn 4
x_out[4] = (-1./50.)*np.log ( x[4] )
#Cm, posn 5
x_out[5] = (-1./100.)*np.log ( x[5] )
# LAI, posn 6
x_out[6] = -2.*np.log ( x[6] )
# ALA, posn 7
x_out[7] = 90.*x[7]
return x_out
def rt_model ( x, passband=None, do_trans=True ):
"""A coupled land surface/atmospheric model, predicting refl from
land surface parameters. Thisfunction provides estimates of refl for
a particular illumination geometry.
The underlying land surface reflectance spectra is simulated using
PROSAIL. The input parameter ``x`` is a vector with the following components:
* ``n``
* ``cab``
* ``car``
* ``cbrown``
* ``cw``
* ``cm``
* ``lai``
* ``ala``
* ``bsoil``
* ``psoil``
"""
x, sza, vza, raa = x
# Invert parameter LAI
if do_trans:
x = inverse_transform ( x )
################# surface refl with prosail #####################
surf_refl = prosail.run_prosail(x[0], x[1], x[2], x[3], x[4], x[5], x[6], x[7], 0.01, sza, vza, raa,
rsoil=x[8], psoil=x[9])
if passband is None:
return surf_refl
else:
return surf_refl[passband].mean()
Now we loop over all the bands, and prepare the emulators. we do this by using the `create_emulator_validation` function, that does everything you'd want to do... We just stuff the emulator, training and validation sets in one list for convenience.

.. code-block:: python
retval = []
for iband,bmin in enumerate ( b_min ):
# Looping over the bands....
print("Doing band %d" % (iband+1))
passband = np.nonzero( np.logical_and ( wv >= bmin, wv <= b_max[iband] ) )
# Define the SRF for the current band
# Define the simulator for convenience
simulator = partial (rt_model, passband=passband)
# Actually create the training and validation parameter sets, train the emulators
# and return all that
x = gp_emulator.create_emulator_validation (simulator, parameters, min_vals, max_vals,
n_train, n_validate, do_gradient=True,
n_tries=15, args=(30, 0, 0) )
retval.append (x)
A simple validation visualisation looks like this

.. figure:: prosail_emulator_new_300.png
:figwidth: 90%

Comparison between the simulated output and the corresponding emulator output for the validation dataset. Correlations (R2) are in all cases better than 0.99. Slope was between 0.97 and 1., whereas the bias term was smaller than 0.002.
2 changes: 2 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ Welcome to gp_emulator's documentation!
:caption: Contents:

introduction.rst
quickstart.rst
emulating_rt_model.rst


Indices and tables
Expand Down
120 changes: 0 additions & 120 deletions docs/introduction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,123 +28,3 @@ or just clone or download the source repository and invoke `setup.py` script: ::

python setup.py install
Quickstart
===========

Single output model emulation
----------------------------------

Assume that we have two arrays, ``X`` and ``y``. ``y`` is of size ``N``, and it stores the ``N`` expensive model outputs that have been produced by running the model on the ``N`` input sets of ``M`` input parameters in ``X``. We will try to emulate the model by learning from these two training sets: ::

gp = gp_emulator.GaussianProcess(inputs=X, targets=y)
Now, we need to actually do the training... ::

gp.learn_hyperparameters()

Once this process has been done, you're free to use the emulator to predict the model output for an arbitrary test vector ``x_test`` (size ``M``): ::

y_pred, y_sigma, y_grad = gp.predict (x_test, do_unc=True,
do_grad=True)
In this case, ``y_pred`` is the model prediction, ``y_sigma`` is the variance associated with the prediction (the uncertainty) and ``y_grad`` is an approximation to the Jacobian of the model around ``x_test``.

Let's see a more concrete example. We create a damped sine, add a bit of Gaussian noise, and then subsample a few points (10 in this case), fit the GP, and predict the function over the entire range. We also plot the uncertainty from this prediction.


.. plot::
:include-source:

import random
import numpy as np
import matplotlib.pyplot as plt

import gp_emulator

random.seed(111)
n_samples = 2000
x = np.linspace(0, 2, n_samples)
y = np.exp(-0.7*x)*np.sin(2*np.pi*x/0.9)
y += np.random.randn(n_samples)*0.02
plt.plot(x, y, '-', label="Original")
# Select a few random samples from x and y
isel = random.choices(range(n_samples), k=10)
x_train = np.atleast_2d(x[isel]).T
y_train = y[isel]
plt.plot(x_train[:,0], y_train, 'o', label="Samples")

gp = gp_emulator.GaussianProcess(x_train, y_train)
gp.learn_hyperparameters(n_tries=25)

y_pred, y_unc, _ = gp.predict(np.atleast_2d(x).T,
do_unc=True, do_deriv=False)
plt.plot(x, y_pred, '-', lw=2., label="Predicted")
plt.plot(x, np.exp(-0.7*x)*np.sin(2*np.pi*x/0.9), '-', label="True")
plt.fill_between(x, y_pred-1.96*y_unc,
y_pred+1.96*y_unc, color="0.8")
plt.legend(loc="best")


We can see that the GP is doing an excellent job in predicting the function, even in the presence of noise, and with a handful of sample points. In situations where there is extrapolation, this is indicated by an increase in the predictive uncertainty.



Multiple output emulators
--------------------------

In some cases, we can emulate multiple outputs from a model. For example, hyperspectral data used in EO can be emulated by employing the SVD trick and emulating the individual principal component weights. Again, we use ``X`` and ``y``. ``y`` is now of size ``N, P``, and it stores the ``N`` expensive model outputs (size ``P``) that have been produced by running the model on the ``N`` input sets of ``M`` input parameters in ``X``. We will try to emulate the model by learning from these two training sets, but we need to select a variance level for the initial PCA (in this case, 99%) ::

gp = gp_emulator.MultivariateEmulator (X=y, y=X, thresh=0.99)
Now, we're ready to use on a new point ``x_test`` as above: ::

y_pred, y_sigma, y_grad = gp.predict (x_test, do_unc=True,
do_grad=True)


A more concrete example: let's produce a signal that can be decomposed as a sum of scaled orthogonal basis functions...


.. plot::
:include-source:

import random
import numpy as np

from scipy.fftpack import dct

import matplotlib.pyplot as plt
import gp_emulator

random.seed(111)

n_validate = 250
n_train = 100
basis_functions = dct(np.eye(128), norm="ortho")[:, 1:4]

params=["w1", "w2", "w3"]
mins = [-1, -1, -1]
maxs = [1, 1, 1]


train_weights, dists = gp_emulator.create_training_set(params, mins, maxs,
n_train=n_train)
validation_weights = gp_emulator.create_validation_set(dists,
n_validate=n_validate)

training_set = (train_weights@basis_functions.T).T

training_set += np.random.randn(*training_set.shape)*0.0005
validation_set = (validation_weights@basis_functions.T).T
gp = gp_emulator.MultivariateEmulator (y=train_weights, X=training_set.T,
thresh=0.973, n_tries=25)
y_pred = np.array([gp.predict(validation_weights[i])[0]
for i in range(n_validate)])

fig, axs = plt.subplots(nrows=1, ncols=2,sharey=True,figsize=(12, 4))
axs[0].plot(validation_set[:, ::25])
axs[1].plot(10.*(y_pred.T - validation_set))
axs[0].set_title("Samples from validation dataset")
axs[1].set_title("10*Mismatch between validation simulator and emulator")

0 comments on commit 2680433

Please sign in to comment.