<div style="float:right">
    <table>
    <tr>
        <td> <img src="../figs/pangeos-small-1.png" alt="PANGEOS" style="width:200px;height:45px;"/> 
        <td> <img src="../figs/kcl_logo.png" alt="King's College London" style="width:54px;height:40px;"/> 
        <td> <img src="../figs/nceo_logo.png" alt="NCEO" style="width:200px;height:40px;"/> 
        <td> <img src="../figs/multiply_logo.png" alt="H2020 Multiply" style="width:40px;height:40px;"/>
    </tr>
    </table>
</div>
&nbsp;

# Biophysical parameter retrieval from Sentinel 2-like data

Author: J Gómez-Dans (`jose.gomez-dans@kcl.ac.uk`)

This notebook demonstrates the effect of different sources of uncertainty in biophysical parameter retrievals. Biophysical parameter retrieval is a so-called *inverse problem:* while we have a bunch of radiative transfer (RT) models that allow us to predict optical (and/or thermal, SIF or microwave) observations, we are often interested in the opposite problem: retrieve parameters of interest from EO observations.

In this notebooks, we assume that the PROSAIL model is adequate for our goals. The assumption of a continuous canopy is often met for grasslands and mature cereal/grassy crops, but it's probably a stretch for forests. 
You should carefully evaluate the assumptions of your model prior to using it!

## Data modelling

We will try to model a realistic acquisition scenario, and we will try to simulate some of the processes that affect the data gathered by a spaceborne sensor. Due to its wide availability and general excellent performance, we will focus on Sentinel 2, but note that you could easily modify this to work on other sensors.

The procedure is as follows:
1. Simulate *true* surface reflectance using the PROSAIL model (use 9 S2 bands)
2. Propagate the true surface reflectance to top of the atmosphere, and add some (thermal) noise.
3. Perform an uncertaint atmospheric correction to retrieve surface reflectance.
4. Invert the PROSAIL model taking into account the uncertainties, using different prior constraints.
5. Visualise spectra, uncertainties and parameters

### Data simulation

We will start by using the PROSAIL model (PROSPECT-D and SAIL) to perform a simulaiton of reflectance in 400-2500 nm range. This will then be integrated over the Sentinel 2 land bands.

### Propagate to top of the atmosphere

Assuming a Lambertian surface-atmospheric coupling for simplicity, we wil use estimates of aersol optical thickness and total column water vapour to propagate the surface measurements to TOA. This is done via a look up table previoulsy created using the 6s model. At this point, we have propagated the measurements, but there is no uncertainty. We will also add at this stage some noise, given as a percentage (and identical for all bands)

### Uncertain atmospheric correction

Atmospheric correction schemes work by first inferring atmospheric composition (AOT and TCWV mostly), and subtracting their effect. Following [Gorroño et al (2024)](https://doi.org/XXXXX), we will assume an imperfect atmospheric composition estimation, with quantified errors in AOT and TCWV. Proper atmospheric correction schemes, such as [SIAC](https://doi.org/XXXX) do provide a per pixel uncertainty. 

We will sample from the uncertain AOT and TCWV estimates and generate a set of corrected surface reflectances (an *ensemble*) that we can use to calculate uncertainty statistics.

### Invert the RT model

The model inversion is done using a simple Metropolis-Hastings MCMC sampler. This takes forever, but is quite a juicy tool that allows you to explore all the assumptions. The inversion sets up a log-likelihood, where we assume that the surface reflectance is corrupted by additive zero mean Gaussian noise. This Gaussian noise has a covariance structure given by the sum of the uncertainty of the atmospheric correction ensemble and the (white) thermal noise.

We can complement the log-likelihood with different prior assumptions, from the very naive, to the more sophisticated. The naive assumption is just a uniform distribution for all parameters (so parameter boundaries), and the more realistic priors are derived from the archetype work of [Yin et al (2024)]() for wheat canopies. These include a mean vector and associated covariance matrices for a:
* generic wheat crop
* early season crop
* mid season crop
* late season crop

In [1]:
%load_ext autoreload
%autoreload 2
from pangeos_uq.param_retrieval_gui import create_prosail_gui

In [None]:
# Define the function to run the simulation



boundaries_example = {
    "N": (1.0, 3.0),
    "LAI": (0.1, 8.0),
    "ALA": (0.0, 90.0),
    "Cab": (0.0, 120.0),
    "Cw": (0.0, 0.06),
    "Cm": (0.0, 0.02),
    "Cbrown": (0.0, 1.0),
    "psoil": (0.0, 1.0),
    "rsoil": (0.0, 1.0),
    "sza": (0.0, 90.0),
    "vza": (0.0, 90.0),
    "raa": (0.0, 180.0),
    "AOT": (0.01, 2.0),
    "TCWV": (0.1, 10.0),
}

create_prosail_gui(boundaries_example)

VBox(children=(Output(), HBox(children=(VBox(children=(FloatSlider(value=2.0, description='N', max=3.0, min=1.…

atmospheric_parameters(LUT=<pangeos_uq.sixs_lut.LUTQuery object at 0x7aa108687bd0>, AOT=0.1513561248436208, TCWV=5.050000000000001, AOT_unc=0.04551179505629651, TCWV_unc=1.1800000000000002)


Sampling: 100%|██████████| 50000/50000 [03:24<00:00, 244.27it/s]


Saved posterior samples to 2024-09-23T22:12:39.620216_posterior_samples.npz


<a rel="license" href="http://creativecommons.org/licenses/by-nc/4.0/"><img alt="Creative Commons Licence" style="border-width:0" src="https://i.creativecommons.org/l/by-nc/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc/4.0/">Creative Commons Attribution-NonCommercial 4.0 International License</a>.