# Error fields
This notebook illustrates the error field computation using different techniques, including the so-called *Clever Poor Man's Error* (CPME) method.

In [None]:
using DIVAnd
using PyPlot
using Dates
using Statistics
using LinearAlgebra
include("../config.jl")

## Data reading
Download the file (it not already done) and read it.

In [None]:
varname = "Salinity"
filename = salinityprovencalfile
download_check(salinityprovencalfile, salinityprovencalfileURL)

In [None]:
obsval,obslon,obslat,obsdepth,obstime,obsid = loadobs(Float64, salinityprovencalfile, varname);

## Topography and grid definition
See the notebook on [bathymetry](../2-Preprocessing/06-topography.ipynb) for more explanations.

Define domain and resolution, create the domain.

In [None]:
dx = dy = 0.125/2.
lonr = 2.5:dx:12.
latr = 42.3:dy:44.6

mask,(pm,pn),(xi,yi) = DIVAnd_rectdom(lonr,latr);

Download the bathymetry file and load it.

In [None]:
bathname = gebco04file
download_check(gebco04file, gebco04fileURL)

In [None]:
bx,by,b = load_bath(bathname,true,lonr,latr);

Create a land-sea mask based on the bathymetry.

In [None]:
mask = falses(size(b,1),size(b,2))

for j = 1:size(b,2)
    for i = 1:size(b,1)
        mask[i,j] = b[i,j] >=1.0
    end
end

## Data selection for example

Cross validation, error calculations etc. assume independant data. Hence do not take high-resolution vertical profiles with all data but restrict yourself to specific small depth range. Here we limit outselves to August data at surface:

In [None]:
sel = (obsdepth .< 1) .& (Dates.month.(obstime) .== 8)

obsval = obsval[sel]
obslon = obslon[sel]
obslat = obslat[sel]
obsdepth = obsdepth[sel]
obstime = obstime[sel]
obsid = obsid[sel];
@show (size(obsval))
checkobs((obslon,obslat,obsdepth,obstime),obsval,obsid)

### Analysis
Analysis parameters have been calibrated in the other notebook [13-processing-parameter-optimization example.ipynb](13-processing-parameter-optimization). 

⚠ if the statistical parameters are incorrectly estimated, the error fields are meaningless and only provide an idea of data coverage.

The analysis parameters are:

In [None]:
len = 0.3
epsilon2 = 1.0;

Analysis `fi` using mean data as background.    
Structure `s` is stored for later use in error calculation.

In [None]:
fi, s = DIVAndrun(mask,(pm,pn),(xi,yi),(obslon,obslat),obsval.-mean(obsval),len,epsilon2);

Create a simple plot of the analysis

In [None]:
pcolor(xi,yi,fi.+mean(obsval),vmin=37,vmax=38.5);
colorbar(orientation="horizontal")
contourf(bx,by,copy(b'), levels = [-1e5,0],colors = [[.5,.5,.5]])
aspectratio = 1/cos(mean([ylim()...]) * pi/180)
gca().set_aspect(aspectratio)

## Exact error and approximations

Details can be found in the publication:

Approximate and Efficient Methods to Assess Error Fields in Spatial Gridding with Data Interpolating Variational Analysis (DIVA) Beckers, Jean-Marie; Barth, Alexander;  Troupin, Charles, Alvera-Azcarate, A.  *Journal of Atmospheric & Oceanic Technology* (2014), **31(2)**, 515-530     
https://orbi.uliege.be/handle/2268/161069      
https://journals.ametsoc.org/doi/abs/10.1175/JTECH-D-13-00130.1

In the 2D case you can try to calculate the exact error expression. This demands the computationally expensive evaluation of `diag(s.P)` accessible via the analysis returned structure `s`. This is only available with `DIVAndrun`.

In [None]:
# plots the error field `exerr`
function ploterr(exerr; vmin=0, vmax=1.5, cmap="hot_r")
    pcolor(xi,yi,exerr,vmin=vmin, vmax=vmax, cmap=cmap);
    colorbar(orientation="horizontal")
    contourf(bx,by,copy(b'), levels = [-1e5,0],colors = [[.5,.5,.5]])
    plot(obslon, obslat, "k.", markersize=.5)
    ylim(extrema(yi))
    gca().set_aspect(1/cos(mean([ylim()...]) * pi/180))
end

In [None]:
exerr, = statevector_unpack(s.sv,diag(s.P),NaN)
ploterr(exerr)
title("Unscaled error");

Relative error by scaling with background variance `Berr` estimated using data with high errors

In [None]:
epsilon2huge=1E6
fib,sb = DIVAndrun(mask,(pm,pn),(xi,yi),(obslon,obslat),obsval,len,epsilon2huge);
Berr,= statevector_unpack(sb.sv,diag(sb.P));

ploterr(exerr./Berr)
title("Scaled error");

In [None]:
cpme = DIVAnd_cpme(mask,(pm,pn),(xi,yi),(obslon,obslat),obsval,len,epsilon2)

ploterr(cpme)
title("Clever poor man's error");

Do you see any difference between the exact and clever poor man's error ? 
## Difference between error fields
We also overlay the data positions.

In [None]:
ploterr(cpme-exerr./Berr,vmin=-0.2, vmax=0.2, cmap="RdBu_r")
title("Error on error");

Another approximation to the error field: AEXERR

In [None]:
myerr,bjmb,fa,sa = DIVAnd_aexerr(mask,(pm,pn),(xi,yi),(obslon,obslat),obsval,len,epsilon2)
if myerr==0
    @error("No need to approximate error, use direct calculation")
else
    ploterr(myerr)
    title("Unscaled almost exact error")
end;

Scaled AXERR error

In [None]:
if myerr==0
    @error("No need to approximate error, use direct calculation")
else
    ploterr(myerr./bjmb)
    title("Scaled almost exact error")
end;

# Exercise 
1. Modify the (L, $\epsilon^2$) parameters.
2. Re-run the analysis.
3. See how the error field behaves.

# Conclusion
In view of the uncertainties on statistical parameters (L, $\epsilon^2$), the *clever poor man's error* is generally a sufficient approximation for the error fields.     
This is the one implemented in the `DIVAndgo` high-level analysis function. 