DAPPER is a set of templates for benchmarking the performance of data assimilation (DA) methods. The tests provide experimental support and guidance for new developments in DA.
The typical set-up is a synthetic (twin) experiment, where you
- specify a
- dynamic model
*
- observational model
*
- dynamic model
- use these to generate a synthetic
- "truth"
- and observations thereof
*
- assess how different DA methods perform in estimating the truth,
given the above starred (
*
) items.
DAPPER enables the numerical investigation of DA methods
through a variety of typical test cases and statistics. It
(a) reproduces numerical benchmarks results reported in the literature, and
(b) facilitates comparative studies, thus promoting the
(a) reliability and
(b) relevance of the results.
For example, this figure is generated by examples/basic_3.py
and is a
reproduction from this book on DA.
DAPPER is (c) open source, written in Python, and (d) focuses on readability; this promotes the (c) reproduction and (d) dissemination of the underlying science, and makes it easy to adapt and extend. It also comes with a battery of diagnostics and statistics, and live plotting (on-line with the assimilation) facilities, including pause/inspect options, as illustrated below
In summary, it is well suited for teaching and fundamental DA research. Also see its drawbacks.
Works on Linux/Windows/Mac.
If you're not an admin or expert:
-
Install Anaconda.
-
Open the Anaconda terminal and run the following commands:
conda create --yes --name my-env python=3.8 conda activate my-env python -c 'import sys; print("Version:", sys.version.split()[0])'
Ensure the output at the end gives a version bigger than 3.8.
Keep using the same terminal for the commands below.
Do you just want to run a script that requires DAPPER? Then
- If the script comes with a
requirements.txt
file, then do
pip install -r path/to/requirements.txt
. - If not, hopefully you know the version of DAPPER needed. Run
pip install DA-DAPPER==1.0.0
to get version1.0.0
(as an example).
Do you want the DAPPER code available play around with? Then
- Download and unzip (or
git clone
) DAPPER. - Move the resulting folder wherever you like,
andcd
into it (ensure you're in the folder with asetup.py
file). pip install -e .[dev]
You can omit[dev]
if you don't need to do serious development.
You should now be able to do run your script with
python path/to/script.py
.
For example, if you are in the DAPPER dir,
python examples/basic_1.py
If you've closed the terminal (or shut down your computer), you first need to open the (anaconda) terminal and run this:
conda activate my-env
Read, run, and understand the scripts examples/basic_{1,2,3}.py
.
Then, get familiar with the code.
The documentation provides more information, and the API reference.
Alternatively, DA-tutorials provides a python-based introduction to DA.
Method | Literature reproduced |
---|---|
EnKF 1 | Sakov08, Hoteit15 |
EnKF-N | Bocquet12, Bocquet15 |
EnKS, EnRTS | Raanes2016 |
iEnKS / iEnKF / EnRML / ES-MDA 2 | Sakov12, Bocquet12, Bocquet14 |
LETKF, local & serial EAKF | Bocquet11 |
Sqrt. model noise methods | Raanes2014 |
Particle filter (bootstrap) 3 | Bocquet10 |
Optimal/implicit Particle filter 3 | Bocquet10 |
NETF | Tödter15, Wiljes16 |
Rank histogram filter (RHF) | Anderson10 |
4D-Var | |
3D-Var | |
Extended KF | |
Optimal interpolation | |
Climatology |
1: Stochastic, DEnKF (i.e. half-update), ETKF (i.e. sym. sqrt.).
Serial forms are also available.
Tuned with inflation and "random, orthogonal rotations".
2: Also supports the bundle version,
and "EnKF-N"-type inflation.
3: Resampling: multinomial
(including systematic/universal and residual).
The particle filter is tuned with "effective-N monitoring",
"regularization/jittering" strength, and more.
For a list of ready-made experiments with suitable,
tuned settings for a given method (e.g. the iEnKS
), use gnu's grep:
cd dapper/mods
grep -r "iEnKS.*("
Model | Lin | TLM** | PDE? | Phys.dim. | State len | Lyap≥0 | Implementer |
---|---|---|---|---|---|---|---|
Linear Advect. (LA) | Yes | Yes | Yes | 1d | 1000 * | 51 | Evensen/Raanes |
DoublePendulum | No | Yes | No | 0d | 4 | 2 | Matplotlib/Raanes |
Ikeda | No | Yes | No | 0d | 2 | 1 | Raanes |
LotkaVolterra | No | Yes | No | 0d | 5 * | 1 | Wikipedia/Raanes |
Lorenz63 | No | Yes | "Yes" | 0d | 3 | 2 | Sakov |
Lorenz84 | No | Yes | No | 0d | 3 | 2 | Raanes |
Lorenz96 | No | Yes | No | 1d | 40 * | 13 | Raanes |
LorenzUV | No | Yes | No | 2x 1d | 256 + 8 * | ≈60 | Raanes |
Kuramoto-Sivashinsky | No | Yes | Yes | 1d | 128 * | 11 | Kassam/Raanes |
Quasi-Geost (QG) | No | No | Yes | 2d | 129²≈17k | ≈140 | Sakov |
*
: Flexible; set as necessary**
: Tangent Linear Model included?
The models are found as subdirectories within dapper/mods
.
A model should be defined in a file named __init__.py
,
and illustrated by a file named demo.py
.
Most other files within a model subdirectory
are usually named authorYEAR.py
and define a HMM
object,
which holds the settings of a specific twin experiment,
using that model,
as detailed in the corresponding author/year's paper.
At the bottom of each such file should be (in comments)
a list of suitable, tuned settings for various DA methods,
along with their expected, average rmse.a score for that experiment.
The complete list of included experiment files can be obtained with
gnu's find
:
cd dapper/mods
find . -iname '[a-z]*[0-9]*.py'
Some of these files contain settings that have been used in several papers. As mentioned above, DAPPER reproduces literature results. You will also find results that were not reproduced by DAPPER.
DAPPER is aimed at research and teaching (see discussion up top). Example of limitations:
- It is not suited for very big models (>60k unknowns).
- Time-dependent error covariances and changes in lengths of state/obs (although the Dyn and Obs models may otherwise be time-dependent).
- Non-uniform time sequences not fully supported.
Also, DAPPER comes with no guarantees/support. Therefore, if you have an operational (real-world) application, such as WRF, you should look into one of the alternatives, sorted by approximate project size.
Name | Developers | Purpose (approximately) |
---|---|---|
DART | NCAR | Operational, general |
PDAF | AWI | Operational, general |
JEDI | JCSDA (NOAA, NASA, ++) | Operational, general (in develpmt?) |
ERT | Statoil | Operational, history matching (Petroleum) |
OpenDA | TU Delft | Operational, general |
Verdandi | INRIA | Biophysical DA |
PyOSSE | Edinburgh, Reading | Earth-observation DA |
SANGOMA | Conglomerate* | Unify DA research |
EMPIRE | Reading (Met) | Research (high-dim) |
MIKE | DHI | Oceanographic. Commercial? |
OAK | Liège | Oceaonagraphic |
Siroco | OMP | Oceaonagraphic |
FilterPy | R. Labbe | Engineering, general intro to Kalman filter |
DASoftware | Yue Li, Stanford | Matlab, large-scale |
Pomp | U of Michigan | R, general state-estimation |
PyIT | CIPR | Real-world petroleum DA (?) |
EnKF-Matlab | Sakov | Matlab, personal publications and intro |
EnKF-C | Sakov | C, light-weight EnKF, off-line |
pyda | Hickman | Python, personal publications |
Datum | Raanes | Matlab, personal publications |
IEnKS code | Bocquet | Python, personal publications |
The EnKF-Matlab
and IEnKS
codes have been inspirational
in the development of DAPPER.
*: AWI/Liege/CNRS/NERSC/Reading/Delft
Patrick N. Raanes, Colin Grudzien, Maxime Tondeur, Remy Dubois
If you use this software in a publication, please cite as follows.
@misc{raanes2018dapper,
author = {Patrick N. Raanes and others},
title = {nansencenter/DAPPER: Version 0.8},
month = December,
year = 2018,
doi = {10.5281/zenodo.2029296},
url = {https://doi.org/10.5281/zenodo.2029296}
}
DAPPER is developed and maintained at NORCE (Norwegian Research Institute) and the Nansen Environmental and Remote Sensing Center