Skip to content

figerhaowang/DAMBench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DAMBench: A Multi-Modal Benchmark for Deep Learning-based Atmospheric Data Assimilation

DAMBench is a minimalist, reproducible benchmark for evaluating learning-based data assimilation methods on reanalysis-like gridded data (e.g., ERA5). It standardizes data download, storage layout, preprocessing, training, and inference so you can compare models apples-to-apples.

Key Features

  • 📦 Turn-key environment via environment.yaml
  • ⬇️ Scripted data retrieval with configurable years, spatial resolution, and variable types (single-/multi-level)
  • 🗂️ Deterministic on-disk layout for fast I/O
  • 🧪 Reference training & inference entry points (train.py, inference.py)
  • 🧰 Drop-in model switch via --model flag (e.g., FNP)

1) Setup

Create the conda environment from the provided spec:

conda env create -f environment.yaml
conda activate da-bench

2) Download Data

Use download.py to fetch and organize the dataset.

python download.py

You can customize the download in the script:

  • Years (e.g., 1979–1980, 2000–2020)
  • Resolution (e.g., 240x121, 1440x721)
  • Data types:
    • Single-level: t2m, u10, v10, msl, …
    • Multi-level: z, t, q, u, v, with pressure levels (e.g., 50, 100, …, 1000 hPa)

Tip: Open download.py and edit the config block (years / resolution / variables) to match your use case.


3) Dataset Layout

After download, files are organized by year → day → variable → (level) → time step:

DATA_ROOT/
└── 2000/
    └── 2000-01-01/
        ├── msl/
        │   ├── T0.npy
        │   ├── T6.npy
        │   ├── T12.npy
        │   └── T18.npy
        └── q/
            ├── 50/
            │   ├── T0.npy
            │   ├── T6.npy
            │   ├── T12.npy
            │   └── T18.npy
            ├── 100/
            │   └── ...
            └── ...

4) Train

Train a model (example: FNP):

python train.py --model FNP

5) Inference / Evaluation

Run inference with the trained checkpoint:

python inference.py --model FNP

6) Notes & Conventions

  • Variables
    • Single-level: t2m, u10, v10, msl
    • Multi-level: z, t, q, u, v with pressure levels
  • Time steps: by default T0, T6, T12, T18 (6-hourly)
  • File format: each .npy holds a single 2D grid for that variable/level/time

7) Visualization

We visualize the assimilation result from the baseline models of the global RMSE of t850. The following results is the global RMSE between each baseline and ground truth in 2024.09.16, in which day there is typhoon in east China sea. This can also show how the extreme weather events affect the quality of data assimilation. As can be seen, the RMSE in the typhoon position is obviously higher than the neighborhood.



Shanghai AI Lab and Intern Discovery





8) More Baselines


Model SpecDiv ↓ MSE ↓ MAE ↓ z500 RMSE ↓ t850 RMSE ↓ t2m RMSE ↓ u10 RMSE ↓ v10 RMSE ↓ u500 RMSE ↓ v500 RMSE ↓ q700 RMSE ↓ (×10⁻⁴) Imp ↑
Background 0.153 2.88 8.61 45.455 0.7200 0.7790 0.9336 0.9645 1.7278 1.7535 6.7220
Adas 0.116 2.31 7.65 30.100 0.6750 0.7350 0.8400 0.8600 1.4950 1.4900 6.5400
Multi-Adas 0.060 2.20 7.30 27.800 0.6700 0.6900 0.7400 0.7400 1.4000 1.4200 6.3500 4.35%
ConvCNP 0.125 2.49 7.98 31.253 0.6944 0.7662 0.8334 0.8553 1.5770 1.5876 6.5717
Multi-ConvCNP 0.123 2.44 7.82 30.628 0.6805 0.7510 0.8170 0.8380 1.5750 1.5560 6.5400 2.01%
FNP 0.063 2.30 7.54 28.500 0.6985 0.7100 0.7650 0.7650 1.4350 1.4600 6.4698
Multi-FNP 0.059 2.16 7.09 26.790 0.6566 0.6674 0.7191 0.7191 1.3489 1.3724 6.0800 6.09%
VAE-VAR 0.052 2.31 7.60 27.000 0.6970 0.7050 0.7560 0.7770 1.4500 1.4500 6.4700
Multi-VAE-VAR 0.048 2.13 6.99 24.840 0.6412 0.6486 0.6955 0.7148 1.3340 1.3340 5.9500 7.79%
SDA 0.117 2.65 8.02 38.000 0.7100 0.7500 0.8800 0.9100 1.6500 1.7000 6.6100
SLAM 0.091 2.55 7.94 32.500 0.7020 0.7300 0.8000 0.7800 1.5000 1.4700 6.5000 3.77%
DBF 1.48 2.79 8.42 40.25 0.713 0.765 0.923 0.943 1.632 1.645 6.654
Multi-DBF 1.41 2.73 8.27 39.11 0.708 0.762 0.912 0.931 1.613 1.629 6.642
PhyDA 0.031 2.28 7.53 26.866 0.668 0.703 0.755 0.764 1.425 1.445 6.469
Multi-PhyDA 0.031 2.27 7.51 26.801 0.668 0.702 0.756 0.763 1.426 1.444 6.468

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages