by Climatoc-Lab
AI4RecWind provides tools for reconstructing missing windspeed observations in AEMET’s historical dataset using a U-Net with partial convolutions (CRAI model).
This implementation builds on the CRAI software developed by DKRZ and adapts it to reconstruct daily windspeed maps in Spain.
- 🧠 Model architecture: Based on a partial convolution U-Net (PCNN) with optional LSTM, GRU, and Attention mechanisms.
- 🗺️ Use case: Filling gaps in daily windspeed maps from AEMET’s station network.
You can set up the Python environment using Conda.
conda env create -f environment.yml
conda activate craiconda env create -f environment-cuda.yml
conda activate craiWith the environment activated, install the climatereconstructionAI package:
pip install model/climatereconstructionAIOnce the model is installed and the environment is ready, you can reconstruct missing values using a pre-trained model (typically saved as best.pth):
bash run_eval_CRAI.shBefore running the script, ensure run_eval_CRAI.sh is updated with:
- Paths to:
- Input data (
input_data/) - Masks (defining valid and missing pixels)
- Model checkpoints
- Output directory (where infilled data will be saved)
- Input data (
- Configuration details such as:
- File names of inputs and masks
- Output file name
- Device selection:
cuda(for GPU) orcpu
input_data/should include:test/: test data to reconstructmasks/: observation masks (1 = valid, 0 = missing)steady_mask_reversed: inverted land/sea mask (1 = sea, 0 = land), required for evaluation (provided in the repository for Spain)
The script uses evaluation_spain.inp, which defines:
- The variable to reconstruct
- Model hyperparameters
- Number of partitions (to manage GPU memory usage: the smaller the VRAM of the gpu the larger the number of the partitions is needed)
After evaluation, results are saved in the evaluation/ folder:
name_output.nc: Final reconstruction, merging model output with original observationsname_infilled.nc: Raw model prediction (infilled data only)name_gt.nc,name_image.nc,name_mask.nc: Supporting files (seeevaluation/Output of CRAI evaluation.txt)
If you wish to train your own model from scratch:
- Place training and validation data in
input_data/train/andinput_data/val/ - Validation files must have the same names as training files
- Include:
- Gridded data files (complete datasets)
- Observation masks for each timestamp in
input_data/masks/
(reflecting which grid points are valid/missing at each time) - Steady land/sea mask: 1 = land, 0 = sea (provided in the repository for Spain)
- Inverted land/sea mask (used in evaluation)
Navigate to the execution/ folder and run:
bash run_train_CRAI.shEdit this script to specify:
- Paths to input data and masks
- Device to use (
cpuorcuda) - Output directories for logs and checkpoints
Defined in ws_crai_training.inp, including:
- Batch size
- Learning rate
- Model architecture (layers, attention, etc.)
- Variable names and masks used
- Logs are saved in the
logs/folder - Model checkpoints in
snapshots/ckpt/, including:best.pth: best-performing model based on validation loss
This software is based on the CRAI model developed by the Data Analysis Group led by Christopher Kadow at the Deutsches Klimarechenzentrum (DKRZ).
Kadow et al. (2020), Nature Geoscience
DOI: 10.1038/s41561-020-0582-5
Maintained by the Climate Informatics and Technology Group at DKRZ
- Previous contributing authors:: Naoto Inoue, Christopher Kadow, Stephan Seitz
- Current contributing authors: Johannes Meuer, Maximilian Witte, Étienne Plésiat
CRAI is licensed under the terms of the BSD 3-Clause license.