Icíar Lloréns Jover, Michaël Defferrard, Gionata Ghiggi, Natalie Bolón Brun
The code in this repository provides a framework for a deep learning medium range weather prediction method based on graph spherical convolutions.
[June 2020]: The results obtained with this code are detailed in the Masters thesis report and slides.
[September 2020]: Results have been improved from the initial basis thanks to:
- Introduction of residual connections in the architecture
- Inclusion of further consecutive steps in the loss with different weighting schemes to reduce the loss at long term predictions
| Model | Z500 (6h) | t850 (6h) | Z500 (120h) | t850 (120h) |
|---|---|---|---|---|
| Weyn et al | 103.17 | 1.0380 | 611.33 | 2.957 |
| Iciar June 2020 | 67.46 | 0.7172 | 861.7 | 3.432 |
| Ours Sep 2020 | 61.58 | 0.7110 | 680.024 | 2.901 |
- Results can be checked at
plot_results.ipynb
Ressources:
- Report and slides: Geometric deep learning for medium-range weather prediction
For a local installation, follow the below instructions.
-
Clone this repository.
git clone https://github.com/natbolon/weather_prediction.git cd weather_prediction -
Install the dependencies.
conda env create -f environment.yml
-
Create the data folders
mkdir data/equiangular/5.625deg/ data/healpix/5.625deg/
-
Download the WeatherBench data on the
data/equiangular/5.625deg/folder by following instructions on the WeatherBench repository. -
Interpolate the WeatherBench data onto the HEALPix grid. Modify the paremeters in
scripts/config_data_interpolation.ymlas desired.python -m scripts.data_iterpolation -c scripts/config_data_interpolation.yml
Attention:
-
If deepsphere is not properly installed:
conda activate weather_modelling pip install git+https://github.com/deepsphere/deepsphere-pytorch
-
If an incompatibility with YAML raises, the following command should solve the problem:
conda activate weather_modelling pip install git+https://github.com/deepsphere/deepsphere-pytorch --ignore-installed PyYAML
-
If it does not find the module
SphereHealpixfrom pygsp, install the development branch using:conda activate weather_modelling pip install git+https://github.com/Droxef/pygsp@new_sphere_graph
The model listed as "Ours 2020" is trained using the module full_pipeline_multiple_steps.py. An example of how to use it can be found on the notebook Restarting_weights_per_epoch.ipynb.
The config file to be used is configs/config_residual_multiple_steps.json. You may want to modify the model name and/or the data paths if the data has been relocated.
You can generate the model predictions using the notebook generate_evaluate_predictions.ipynb. The parameters to be modified are:
- model name (third cell)
- epochs to be evaluated (you can define a range or a single one)
In order to evaluate the performance of the model, you only need to run up to "Generate plots for evaluation". This sencond part will generate the skill and climatology plots (you may be interested in generate them for a single epoch usually, not all of them)
In order to compare the performance of different models, or the same model at different epochs or simply a model against different baselines, you can use the notebook plot_results.ipynb. Depending on the purpose of the comparison, you may want to run a different section of the notebook. An explanation of each section and its use case can be found under the heading of the notebook.
full_pipeline_evaluation.py
Allows to train, test, generate predictions and evaluate them for a model trained
with a loss function that includes 2 steps. All parameters, except GPU configuration, are defined in a config
file such as the ones stored on the folder configs/ .
To use the mail notification at the end of the process, you need to provide a confMail.json file which
must have the following structure:
{
"password": "yourMailPassword",
"sender": "yourMail"
}
Attention: If you are using gmail and have activated a two-step verification process, you need to get permission to the application and generate a new password. Details on how to generate the password can be found here
full_pipeline_multiple_steps.py
Allows to to train and test a model
with a loss function that includes multiple steps that can be defined by the user. It saves the model after every epoch
but does not generate the predictions (to save time since it can be done in parallel using the notebook
generate_evaluate_predictions.ipynb ). The parameters are defined inside the main function, although it can be
adapted to use a config file as in full_pipeline_evalution.py
It is important to remark that the update function that takes care of the weight's update is defined on top of the file and should be adapted to the number of lead steps taken into account in the loss function.
architecture.py
Contains pytorch models used for both full_pipeline_multiple_steps.py and full_pipeline_evaluation.py
Previous architectures used can be found in the folder modules/old_architectures/
plotting.py
Contains different functions to generate evaluation plots.
train_last_model.py
Contains code to train model with 2step-ahead prediction such as the one used for Iciar2020 results.
The main notebooks to explore are:
-
Restarting_weights_per_epoch.ipynbContains an example of how to use the functions that train the model that reported the best results mentioned earlier. -
generate_evaluate_predictions.ipynbGenerate values on validation set using the weights of the desired saved model -
plot_results.ipynbGenerate loss plots and comparison plots against different benchmark models -
healpix_resampling.ipynbGenerate healpix data from equiangular data -
generate_observations.ipynbGenerate ground-truth data for evaluation of the models
The content of this repository is released under the terms of the MIT license.
Benchmark_time_training.txt- LICENSE.txt
- README.md
- configs
- config_bottleneck.json
- config_original_less_channels.json
- config_residual_level2.json
- config_residual_level3.json
- config_residual_level3_long_connections.json
- config_residual_level4.json
- config_residual_level4_nolong.json
- config_residual_multiple_steps.json
- config_train.json
- data
- final_models_rmse.pkl
- weatherbench_training.npy
- environment.yml
- modules
- __init__.py
- architectures.py
- data.py
- full_pipeline.py
- full_pipeline_evaluation.py
- full_pipeline_multiple_steps.py
- layers.py
- mail.py
old_architectures__init__.pyhealpix_models.pymodels.pyother_architectures.pyplotting.pytest.pyutils.py
- notebooks
Iciar-report-notebookserror_video.ipynbtest_dynamic_features.ipynbtest_static_features-nearest.ipynbtest_static_features.ipynbtest_temporal_dimension.ipynb
- Restarting_weights_per_epoch.ipynb
full_pipeline.ipynb- generate_evaluate_predictions.ipynb
- generate_observations.ipynb
- healpix_resampling.ipynb
other_notebooksbenchmark_chunks.ipynbbenchmark_training_time.ipynbeffect_size_training_ds.ipynbpytorch_weatherbench.ipynbstandardize_data.ipynb
- plot_results.ipynb
running with variable weights multiple steps.ipynbtrain_direct_prediction.ipynbtrain_last_model_l1.ipynb
notes.mdscriptsconfig_data_interpolation.ymlconfig_data_preprocessing.ymldata_interpolation.pyscores_Weyn.py
tips.mdweather.yml