# Background information

[source: https://ecmwf-ai-weather-quest.readthedocs.io/en/latest/training_data.html]

Participants can use any observational, forecast or reanalysis dataset to develop their AI/ML sub-seasonal forecasting models. To facilitate initial model development, registered participants can download post-processed ERA5 data from ECMWF.

The following datasets at a 1.5 degree resolution have been made easily accessible:

ERA5 data:
- Weekly-mean temperature (K)
- Weekly-mean mean sea level pressure (Pa)
- Weekly-accumulated total precipitation (mm week-1)

**Important**: At the launch of the AI Weather Quest, post-processed training data is available from the 1st January 1979 to 31st December 2024. Updates to training data will be performed with a latency of three months.

# Retrieve Training Data Module

To download post-processed ERA5 data, you will need functions from the retrieve_training_data.py module. The key function within this module is:

`retrieve_annual_training_data`: Download annual files containing weekly statistics for temperature (tas), mean sea level pressure (mslp) or precipitation (pr).

In [16]:
from AI_WQ_package import retrieve_training_data
import os

In [37]:
def loop_download(years, variables, data_loc='/users/jk/22/6ixCast/data/ERA5/'):
    '''
    Wrapper around the AI_WQ_package function "retrieve_annual training data"

    years(array_like): years in the range [1979, 2024]
    variables(array_like): subset of "pr", "tas", "mslp"
    '''

    for year in years:
        for variable in variables:
            
            retrieve_training_data.retrieve_annual_training_data(year, 
                                                                variable, 
                                                                'NegF8LfwK', 
                                                                local_destination=data_loc+variable)
                
                

In [None]:
loop_download(range(1979, 2024), ['pr', 'tas'])