# Download ERA5 data for METFUT

1. Understand the task: METFUT would like to train ML models on subsets of ERA5 data. For this, we should retrieve snapshots of individual variables on single model levels at 12-hourly resolution. The variable names/ids, model levels, and time ranges will be given to us.
2. Read through the [ERA5 documentation](https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-complete?tab=form) to understand the data structure and identify, which data/access point we need.
3. Install the CDS API.
   * For this, we first need to register at [Copernicus Data Store](https://cds.climate.copernicus.eu/user/register?destination=%2F%23!%2Fhome).
   * Next copy and store your API key in file ```$HOME/.cdsapirc```. You find it at the bottom of your personal profile when you are logged in the CDS. Format:
```
url: https://cds.climate.copernicus.eu/api/v2
key: {uid}:{api-key}
```
   *Install the CDS API via ```pip install cdsapi```
   * Read through the [instructions](https://cds.climate.copernicus.eu/api-how-to) on how to use the cdsapi (bottom half of web page))
5.Browse through the [ERA5 data catalogue](https://apps.ecmwf.int/data-catalogues/era5/?class=ea) and select the fields you want to download.
6. After composing your search, click on "Show API request" and copy the commands into your notebook.

Please make sure to limit the download to 5.625 degree resolution and 12-hourly samples! Select Netcdf as output format. 

In [1]:
# install modules
#!pip install cdsapi
#!pip install xarray

__Note:__ The following works with the ipynb kernel, but not with the METCLOUD kernel. The first try might raise an error telling you that you must first accept the use conditions (follow the link at the bottom). Thereafter, it should work.

In [2]:
# Exemplary MARS request after composing through the web interface
# shortened from 1 month to 3 days
import cdsapi

c = cdsapi.Client()

c.retrieve("reanalysis-era5-complete", {
    "class": "ea",
    "date": "2023-01-01/to/2023-01-31",
    "expver": "1",
    "levelist": "137",
    "levtype": "ml",
    "param": "130",
    "step": "0",
    "stream": "oper",
    "time": "21:00:00",
    "type": "4v"
}, "output")

# if successful, you shall see a new file named ```output``` in your current directory.


2024-05-21 20:27:47,310 INFO Welcome to the CDS
2024-05-21 20:27:47,317 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/resources/reanalysis-era5-complete
2024-05-21 20:27:47,673 INFO Request is queued
2024-05-21 20:28:08,829 INFO Request is completed
2024-05-21 20:28:08,831 INFO Downloading https://download-0017.copernicus-climate.eu/cache-compute-0017/cache/data5/adaptor.mars.external-1716316083.127147-21713-4-4b4684cc-af6b-4aac-8a6f-7ef2c6a4d503.grib to output (24.3M)
2024-05-21 20:28:23,167 INFO Download rate 1.7M/s   


Result(content_length=25502894,content_type=application/x-grib,location=https://download-0017.copernicus-climate.eu/cache-compute-0017/cache/data5/adaptor.mars.external-1716316083.127147-21713-4-4b4684cc-af6b-4aac-8a6f-7ef2c6a4d503.grib)

## Customize requests 
Now, we need to modify the request to obtain what we really want:
* time 09:00 and 21:00
* 5.625 degree resolution
* Netcdf output

Check the [Guidelines for efficient MARS requests](https://confluence.ecmwf.int/display/UDOC/Guidelines+to+write+efficient+MARS+requests) how to make these modifications.

Question: how to use "list" and "output = cost" in cdsapi?

See also [MARS user documentation](https://confluence.ecmwf.int/display/UDOC/MARS+user+documentation)

From efficiency guide: loop structure should be
```
date (outer loop)
   time
      step
         number (EPS only)
            level
               parameter (inner loop)
```
Try to issue the MARS request at the highest level possible.

In [3]:
c.retrieve("reanalysis-era5-complete", {
    "class": "ea",
    "date": "2023-01-01/to/2023-01-03",
    "expver": "1",
    "levelist": "137",
    "levtype": "ml",
    "grid": "5.625/5.625",
    "param": "130",
    "step": "0",
    "stream": "oper",
    "time": "09:00:00/21:00:00",
    "type": "4v",
    "format": "netcdf"
}, "test.nc")

2024-05-21 20:28:23,313 INFO Welcome to the CDS
2024-05-21 20:28:23,313 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/resources/reanalysis-era5-complete


2024-05-21 20:28:23,441 INFO Request is queued
2024-05-21 20:29:38,911 INFO Request is completed
2024-05-21 20:29:38,913 INFO Downloading https://download-0008-clone.copernicus-climate.eu/cache-compute-0008/cache/data6/adaptor.mars.external-1716316176.4278162-24480-19-00e5f546-25c9-4d83-9d6b-b7f43e99545c.nc to test.nc (26.2K)
2024-05-21 20:29:39,372 INFO Download rate 57.3K/s


Result(content_length=26864,content_type=application/x-netcdf,location=https://download-0008-clone.copernicus-climate.eu/cache-compute-0008/cache/data6/adaptor.mars.external-1716316176.4278162-24480-19-00e5f546-25c9-4d83-9d6b-b7f43e99545c.nc)

In [4]:
#Now to dowload the January 1st, 2023 temperature field. 

c.retrieve("reanalysis-era5-complete", {
    "class": "ea",
    "date": "2023-01-01",
    "expver": "1",
    "levelist": "137",
    "levtype": "ml",
    "grid": "5.625/5.625",
    "param": "130",
    "step": "0",
    "stream": "oper",
    "time": "09:00:00/21:00:00",
    "type": "4v",
    "format": "netcdf"
}, "01jan2023.nc")

2024-05-21 20:29:39,490 INFO Welcome to the CDS
2024-05-21 20:29:39,493 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/resources/reanalysis-era5-complete


2024-05-21 20:29:39,828 INFO Request is queued
2024-05-21 20:35:58,782 INFO Request is completed
2024-05-21 20:35:58,783 INFO Downloading https://download-0014-clone.copernicus-climate.eu/cache-compute-0014/cache/data1/adaptor.mars.external-1716316510.5129008-25619-11-11eb2f9b-82a6-497a-ab66-eaac422d3189.nc to 01jan2023.nc (9.7K)
2024-05-21 20:35:59,151 INFO Download rate 26.5K/s


Result(content_length=9948,content_type=application/x-netcdf,location=https://download-0014-clone.copernicus-climate.eu/cache-compute-0014/cache/data1/adaptor.mars.external-1716316510.5129008-25619-11-11eb2f9b-82a6-497a-ab66-eaac422d3189.nc)