# Download ERA5 data for METFUT

1. Understand the task: METFUT would like to train ML models on subsets of ERA5 data. For this, we should retrieve snapshots of individual variables on single model levels at 12-hourly resolution. The variable names/ids, model levels, and time ranges will be given to us.
2. Read through the [ERA5 documentation](https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-complete?tab=form) to understand the data structure and identify, which data/access point we need.
3. Install the CDS API.
   * For this, we first need to register at [Copernicus Data Store](https://cds.climate.copernicus.eu/user/register?destination=%2F%23!%2Fhome).
   * Next copy and store your API key in file ```$HOME/.cdsapirc```. You find it at the bottom of your personal profile when you are logged in the CDS. Format:
```
url: https://cds.climate.copernicus.eu/api/v2
key: {uid}:{api-key}
```
   *Install the CDS API via ```pip install cdsapi```
   * Read through the [instructions](https://cds.climate.copernicus.eu/api-how-to) on how to use the cdsapi (bottom half of web page))
5.Browse through the [ERA5 data catalogue](https://apps.ecmwf.int/data-catalogues/era5/?class=ea) and select the fields you want to download.
6. After composing your search, click on "Show API request" and copy the commands into your notebook.

Please make sure to limit the download to 5.625 degree resolution and 12-hourly samples! Select Netcdf as output format. 

In [11]:
# install modules
#!pip install cdsapi
#!pip install xarray

__Note:__ The following works with the ipynb kernel, but not with the METCLOUD kernel. The first try might raise an error telling you that you must first accept the use conditions (follow the link at the bottom). Thereafter, it should work.

In [12]:
# Exemplary MARS request after composing through the web interface
# shortened from 1 month to 3 days
import cdsapi

c = cdsapi.Client()

# c.retrieve("reanalysis-era5-complete", {
#     "class": "ea",
#     "date": "2023-01-01/to/2023-01-31",
#     "expver": "1",
#     "levelist": "137",
#     "levtype": "ml",
#     "param": "130",
#     "step": "0",
#     "stream": "oper",
#     "time": "21:00:00",
#     "type": "4v"
# }, "output")


## Customize requests 
Now, we need to modify the request to obtain what we really want:
* time 09:00 and 21:00
* 5.625 degree resolution
* Netcdf output

Check the [Guidelines for efficient MARS requests](https://confluence.ecmwf.int/display/UDOC/Guidelines+to+write+efficient+MARS+requests) how to make these modifications.

Question: how to use "list" and "output = cost" in cdsapi?

See also [MARS user documentation](https://confluence.ecmwf.int/display/UDOC/MARS+user+documentation)

From efficiency guide: loop structure should be
```
date (outer loop)
   time
      step
         number (EPS only)
            level
               parameter (inner loop)
```
Try to issue the MARS request at the highest level possible.

In [13]:
c.retrieve("reanalysis-era5-complete", {
    "class": "ea",
    "date": "2023-01-01/to/2023-01-03",
    "expver": "1",
    "levelist": "137",
    "levtype": "ml",
    "grid": "5.625/5.625",
    "param": "130",
    "step": "0",
    "stream": "oper",
    "time": "09:00:00/21:00:00",
    "type": "4v",
    "format": "netcdf"
}, "test.nc")

2024-04-25 11:52:11,183 INFO Welcome to the CDS
2024-04-25 11:52:11,184 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/resources/reanalysis-era5-complete
2024-04-25 11:52:11,302 INFO Request is completed
2024-04-25 11:52:11,304 INFO Downloading https://download-0020.copernicus-climate.eu/cache-compute-0020/cache/data6/adaptor.mars.external-1714035110.1544075-23233-5-bf6524c3-9bdc-4900-82a4-bbf4f80543a3.nc to test.nc (26.2K)
2024-04-25 11:52:11,968 INFO Download rate 39.6K/s


Result(content_length=26860,content_type=application/x-netcdf,location=https://download-0020.copernicus-climate.eu/cache-compute-0020/cache/data6/adaptor.mars.external-1714035110.1544075-23233-5-bf6524c3-9bdc-4900-82a4-bbf4f80543a3.nc)

In [14]:
# ncdump -t -v latitude,longitude,time test.nc
# shows that this worked fine.

In [15]:
import xarray

In [16]:
ds = xarray.open_dataset("test.nc")

In [17]:
ds

# Homework

- download era5 data with surface temperature. 1st january 2023 9 and 21 hour
- convert the grid to icon/or other grid data
- plot map using cartopy
- maybe show how accurate the prediction for the weather forecast was?