# Download ERA5 data for METFUT

1. Understand the task: METFUT would like to train ML models on subsets of ERA5 data. For this, we should retrieve snapshots of individual variables on single model levels at 12-hourly resolution. The variable names/ids, model levels, and time ranges will be given to us.
2. Read through the [ERA5 documentation](https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-complete?tab=form) to understand the data structure and identify, which data/access point we need.
3. Install the CDS API.
   * For this, we first need to register at [Copernicus Data Store](https://cds.climate.copernicus.eu/user/register?destination=%2F%23!%2Fhome).
   * Next copy and store your API key in file ```$HOME/.cdsapirc```. You find it at the bottom of your personal profile when you are logged in the CDS. Format:
```
url: https://cds.climate.copernicus.eu/api/v2
key: {uid}:{api-key}
```
   *Install the CDS API via ```pip install cdsapi```
   * Read through the [instructions](https://cds.climate.copernicus.eu/api-how-to) on how to use the cdsapi (bottom half of web page))
5.Browse through the [ERA5 data catalogue](https://apps.ecmwf.int/data-catalogues/era5/?class=ea) and select the fields you want to download.
6. After composing your search, click on "Show API request" and copy the commands into your notebook.

Please make sure to limit the download to 5.625 degree resolution and 12-hourly samples! Select Netcdf as output format. 

In [11]:
# install modules
!pip install cdsapi
!pip install xarray

Defaulting to user installation because normal site-packages is not writeable
[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m24.0[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Defaulting to user installation because normal site-packages is not writeable
Collecting xarray
  Obtaining dependency information for xarray from https://files.pythonhosted.org/packages/f7/fe/c4d15ac730b2bcdd530e4bc6491958c53237eb573dba4eec3ad31ff0519a/xarray-2024.3.0-py3-none-any.whl.metadata
  Using cached xarray-2024.3.0-py3-none-any.whl.metadata (11 kB)
[31mERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: '/opt/tljh/user/lib/python3.10/site-packages/numpy-1.25.2.dist-info/METADATA'
[0m[31m
[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -

__Note:__ The following works with the ipynb kernel, but not with the METCLOUD kernel. The first try might raise an error telling you that you must first accept the use conditions (follow the link at the bottom). Thereafter, it should work.

In [3]:
# Exemplary MARS request after composing through the web interface
# shortened from 1 month to 3 days
import cdsapi

c = cdsapi.Client()
"""
c.retrieve("reanalysis-era5-complete", {
    "class": "ea",
    "date": "2023-01-01/to/2023-01-31",
    "expver": "1",
    "levelist": "137",
    "levtype": "ml",
    "param": "130",
    "step": "0",
    "stream": "oper",
    "time": "21:00:00",
    "type": "4v"
}, "output")

# if successful, you shall see a new file named ```output``` in your current directory.
"""

2024-04-17 08:56:25,545 INFO Welcome to the CDS
2024-04-17 08:56:25,547 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/resources/reanalysis-era5-complete
2024-04-17 08:56:25,597 INFO Request is queued
2024-04-17 08:56:58,009 INFO Request is running
2024-04-17 08:57:15,145 INFO Request is completed
2024-04-17 08:57:15,147 INFO Downloading https://download-0010-clone.copernicus-climate.eu/cache-compute-0010/cache/data9/adaptor.mars.external-1713344217.7449307-12310-16-70d5e74c-505c-4317-a14c-df5268f779bc.grib to output (24.3M)
2024-04-17 08:57:22,256 INFO Download rate 3.4M/s   


Result(content_length=25504320,content_type=application/x-grib,location=https://download-0010-clone.copernicus-climate.eu/cache-compute-0010/cache/data9/adaptor.mars.external-1713344217.7449307-12310-16-70d5e74c-505c-4317-a14c-df5268f779bc.grib)

## Customize requests 
Now, we need to modify the request to obtain what we really want:
* time 09:00 and 21:00
* 5.625 degree resolution
* Netcdf output

Check the [Guidelines for efficient MARS requests](https://confluence.ecmwf.int/display/UDOC/Guidelines+to+write+efficient+MARS+requests) how to make these modifications.

Question: how to use "list" and "output = cost" in cdsapi?

See also [MARS user documentation](https://confluence.ecmwf.int/display/UDOC/MARS+user+documentation)

From efficiency guide: loop structure should be
```
date (outer loop)
   time
      step
         number (EPS only)
            level
               parameter (inner loop)
```
Try to issue the MARS request at the highest level possible.

In [6]:
c.retrieve("reanalysis-era5-complete", {
    "class": "ea",
    "date": "2023-01-01/to/2023-01-03",
    "expver": "1",
    "levelist": "137",
    "levtype": "ml",
    "grid": "5.625/5.625",
    "param": "130",
    "step": "0",
    "stream": "oper",
    "time": "09:00:00/21:00:00",
    "type": "4v",
    "format": "netcdf"
}, "test.nc")

2024-04-17 09:17:33,383 INFO Welcome to the CDS
2024-04-17 09:17:33,385 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/resources/reanalysis-era5-complete
2024-04-17 09:17:33,459 INFO Request is queued
2024-04-17 09:17:46,838 INFO Request is running
2024-04-17 09:17:54,470 INFO Request is completed
2024-04-17 09:17:54,472 INFO Downloading https://download-0014-clone.copernicus-climate.eu/cache-compute-0014/cache/data7/adaptor.mars.external-1713345467.772383-14742-15-2ee82435-64a1-498a-8d39-be02fc3de4da.nc to test.nc (26.2K)
2024-04-17 09:17:54,679 INFO Download rate 128.2K/s


Result(content_length=26864,content_type=application/x-netcdf,location=https://download-0014-clone.copernicus-climate.eu/cache-compute-0014/cache/data7/adaptor.mars.external-1713345467.772383-14742-15-2ee82435-64a1-498a-8d39-be02fc3de4da.nc)

In [14]:
# ncdump -t -v latitude,longitude,time test.nc
# shows that this worked fine.

In [12]:
import xarray

ModuleNotFoundError: No module named 'xarray'