<img src='./img/logoline_12000.png' align='right' width='100%'></img>

<br>

# The Copernicus Climate Data Store (CDS) - Introduction and data access example

This notebook provides you an introduction to the Copernicus Climate Data Store (CDS), an overview of the type of data that can be accessed and gives practical examples how to access and retrieve data from the CDS.

### Outline
* [1 - About](#about)
* [2 - C3S data overview](#c3s_data_overview)
* [3 - C3S data retrieval](#c3s_data_retrieval)
 * [3.1 - Access data manually via the CDS web interface](#access_manual)
 * [3.2 - Access data in a programmatic way with the CDS API](#access_programmatic)
* [4 - Example data requests](#example_requests)
 * [4.1 - Climate reanalysis](#climate_reanalysis)
 * [4.2 - Seasonal forecasts](#seasonal_forecasts)

### How to access the notebook
* via [nbviewer](https://nbviewer.org/github/ecmwf-projects/copernicus-training/blob/master/100_climate_data_store_intro.ipynb): view a static version of the notebook
* via [Binder](https://mybinder.org/v2/gh/ecmwf-projects/copernicus-training/HEAD?urlpath=lab/tree/100_climate_data_store_intro.ipynb): run, execute and modify the notebook

<hr>

## <a id='about'></a>1. About

The [Copernicus Climate Data Store (CDS)](https://cds.climate.copernicus.eu/) is the data access portal of the [Copernicus Climate Change Service (C3S)](https://climate.copernicus.eu/) and offers access to `data` and `applications` about the Earth's past, present and future climate.

<img src='./img/cds_landing_page.png' align='left' width='80%'></img>

<hr>

## <a id='c3s_data_overview'></a>2. C3S data overview

The Copernicus Climate Change Service offers a variety of different climate data products. The most popular ones can be classified in three:

#### Climate reanalysis
Climate reanalysis combines model data with observations from across the world into a globally complete and consistent dataset. Get an overview of climate reanalysis data offered by the CDS [here](./101_c3s_data_intro.ipynb#climate_reanalysis)

#### Seasonal Forecasts
Seasonal forecasts provide a long-range outlook of changes in the Earth system over periods of a few weeks or months. Get an overview of seasonal forecasts offered by the CDS [here](./101_c3s_data_intro.ipynb#seasonal_forecasts).

#### Climate projections
Climate projections are simulations of the future climate performed using models that represent physical processes in the atmosphere, ocean, cryosphere, biosphere and land, as well as interactions between them. Climate projections describe the fture evolution of the planet's climate system at global and regional scales. Get an overview of climate projections offered by the CDS [here](./101_c3s_data_intro.ipynb#climate_projections).



<br>

A complete overview of all the climate data available from the CDS can be found on the CDS web interface under [Datasets](https://cds.climate.copernicus.eu/cdsapp#!/search?type=dataset).

<hr>

## <a id='c3s_data_retrieval'></a>3. Data retrieval

There are two ways how to access data from the Copernicus Climate Data Store (CDS):
* [manually](#access_manual) via the CDS web interface, or
* [programmatically](#access_programmatic) with the CDS API

### <a id='access_manual'></a>3.1 Access data manually via the CDS web interface

The `CDS web interface` allows you to manually `browse`, `select` and `download` data products offered by the CDS. First, under [Datasets](https://cds.climate.copernicus.eu/cdsapp#!/search?type=dataset), you can browse and select the data product you are interested in. In a second step, you can then specify details of the data download form you wish to submit.

#### Filter and select a data product

As a first step, you can `browse` and `filter` the data product you are interested in. The [Datasets](https://cds.climate.copernicus.eu/cdsapp#!/search?type=dataset) interface allows you either to select data based on different categories, e.g. `Product type`, `Variable domain`, `Spatial / Temporal coverage`, but also offers a free text search. The list of data products allows you to select the dataset you are interested in. 

Once you selected a dataset, you then get redirected to a data description section, which provides you an overview of the chosen dataset as well as the option to specify the dataset you would like to download and to submit the download form.

<br>

<img src='./img/cds_web_interface_1.png' align='left' width='60%'></img>

<br>

#### Submit the *Download form*

The `Data description` section (see 1) provides you an overview of the dataset product, including a list of variables that are available. Under the tab `Download data`, the `Download form` opens (see 2) which allows you to  manually filter the data product based on:
* `Product type`
* `Variable`
* `Year / Month / Time`
* `Geographical area`
* `Format`

At the end of the `Download form`, you get three options: `Show API request`, `Show Toolbox request` and `Submit Form`. If you want to download the data manually, the data requests will be executed as soon as you click on the `Submit Form` button. You will need the `Show API request`, if you want to request data in a programmatic way. See [Section 3.2](#access_programmatic) for further information.


<div class="alert alert-block alert-success">
<b>NOTE</b>: <br>
    Under the tab <code>Your requests</code> in the main menu, you can monitor the status of your data requests.</div>

<br>

<img src='./img/cds_data_description_download_form.png' align='left' width='60%'></img>

### <a id='access_programmatic'></a>3.2 Access data programmatically with the CDS API

The `Climate Data Store Application Program Interface (CDS API)` is a Python library which allows you to access data from the CDS programmatically. The library is available for both Python versions, Python 2.7.x and Python 3. In order to use the CDS API, follow the steps below:

#### Install the CDS API key

* [Self-register](https://cds.climate.copernicus.eu/#!/home) at the CDS registration page (if you do not have an account yet)
* [Login](https://cds.climate.copernicus.eu/user/login) to the CDS portal and go to the [api-how-to page](https://cds.climate.copernicus.eu/api-how-to)
* Copy the CDS API key displayed in the black terminal window in a file under `$HOME/.cdsapirc`

**Note:** You find your CDS API key displayed in the black terminal box under the section `Install the CDS API key`. If you do not see a URL or key appear in the black terminal box, please refresh your browser tab. 
  

<img src='./img/cds_api_key.png' align='left' width='60%'></img>

The code below creates the file under your current working directory. Make sure to replace the `################` with your personal `CDS API key`.


In [101]:
%%writefile ./.cdsapirc

url: https://cds.climate.copernicus.eu/api/v2
key: ##############################

Overwriting ./.cdsapirc


<br>

<div class="alert alert-block alert-success">
<b>NOTE</b>: <br>
    Alternatively to store your CDS API key in the .cdsapirc file, you can also include the <code>url</code> and your <code>CDS API key</code> as additional keyword arguments when you define the <code>cds.Client()</code>:<br>
    <br>
<code>cds.Client(url='https://cds.climate.copernicus.eu/api/v2', key=#################)</code></div>

<br>


#### Install the CDS API client

The next step is to install the `CDS API client`. You can do this with the package management system `pip`.

In [None]:
!pip install cdsapi

#### Use the CDS API client for data access

Once the `CDS API` is installed, it can be used to request data from the Climate Data Store.

Below, you see the principle of a `data retrieval` request. You always have to make sure to first import the `cdsapi` and define a `cdsapi.Client()` before you are able to execute an `API request`. You can use the [web interface](https://cds.climate.copernicus.eu/cdsapp#!/search?type=dataset) to browse through the datasets. At the end of the `Download form`, there is the option to choose `Show API request`. If you click this button, the `API request` appears (see example below), which you can copy paste into your coding workflow.

<br>

<div><img src='./img/cdsapi_request.png' align='left' width='30%'></img></div>






<br>

<div class="alert alert-block alert-success">
<b>NOTE</b>: <br>
    Per default, ECMWF data is stored on a grid with longitudes from 0 to 360 degrees. It can be reprojected to a regular geographic latitude-longitude grid, by setting the keyword argument <code>area</code> and <code>grid</code>. Per default, data is retrieved in <code>GRIB</code>. If you wish to retrieve the data in <code>netCDF</code>, you have to specify it by using the keyword argument <code>format</code>.</div>

<br>

See [below](#example_requests) for some example `CDS API requests`.

<hr>

## <a id='example_requests'></a>4. Example data requests

Below, you find a list of CDS API requests that have been used to retrieve the datasets used throughout the learning modules

* [Climate reanalysis](#climate_reanalysis)
  * [ERA5 monthly average data on single levels from 1979 to present](era5_monthly)
  * [ERA5 hourly data on single levels from 1979 to present](era5_hourly)
  * [ERA5-Land hourly data on single levels from 1981 to present](era5-land_hourly)
* [Seasonal forecasts](#seasonal_forecasts)

### <a id='climate_reanalysis'></a>4.1 Climate reanalysis

#### <a id='era5_monthly'></a>Example: **ERA5 monthly averaged data on single levels from 1979 to present**

> Data used in [111_c3s_climatologies_trends](./111_c3s_climatologies_trends.ipynb)

CDS API name: `reanalysis-era5-single-levels-monthly-means`

> - Product type: `monthly_averaged_reanalysis`
> - Variable: `2m_temperature`
> - Year: `[1979 to 2020]`
> - Month: `[01 to 12]`
> - Time: `00:00` (default)
> - Geographical area: `Whole available region` 
> - Format: `netcdf`

In [None]:
import cdsapi
c = cdsapi.Client()
c.retrieve(
    'reanalysis-era5-single-levels-monthly-means',
    {
        'product_type': 'monthly_averaged_reanalysis',
        'variable': '2m_temperature',
        'year': [
            '1979', '1980', '1981',
            '1982', '1983', '1984',
            '1985', '1986', '1987',
            '1988', '1989', '1990',
            '1991', '1992', '1993',
            '1994', '1995', '1996',
            '1997', '1998', '1999',
            '2000', '2001', '2002',
            '2003', '2004', '2005',
            '2006', '2007', '2008',
            '2009', '2010', '2011',
            '2012', '2013', '2014',
            '2015', '2016', '2017',
            '2018', '2019', '2020'
        ],
        'month': [
            '01', '02', '03',
            '04', '05', '06',
            '07', '08', '09',
            '10', '11', '12',
        ],
        'time': '00:00',
        'format': 'netcdf',
    },
    './data/era5_monthly_t2m.nc')

<hr>

#### <a id='era5_monthly_seas5'></a>Example: **ERA5 monthly averaged data on single levels from 1979 to present**

> Data used in [122_c3s_bias_correction](./122_c3s_seasonal_forecast_bias_correction.ipynb)

CDS API name: `reanalysis-era5-single-levels-monthly-means`

> - Product type: `monthly_averaged_reanalysis`
> - Variable: `sea_surface_temperature`
> - Year: `[1993 to 2016]`
> - Month: `['01', '02', '09', '10', '11', '12']`
> - Time: `00:00` (default)
> - Geographical area: `Whole available region` 
> - Format: `grib`

In [None]:
c = cdsapi.Client()

c.retrieve(
    'reanalysis-era5-single-levels-monthly-means',
    {
        'product_type': 'monthly_averaged_reanalysis',
        'variable': 'sea_surface_temperature',
        'year': [
            '1993', '1994', '1995',
            '1996', '1997', '1998',
            '1999', '2000', '2001',
            '2002', '2003', '2004',
            '2005', '2006', '2007',
            '2008', '2009', '2010',
            '2011', '2012', '2013',
            '2014', '2015', '2016',
        ],
        'month': [
            '01', '02', '09',
            '10', '11', '12',
        ],
        'time': '00:00',
        'format': 'grib',
    },
    './data/era5_monthly_1993-2016_sep-feb_sst.grib')

<hr>

#### <a id='era5_hourly'></a>Example: **ERA5 hourly data on single levels from 1979 to present**

> Data used in [112_c3s_climate_extremes](./112_c3s_climate_extremes.ipynb)

CDS API name: `reanalysis-era5-single-levels`

> - Product type: `reanalysis`
> - Variable: `2m_temperature`
> - Year: `[1979 to 2020]`
> - Month: `09`
> - Day: `[1 to 30]`
> - Time: `[00:00 to 23:00]`
> - Area: `[51, 3, 50, 4]` 
> - Format: `netcdf`

**Note:** the request above makes use of the keyword `area`, which enable you to retrieve only a geographical subset. The bounding box information are set as follows: `[N, W, S, E]`. When this keyword is set, the data is automatically projected to a grid from [-180, 180].  

In [None]:
import cdsapi
c = cdsapi.Client()
c.retrieve(
    'reanalysis-era5-single-levels',
    {
        'product_type': 'reanalysis',
        'format': 'netcdf',
        'variable': '2m_temperature',
        'year': [
            '1979', '1980', '1981',
            '1982', '1983', '1984',
            '1985', '1986', '1987',
            '1988', '1989', '1990',
            '1991', '1992', '1993',
            '1994', '1995', '1996',
            '1997', '1998', '1999',
            '2000', '2001', '2002',
            '2003', '2004', '2005',
            '2006', '2007', '2008',
            '2009', '2010', '2011',
            '2012', '2013', '2014',
            '2015', '2016', '2017',
            '2018', '2019', '2020',
        ],
        'month': '09',
        'day': [
            '01', '02', '03',
            '04', '05', '06',
            '07', '08', '09',
            '10', '11', '12',
            '13', '14', '15',
            '16', '17', '18',
            '19', '20', '21',
            '22', '23', '24',
            '25', '26', '27',
            '28', '29', '30',
        ],
        'time': [
            '00:00', '01:00', '02:00',
            '03:00', '04:00', '05:00',
            '06:00', '07:00', '08:00',
            '09:00', '10:00', '11:00',
            '12:00', '13:00', '14:00',
            '15:00', '16:00', '17:00',
            '18:00', '19:00', '20:00',
            '21:00', '22:00', '23:00',
        ],
        'area': [
            51, 3, 50, 4, # North, West, South, East
        ],
    },
    './data/era5_t2m_hourly_northern_france_sep.nc')

<hr>

#### <a id='era5-land_hourly'></a>Example: **ERA5-Land hourly data from 1950 to present**

> Data used in [113_c3s_climate_indices](./113_c3s_climate_indices.ipynb)

CDS API name: `reanalysis-era5-land`

> - Variable: `['10m_u_component_of_wind', '10m_v_component_of_wind','2m_temperature']`
> - Year: `[1981 to 2020]`
> - Month: `12`
> - Day: `15`
> - Time: `12:00`
> - Area: `[60, -10, 35, 30]` # North, West, South, East
> - Format: `netcdf`

**Note:** the request above makes use of the keyword `area`, which enable you to retrieve only a geographical subset. The bounding box information are set as follows: `[N, W, S, E]`. When this keyword is set, the data is automatically projected to a grid from [-180, 180].  

In [None]:
import cdsapi
c = cdsapi.Client()
c.retrieve(
    'reanalysis-era5-land',
    {
        'variable': [
            '10m_u_component_of_wind', '10m_v_component_of_wind', '2m_temperature',
        ],
        'year': [
            '1981', '1982', '1983',
            '1984', '1985', '1986',
            '1987', '1988', '1989',
            '1990', '1991', '1992',
            '1993', '1994', '1995',
            '1996', '1997', '1998',
            '1999', '2000', '2001',
            '2002', '2003', '2004',
            '2005', '2006', '2007',
            '2008', '2009', '2010',
            '2011', '2012', '2013',
            '2014', '2015', '2016',
            '2017', '2018', '2019',
            '2020',
        ],
        'month': '12',
        'day': '15',
        'time': '12:00',
        'format': 'netcdf',
        'area': [
            60, -10, 35,
            30,
        ],
    },
    './data/era5-land_eur_1981_2020.nc')

<hr>

### <a id='seasonal_forecasts'></a>4.2 Seasonal forecasts

#### <a id='seas5_monthly_hindcast'></a>Example: **Seasonal forecast monthly statistics on single levels - Retrospective forecasts (Hindcasts)**

> Data used in 
> * [121_c3s_seasonal_forecast anomalies](./121_c3s_seasonal_forecasts_anomalies.ipynb)
> * [122_c3s_seasonal_forecast_bias_correction](./122_c3s_seasonal_forecasts_bias_correction.ipynb)

CDS API name: `seasonal-monthly-single-levels`

> - Originating centre: `ecmwf`
> - System: `5`
> - Product type: `monthly_mean`
> - Variable: `['total_precipitation', 'sea_surface_temperature']`
> - Year: `[1993 to 2016]` # Hindcast data
> - Month: `09`
> - Leadtime month: `['1', '2', '3', '4', '5', '6']`
> - Geographical area: `Whole available region` 
> - Format: `grib`

In [None]:
import cdsapi

c = cdsapi.Client()

c.retrieve(
    'seasonal-monthly-single-levels',
    {
        'format': 'grib',
        'originating_centre': 'ecmwf',
        'system': '5',
        'variable': ['total_precipitation','sea_surface_temperature'],
        'product_type': 'monthly_mean',
        'year': [
            '1993', '1994', '1995',
            '1996', '1997', '1998',
            '1999', '2000', '2001',
            '2002', '2003', '2004',
            '2005', '2006', '2007',
            '2008', '2009', '2010',
            '2011', '2012', '2013',
            '2014', '2015', '2016',
        ],
        'month': '09',
        'leadtime_month': [
            '1', '2', '3',
            '4', '5', '6',
        ],
    },
    './data/ecmwf_seas5_1993-2016_09_hindcast_monthly.grib')

<br>

#### <a id='seas5_monthly_hindcast_mean'></a>Example: **Seasonal forecast monthly statistics on single levels - Hindcast climate mean**

> Data used in [121_c3s_seasonal_forecast anomalies](./121_c3s_seasonal_forecasts_anomalies.ipynb)

CDS API name: `seasonal-monthly-single-levels`

> - Originating centre: `ecmwf`
> - System: `5`
> - Product type: `hindcast_climate_mean`
> - Variable: `'total_precipitation'`
> - Year: `2021`
> - Month: `09`
> - Leadtime month: `['1', '2', '3', '4', '5', '6']`
> - Geographical area: `Whole available region` 
> - Format: `grib`

In [None]:
import cdsapi

c = cdsapi.Client()

c.retrieve(
    'seasonal-monthly-single-levels',
    {
        'format': 'grib',
        'originating_centre': 'ecmwf',
        'system': '5',
        'variable': 'total_precipitation',
        'product_type': 'hindcast_climate_mean',
        'month': '09',
        'leadtime_month': [
            '1', '2', '3',
            '4', '5', '6',
        ],
        'year': '2021',
    },
    './data/ecmwf_seas5_hincast_climate_mean_tp.grib')

<br>

#### <a id='seas5_monthly_forecast'></a>Example: **Seasonal forecast monthly statistics on single levels - Forecasts**

> Data used in 
> * [121_c3s_seasonal_forecast anomalies](./121_c3s_seasonal_forecasts_anomalies.ipynb)
> * [122_c3s_seasonal_forecast_bias_correction](./122_c3s_seasonal_forecasts_bias_correction.ipynb)

CDS API name: `seasonal-monthly-single-levels`

> - Originating centre: `ecmwf`
> - System: `5`
> - Product type: `monthly_mean`
> - Variable: `['total_precipitation', 'sea_surface_temperature']`
> - Year: `2021`
> - Month: `09`
> - Leadtime month: `['1', '2', '3', '4', '5', '6']`
> - Geographical area: `Whole available region` 
> - Format: `grib`

In [None]:
import cdsapi

c = cdsapi.Client()

c.retrieve(
    'seasonal-monthly-single-levels',
    {
        'format': 'grib',
        'originating_centre': 'ecmwf',
        'system': '5',
        'variable': ['total_precipitation', 'sea_surface_temperature'],
        'product_type': 'monthly_mean',
        'year': '2021',
        'month': '09',
        'leadtime_month': [
            '1', '2', '3',
            '4', '5', '6',
        ],
    },
    './data/ecmwf_seas5_2021_09_forecast_monthly.grib')

<br>

#### <a id='seas5_monthly_anomalies'></a>Example: **Seasonal forecast anomalies on single levels**

> Data used in [121_c3s_seasonal_forecast anomalies](./121_c3s_seasonal_forecasts_anomalies.ipynb)

CDS API name: `seasonal-postprocessed-single-levels`

> - Originating centre: `ecmwf`
> - System: `5`
> - Product type: `ensemble_mean`
> - Variable: `['total_precipitation_anomalous_rate_of_accumulation']`
> - Year: `2021`
> - Month: `09`
> - Leadtime month: `['1', '2', '3', '4', '5', '6']`
> - Geographical area: `Whole available region` 
> - Format: `grib`

In [None]:
import cdsapi

c = cdsapi.Client()

c.retrieve(
    'seasonal-postprocessed-single-levels',
    {
        'format': 'grib',
        'originating_centre': 'ecmwf',
        'system': '5',
        'variable': 'total_precipitation_anomalous_rate_of_accumulation',
        'product_type': 'ensemble_mean',
        'year': '2021',
        'month': '09',
        'leadtime_month': [
            '1', '2', '3',
            '4', '5', '6',
        ],
    },
    './ecmwf_seas5_anomalies_2021_09_tp.grib')

<hr>

<p><img src='./img/copernicus_logo.png' align='right' alt='Logo EU Copernicus' width='20%'></img></p>
<br><br><br><br><br>
<span style='float:right'><p style=\"text-align:right;\">This project is licensed under <a href="./LICENSE">APACHE License 2.0</a>. | <a href=\"https://github.com/ecmwf-projects/copernicus-training">View on GitHub</a></span>