# ESDP1 – Homework 1
## ERA5 Data Access (Pelin Su Kaplan)

In [None]:
import cdsapi
import xarray as xr
import matplotlib.pyplot as plt
import numpy as np

# Step 1: CDS API configuration

In this step, we initialize the CDS API client (`cdsapi.Client`) that will be used to request ERA5 data.
The client automatically reads the API key stored in the `~/.cdsapirc` file and establishes a connection
to the Copernicus Climate Data Store (CDS) service.

This step only verifies that the connection works. 
No data is downloaded here.

In [4]:
import cdsapi

# Initialize the CDS API client using the credentials stored in ~/.cdsapirc
client = cdsapi.Client()

# Display the client object to confirm successful connection
client

<ecmwf.datastores.legacy_client.LegacyClient at 0x10f3ac770>

# Step 2: ERA5 download request

In this step, I define and submit a download request for a small ERA5 subset using the CDS API client.

For this homework, I request:
- Dataset: **ERA5 single levels** (`reanalysis-era5-single-levels`)
- Variables: 10 m u-component of wind, 10 m v-component of wind, 2 m temperature
- Region: a box over Central Europe
- Time period: a few days in September 2024, at 00, 06, 12, and 18 UTC
- Output format: NetCDF

The request is passed as a Python dictionary to `client.retrieve(...)`, which sends it to the CDS
server and saves the downloaded file to disk.

In [None]:
# Define the ERA5 dataset name on the CDS
dataset = "reanalysis-era5-single-levels"

# Define the download request as a Python dictionary
request = {
    "product_type": "reanalysis",
    "variable": [
        "10m_u_component_of_wind",
        "10m_v_component_of_wind",
        "2m_temperature",
    ],
    "year": "2024",
    "month": "09",
    # A small subset of days to keep the download manageable
    "day": [
        "01", "02", "03", "04", "05",
    ],
    # Four synoptic hours per day
    "time": [
        "00:00", "06:00", "12:00", "18:00",
    ],
    # Area is given as [north, west, south, east] in degrees
    # Here: a rough box over Central Europe
    "area": [
        60,   -10,   # north, west
        45,    20,   # south, east
    ],
    "format": "netcdf",  # NetCDF is convenient for xarray
}

# Define the output filename
target_file = "era5_central_europe_sept2024_small.nc"

# Send the request to CDS and download the data
client.retrieve(dataset, request, target_file)

# Step 3: Open dataset with xarray

In [6]:
import xarray as xr

# Path to the downloaded ERA5 NetCDF file
data_file = "era5_central_europe_sept2024_small.nc"

# Open the dataset with xarray
ds = xr.open_dataset(data_file)

# Display the dataset object to see its basic structure
ds

# Step 4: Inspect dataset

In this step, we examine the structure of the ERA5 dataset using xarray.
We check the dimensions, coordinates, variables, and basic statistics to
understand what the data contains before plotting or further analysis.

In [7]:
# Inspect the dataset by showing its dimensions, coordinates, variables, and metadata
ds.dims, ds.coords, list(ds.data_vars), ds.attrs

 Coordinates:
     number      int64 8B ...
   * valid_time  (valid_time) datetime64[ns] 160B 2024-09-01 ... 2024-09-05T18...
   * latitude    (latitude) float64 488B 60.0 59.75 59.5 ... 45.5 45.25 45.0
   * longitude   (longitude) float64 968B -10.0 -9.75 -9.5 ... 19.5 19.75 20.0
     expver      (valid_time) <U4 320B ...,
 ['u10', 'v10', 't2m'],
 {'GRIB_centre': 'ecmf',
  'GRIB_centreDescription': 'European Centre for Medium-Range Weather Forecasts',
  'GRIB_subCentre': np.int64(0),
  'Conventions': 'CF-1.7',
  'institution': 'European Centre for Medium-Range Weather Forecasts',
  'history': '2025-12-01T21:19 GRIB to CDM+CF via cfgrib-0.9.15.1/ecCodes-2.42.0 with {"source": "tmpjrdim979/data.grib", "filter_by_keys": {"stream": ["oper"], "stepType": ["instant"]}, "encode_cf": ["parameter", "time", "geography", "vertical"]}'})