# Working with NetCDF Data in Python Using xarray and CDS API

## Overview

In this exercise, you will learn how to:

- Programmatically download **ERA5 reanalysis** climate data over a **long time period (15 years)**
- Focus on a **small geographic region** to keep data size manageable
- Load and inspect the downloaded NetCDF dataset using **xarray**
- Extract a **time series** of temperature at a specific geographic point
- Calculate and visualize the **temperature anomaly** relative to the long-term climatology

---

## Why This Task?

Working with climate and atmospheric data often involves large datasets spanning many years and geographic areas.  
To efficiently handle and analyze this data:

- You need to programmatically request only the data you need.
- Use Python tools like **xarray** to explore, subset, and analyze data.
- Understand how to compute anomalies which highlight deviations from average conditions, useful for climate studies.

---

## Task Details and Steps

### 1. Find and Request the Dataset

- Visit the [Copernicus Climate Data Store (CDS)](https://cds.climate.copernicus.eu/).
- Search for the dataset named **"ERA5 single levels"** with the product ID:  **reanalysis-era5-single-levels**
- This dataset provides ERA5 reanalysis data on single vertical levels, suitable for near-surface temperature and other variables.
- Make sure to choose **NetCDF** as the output format (`"format": "netcdf"`) and `"product_type": "reanalysis"`.

> **Tip:**  
> If you simulate a manual download through the CDS web interface, the platform shows the equivalent **API request** snippet.  
> This snippet can be copied and expanded in your Python scripts to automate the download process.

---

### 2. Download Data Programmatically

- Use the CDS API to download **daily 2m temperature data** over 15 years for a very small region.
- Limit the time to a single daily hour (e.g., 12:00 UTC) to reduce data size.

---

### 3. Load and Explore the Dataset

- Open the downloaded NetCDF file with **xarray**.
- Inspect metadata, variables, and dimensions.

---

### 4. Extract a Time Series at a Specific Location

- Select a point within your region and extract the temperature time series.

---

### 5. Calculate and Plot the Temperature Anomaly

- Compute the climatological mean (daily average over years).
- Calculate the anomaly by subtracting the climatology.
- Visualize the anomaly time series.

---

## Expected Outcomes

- A local NetCDF file with ERA5 data.
- Familiarity with CDS API requests and metadata inspection.
- Experience in timeseries extraction and anomaly calculation.

---

## Tips for Success

- Make sure your `.cdsapirc` file is correctly configured with your API credentials.
- Start with small spatial/temporal subsets for quick testing.
- Explore xarray’s powerful time grouping and selection tools.

---

Ready to get started? Complete the provided code template step by step!


In [None]:
#Install packages using pip

## Create credentials file
### Note: you can pass credentials directly: c = cdsapi.Client(

```python
import cdsapi

c = cdsapi.Client(
    url="https://cds.climate.copernicus.eu/api/v2",
    key="12345:abcdefg-...."
)
```

In [None]:
import cdsapi
import xarray as xr
import matplotlib.pyplot as plt

# --- Part 1: Initialize CDS API client ---
c = cdsapi.Client()

# TODO: Define the variable name (e.g., "2m_temperature")
variable = "___"

# TODO: Define the years range (15 years, e.g., 2007 to 2021)
year_start = "___"
year_end = "___"

# TODO: Define the spatial bounding box [North, West, South, East]
area = [___, ___, ___, ___]  # small region for manageable data size

# TODO: Prepare request parameters dictionary
request_params = {
    "product_type": "reanalysis",
    "format": "netcdf",
    "variable": variable,
    "year": [...],
    "month": [f"{m:02d}" for m in range(1, 13)],
    "day": [f"{d:02d}" for d in range(1, 32)],
    "time": ["12:00"],
    "area": area
}

# TODO: Define the output filename
output_file = f"era5_{variable}_{year_start}_{year_end}_smallregion.nc"

# TODO: Uncomment to download the data (may take some time and requires API access)
# c.retrieve("reanalysis-era5-single-levels", request_params, output_file)

# --- Part 2: Load the dataset ---
# TODO: Load the downloaded NetCDF file using xarray
# ds = xr.open_dataset(output_file)

# TODO: Print dataset summary and inspect variables
# print(ds)
# print(ds.variables)
# print(ds.coords)

# --- Part 3: Extract time series at a specific point ---
# TODO: Choose latitude and longitude within bounding box
# lat_sel = ___
# lon_sel = ___

# TODO: Extract temperature time series at the selected point
# ts = ds[variable].sel(latitude=lat_sel, longitude=lon_sel, method="nearest")

# TODO: Plot the time series
# ts.plot()
# plt.title(f"Time series of {variable} at lat={lat_sel}, lon={lon_sel}")
# plt.show()

# --- Part 4: Calculate anomaly ---
# TODO: Compute climatological mean grouped by day of year
# climatology = ts.groupby("time.dayofyear").mean("time")

# TODO: Calculate anomaly by subtracting climatology from time series
# anomaly = ts.groupby("time.dayofyear") - climatology

# TODO: Plot anomaly time series
# anomaly.plot()
# plt.title(f"Anomaly of {variable} at lat={lat_sel}, lon={lon_sel}")
# plt.show()
