# Data Collection and Integration of ERA5-Land Dataset

In this notebook, we will extend our data collection process to include historical weather data from the ERA5-Land dataset provided by the Copernicus Climate Data Store (CDS). This will allow us to obtain data dating back to 1981, significantly enhancing the robustness of our predictive models.

We will:

- Set up the CDS API and install the necessary libraries.
- Download data for the variables: `temperature_2m_max`, `temperature_2m_min`, `rain_sum`, `snowfall_sum`.
- Process and save the data in the same format and structure as our existing datasets.
- Integrate the new data with our existing data cleaning pipeline.


## 2. Setup the CDS API Personal Access Token

To access data programmatically from the CDS, we need to set up the CDS API key.

**Steps:**

1. **Register or Login to CDS:**
   - If you do not have an account, please [register](https://cds.climate.copernicus.eu/user/register).
   - If you have an account, please [login](https://cds.climate.copernicus.eu/user/login).

2. **Agree to Terms of Use:**
   - Navigate to the [ERA5-Land dataset page](https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-land).
   - Scroll to the bottom and click "Accept terms".

3. **Retrieve Your API Key:**
   - Go to your [API key page](https://cds.climate.copernicus.eu/api-how-to).
   - Copy the code snippet containing your unique API key.

4. **Create the `.cdsapirc` File:**
   - Create a file named `.cdsapirc` in your home directory.
   - Paste the following content into it (replace with your actual key):

     ```
     url: https://cds.climate.copernicus.eu/api/v2
     key: your-uid:your-api-key
     ```

## 3. Install the CDS API Client

The CDS API client is a Python-based library that allows us to access the CDS programmatically.

**Installation:**

- Run the following command in your terminal or use a code cell to install via `pip`:

  ```bash
  pip install cdsapi

In [1]:
# Install the CDS API client
%pip install cdsapi

Note: you may need to restart the kernel to use updated packages.


## 4. Import Required Libraries

In [2]:
import cdsapi
import pandas as pd
import xarray as xr
import os
import numpy as np
from datetime import datetime

## 5. Define the List of Resorts and Their Coordinates

We need to specify the latitude and longitude of each resort to download data specific to their locations.

In [3]:
# Dictionary of resorts with their coordinates
resorts = {
    'french_alps/chamonix': {'lat': 45.9237, 'lon': 6.8694},
    'french_alps/val_d_isere_tignes': {'lat': 45.4469, 'lon': 6.9790},
    'french_alps/les_trois_vallees': {'lat': 45.3781, 'lon': 6.6374},
    'austrian_alps/st_anton': {'lat': 47.1287, 'lon': 10.2643},
    'austrian_alps/kitzbuhel': {'lat': 47.4467, 'lon': 12.3929},
    'austrian_alps/solden': {'lat': 46.9690, 'lon': 11.0106},
    'swiss_alps/zermatt': {'lat': 46.0207, 'lon': 7.7491},
    'swiss_alps/st_moritz': {'lat': 46.4907, 'lon': 9.8355},
    'swiss_alps/verbier': {'lat': 46.0965, 'lon': 7.2269},
    'italian_alps/cortina_d_ampezzo': {'lat': 46.5405, 'lon': 12.1357},
    'italian_alps/val_gardena': {'lat': 46.5719, 'lon': 11.7173},
    'italian_alps/sestriere': {'lat': 44.9555, 'lon': 6.8835},
    'slovenian_alps/kranjska_gora': {'lat': 46.4847, 'lon': 13.7836},
    'slovenian_alps/mariborsko_pohorje': {'lat': 46.5152, 'lon': 15.5931},
    'slovenian_alps/krvavec': {'lat': 46.2971, 'lon': 14.5375},
}


### 6.1 Function to Download ERA5-Land Data

We define a function `download_era5_land_data` that uses the CDS API to download the required variables for a given resort.
