# Downloading ERA5 data 

This notebook illustrates the use of module 'era5_from_ee' to download ERA5 data from Google Earth Engine (GEE). This specific example is closely connected to the creation of a fire propagation database. After creating the individual burned area propagation files, ERA5 data is downloaded according to the shape, time and the geographical position of the burned area images. Nevertheless, the presented code can be used to download ERA5 data for every other shape as well.

ERA5 reanalysis is updated hourly. Accordingly, one file is downloaded for every hour of a fire propagation interval. This can be specified and manually adapted in the code.

ERA5 includes 50 different meteorological variables. For most tasks, not all variables are needed. To avoid unnecessary large data, the necessary variables are specified and extracted in Google Earth Engine, before they are downloaded simultaneously.

### 1. Importing the necessary libraries

While importing most necessary libraries can be done normally (e.g., using conda-forge and a specified conda environment), GDAL is used in geopandas, rasterio and rioxarray, which can sometimes lead to conflicts. This can be circumvented by following a specific order of installation. In the presented example, we used the Python Version 3.11.9 and installed the aforementioned packages in the following order:
1. conda install -c conda-forge gdal
2. conda install -c conda-forge geopandas rasterio
3. conda install -c conda-forge rioxarray

In [None]:
# import necessary libraries
import pandas as pd
import geopandas as geo
import datetime

### 2. Import the modules for downloading ERA5 data from Google Earth Engine

In [None]:
from era5_from_ee import retrieve_era5_from_ee

Read the geopandas dataframe that contains the shapes for which the ERA5 data is supposed to be extracted.
This dataframe needs to be reprojected to the CRS 4326, since this is the CRS of ERA5 in Google Earth Engine. 

If this code is directly used after creating the fire propagation database, gdf_poly already exists as a variable and it can be used accordingly. 

In [None]:
gdf_poly = geo.read_file(".../shapes_of_burned_areas.shp")

gdf_poly_4326 = gdf_poly.to_crs(4326)

# The date is needed to extract the correct data from the GEE collection. This line ensures that the date column includes a datetime object
gdf_poly_4326["date"] = pd.to_datetime(gdf_poly_4326["date"])

To download ERA5 data from GEE, we wrote the 'retrieve_era5_from_ee' class.
A new instance of this class needs to be initialized. 

The two necessary variables for initialization include:
1. A geopandas dataframe with the shapes for which you want to retrieve the ERA5 data, and 
2. Your Google Earth Engine project name. This is needed for authentification purposes.

In [None]:
# initialize the retrieve_era5_from_ee class
get_era5 = retrieve_era5_from_ee(burned_area_poly=gdf_poly_4326, proj_ee = 'your GEE project name')

Using the Google Earth Engine API with Python requires an authentification.
'authenticate_ee()' takes care of this. To work properly, ´the correct project name needs to be provided while initializing the new instance of the 'retrieve_sent2_from_ee class'.

In [None]:
# initialize to google earth engine
get_era5.authenticate_ee()

After transferring the files from server to client (i.e., downloading the data from GEE), they are saved as raster files in the specified the output path. 

In this example, only specific ERA5 variables are needed for fire related tasks. If other variables are important, the comp_names parameter can be adjusted accordingly.

In [None]:
# set path to save the raster files
out_path = ".../ERA5"

# get start- and date and transfrom to correct type
# end_date defines the number of days after the starting day, for which ERA5 datra can be retrieved.
# Since wildfire propagation is given in 12 hour intervals, a single day is used in this exampe. 
end_date = str(gdf_poly_4326.iloc[0].date + datetime.timedelta(days=1))[0:10]
start_date = str(gdf_poly_4326.iloc[0].date)[0:10]

# download the ERA5 data and save them in the output path.
# Here, the data is only downloaded for the first shape of the geodataframe.
# To download data for every shape of the geodataframe, the function 'download_era5' needs to be iterate over every individual shape.
get_era5.download_era5(sel_polygon = gdf_poly_4326.iloc[0], 
                       start_date = start_date, 
                       start_hour = 0, 
                       end_date = end_date, 
                       len_fire_sequence = 12, 
                       out_directory = out_path,
                       comp_names = ['u_component_of_wind_10m','v_component_of_wind_10m','dewpoint_temperature_2m','temperature_2m','surface_pressure', 'total_precipitation'])