# Downloading ERA5 data from Python Jupyter Notebook

To download ERA5 reanalysis data from the Copernicus Climate Data Store (CDS), the steps in https://github.com/joaohenry23/Download_ERA5_with_python?tab=readme-ov-file can be followed. The setup steps are not explained in this file, but can be followed from the link provided. 

This file contains the code to download ERA5 reanalysis data from the Copernicus portal.

## 1. Download ERA5 data

In the code cell below ERA5 data is downloaded using the c.retrieve() function from the dcsapi library. Several parameters are defined, and can be adjusted to ones specific needs. In this example data for the $[45, -10, 35, 5]$ grid, which corresponds to the Iberian Peninsula, is downloaded. The grid resolution is set to $1 \times 1$. Data for relative humidity, geopotential height, and temperature at pressure levels of 850hPa, 700 hPa, 500hPa, and 300hPa is retrieved. The year period is set from 1960 to 2023, for all days and all months. Please bear in mind that the code cell might take a long time to run (it took me more than 10 hours with my laptop).

In [None]:
import os
import cdsapi

c = cdsapi.Client()

# Define variables and pressure levels to be retrieved
variables = ['relative humidity','geopotential','temperature',] # Select variables 
pressure_levels = ['850','700','500','300',] # Select pressure levels

# Define the output route
output_dir = 'C:/Users/elsac/Documents/ERA5/Data/Raw.nc'

# Create the folder if it does not exist
if not os.path.exists(output_dir):
    os.makedirs(output_dir)
    
for i in variables:
    for j in pressure_levels:
        output_file_path = os.path.join(output_dir, f'{i}_{j}.nc')
        c.retrieve(
            'reanalysis-era5-pressure-levels',
            {
                'product_type': 'reanalysis',
                'variable': [i],
                'pressure_level': [j],
                'year': ['1960','1961','1962','1963','1964','1965','1966','1967','1968','1969','1970','1971','1972','1973','1974','1975','1976','1977','1978','1979','1980','1981','1982','1983','1984','1985','1986','1987','1988','1989','1990','1991','1992','1993','1994','1995','1996','1997','1998','1999','2000','2001','2002','2003','2004','2005','2006','2007','2008','2009','2010','2011','2012','2013','2014','2015','2016','2017','2018','2019','2020','2021','2022','2023',],
                'month': ['01','02','03','04','05','06','07','08','09','10','11','12',],
                'day': ['01','02','03','04','05','06','07','08','09','10','11','12','13','14','15','16','17','18','19','20','21','22','23','24','25','26','27','28','29','30','31',],
                'time': '12:00',
                'format': 'netcdf',                 # Supported format: grib and netcdf. Default: grib
                'area'          : [45, -10, 35, 5], # North, West, South, East.          Default: global
                'grid'          : [1.0, 1.0],       # Latitude/longitude grid.           Default: 0.25 x 0.25
            },
            output_file_path) 

## Create function to convert .nc to .csv

In the code cell below a funcion is created that converts the .nc file to a data frame and saves it as a .csv. This is convenient to handle the data in an easier format.

In [1]:
# Function to convert nc file to DataFrame and save it
def nc_to_csv(file_path, output_dir):
    
    # Load libraries
    import netCDF4 as nc
    import pandas as pd
    import xarray as xr
    import os
    
    # Load nc file
    ds = xr.open_dataset(file_path)
    
    # Convert to data frame
    df = ds.to_dataframe()
    df = df.reset_index()
    
    # Convert time to date and time columns
    date = df['time'].dt.strftime('%m/%d/%Y')
    time =  df['time'].dt.strftime('%H:%M:%S')
    
    df['Date'] = date
    df['Time'] = time
    
    # Ensure the output directory exists
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)

    # Create the full file path for the output CSV file
    file_name = os.path.splitext(os.path.basename(file_path))[0] + '.csv'
    output_file_path = os.path.join(output_dir, file_name)

    # Save DataFrame to CSV
    df.to_csv(output_file_path, index=True, header=True)