# Downloading ensemble weather forecasts

The atmosphere can be viewed as a chaotic system in which the future state depends sensitively on the initial conditions, i.e. a slight change in the initial conditions can lead to a significant change in the forecast outputs. The fact that estimates of the current state are inaccurate and that numerical models have inadequacies, leads to forecast errors and uncertainty that grow with increasing forecast lead time. Ensemble forecasting aims at capturing this forecast uncertainty by generating an ensemble of several possible scenarios with the same probability of occurrence. ([Learn more about ensemble prediction](https://www.youtube.com/watch?v=NLhRUun2iso))

In this Notebook we will learn how to download ensemble weather forecasts and hindcasts from the [ECMWF public dataset]. This Notebook downloads 25 (Hindcast: 1981-2016) or 50 (Forecast: 2017-2019) members of the ECMWF seasonal forecast from the server.
They are updated and published online every 1st day of the month.

The system used to generate the seasonal forecast ensemble is the [SEAS5](https://www.ecmwf.int/en/newsletter/154/meteorology/ecmwfs-new-long-range-forecasting-system-seas5).

C3S Seasonal Catalogue: http://apps.ecmwf.int/data-catalogues/c3s-seasonal/?class=c3

The files are in netcdf4 format.(https://apps.ecmwf.int/datasets/).

<left> <img src="../../util/images/uncertainty.1.jpg" width = "400px"><left>
## 1. Create an account on the Copernicus Climate Data Store
First of all you need to register on the [Copernicus Climate Data Store](https://cds.climate.copernicus.eu)

Once you have created an account copy your user ID (UID) and API key. You can find them in your User profile

In the folder containing this Notebook you will find a file called ".cdsapirc". Copy and paste this file in your "home" folder. On Windows it corresponds to "C:/Users/{your username on Windows}/"

Open the copied file with a text editor, you should see this:

> url: https://cds.climate.copernicus.eu/api/v2

> key: UID:APIkey
    
Now edit this text and replace UID by your own UID number and APIkey by your own API key number (make sure that both numbers are separated by a colon)
    
You can also find these intructions in this [link](https://cds.climate.copernicus.eu/api-how-to)

## 2. Install the CDS API client library
Use this command to install the library:

> pip install cdsapi

Use [this link](../A%20-%20Knowledge%20transfer/0.b.%20How%20to%20install%20libraries.ipynb) to learn how to install a library.

## 3. Import libraries
First, we need to import the necessary libraries and tools. **Only if iRONS is run locally**: since one required library, [Netcdf4](https://pypi.org/project/netCDF4/) is not available on Anaconda by default, you must have installed it first. Help on how to install such libraries is given here: [How to install libraries](../A%20-%20Knowledge%20transfer/0.b.%20How%20to%20install%20libraries.ipynb). If iRONS is run on the cloud, e.g. on [Binder](https://mybinder.org/) or [Microsoft Azure Notebooks](https://notebooks.azure.com/), we do not need to install the libraries to import them. 
Once all the necessary libraries are installed locally or in the case that we are running iRONS on the cloud, we can import them with the following code:

In [None]:
import numpy as np
import pandas as pd
import os
import cdsapi
server = cdsapi.Client()
import sys
from netCDF4 import Dataset # to extract data from NetCDF files (format of the downloaded ECMWF files)

In [1]:
import numba
print(numba.__version__)

0.39.0


**Tools from the iRONS toolbox**

In [None]:
sys.path.append('../../Toolbox')
from Weather_forecast.Download_forecast import data_retrieval_request
from Data_management.Read_data import read_netcdf_data

## 4. Define the data and file parameters

In [None]:
# Originating centre of the ensemble weather forecast
originating_centre = 'ECMWF'
system = '5'
# Weather variables to download
weather_variables = ['2m_temperature','evaporation','total_precipitation']
# Initial dates of the forecast
years = [2014] # np.arange(1981,2019)
months = [11] # np.arange(1,13)
days = [1]
# Forecast leadtime
leadtime = 5160 # hours. 5160 hours = 7 months approximately
time_step = 24 # hours
leadtime_hours = [str(x) for x in np.arange(0,leadtime+time_step,time_step)] 
# Spatial coordinates
grid_resolution = '0.2/0.05' # The first number is east-west resolution (longitude) and the second is north-south (latitude)
coordinates = '51.10/-3.5/51.05/-3.3' # This defines a squared area defined by N/W/S/E (in degrees)

# Format of the file to download 
file_format = 'netcdf'
# Folder and file name ending
folder_path = 'Inputs//'+originating_centre+' forecasts '+file_format
file_name_end = '_1d_7m_'+originating_centre+'_Temp_Evap_Rain.nc'

## 5. Download the forecast file
Here we call the submodule to send the request to download the file (the files are stored in the Inputs folder).
**Comment**: it may take quite long to download the forecast. As you will see, the request will be queued.

In [None]:
data_retrieval_request(originating_centre,system,weather_variables,
                           years, months, days, leadtime_hours,
                           grid_resolution, coordinates,
                           file_format,folder_path,file_name_end)

## 6. Save a copy of the forecast file in CSV format
The downloaded forecast files are in [NetCDF](https://confluence.ecmwf.int/display/CKB/What+are+NetCDF+files+and+how+can+I+read+them) (Network Common Data Form) format. This file format supports the creation, access, and sharing of array-oriented scientific data.
Here we will extract temperature, evaporation and rainfall data using their corresponding short names: 't2m' (temperature), 'e' (evaporation) and 'tp' (rainfall) respectively and save the forecast ensemble for each of these weather variables in CSV (the files are stored in the Inputs folder).  

**Comment:** You can find a complete list of weather variables with their corresponding short names in this [link](https://apps.ecmwf.int/codes/grib/param-db/?filter=netcdf).

**Comment:** A CSV is a comma-separated values file, which allows data to be saved in a tabular format. CSV files can be used either with most any spreadsheet program, such as Microsoft Excel, or text editors.

### 6.1 Extract temperature data (temperature at 2m over the surface: 't2m')
Original data is in degK

In [None]:
for year in years:
    for month in months:
        for day in days:
            file_name = str(year)+str(month).zfill(2)+str(1).zfill(2)+file_name_end
            dates_fore,Temp_fore = read_netcdf_data(folder_path,file_name,'t2m')
            # Spatially averaged data and converted into degC
            Temp_fore_ens = Temp_fore.mean(3).mean(2)-273.15
            Temp_fore_df = pd.DataFrame(Temp_fore_ens)
            Temp_fore_df.insert(0,'Date',dates_fore.strftime('%d/%m/%Y'))
            Temp_fore_df.to_csv('Inputs//'+originating_centre+' forecasts csv'+'//'+
                                str(year)+str(month).zfill(2)+str(day).zfill(2)+
                                '_1d_7m_'+originating_centre+'_Temp.csv',index = None)

### 6.2 Evaporation

In [None]:
for year in years:
    for month in months:
        for day in days:
            file_name = str(year)+str(month).zfill(2)+str(1).zfill(2)+file_name_end
            dates_fore,Evap_fore = read_netcdf_data(folder_path,file_name,'e')
            # Spatially averaged data and coverted into mm
            Evap_fore_ens = -Evap_fore.mean(3).mean(2)*1000
            Evap_fore_df = pd.DataFrame(Evap_fore_ens)
            Evap_fore_df.insert(0,'Date',dates_fore.strftime('%d/%m/%Y'))
            Evap_fore_df.to_csv('Inputs//'+originating_centre+' forecasts csv'+'//'+
                                str(year)+str(month).zfill(2)+str(day).zfill(2)+
                                '_1d_7m_'+originating_centre+'_Evap.csv',index = None)

### 6.3 Rainfall

In [None]:
for year in years:
    for month in months:
        for day in days:
            file_name = str(year)+str(month).zfill(2)+str(1).zfill(2)+file_name_end
            dates_fore,Rain_fore = read_netcdf_data(folder_path,file_name,'tp')
            # Spatially averaged data and coverted into mm
            Rain_fore_ens = Rain_fore.mean(3).mean(2)*1000
            Rain_fore_df = pd.DataFrame(Rain_fore_ens)
            Rain_fore_df.insert(0,'Date',dates_fore.strftime('%d/%m/%Y'))
            Rain_fore_df.to_csv('Inputs//'+originating_centre+' forecasts csv'+'//'+
                                str(year)+str(month).zfill(2)+str(day).zfill(2)+
                                '_1d_7m_'+originating_centre+'_Rain.csv',index = None)

#### Let's go to the next Notebook to read and bias correct the downladed data: [Bias correction of weather forecasts](1.b.%20Bias%20correction%20of%20weather%20forecasts.ipynb)