# Prediction of yields and Futures price of crops in Europe using climate data

Team MiECa:
- Mikhail Sarafanov
- Egor Turukhanov
- Juan Camilo Diosa E.

## Processing of climate data

Data is used: [E-OBS daily gridded meteorological data for Europe from 1950 to present](https://cds.climate.copernicus.eu/cdsapp#!/dataset/insitu-gridded-observations-europe?tab=overview)

Preprocessing of hydrometeorological information is carried out in 2 stages:
* Transformation of the source data of the reanalysis grid. Creating features
* Combining hydrometeorological information with yield data + adding additional predictors

### Transformation of the source data of the reanalysis grid. Creating features

The source archives obtained from the European space Agency website contain netCDF files. The files have daily fields for the following parameters:
- Mean daily air temperature, ℃ 
- Minimum daily air temperature, ℃
- Maximum daily air temperature, ℃
- Pressure, HPa
- Precipitation, mm

Based on the initial parameters, indicators for the first half of each year were calculated:
- 1) Precip_amount - total rainfall for the first half of the year, mm
- 2) Precip_days - the number of days with precipitation for the first half of the year, days
- 3) Pressure_mean - average pressure, hpa
- 4) Temperature_max - maximum average daily air temperature for the first six months, ℃
- 5) Temperature_min - the minimum average daily temperature for the first six months, ℃
- 6) Temperature_SAT - the sum of active temperatures above 10 degrees Celsius, ℃

An example of an algorithm for calculating the sum of active temperatures above 10 degrees Celsius can be seen below:
![Data_preparation.png](https://raw.githubusercontent.com/Dreamlone/ITMO_Masters_degree/master/Images/img_1.png)

As a result of these conversions, fields with attribute values for each year were obtained. An example for the sum of active temperatures above 10 degrees for the first half of the year in 1950 can be seen below.

![SAT_10.png](https://raw.githubusercontent.com/Dreamlone/ITMO_Masters_degree/master/Images/img_3.png)

An algorithm is used to implement the above actions:

#### TransformNetCDF

To speed up the work time, a powerful library for parallelizing calculations is used - Dask.

In [None]:
# Количество дней с осадками за первое полугодие
transformed_matrix, timesteps = transform_grid(path = '../Reanalysis_grid_Europe', 
                                               matrix_name = "rr_ens_mean_0.1deg_reg_v20.0e.nc", 
                                               calculate_days = True)
save_netCDF(transformed_matrix, timesteps, '../Reanalysis_grid_Europe/Processed_grid/Precip_days.nc',
            countries_tif = '../Reanalysis_grid_Europe/Country_bounds.tif',
            land_tif = '../Reanalysis_grid_Europe/LandMatrix.tif')

# Сумма осадков за первое полугодие
transformed_matrix, timesteps = transform_grid(path = '../Reanalysis_grid_Europe', 
                                               matrix_name = "rr_ens_mean_0.1deg_reg_v20.0e.nc")
save_netCDF(transformed_matrix, timesteps, '../Reanalysis_grid_Europe/Processed_grid/Precip_amount.nc',
            countries_tif = '../Reanalysis_grid_Europe/Country_bounds.tif',
            land_tif = '../Reanalysis_grid_Europe/LandMatrix.tif')

# Среднее поле давления за первые полгода
transformed_matrix, timesteps = transform_grid(path = '../Reanalysis_grid_Europe', 
                                               matrix_name = "pp_ens_mean_0.1deg_reg_v20.0e.nc")
save_netCDF(transformed_matrix, timesteps, '../Reanalysis_grid_Europe/Processed_grid/Pressure_mean.nc',
            countries_tif = '../Reanalysis_grid_Europe/Country_bounds.tif',
            land_tif = '../Reanalysis_grid_Europe/LandMatrix.tif')

# Максимальная температура воздуха, встречавшаяся за первые полгода
transformed_matrix, timesteps = transform_grid(path = '../Reanalysis_grid_Europe', 
                                               matrix_name = "tx_ens_mean_0.1deg_reg_v20.0e.nc")
save_netCDF(transformed_matrix, timesteps, '../Reanalysis_grid_Europe/Processed_grid/Temperature_max.nc',
            countries_tif = '../Reanalysis_grid_Europe/Country_bounds.tif',
            land_tif = '../Reanalysis_grid_Europe/LandMatrix.tif')

# Сумма активных температур выше 10 градусов
transformed_matrix, timesteps = transform_grid(path = '../Reanalysis_grid_Europe', 
                                               matrix_name = "tg_ens_mean_0.1deg_reg_v20.0e.nc")
save_netCDF(transformed_matrix, timesteps, '../Reanalysis_grid_Europe/Processed_grid/Temperature_SAT.nc',
            countries_tif = '../Reanalysis_grid_Europe/Country_bounds.tif',
            land_tif = '../Reanalysis_grid_Europe/LandMatrix.tif')

# Минимальная температура воздуха, встречавшаяся за первые полгода
transformed_matrix, timesteps = transform_grid(path = '../Reanalysis_grid_Europe', 
                                               matrix_name = "tn_ens_mean_0.1deg_reg_v20.0e.nc")
save_netCDF(transformed_matrix, timesteps, '../Reanalysis_grid_Europe/Processed_grid/Temperature_min.nc',
            countries_tif = '../Reanalysis_grid_Europe/Country_bounds.tif',
            land_tif = '../Reanalysis_grid_Europe/LandMatrix.tif')

The result of the algorithm is netCDF files with parameter fields. The files also include a matrix of landscape types and a vector consisting of timestamps.

### Combining hydrometeorological information with yield data

The received information about climate parameters is aggregated by country. Data is averaged for each country. A specially prepared country matrix is used for the aggregation procedure. The matrix has the same resolution as the reanalysis grid data.

![Countries.png](https://raw.githubusercontent.com/Dreamlone/ITMO_Masters_degree/master/Images/img_2.png)

Для создания растрового изображения со странами используется алгоритм:

#### Rasterizer

В качестве входных данных использется векторный слой с границами государств и матрица, разрешение и пространственную привязку с которой требуется скопировать. В результате работы получается растровое изображения формата geotiff.

Естественно, не вся территория стран используется в качесстве сельскохозяйственных угодий.