# Background

## Problem description

Advancement in environmental data collection and standardization has led to the establishment of datasets of long time series of environmental data sampling many biotopes worldwide, such as [FluxNet2015](https://fluxnet.org/data/fluxnet2015-dataset/).
Environmental variables from this dataset include variables measured locally with a half-hourly resolution such as temperature, precipitation, air pressure, and various variables related to gas fluxes.
Other datasets, such as [MODIS](https://modis.gsfc.nasa.gov/data/), propose variables collected through remote sensing technology. Examples are the Leaf Area Index (LAI) and the fraction of Absorbed Photosynthetically Active Radiation (fAPAR).

One interesting variable provided by FluxNet2015 is the Gross Primary Product (GPP) which is the brut carbon flux entering the ecosystem via photosynthesis.
In other words, it estimates the rate of capture of CO2 from the atmosphere by vegetation ecosystems (which is compensated by the re-emission of carbon from other processes such as respiration) and is therefore an important metric in the context of global warming and goals of achieving net-zero in 2050.

The questions I propose to examine are:
- How well is it possible to model GPP from local and/or remote sensed data using machine learning techniques?
- How well do models generalize through time and across vegetation types?

## Definitions of common variables

There are many parameters which influence or characterize plant ecosystems.
We describe here the ones which are found in the dataset.

### Meteorological and chemical

| Variable  | Description |
|-----------|---------------------------------------------------------------------------|
| P   | Precipitation             |
| TA  | Air temperature           |
| WS  | Wind speed                |
| PA  | Air pressure              |
| VPD | Vapor pressure deficit    |
| CO2 |CO2 concentration          |

### Fluxes

#### CO2 fluxes

| Variable  | Description |
|-----------|---------------------------------------------------------------------------|
| GPP       | Gross primary product (incoming flux of CO2 in the ecosystem)             |
| RECO      | Respiration ecosystem (outgoing flux of CO2 from the ecosystem)           |
| NEE       | Net ecosystem exchange (net CO2 flux)                                     |

#### Radiative fluxes

| Variable          | Description |
|-----------|---------------------------------------------------------------------------|
| SW_IN_F_MDS       | Incoming shortwave radiation                                       |
| LW_IN_F_MDS       | Incoming longwave radiation                                        |
| NETRAD            | Net radiation                                                      |
| USTAR             | Water friction velocity (u*-threshold method)                      |
| SW_OUT            | Shortwave radiation, outgoing                                             

#### Heat fluxes

| Variable          | Description |
|-----------|---------------------------------------------------------------------------|
| LE_F_MDS          | Latent heat flux                                                    |
| H_F_MDS           | Sensible heat flux                                                  |

### Remote sensed variables

#### Leaf Area Index (LAI)

The Leaf Area Index ([LAI](https://en.wikipedia.org/wiki/Leaf_area_index)) expresses the leaf area per unit ground surface area of a canopy and is commonly used as an indicator of the growth rate of plants.
LAI is defined as the one-sided green leaf area per unit ground surface area in broadleaf canopies. In conifers, it is defined as half of the total needle surface area per unit ground surface area.

#### Fraction of absorbed photosynthetically active radiation (fAPAR)

The Fraction of Absorbed Photosynthetically Active Radiation ([fAPAR](https://en.wikipedia.org/wiki/Fraction_of_absorbed_photosynthetically_active_radiation)) is the fraction of the incoming solar radiation in the photosynthetically active radiation spectral region that is absorbed by a photosynthetic organism, typically describing the light absorption across an integrated plant canopy.

### IGBP classification

The IGBP classification defines 17 basic types of land cover defined by the International Geosphere Biosphere Programme (IGBP): 11 types of natural vegetation classification, 3 types of land use and land inlays, and 3 types of non-vegetation land classification.
Examples are grassland, cropland, evergreen needleleaf forest.
The complete list is available [here](https://fluxnet.org/data/badm-data-templates/igbp-classification/).