# Sentinel 5 Data Acquisition
Ref: [Spatio-temporal data on the air pollutant nitrogen dioxide derived from Sentinel satellite for France](https://www.sciencedirect.com/science/article/pii/S2352340919314453)


## Non-Technical Summary
We can retrieve the following data from Sentinel 5

| Product type | Parameter                                                         |
| ------------ | ----------------------------------------------------------------- |
| L2__O3____   | Ozone (O3) total column                                           |
| L2__NO2___   | Nitrogen Dioxide (NO2), tropospheric, stratospheric, slant column |
| L2__SO2___   | Sulfur Dioxide (SO2) total column                                 |
| L2__CO____   | Carbon Monoxide (CO) total column                                 |
| L2__CH4___   | Methane (CH4) total column                                        |
| L2__HCHO__   | Formaldehyde (HCHO) tropospheric, slant column                    |
| L2__AER_AI   | UV Aerosol Index                                                  |
| L2__CLOUD_   | Cloud fraction, albedo, top pressure                              |

Spatial Resolution 0.01 degrees ~1km

Data Availability:  2019,2020 (and probably 2021) **daily**. (Weekly-Monthly analysis is easy, disk space might be an issue)

## Technical Summary
Use repo: [s5p tools](https://github.com/bilelomrani1/s5p-tools) . It retrieves L2 data from s5p and converts to L3 using harp.

### Step 1 - Setup
We recommend setting up a virtual environment with [conda](https://docs.conda.io/projects/conda/en/latest/) for easy management of external dependencies. 

1. Clone the [s5p-tools](https://github.com/bilelomrani1/s5p-tools) repository locally.
    ```bash
    git clone https://github.com/bilelomrani1/s5p-tools.git
    ```


2. Create a virtual environment named `s5p` with Python 3.7 and `s5p-tools` dependencies.

    ```bash
    conda create -n s5p python=3.7
    conda activate s5p
    conda install -y -c conda-forge jupyterlab cartopy dask harp seaborn
    conda install -c conda-forge --file s5p-tools/requirements.txt
    pip install sentinelsat
    ```

### Step 2 - Download s5p data
We use the default script provided by `s5p-tools` to download and process data from Sentinel-5P. Here, we query the tropospheric column of NO2 over Greece (create .geojson from [geojson.io](geojson.io) for the area of interest) between 01/06/2020 and 08/06/2020. 


```bash
python s5p-tools/s5p-request.py L2__NO2___ --aoi geojson/greece.geojson --date 20200601 20200608
```

This way we download the data we are interested in from sentinelsat API for the area we are interested in. Level 2 products are downloaded and then processed into L3 product. 

### Step 3 - Analysis/Dataset creation

In [2]:
import xarray as xr
import numpy as np
import pandas as pd
from itertools import product, cycle

# Ignore Runtime warnings
import warnings
warnings.filterwarnings('ignore', category=RuntimeWarning)

# Propagate attributes during computation
xr.set_options(keep_attrs=True);

#### Importing the dataset

In [6]:
DS = xr.open_dataset('processed/processed__AER_AI/AER_AI1-8-2020__1-8-2020.nc')
DS

This array contains information about AER_AI as obtained by Sentinel 5 for date  1-8-2020. We can iterate through those data in the following way

In [8]:
VARIABLE = 'absorbing_aerosol_index'
DS[VARIABLE]

#### Prefecture Granularity Analysis
Assuming that we have  the data for the whole area of interest (i.e. whole Greece). We can use data from [geodata.gr](https://geodata.gov.gr/dataset/oria-nomon-okkhe) for prefecture-granularity analysis.

In [9]:
import cartopy.io.shapereader as shpreader
import shapely.vectorized

reader = shpreader.Reader('shp/prefecture.shp')
records = list(reader.records())

pd.DataFrame([entry.attributes for entry in records])

Unnamed: 0,PARENT,ESYE_ID,NAME_GR,NAME_ENG,pop,EDRA,shape_leng,shape_area
0,61000000,61000000,?. ???????,N. PIERIAS,126412.0,????????,273982.7,1521683000.0
1,97000000,97000000,?. ??????? ???????,N. DYTIKIS ATTIKIS,149794.0,????????,226024.53,1004945000.0
2,33000000,33000000,?. ?????????,N. IOANNINON,161027.0,????????,459160.75,4999037000.0
3,59000000,59000000,?. ??????,N. PELLAS,143957.0,??????,323924.97,2505800000.0
4,5000000,5000000,?. ??????????,N. EVRYTANIAS,19518.0,?????????,267851.3,1870664000.0
5,7000000,7000000,?. ???????,N. FOKIDAS,37866.0,???????,357885.44,2126379000.0
6,1000000,1000000,?. ????????????????,N. ETOLOAKARNANIAS,219092.0,?????????,905135.5,5425151000.0
7,42000000,42000000,?. ???????,N. LARISAS,282156.0,??????,479120.03,5385277000.0
8,16000000,16000000,?. ????????,N. LAKONIAS,92811.0,??????,622035.7,3636384000.0
9,91000000,91000000,?. ?????????,N. IRAKLIOU,291225.0,????????,349691.72,2634600000.0


We can now iterate through the data Array and group the measurements by Prefecture. (Relatively easy using shapely)

Example: 
```python
mask = shapely.vectorized.contains(region.geometry, x, y).reshape((DS.latitude.shape[0], DS.longitude.shape[0]))
```

#### Notes (TLDR) 
* Available metrics (from Sentinel 5, more updates on ERA5 soon):
```
O3, NO2, SO2, CO, CH4, HCHO, AER_AI, CLOUD
```
* Daily Metrics available (AFAIK)

* Image size on avg 100-400MB per metric. For 1-year weekly analysis (for 8 metrics) 8*52 = 416 Images ~100-200GB of Images. (Processing time per image is trivial, DISK space and download speed is current bottleneck)
