# Sentinel-5P <img align="right" src="../../resources/easi_logo.jpg">

#### Index
- [Overview](#Overview)
- [Setup (dask, imports, query)](#Setup)
- [Product definition (measurements, flags)](#Product-definition)
- [Quality layer (mask)](#Quality-layer)
- [Scaling and nodata](#Scaling-and-nodata)
- [Visualisation](#Visualisation)
- [Appendix](#Appendix)

## Overview

https://sentinel.esa.int/web/sentinel/missions/sentinel-5p

The Copernicus Sentinel-5 Precursor mission is the first Copernicus mission dedicated to monitoring our atmosphere. The main objective of the Sentinel-5P mission is to perform atmospheric measurements with high spatio-temporal resolution, to be used for air quality, ozone & UV radiation, and climate monitoring & forecasting.

The satellite was launched on 13 October 2017. The satellite's local time of ascending node crossing is 13.30 h, which was chosen to facilitate a loose formation with NASA's Suomi-NPP spacecraft.

#### Data source and documentation

Products of interest: CH4, CO, NO2 required at 7km, 7km pixel resolution, Australia coverage

- https://sentinel.esa.int/web/sentinel/missions/sentinel-5p/data-products
- https://sentinels.copernicus.eu/web/sentinel/technical-guides/sentinel-5p/products-algorithms

| Name | Product | Information |
|------|---------|-------------|
| Methane (CH4) total column (HARP) | `cophub_s5p_ch4_harp` | - https://sentinels.copernicus.eu/documents/247904/2474726/Sentinel-5P-Level-2-Product-User-Manual-Methane<br>- https://stcorp.github.io/harp/doc/html/ingestions/S5P_L2_CH4.html |
| Carbon Monoxide (CO) total column (HARP) | `cophub_s5p_co_harp` | - https://sentinels.copernicus.eu/documents/247904/2474726/Sentinel-5P-Level-2-Product-User-Manual-Carbon-Monoxide<br>- https://stcorp.github.io/harp/doc/html/ingestions/S5P_L2_CO.html |
| Nitrogen Dioxide (NO2), total and tropospheric columns (HARP) | `cophub_s5p_no2_harp` | - https://sentinels.copernicus.eu/documents/247904/2474726/Sentinel-5P-Level-2-Product-User-Manual-Nitrogen-Dioxide<br>- https://stcorp.github.io/harp/doc/html/ingestions/S5P_L2_NO2.html |
| Other | |
| - Cloud fraction, albedo, top pressure | | https://sentinel.esa.int/documents/247904/2474726/Sentinel-5P-Level-2-Product-User-Manual-NPP-Cloud-product |
| - UV Aerosol Index | | https://sentinels.copernicus.eu/documents/247904/2474726/Sentinel-5P-Level-2-Product-User-Manual-Aerosol-Index-product |
| - Aerosol Layer Height (mid-level pressure) | | https://sentinels.copernicus.eu/documents/247904/2474726/Sentinel-5P-Level-2-Product-User-Manual-Aerosol-Layer-Height |

HARP, https://stcorp.github.io/harp/doc/html/index.html
- HARP filters and converts data: remove unneeded parameters/measurements, add derived physical parameters, perform unit conversion, regrid dimensions

Level-2 products are:
- geolocated total columns of ozone, sulfur dioxide, nitrogen dioxide, carbon monoxide, formaldehyde and methane
- geolocated tropospheric columns of ozone
- geolocated vertical profiles of ozone
- geolocated cloud and aerosol information (e.g. absorbing aerosol index and aerosol layer height)

#### Reference datasets

__Google Earth Engine__

https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S5P_OFFL_L3_CH4

The original Sentinel 5P Level 2 (L2) data is binned by time, not by latitude/longitude. To make it possible to ingest the data into Earth Engine, each Sentinel 5P L2 product is converted to L3, keeping a single grid per orbit (that is, no aggregation across products is performed).

Source products spanning the antimeridian are ingested as two Earth Engine assets, with suffixes _1 and _2.

The conversion to L3 is done by the harpconvert tool using the bin_spatial operation. The source data is filtered to remove pixels with QA values less than:
- 80% for AER_AI
- 75% for the tropospheric_NO2_column_number_density band of NO2
- 50% for all other datasets except for O3 and SO2
- The O3_TCL product is ingested directly (without running harpconvert).

__OpenSearch / GES-DISC__

https://cmr.earthdata.nasa.gov/opensearch > Granule search

| OpenSearch Short Name | Description |
|------------|-------------|
| ```S5P_L2__AER_AI_HiR``` | S5P Aerosol Index L2 5.5km x 3.5km |
| ```S5P_L2__AER_LH_HiR``` | S5P Aerosol Layer Height L2 5.5km x 3.5km |
| ```S5P_L2__CH4____HiR``` | S5P Methane CH4 L2 5.5km x 7km |
| ```S5P_L2__CLOUD__HiR``` | S5P Cloud L2 5.5km x 3.5km |
| ```S5P_L2__CO_____HiR``` | S5P Carbon Monoxide CO Column L2 5.5km x 7km |
| ```S5P_L2__NO2____HiR``` | S5P Tropospheric NO2 L2 5.5km x 3.5km |

#### EASI pipeline

| Task | Summary |
|------|---------|
| Source | Copernicus Science Hub<br>- https://s5phub.copernicus.eu/dhus/<br>- User: s5pguest \| Pass: s5pguest |
| Download | - Modified version of https://github.com/sentinelsat/sentinelsat<br>- Same data-pipeline code as for the [CopHub Sentinel-2 product](Sentinel-2.ipynb) |
| Preprocess | - Uses HARP (mostly following https://github.com/bilelomrani1/s5p-tools)<br>- Intention is to be reasonably equivalent to the GEE products |
| Format | Convert to COGs |
| Prepare | EO3 metadata taken from S5P netcdf file |
| TODO | - Decide whether (GEE/)HARP parameters and variables are fit for purpose<br>- Convert from HARP netcdf to Zarr |

## Setup

#### Dask

In [None]:
from dask.distributed import Client

client = Client("tcp://10.0.98.79:45973")
client

#### Imports

In [None]:
# Data tools
import numpy as np
import xarray as xr
import pandas as pd
from datetime import datetime

# Datacube
import datacube
from datacube.utils import masking  # https://github.com/opendatacube/datacube-core/blob/develop/datacube/utils/masking.py
from odc.algo import enum_to_bool   # https://github.com/opendatacube/odc-tools/blob/develop/libs/algo/odc/algo/_masking.py
from odc.algo import xr_reproject   # https://github.com/opendatacube/odc-tools/blob/develop/libs/algo/odc/algo/_warp.py
from datacube.utils.geometry import GeoBox, box  # https://github.com/opendatacube/datacube-core/blob/develop/datacube/utils/geometry/_base.py
from datacube.utils.rio import configure_s3_access

# Holoviews, Datashader and Bokeh
import hvplot.pandas
import hvplot.xarray
import holoviews as hv
import panel as pn
import colorcet as cc
import cartopy.crs as ccrs
from datashader import reductions
from holoviews import opts
# import geoviews as gv
# from holoviews.operation.datashader import rasterize
hv.extension('bokeh', logo=False)

# Python
import sys, os, re

# Optional EASI tools
sys.path.append(os.path.expanduser('~/hub-notebooks/scripts'))
import notebook_utils

#### ODC develop credentials

This is a development ODC database while we test and demo this product.

In [None]:
CONF = """
[datacube]
db_hostname: v2-db-easihub-csiro-eks.cluster-ro-cvaedcg0qvwd.ap-southeast-2.rds.amazonaws.com
db_database: user_dev_odc
db_username: user
db_password: secretpassword
"""
from datacube.config import read_config, LocalConfig
dc = datacube.Datacube(config=LocalConfig(read_config(CONF)), env='datacube')

# dc = datacube.Datacube()  # Uncomment when ODC develop credentials are no longer required

# Example query

In [None]:
# Australia
min_longitude, max_longitude = (110, 154)
min_latitude, max_latitude = (-45, -10)
min_date = '2019-01-01'
max_date = '2021-03-31'
product = 'cophub_s5p_ch4_harp'

native_crs = 'epsg:4326'

query = {
    'product': product,                     # Product name
    'x': (min_longitude, max_longitude),    # "x" axis bounds
    'y': (min_latitude, max_latitude),      # "y" axis bounds
    'time': (min_date, max_date),           # Any parsable date strings
    'output_crs': native_crs,               # EPSG code
    'resolution': (0.01, 0.01),             # Target resolution
    'group_by': 'solar_day',                # Scene ordering
    'dask_chunks': {'latitude': 2048, 'longitude': 2048},  # Dask chunks
}

In [None]:
# Optional. Some products require AWS S3 credentials to supplied

# S3 credentials - required for s2_l2a
# configure_s3_access(aws_unsigned=True,requester_pays=False,client=client)
# print("Configured s3 requester pays data access")

In [None]:
# Load data
data = dc.load(**query)

notebook_utils.heading(notebook_utils.xarray_object_size(data))
display(data)

# Calculate valid (not nodata) masks for each layer
valid_mask = masking.valid_data_mask(data)
notebook_utils.heading('Valid data masks for each variable')
display(valid_mask)

## Product definition

Display the measurement definitions for the selected product.

Use `list_measurements` to show the details for a product, and `masking.describe_variable_flags` to show the flag definitions.

In [None]:
# Measurement definitions for the selected product
measurement_info = dc.list_measurements().loc[query['product']]
notebook_utils.heading(f'Measurement table for product: {query["product"]}')
notebook_utils.display_table(measurement_info)

# Separate lists of measurement names and flag names
measurement_names = measurement_info[ pd.isnull(measurement_info.flags_definition)].index
flag_names        = measurement_info[pd.notnull(measurement_info.flags_definition)].index

notebook_utils.heading('Selected Measurement and Flag names')
notebook_utils.display_table(pd.DataFrame({
    'group': ['Measurement names', 'Flag names'],
    'names': [', '.join(measurement_names), ', '.join(flag_names)]
}))

# Flag definitions
for flag in flag_names:
    notebook_utils.heading(f'Flag definition table for flag name: {flag}')
    notebook_utils.display_table(masking.describe_variable_flags(data[flag]))

## Quality layer

S5P HARP products have been pre-processed with a quality filter

| product | filter |
|---------|--------|
| cophub_s5p_ch4_harp | CH4_column_volume_mixing_ratio_dry_air_validity >= 50 |
| cophub_s5p_co_harp | CO_column_number_density_validity >= 50 |
| cophub_s5p_no2_harp | tropospheric_NO2_column_number_density_validity >= 75<br>tropospheric_NO2_column_number_density >= 0 |

## Scaling and nodata

S5P HARP products have been preprocessed to floating-point values, and presumably have been scaled (opaque documentation).

The nodata value was set to -9999 for each band. This will have been applied at `valid_mask = masking.valid_data_mask(data)` just after `dc.load()` above.

In [None]:
# Select a layer and apply masking and scaling, then persist in dask

layer_name = 'ch4_column_volume_mixing_ratio_dry_air'  # cophub_s5p_ch4_harp
# layer_name = 'co_column_number_density'                # cophub_s5p_co_harp
# layer_name = 'tropospheric_no2_column_number_density'    # cophub_s5p_no2_harp

# Apply valid mask and good pixel mask
layer = data[layer_name].where(valid_mask[layer_name])
layer = layer.persist()

## Visualisation

In [None]:
# Generate a plot

options = {
    'title': f'{query["product"]}: {layer_name}',
    'width': 700,
    'height': 450,
    'aspect': 'equal',
    'cmap': cc.rainbow,
    'clim': (1800, 2000),   # CH4 .. 1750 = pre-industrial
# #     'clim': (0, 0.05),      # CO
#     'clim': (0, 1e-4),        # NO2
    'colorbar': True,
    'tools': ['hover'],
}

# Set the Dataset CRS
plot_crs = native_crs
if plot_crs == 'epsg:4326':
    plot_crs = ccrs.PlateCarree()


# Native data and coastline overlay:
# - Comment `crs`, `projection`, `coastline` to plot in native_crs coords
# TODO: Update the axis labels to 'longitude', 'latitude' if `coastline` is used
    
layer_plot = layer.hvplot.image(
    x = 'longitude', y = 'latitude',         # Dataset x,y dimension names
    rasterize = True,                        # Use Datashader
    aggregator = reductions.mean(),          # Datashader selects mean value
    precompute = True,                       # Datashader precomputes what it can
    crs = plot_crs,                          # Dataset crs
    projection = ccrs.PlateCarree(),         # Output projection (use ccrs.PlateCarree() when coastline=True)
    coastline='10m',                         # Coastline = '10m'/'50m'/'110m'
).options(opts.Image(**options)).hist(bin_range = options['clim'])

# display(layer_plot)
# Optional: Change the default time slider to a dropdown list, https://stackoverflow.com/a/54912917
fig = pn.panel(layer_plot, widgets={'time': pn.widgets.Select})  # widget_location='top_left'
display(fig)

## Appendix