<span style="color:red; font-family:Helvetica Neue, Helvetica, Arial, sans-serif; font-size:2em;">An Exception was encountered at '<a href="#papermill-error-cell">In [15]</a>'.</span>

In [1]:
from edc import check_compatibility
check_compatibility("user-2022.02", dependencies=["GEODB", "SH"])



---------

The following environment variables are available:

* `GEODB_AUTH_AUD`, `GEODB_AUTH_CLIENT_ID`, `GEODB_AUTH_DOMAIN`, `GEODB_API_SERVER_URL`, `GEODB_AUTH_CLIENT_SECRET`, `GEODB_API_SERVER_PORT`
* `SH_CLIENT_ID`, `SH_INSTANCE_ID`, `SH_CLIENT_NAME`, `SH_CLIENT_SECRET`


# Data Access

This notebook provides information on how to access the data collected from NASA, ESA and JAXA. There is a large amount of data that you can access through the EuroDataCube in both raster and vector format. In this notebook we will show you how to access the different datasets. We split this up into a number of different sections based on the data format. Of course the advantage of such a range of data in one notebook is the ability to combine them to create new layers/ anaylsis with ease. The datasets provided from the different organisations will be processed in different API's ([NASA](https://nasa-impact.github.io/veda-documentation/), [ESA](https://docs.sentinel-hub.com/api/latest/)), this should be taken into account when mixing data sourses.

There is both vector data and ranster layers which you can make use of, they have different ways to access them. Below we list of global indicators the [available](https://collections.eurodatacube.com) from the [EO Dashboard](https://eodashboard.org/explore) and the [RACE COVID Dashboard](https://race.esa.int/):


Economic
- Activity (cars/containers)
- Nightlights
- Recovery Proxy Maps
- Slowdown Proxy Maps
- Airports
- Border crossings
- Commercial activity
- Crude Oil Storage
- Goods Production
- Mobilility Data
- Number of Trucks
- Oil Storage Volume
- Population
- Ports and Shipping
- Vessel Density
- Parking Activtiy
- World Settlement Footprint

Agriculture
- Aboveground Biomass
- Activity Indicator 
- Crop Conditions
- Global NDVI
- Harvesting Activity
- Harvesting Evolution
- Planting Activity
- Productive Area
- Productive Area Change
- Regional Cropland
- Soil Moisture
- Solar Induced Chlorophyll Fluorescense
- Agricultural Workers

Health
- Covid-19 Cases
- Covid-19 Vaccinations

Atmosphere
- Air Quality
- Greenhouse Gases

Water
- Ocean Primary Productivity
- Precipitation Anomoly
- Sea Ice Concentration
- Sea Ice Thickness
- Water Quality Regional Maps
- Water Quality Time Series

All the avaialable raster data can be explored in the [EuroDataCube collection archive](https://collections.eurodatacube.com/) where you can easily search for data. The vector data, stored in the GeoDB is listed below.

### How do I access the data?

The vector data is stored in the GeoDB, you can access it using the following steps:

### 1. Accessing vector data with [GeoDB](https://xcube-geodb.readthedocs.io/en/latest/)

This first cell is just checking the appropriate permissions are in place

In [2]:
from edc import check_compatibility
from xcube_geodb.core.geodb import GeoDBClient
geodb = GeoDBClient()
geodb.whoami

'geodb_418dfeac-15f0-4606-9edb-fd9eb722bf04'

Each indicator has a code which can be used to query the data, we need to define the indicator of interest and the database we are going to query. A full list of the indicators available through the GeoDB is shown bellow. JSON files describing each indicator are available [here](https://github.com/eurodatacube/eodash/tree/master/app/public/data/internal).

 | Name                                                    | Collection Code        | Available Locations |
  | ------------------------------------------------------- | ---------------------- | ------------ |
  | Import/production sites: status of metallic ores        |   E1                   | Port of Genoa, Gdynia, Gdansk, Gijon, Genova, Hamburg, Dunkirk, Dunkirque, Ghent |
  | Productive area                                         |  E10a1_tri             | Brandenburg        |
  | Activity Indicator                                      |  E10a2_tri             | Brandenburg        |
  | Productive area change                                  |  E10a3_tri             | Brandenburg        |
  | Number of berry trucks in 2018-2019                     |   E10a5                | Laguna de las Madres        |
  | Regional Harvesting Evolution                           |   E10a6                | Regions of Spain        |
  | Harvesting activity: cumulative harvested area          |   E10a8                | Regions of Spain        |
  | National Harvesting Evolution                           |   E10a10               | Europe        | 
  | Commercial centres: volume of activity                  |   E11                  | Warsaw, Brussels, Athens, Milan, Rome, Bucharest       |
  | Border crossing points: volume of activity              |   E12b                 | GB Border        |
  | Airports: throughput                                    | E13b, E13b_tri         | European Airports      |
  | Airports: airplane traffic                              | E13d                   | European Airports        |
  | Ports and Shipping - Major Harbours                     | E13c_tri               | Hamburg, Ghent, Gdynia, Dunkirk, Genoa, Suez
  | Maritime Traffic                                        | E13e,f,g,h,i,l,m,n     | Gioia Tauro, Genoa        |
  | Changes in commertial fluxes                            |   E13n                 | Gijon        |
  | Ports and Shipping - Major Harbours                     |   E200                 | Hamburg, Ghent, Gdynia, Dunkirk, Genoa, Suez   |
  | Finished goods production: output inventory level       |   E8                   | Swindon, Cassino, Ghent, Mioveni, Russelsheim, Leipzig, Craiova, Nosovice, Martorell, Barcelona, Emden, Ingolstadt, Kvasiny        |
  | Activity (cars/containers)                              |   E9_tri               | Beijing, Singapore, Palm Springs, Los Angeles, Arcadia, Nagoya        |
  | Air Quality (tropomi NO2)                               |   N1, N1_tri           | European Cities & Major World Cities        |
  | CAMS Air Quality (PM 2.5)                               |   N1a                  | European Cities        |
  | CAMS Air Quality (NO2)                                  |   N1b                  | European Cities        |
  | CAMS Air Quality (PM10)                                 |   N1c                  | European Cities        |
  | CAMS Air Quality (O3)                                   |   N1d                  | European Cities        |
  | Greenhouse Gas                                          |   N2_tri               | Major World Cities        |
  | Water Quality Time Series                               |   N3, N3b_tri          | Barcelona, Marseilles, Venice Lagoon & Major World Ports       |


Please feel free to change the value of `geodb_collection` and use other indicators.

In [3]:
geodb_database = "eodash"
geodb_collection = "E13c_tri"

To get an overview over the data we can get the first rows of the dataset. 

The data contains measured values during a measurement period categorised as "low", "medium", "high" under "indicator_value" heading.

In [4]:
data = geodb.get_collection(collection=geodb_collection, database=geodb_database)
data.head()

Unnamed: 0,id,created_at,modified_at,geometry,aoi,country,region,city,site_name,description,...,reference_value,rule,indicator_value,sub_aoi,y_axis,indicator_name,color_code,data_provider,aoi_id,update_frequency
0,449,2021-12-14T14:05:59.974126+00:00,2022-02-24T11:16:06.935296+00:00,POINT (18.51089 54.53786),"54.537859,18.510887",PL,/,Gdynia,Port of Gdynia/Gdansk,Ports and Shipping - Major Harbours,...,5,X is the Measurement value. If X<(ref_value-30...,Low,MULTIPOLYGON(((18.56179820666602 54.5171532394...,Number of ships in Port,Changes in Ships traffic within the Port,RED,PLES,PL1,Weekly
1,450,2021-12-14T14:05:59.974126+00:00,2022-02-24T11:16:06.935296+00:00,POINT (2.28537 51.03614),"51.036138,2.285374",FR,/,Dunkirk,Port of Dunkirk,Ports and Shipping - Major Harbours,...,7,X is the Measurement value. If X<(ref_value-30...,Low,MULTIPOLYGON(((2.1499827079701284 51.034478942...,Number of ships in Port,Changes in Ships traffic within the Port,RED,PLES,FR3,Weekly
2,453,2021-12-14T14:05:59.974126+00:00,2022-02-24T11:16:06.935296+00:00,POINT (32.31492 30.93955),"30.939554,32.314923",EG,/,Suez,Suez Canal,Ports and Shipping - Major Harbours,...,28,X is the Measurement value. If X<(ref_value-30...,Low,MULTIPOLYGON(((32.2245676834261 31.36554628042...,Number of ships in Port,Changes in Ships traffic within the Port,RED,PLES,EG1,Weekly
3,463,2022-01-18T10:23:23.995343+00:00,2022-02-24T11:16:06.935296+00:00,POINT (8.88585 44.40814),"44.408142,8.885851",IT,/,Genoa,Port of Genoa and surrounding industrial areas,Ports and Shipping - Major Harbours,...,14,X is the Measurement value. If X<(ref_value-30...,Low,MULTIPOLYGON(((8.917566076705773 44.4026124064...,Number of ships in Port,Changes in Ships traffic within the Port,RED,PLES,IT3,Weekly
4,467,2022-01-26T22:11:17.719758+00:00,2022-02-24T11:16:06.935296+00:00,POINT (2.28537 51.03614),"51.036138,2.285374",FR,/,Dunkirk,Port of Dunkirk,Ports and Shipping - Major Harbours,...,7,X is the Measurement value. If X<(ref_value-30...,Low,MULTIPOLYGON(((2.1499827079701284 51.034478942...,Number of ships in Port,Changes in Ships traffic within the Port,RED,PLES,FR3,Weekly


A valid header line for a CSV uses the strings in bold and looks like this:

`AOI,Country,Region,City,Site Name,Description,Method,EO Sensor,Input Data,Indicator code,Time,Measurement Value,Reference Description,Reference time,Reference value,Rule,Indicator Value,Sub-AOI,Y axis,Indicator Name,Color code,Data Provider,AOI_ID,Update Frequency`

### 2. Accessing Raster Data [(SentinelHub)](https://docs.sentinel-hub.com/api/latest/reference/#tag/process)

### BYOC Datasets 

These datasets are loaded into the sentinelhub and are accessed using their specific code.

First the permissions:

In [5]:
import os
from oauthlib.oauth2 import BackendApplicationClient
from requests_oauthlib import OAuth2Session

# Your client credentials
client_id = os.environ['SH_CLIENT_ID']
client_secret = os.environ['SH_CLIENT_SECRET']

# Create a session
client = BackendApplicationClient(client_id=client_id)
oauth = OAuth2Session(client=client)

# Get token for the session
token = oauth.fetch_token(token_url='https://services.sentinel-hub.com/oauth/token',
                          client_id=client_id, client_secret=client_secret)

# All requests using this session will have an access token automatically added
resp = oauth.get("https://services.sentinel-hub.com/oauth/tokeninfo")


As an example, here we access the population density information for a specific country (Austria).
We use the [NUTS Tool](https://ec.europa.eu/eurostat/web/gisco/geodata/reference-data/administrative-units-statistical-units/nuts) to help.

In [6]:
import requests
# First let us get the area information of the country from the administratives zones
# You can find all the different NUTS levels, resolution here: https://gisco-services.ec.europa.eu/distribution/v2/nuts/nuts-2021-files.html

response = requests.get(
    "https://gisco-services.ec.europa.eu/distribution/v2/nuts/geojson/NUTS_RG_10M_2021_4326_LEVL_0.geojson"
)

data = response.json()

# Now lets find the geometry information for one country
match = [x for x in data["features"] if x["properties"]["CNTR_CODE"] == 'AT'][0]

Here we see how to access specifically loaded datasets, for learning how to access default datasets look further down in this tutorial notebook. You have access to a number of [datasets](https://collections.eurodatacube.com/). We display display the syntax using population data:
- Population density - collection id: `a9743257-ef21-4fd3-999d-15c5c7fcacbd`

In [7]:

response = requests.post('https://shservices.mundiwebservices.com/api/v1/process',
  headers={"Authorization" : "Bearer %s"%(token['access_token'])},
  json={
    "input": {
        "bounds": {
            "geometry": match["geometry"]
        },
        "data": [{
            "type": "byoc-a9743257-ef21-4fd3-999d-15c5c7fcacbd"
        }]
    },
    "output": {
        "width": 800,
        "height": 400,
    },
    "evalscript": """
    //VERSION=3
    function setup() {
      return {
        input: [{
          bands: ["populationDensity", "dataMask"], // this sets which bands to use
        }],
        output: { // this defines the output image type
          bands: 4,
          sampleType: "UINT8"
        }
      };
    }

    function evaluatePixel(sample) {
      var arr = colorBlend(
          sample.populationDensity,
          [1, 5, 25, 250, 1000, 10000],
          [[255,242,209],[255,218,166],[250,184,85],[253,141,60],[240,59,32],[189,0,38]]
      ); 
      if (sample.dataMask==1)  arr.push(255);
      else arr.push(0);
      return arr;
    }
    """
})

#### Satellite Imagery 
Now that we have the necessary token we can access the data through the processing API.  
As described in the documentation we can access multiple datasets, but for this challenge we consider the following the most relevant:  
- [S1GRD](https://docs.sentinel-hub.com/api/latest/data/sentinel-1-grd/)
- [S2L1C](https://docs.sentinel-hub.com/api/latest/data/sentinel-2-l1c/#available-bands-and-data)
- [S2L2A](https://docs.sentinel-hub.com/api/latest/data/sentinel-2-l2a/#available-bands-and-data)
- [S3OLCI](https://docs.sentinel-hub.com/api/latest/data/sentinel-3-olci-l1b/#available-bands-and-data)
- [S3SLSTR](https://docs.sentinel-hub.com/api/latest/data/sentinel-3-slstr-l1b/#available-bands-and-data)
- [S5PL2](https://docs.sentinel-hub.com/api/latest/data/sentinel-5p-l2/#available-bands-and-data)  

Have also a look at the linked references of the list as they also show available bands for the datasets.
This identifier can then be used in the data type definition of the request.  
The example that follows is taken from the API documentation, we are selecting a bounding box, which bands will be used and a function of how the pixel will be evaluated.   

In [8]:
import requests

available_datasets = ['S2L1C', 'S2L2A']
responses = {}

for ds in available_datasets:
    responses[ds] = requests.post('https://services.sentinel-hub.com/api/v1/process',
      headers={"Authorization" : "Bearer %s"%(token['access_token'])},
      json={
        "input": {
            "bounds": {
                "bbox": [ 13.45, 45.4, 13.55,45.5 ]
            },
            "data": [{
                "type": ds
            }]
        },
        "evalscript": """
        //VERSION=3

        function setup() {
          return {
            input: ["B02", "B03", "B04"],
            output: {
              bands: 3
            }
          };
        }

        function evaluatePixel(
          sample,
          scenes,
          inputMetadata,
          customData,
          outputMetadata
        ) {
          return [2.5 * sample.B04, 2.5 * sample.B03, 2.5 * sample.B02];
        }
        """
    })

### 2. Accessing Raster Data [(NASA API)](https://nasa-impact.github.io/veda-documentation/)

Data is accessed using a [STAC endpoint](https://staging-stac.delta-backend.xyz/docs) 

In [9]:
!pip install pystac

Collecting pystac
  Downloading pystac-1.4.0-py3-none-any.whl (137 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/137.4 KB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m137.4/137.4 KB[0m [31m17.6 MB/s[0m eta [36m0:00:00[0m
[?25h



Installing collected packages: pystac


Successfully installed pystac-1.4.0


In [10]:
from pystac import Catalog
import concurrent.futures
import datetime as dt
from ipyleaflet import basemaps, Map, GeoJSON
import json
import requests as re
import matplotlib.pyplot as plt
import pprint
import time

STAC_ENDPOINT_URL = "https://staging-stac.delta-backend.xyz"

In [11]:
stac_api_url = 'https://staging-stac.delta-backend.xyz/'
catalog = Catalog.from_file(stac_api_url)

Below we list the availabe datasets.

In [12]:
for root, subcatalogs, items in catalog.walk():
    # subcats represents any catalogs or collections owned by root
    for cat in subcatalogs:
        print(cat.id)

nightlights-hd-1band


social-vulnerability-index-housing-nopop


grdi-v1-built


MO_NPP_npp_vgpm


nightlights-hd-monthly


HLSS30.002


HLSL30.002


social-vulnerability-index-household


grdi-v1-raster


grdi-shdi-raster


facebook_population_density


grdi-vnl-slope-raster


social-vulnerability-index-socioeconomic


social-vulnerability-index-socioeconomic-nopop


grdi-filled-missing-values-count


grdi-vnl-raster


grdi-cdr-raster


blue-tarp-planetscope


IS2SITMOGR4-cog


social-vulnerability-index-household-nopop


social-vulnerability-index-minority


social-vulnerability-index-overall-nopop


OMSO2PCA-COG


OMI_trno2-COG


social-vulnerability-index-overall


no2-monthly-diff


no2-monthly


social-vulnerability-index-housing


nightlights-500m-daily


blue-tarp-detection


social-vulnerability-index-minority-nopop


nceo_africa_2017


geoglam


grdi-imr-raster


Select an indicator of interesting and load as a json.

In [13]:
re.get(f"{STAC_ENDPOINT_URL}/collections/no2-monthly").json()

{'id': 'no2-monthly',
 'type': 'Collection',
 'links': [{'rel': 'items',
   'type': 'application/geo+json',
   'href': 'https://staging-stac.delta-backend.xyz/collections/no2-monthly/items'},
  {'rel': 'parent',
   'type': 'application/json',
   'href': 'https://staging-stac.delta-backend.xyz/'},
  {'rel': 'root',
   'type': 'application/json',
   'href': 'https://staging-stac.delta-backend.xyz/'},
  {'rel': 'self',
   'type': 'application/json',
   'href': 'https://staging-stac.delta-backend.xyz/collections/no2-monthly'}],
 'title': 'NO₂',
 'extent': {'spatial': {'bbox': [[-180, -90, 180, 90]]},
  'temporal': {'interval': [['2016-01-01T00:00:00Z',
     '2022-01-01T00:00:00Z']]}},
 'license': 'MIT',
 'description': 'Darker colors indicate higher nitrogen dioxide (NO₂) levels and more activity. Lighter colors indicate lower levels of NO₂ and less activity. Missing pixels indicate areas of no data most likely associated with cloud cover or snow.',
 'item_assets': {'cog_default': {'type':

Find the periodicy of the data.

In [14]:
pprint.pprint({
    k:v for k,v in re.get(f"{STAC_ENDPOINT_URL}/collections/no2-monthly").json().items()
    if k in ["dashboard:is_periodic", "dashboard:time_density", "summaries"]
})

{'dashboard:is_periodic': True, 'dashboard:time_density': 'month'}


Inspect one of the monthly measurments.

<span id="papermill-error-cell" style="color:red; font-family:Helvetica Neue, Helvetica, Arial, sans-serif; font-size:2em;">Execution using papermill encountered an exception here and stopped:</span>

In [15]:
items = re.get(f"{STAC_ENDPOINT_URL}/collections/no2-monthly/items?limit=100").json()["features"]
items[0]

IndexError: list index out of range