## Run this notebook

You can launch this notebook using mybinder, by clicking the button below.

<a href="https://mybinder.org/v2/gh/NASA-IMPACT/veda-docs/HEAD?labpath=example-notebooks/nceo-biomass-statistics.ipynb">
<img src="https://mybinder.org/badge_logo.svg" alt="Binder" title="A cute binder" width="150"/> 
</a>

## Approach

   1. 

## About the Data

[NCEO Aboveground Woody Biomass 2017](https://ceos.org/gst/africa-biomass.html) is map for the year 2017 at 100 m spatial resolution which was developed using a combination of LiDAR, Synthetic Aperture Radar (SAR) and optical based data. Aboveground woody biomass (AGB) plays an key role in the study of the Earth's carbon cycle and response to climate change. It is expressed as dry matter in Mg ha-1 and estimation based on Earth Observation measurements is an effective method for regional scale studies. 

**Important note**: Users of this dataset should keep in mind that the map is a continental-scale dataset, generated using a combination of different remote sensing data types, with a single method for the whole study area. Therefore, users should understand that accuracy may vary for different regions and vegetation types.

## The Case Study - 

TBD

## Querying the STAC API

In [1]:
from pystac_client import Client

In [2]:
# Provide STAC API endpoint
STAC_API_URL = "https://staging-stac.delta-backend.com/"

# Declare collection of interest - NCEO Biomass
collection = "nceo_africa_2017"

Now let's check how many total items are available. 

In [3]:
search = Client.open(STAC_API_URL).search(collections=[collection])
items = list(search.items())
print(f"Found {len(items)} items")

Found 1 items


This makes sense as there is only one item available, a map for 2017. 

In [4]:
# Explore one item to see what it contains
items[0]

0
ID: AGB_map_2017v0m_COG
"Bounding Box: [-18.273529509559307, -35.054059016911935, 51.86423292864056, 37.73103856358817]"
"proj:bbox: [-18.273529509559307, -35.054059016911935, 51.86423292864056, 37.73103856358817]"
proj:epsg: 4326.0
"proj:shape: [81024.0, 78077.0]"
end_datetime: 2017-12-31T23:59:59+00:00
"proj:geometry: {'type': 'Polygon', 'coordinates': [[[-18.273529509559307, -35.054059016911935], [51.86423292864056, -35.054059016911935], [51.86423292864056, 37.73103856358817], [-18.273529509559307, 37.73103856358817], [-18.273529509559307, -35.054059016911935]]]}"
"proj:transform: [0.0008983152841195214, 0.0, -18.273529509559307, 0.0, -0.0008983152841195214, 37.73103856358817, 0.0, 0.0, 1.0]"
start_datetime: 2017-01-01T00:00:00+00:00
"stac_extensions: ['https://stac-extensions.github.io/projection/v1.0.0/schema.json', 'https://stac-extensions.github.io/raster/v1.1.0/schema.json']"

0
https://stac-extensions.github.io/projection/v1.0.0/schema.json
https://stac-extensions.github.io/raster/v1.1.0/schema.json

0
href: s3://nasa-maap-data-store/file-staging/nasa-map/nceo-africa-2017/AGB_map_2017v0m_COG.tif
Title: Default COG Layer
Description: Cloud optimized default layer to display on map
Media type: image/tiff; application=geotiff; profile=cloud-optimized
"Roles: ['data', 'layer']"
Owner:
"raster:bands: [{'scale': 1.0, 'nodata': 'inf', 'offset': 0.0, 'sampling': 'area', 'data_type': 'uint16', 'histogram': {'max': 429.0, 'min': 0.0, 'count': 11.0, 'buckets': [405348.0, 44948.0, 18365.0, 6377.0, 3675.0, 3388.0, 3785.0, 9453.0, 13108.0, 1186.0]}, 'statistics': {'mean': 37.58407913145342, 'stddev': 81.36678677343947, 'maximum': 429.0, 'minimum': 0.0, 'valid_percent': 50.42436439336373}}]"

0
Rel: collection
Target: https://staging-stac.delta-backend.com/collections/nceo_africa_2017
Media Type: application/json

0
Rel: parent
Target: https://staging-stac.delta-backend.com/collections/nceo_africa_2017
Media Type: application/json

0
Rel: root
Target:
Media Type: application/json

0
Rel: self
Target: https://staging-stac.delta-backend.com/collections/nceo_africa_2017/items/AGB_map_2017v0m_COG
Media Type: application/geo+json


Explore through the item's assets. We can see from the data's statistics values that the `min` and `max` values for the observed values range from `0` to `429` Mg ha-1.

In [29]:
## Need to update proj:epsg from float to integer, probably best to do this here. 
## Below doesn't work as 'int' isn't subscriptable

# items[0] = int(items[0]["proj:epsg"])
# items[0]


TypeError: 'int' object is not subscriptable

## Reading and accessing the data

Now that we've explored the dataset through the STAC API, let's read and access the dataset itself. 

In [5]:
# This is a workaround that is planning to move up into stackstac itself

import boto3
import stackstac
import rasterio as rio
import rioxarray

gdal_env = stackstac.DEFAULT_GDAL_ENV.updated(always=dict(AWS_NO_SIGN_REQUEST=True, session=rio.session.AWSSession(boto3.Session())))

In [23]:
da = stackstac.stack(search.get_all_items(), gdal_env=gdal_env)
da

Unnamed: 0,Array,Chunk
Bytes,47.13 GiB,8.00 MiB
Shape,"(1, 1, 81025, 78078)","(1, 1, 1024, 1024)"
Dask graph,6160 chunks in 3 graph layers,6160 chunks in 3 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 47.13 GiB 8.00 MiB Shape (1, 1, 81025, 78078) (1, 1, 1024, 1024) Dask graph 6160 chunks in 3 graph layers Data type float64 numpy.ndarray",1  1  78078  81025  1,

Unnamed: 0,Array,Chunk
Bytes,47.13 GiB,8.00 MiB
Shape,"(1, 1, 81025, 78078)","(1, 1, 1024, 1024)"
Dask graph,6160 chunks in 3 graph layers,6160 chunks in 3 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray


In [7]:
# Create an AOI for our study area

# Guinea
guinea_aoi = {
    "type": "Feature",
    "properties": {},
    "geometry": {
        "coordinates": [
          [
            [
              -15.519958756713947,
              12.732440363049193
            ],
            [
              -15.519958756713947,
              6.771426493209475
            ],
            [
              -7.078554695621165,
              6.771426493209475
            ],
            [
              -7.078554695621165,
              12.732440363049193
            ],
            [
              -15.519958756713947,
              12.732440363049193
            ]
          ]
        ],
        "type": "Polygon"
      }
    }

#TODO: replace with admin 1 or admin 2 boundaries

In [8]:
# Subset to bounding box of Guinea
subset = da.rio.clip([guinea_aoi["geometry"]])
subset

Unnamed: 0,Array,Chunk
Bytes,475.76 MiB,8.00 MiB
Shape,"(1, 1, 6636, 9397)","(1, 1, 1024, 1024)"
Dask graph,77 chunks in 7 graph layers,77 chunks in 7 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 475.76 MiB 8.00 MiB Shape (1, 1, 6636, 9397) (1, 1, 1024, 1024) Dask graph 77 chunks in 7 graph layers Data type float64 numpy.ndarray",1  1  9397  6636  1,

Unnamed: 0,Array,Chunk
Bytes,475.76 MiB,8.00 MiB
Shape,"(1, 1, 6636, 9397)","(1, 1, 1024, 1024)"
Dask graph,77 chunks in 7 graph layers,77 chunks in 7 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray


In [19]:
# select the band of interest, as there is only one in this dataset we'll select the default
data_band = subset.sel(band="cog_default")
data_band

Unnamed: 0,Array,Chunk
Bytes,475.76 MiB,8.00 MiB
Shape,"(1, 6636, 9397)","(1, 1024, 1024)"
Dask graph,77 chunks in 8 graph layers,77 chunks in 8 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 475.76 MiB 8.00 MiB Shape (1, 6636, 9397) (1, 1024, 1024) Dask graph 77 chunks in 8 graph layers Data type float64 numpy.ndarray",9397  6636  1,

Unnamed: 0,Array,Chunk
Bytes,475.76 MiB,8.00 MiB
Shape,"(1, 6636, 9397)","(1, 1024, 1024)"
Dask graph,77 chunks in 8 graph layers,77 chunks in 8 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray


In [30]:
#import hvplot.xarray
#data_band.hvplot(x="x", y="y", coastline=True, cmap="viridis")
