## Run this notebook

You can launch this notebook in VEDA JupyterHub by clicking the link below.

[Launch in VEDA JupyterHub (requires access)](https://hub.openveda.cloud/hub/user-redirect/git-pull?repo=https://github.com/NASA-IMPACT/veda-docs&urlpath=lab/tree/veda-docs/notebooks/quickstarts/visualize-zarr.ipynb&branch=main) 

<details><summary>Learn more</summary>
    
### Inside the Hub

This notebook was written on the VEDA JupyterHub and as such is designed to be run on a jupyterhub which is associated with an AWS IAM role which has been granted permissions to the VEDA data store via its bucket policy. The instance used provided 16GB of RAM. 

See (VEDA Analytics JupyterHub Access)[https://nasa-impact.github.io/veda-docs/veda-jh-access.html] for information about how to gain access.

### Outside the Hub

The data is in a protected bucket. Please request access by emailing aimee@developmentseed.org or alexandra@developmentseed.org and providing your affiliation, interest in or expected use of the dataset and an AWS IAM role or user Amazon Resource Name (ARN). The team will help you configure the cognito client.

You should then run:

```
%run -i 'cognito_login.py'
```
    
</details>

## Approach

   1. Use `intake` to open a STAC collection using with `xarray` and `dask`
   3. Plot the data using `hvplot`

## About the data

This is the Gridded Daily OCO-2 Carbon Dioxide assimilated dataset. More information can be found at: [OCO-2 GEOS Level 3 daily, 0.5x0.625 assimilated CO2 V10r (OCO2_GEOS_L3CO2_DAY)](https://catalog.data.gov/dataset/oco-2-geos-level-3-daily-0-5x0-625-assimilated-co2-v10r-oco2-geos-l3co2-day-at-ges-disc-72b15)

The data has been converted to zarr format and published to the development version of the VEDA STAC Catalog.

In [1]:
import intake
import hvplot.xarray  # noqa

## Declare your collection of interest

You can discover available collections the following ways:

* Programmatically: see example in the `list-collections.ipynb` notebook
* JSON API: https://staging-stac.delta-backend.com/collections
* STAC Browser: http://veda-staging-stac-browser.s3-website-us-west-2.amazonaws.com

In [2]:
STAC_API_URL = "https://staging-stac.delta-backend.com/"
collection_id = "oco2-geos-l3-daily"

## Get STAC collection

Use `intake` to get the entire STAC collection.

In [3]:
collection = intake.open_stac_collection(f"{STAC_API_URL}/collections/{collection_id}")
collection

oco2-geos-l3-daily:
  args:
    stac_obj: https://staging-stac.delta-backend.com//collections/oco2-geos-l3-daily
  description: ''
  driver: intake_stac.catalog.StacCollection
  metadata:
    assets:
      zarr:
        description: Zarr array store with one or several arrays (variables)
        href: s3://veda-data-store-staging/EIS/zarr/OCO2_GEOS_L3CO2_day.zarr
        roles:
        - data
        - zarr
        title: Zarr Array Store
        type: application/vnd+zarr
        xarray:open_kwargs:
          chunks: {}
          consolidated: true
          engine: zarr
    cube:dimensions:
      lat:
        axis: y
        description: latitude
        extent:
        - -90.0
        - 90.0
        reference_system: 4326
        type: spatial
      lon:
        axis: x
        description: longitude
        extent:
        - -180.0
        - 179.375
        reference_system: 4326
        type: spatial
      time:
        description: time
        extent:
        - '2015-01-01T12:00

## Read from zarr to xarray

Intake lets you go straight from the asset to an xarray dataset backed by a dask array.

In [4]:
source = collection.get_asset("zarr")

ds = source.to_dask()
ds

Unnamed: 0,Array,Chunk
Bytes,3.87 GiB,7.63 MiB
Shape,"(2500, 361, 576)","(100, 100, 100)"
Dask graph,600 chunks in 2 graph layers,600 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 3.87 GiB 7.63 MiB Shape (2500, 361, 576) (100, 100, 100) Dask graph 600 chunks in 2 graph layers Data type float64 numpy.ndarray",576  361  2500,

Unnamed: 0,Array,Chunk
Bytes,3.87 GiB,7.63 MiB
Shape,"(2500, 361, 576)","(100, 100, 100)"
Dask graph,600 chunks in 2 graph layers,600 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.87 GiB,7.63 MiB
Shape,"(2500, 361, 576)","(100, 100, 100)"
Dask graph,600 chunks in 2 graph layers,600 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 3.87 GiB 7.63 MiB Shape (2500, 361, 576) (100, 100, 100) Dask graph 600 chunks in 2 graph layers Data type float64 numpy.ndarray",576  361  2500,

Unnamed: 0,Array,Chunk
Bytes,3.87 GiB,7.63 MiB
Shape,"(2500, 361, 576)","(100, 100, 100)"
Dask graph,600 chunks in 2 graph layers,600 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray


In `xarray` you can inspect just one data variable using dot notation:

In [5]:
ds.XCO2

Unnamed: 0,Array,Chunk
Bytes,3.87 GiB,7.63 MiB
Shape,"(2500, 361, 576)","(100, 100, 100)"
Dask graph,600 chunks in 2 graph layers,600 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 3.87 GiB 7.63 MiB Shape (2500, 361, 576) (100, 100, 100) Dask graph 600 chunks in 2 graph layers Data type float64 numpy.ndarray",576  361  2500,

Unnamed: 0,Array,Chunk
Bytes,3.87 GiB,7.63 MiB
Shape,"(2500, 361, 576)","(100, 100, 100)"
Dask graph,600 chunks in 2 graph layers,600 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray


## Plot data

We can plot the XCO2 variable as an interactive map (with date slider) using `hvplot`.

In [6]:
ds.XCO2.hvplot(
    x="lon",
    y="lat",
    groupby="time",
    coastline=True,
    rasterize=True,
    aggregator="mean",
    widget_location="bottom",
    frame_width=600,
)