# New STAC collection

Starting point for data providers who want to add a new dataset to the STAC API. 

Additional resources: https://github.com/NASA-IMPACT/delta-backend/issues/29/

## Run this notebook

This notebook is designed to run on a VEDA JupyterHub instance. Either https://nasa-veda.2i2c.cloud or https://daskhub.veda.smce.nasa.gov/

We'll start by installing then importing some packages.

In [1]:
!pip install -U pystac

Collecting pystac
  Using cached pystac-1.7.3-py3-none-any.whl (150 kB)
Installing collected packages: pystac
  Attempting uninstall: pystac
    Found existing installation: pystac 1.6.1
    Uninstalling pystac-1.6.1:
      Successfully uninstalled pystac-1.6.1
Successfully installed pystac-1.7.3


In [2]:
from datetime import datetime, timezone
import pystac

## Create `pystac.Collection`

In this section we will be creating a `pystac.Collection` object. This is the part of that notebook that you should update.

### Declare constants

Start by declaring some string and boolean fields.

In [3]:
COLLECTION_ID = "no2-monthly-diff"
TITLE = "NO₂ (Diff)"
DESCRIPTION = (
    "This layer shows changes in nitrogen dioxide (NO₂) levels. Redder colors "
    "indicate increases in NO₂. Bluer colors indicate lower levels of NO₂. "
    "Missing pixels indicate areas of no data most likely associated with "
    "cloud cover or snow."
)
DASHBOARD__IS_PERIODIC = True
DASHBOARD__TIME_DENSITY = "month"
LICENSE = "CC0-1.0"

### Extents

The extents indicate the start (and potentially end) times of the data as well as the footprint of the data.

In [4]:
# Time must be in UTC
demo_time = datetime.now(tz=timezone.utc)

extent = pystac.Extent(
    pystac.SpatialExtent([[-180.0, -90.0, 180.0, 90.0]]),
    pystac.TemporalExtent([[demo_time, None]]),
)

### Providers

We know that the data host, processor, and producter is "VEDA", but you can include other providers that fill other roles in the data creation pipeline.

In [5]:
providers = [
    pystac.Provider(
        name="VEDA",
        roles=[pystac.ProviderRole.PRODUCER, pystac.ProviderRole.PROCESSOR, pystac.ProviderRole.HOST],
        url="https://github.com/nasa-impact/veda-data-pipelines",
    )
]

### Put it together

Now take your constants and the extents and providers and create a `pystac.Collection`

In [6]:
collection = pystac.Collection(
    id=COLLECTION_ID,
    title=TITLE,
    description=DESCRIPTION,
    extra_fields={
        "dashboard:is_periodic": DASHBOARD__IS_PERIODIC,
        "dashboard:time_density": DASHBOARD__TIME_DENSITY,
    },
    license=LICENSE,
    extent=extent,
    providers=providers,
)

### Try it out!

Now that you have a collection you can try it out and make sure that it looks how you expect and that it passes validation checks.

In [7]:
collection.validate()

['https://schemas.stacspec.org/v1.0.0/collection-spec/json-schema/collection.json']

In [8]:
collection.to_dict()

{'type': 'Collection',
 'id': 'no2-monthly-diff',
 'stac_version': '1.0.0',
 'description': 'This layer shows changes in nitrogen dioxide (NO₂) levels. Redder colors indicate increases in NO₂. Bluer colors indicate lower levels of NO₂. Missing pixels indicate areas of no data most likely associated with cloud cover or snow.',
 'links': [],
 'dashboard:is_periodic': True,
 'dashboard:time_density': 'month',
 'title': 'NO₂ (Diff)',
 'extent': {'spatial': {'bbox': [[-180.0, -90.0, 180.0, 90.0]]},
  'temporal': {'interval': [['2023-06-09T20:44:57.091389Z', None]]}},
 'license': 'CC0-1.0',
 'providers': [{'name': 'VEDA',
   'roles': [<ProviderRole.PRODUCER: 'producer'>,
    <ProviderRole.PROCESSOR: 'processor'>,
    <ProviderRole.HOST: 'host'>],
   'url': 'https://github.com/nasa-impact/veda-data-pipelines'}]}

## Save it

In [9]:
collection.save_object(include_self_link=False, dest_href=f"{COLLECTION_ID}/collection.json")