## Authenticate with VEDA backend

In [48]:
%pip install cognito-client --quiet

You should consider upgrading via the '/Users/kathrynaberger/Documents/Work/veda-docs/_env/bin/python -m pip install --upgrade pip' command.[0m[33m
[0mNote: you may need to restart the kernel to use updated packages.


Running the following cell will trigger a request for your `CognitoClient` `username` and `password`. If you do not already have these credentails please reach out to our VEDA Data Services team for an account to be set up for you. The first time you log in using the `CognitoClient` in this notebook with the new credentials, you'll be prompted to set a new password. 

In [49]:
from cognito_client import CognitoClient

client = CognitoClient(
    client_id="o8c93cebc17upumgstlbqm44f",
    user_pool_id="us-west-2_9mMSsMcxw",
    identity_pool_id="us-west-2:40f39c19-ab88-4d0b-85a3-3bad4eacbfc0",
)
_ = client.login()

TOKEN = client.access_token

In [50]:
import os

import rio_cogeo
import rasterio
import boto3
import requests

## Define item metadata

In [42]:
API = "https://ig9v64uky8.execute-api.us-west-2.amazonaws.com/staging/"

LOCAL_FILE_PATH = "spec_prob_mosaic_2022-10-03_day.tif"

TARGET_FILENAME = "spec_prob_mosaic_2022-10-03_day.tif"

## Validate data format

The following code is used to test whether the data format you are planning to upload is Cloud Optimized GeoTiff (COG) that enables more efficient workflows in the cloud environment. If the validation process identifies that it is not a COG, it will convert it into one. 

In [43]:
file_is_a_cog = rio_cogeo.cog_validate(LOCAL_FILE_PATH)
if not file_is_a_cog:
    raise ValueError()
    print("File is not a COG - converting")
    rio_cogeo.cog_translate(LOCAL_FILE_PATH, LOCAL_FILE_PATH, in_memory=True)

RasterioIOError: spec_prob_mosaic_2022-10-03_day.tif: No such file or directory

## Upload file to S3

The following code will upload your COG data into `veda-data-store-staging` bucket. It will use the `TARGET_FILENAME` to assign the correct month and year values we have provided earlier in this notebook, under the `disturbance-probability-percentile` bucket on `S3`.

In [44]:
s3 = boto3.client("s3")
BUCKET = "veda-data-store-staging"
KEY = f"{BUCKET}/disturbance-probability-percentile/{TARGET_FILENAME}"
S3_FILE_LOCATION = f"s3://{KEY}"

if False:
    s3.upload_file(LOCAL_FILE_PATH, KEY)

## Construct dataset definition

In [53]:
dataset = {
    "collection": "disturbance-probability-percentile",
    "title": "Near Real-time Disturbance probability map (%)",
    "data_type": "cog",
    "spatial_extent": {"bbox": [[-84.132, 25.224, -79.853, 30.728]]},
    "temporal_extent": {
        "startdate": "2022-10-03T00:00:00Z",
        "enddate": "2022-10-03T23:59:59Z",
    },
    "license": "CC0-1.0",
    "providers": [
        {
            "name": "Su Ye",
            "description": "Su Ye was a Postdoc associate of GERS lab who developed the core disturbance detection algorithms and built the NRT platform.",
            "roles": [
                "producer",
            ],
            "url": "https://gers.users.earthengine.app/view/nrt-conus",
        },
        {
            "name": "Zhe Zhu",
            "description": "Zhe Zhu is an Assistant Professor in the Department of Natural Resources & the Environment at the University of Connecticut. Zhe led the development of the core disturbanc detection algorithms.",
            "roles": [
                "producer",
            ],
            "url": "https://gers.users.earthengine.app/view/nrt-conus",
        },
        {
            "name": "Global Environmental Remote Sensing Laboratory",
            "description": "The Global Environmental Remote Sensing Laboratory (GERS Lab) at the University of Connecticut uses quantitative remote sensing to understand how the world is changing.",
            "roles": [
                "host",
            ],
            "url": "https://gerslab.uconn.edu/",
        },
    ],
    "description": "The UCONN GERS lab developed a near real-time platform ('CONUS Disturbance Watcher') for detecting land disturbances from Harmonized Landsat Sentinel-2 (HLS) dataset. The platform first applied Stochastic Continuous Change Detection (S-CCD) to update spectral change magnitudes and other disturbance features based upon the latest images, and then predicts disturbance probability using the pre-trained models from historical disturbance datasets.",
    "is_periodic": False,
    "time_density": None,
    "sample_files": [S3_FILE_LOCATION],
    "discovery_items": [
        {
            "discovery": "s3",
            "prefix": "disturbance-probability-percentile/",
            "bucket": "veda-data-store-staging",
            "filename_regex": f"(.*){TARGET_FILENAME}$",
        }
    ],
}
import json

print(json.dumps(dataset, indent=2))

{
  "collection": "disturbance-probability-percentile",
  "title": "Near Real-time Disturbance probability map (%)",
  "data_type": "cog",
  "spatial_extent": {
    "bbox": [
      [
        -84.132,
        25.224,
        -79.853,
        30.728
      ]
    ]
  },
  "temporal_extent": {
    "startdate": "2022-10-03T00:00:00Z",
    "enddate": "2022-10-03T23:59:59Z"
  },
  "license": "CC0-1.0",
  "providers": [
    {
      "name": "Su Ye",
      "description": "Su Ye was a Postdoc associate of GERS lab who developed the core disturbance detection algorithms and built the NRT platform.",
      "roles": [
        "producer"
      ],
      "url": "https://gers.users.earthengine.app/view/nrt-conus"
    },
    {
      "name": "Zhe Zhu",
      "description": "Zhe Zhu is an Assistant Professor in the Department of Natural Resources & the Environment at the University of Connecticut. Zhe led the development of the core disturbanc detection algorithms.",
      "roles": [
        "producer"
    

## Validate dataset definition

The following code block is used to validate the above dataset definition, and if valid, confirms that it is ready to be published on the VEDA Platform. 

In [54]:
auth_header = f"Bearer {TOKEN}"
headers = {
    "Authorization": auth_header,
    "content-type": "application/json",
    "accept": "application/json",
}
response = requests.post((API + "dataset/validate"), json=dataset, headers=headers)
response.raise_for_status()
print(response.text)

HTTPError: 500 Server Error: Internal Server Error for url: https://ig9v64uky8.execute-api.us-west-2.amazonaws.com/staging/dataset/validate

## Publish to STAC

The final code block below will initiate a workflow and publish the dataset to the VEDA Platform. 

In [55]:
response = requests.post((API + "dataset/publish"), json=dataset, headers=headers)
response.raise_for_status()
print(response.text)

HTTPError: 500 Server Error: Internal Server Error for url: https://ig9v64uky8.execute-api.us-west-2.amazonaws.com/staging/dataset/publish

Congratulations! You have now successfully uploaded a COG dataset to the [VEDA Dashboard](https://www.earthdata.nasa.gov/dashboard/). You can now explore the data catalog to verify the ingestion process has worked successfully, as now uploaded data should be ready for viewing and exploration. 