# Notebook to Publish Items and Collections

This notebook publishes the collections in `/ingestion-data/collections` excluding:
- 'hls-l30-002-ej-reprocessed'
- 'hls-s30-002-ej-reprocessed'
- 'ls8-covid-19-example-data'
- 'landsat-c2l2-sr-antarctic-glaciers-pine-island'
- 'landsat-c2l2-sr-lakes-aral-sea'
- 'landsat-c2l2-sr-lakes-tonle-sap'
- 'landsat-c2l2-sr-lakes-lake-balaton'
- 'landsat-c2l2-sr-lakes-vanern'
- 'landsat-c2l2-sr-antarctic-glaciers-thwaites'
- 'landsat-c2l2-sr-lakes-lake-biwa'
- 'combined_CMIP6_daily_GISS-E2-1-G_tas_kerchunk_DEMO'

In [None]:
import glob
import json
import requests
from cognito_client import CognitoClient

Set the testing mode to `True` when testing and `False` otherwise. When the testing mode is `True`, the notebook will be set to run against `dev` endpoints.

In [None]:
testing_mode = True

The following cell retrieves collection JSON files from the `collections` directory and save collectionIds to a list.

In [None]:
excluded_collections = [
    "hls-l30-002-ej-reprocessed",
    "hls-s30-002-ej-reprocessed",
    "ls8-covid-19-example-data",
    "landsat-c2l2-sr-antarctic-glaciers-pine-island",
    "landsat-c2l2-sr-lakes-aral-sea",
    "landsat-c2l2-sr-lakes-tonle-sap",
    "landsat-c2l2-sr-lakes-lake-balaton",
    "landsat-c2l2-sr-lakes-vanern",
    "landsat-c2l2-sr-antarctic-glaciers-thwaites",
    "landsat-c2l2-sr-lakes-lake-biwa",
    "combined_CMIP6_daily_GISS-E2-1-G_tas_kerchunk_DEMO",
]

local_collections_path = (
    "../ingestion-data/staging/collections/*.json"
    if testing_mode
    else "../ingestion-data/production/collections/*.json"
)

json_file_paths = glob.glob(local_collections_path)
filtered_list = [
    item
    for item in json_file_paths
    if all(
        excluded_collections not in item
        for excluded_collections in excluded_collections
    )
]

file_paths_and_collection_ids = [
    {"filePath": file_path, "collectionId": data["id"]}
    for file_path in filtered_list
    if "id" in (data := json.load(open(file_path, "r")))
]

Have your Cognito `username` and `password` ready to set up Cognito Client to retrieve a token that will be used to access the STAC Ingestor API.

In [None]:
test_endpoint = "https://test.openveda.cloud"
test_client_id = "CHANGE ME"
test_user_pool_id = "CHANGE ME"
test_identity_pool_id = "CHANGE ME"

mcp_prod_endpoint = "https://openveda.cloud"
mcp_prod_client_id = "CHANGE ME"
mcp_prod_user_pool_id = "CHANGE ME"
mcp_prod_identity_pool_id = "CHANGE ME"

if testing_mode:
    STAC_INGESTOR_API = f"{test_endpoint}/api/ingest/"
    VEDA_STAC_API = f"{test_endpoint}/api/stac/"
else:
    STAC_INGESTOR_API = f"{mcp_prod_endpoint}/api/ingest/"
    VEDA_STAC_API = f"{mcp_prod_endpoint}/api/stac/"

client = CognitoClient(
    client_id=test_client_id if testing_mode else mcp_prod_client_id,
    user_pool_id=test_user_pool_id if testing_mode else mcp_prod_user_pool_id,
    identity_pool_id=test_identity_pool_id
    if testing_mode
    else mcp_prod_identity_pool_id,
)
_ = client.login()

The following cell sets up headers for requests.

In [None]:
TOKEN = client.access_token

authorization_header = f"Bearer {TOKEN}"
headers = {
    "Authorization": authorization_header,
    "content-type": "application/json",
    "accept": "application/json",
}

The following cell defines the function that will post the collection.

In [None]:
def post_collection(collection, collection_id):
    collection_url = f"{VEDA_STAC_API}collections/{collection_id}"
    ingest_url = f"{STAC_INGESTOR_API}collections"

    try:
        response = requests.post(ingest_url, json=collection, headers=headers)
        response.raise_for_status()
        if response.status_code == 201:
            print(
                f"Request was successful. Find the updated collection at {collection_url}"
            )
        else:
            print(
                f"Updating {collection_id} failed. Request failed with status code: {response.status_code}"
            )
    except requests.RequestException as e:
        print(
            f"Updating {collection_id} failed. An error occurred during the request: {e}"
        )
    except Exception as e:
        print(
            f"An unexpected error occurred while trying to update {collection_id}: {e}"
        )

If testing_mode is enabled, use a test list:

In [None]:
test_file_paths_and_collection_ids = [file_paths_and_collection_ids[0]]
print(test_file_paths_and_collection_ids)
print(VEDA_STAC_API)


file_paths_and_collection_ids = (
    test_file_paths_and_collection_ids
    if testing_mode
    else file_paths_and_collection_ids
)

The following cell publishes the collection to the target ingestion `api/collections` endpoint.

In [None]:
for item in file_paths_and_collection_ids:
    collection_id = item["collectionId"]
    file_path = item["filePath"]

    try:
        with open(file_path, "r", encoding="utf-8") as file:
            collection = json.load(file)

        # Publish the updated collection to the target ingestion `api/collections` endpoint
        post_collection(collection, collection_id)

    except requests.RequestException as e:
        print(f"An error occurred for collectionId {collection_id}: {e}")
    except Exception as e:
        print(f"An unexpected error occurred for collectionId {collection_id}: {e}")