# Publishing Product to the Open Science Catalog
## Purpose
The purpose of this tutorial is to provide a guide as to how to publish your product to the Open Science Catalog (OSC). This is the last step in the publishing pipeline. You will have the option to fill in some fields relating to your data product, and by running through the rest of the notebook you should be able to generate an appropriate `product` entry in the OSC.

We will do the following:
- Define our descriptory fields, such as id, title, description, extent, and more.
- Determine the relevant pre-existing metadata objects in the OSC, such as Project, Variables, Themes and EO mission
- Generate a valid product JSON object containing all this information (later stored as `collection.json`)
- Store this JSON as a valid STAC object in the `open-science-catalog-metadata-staging` repository
- Update relevant pre-existing metadata objects to link to our new object
- Explain how to use Git to create a Pull Request with our new OSC entry

## Prerequisites
This notebook assumes that you have already prepared your Item Catalog / Data Package as a self-contained STAC catalog in some other, persistent repository. You should have a link to a `catalog.json` file stored remotely. 

If you haven't, please refer to the tutorials and guides on how you should create your Item Catalog.

# Importing dependencies

In [None]:
from datetime import datetime, timedelta
import pystac
import json

# Describing our Product
Please make the appropriate edits to accurately describe your product here. All these cells should be adjusted for your product.

## General Metadata

In [None]:
PRODUCT_ID: str = "my-product-id"
PRODUCT_TITLE: str = "My Product Title"
PRODUCT_DESCRIPTION: str = """A detailed description of my dataset"""

KEYWORDS: list[str] = ["Keyword1", "Keyword2"]
REGION: str = "The region of the data"  # e.g. Antarctica, Europe, America
PRODUCT_STATUS = "ongoing"  # planned | ongoing | completed

In [None]:
time_format = "%Y-%m-%dT%H:%M:%SZ"  # write your own temporal extent in this format
TEMPORAL_EXTENT: list[str] = [
    datetime.strftime(datetime.now() - timedelta(weeks=52), time_format), 
    datetime.strftime(datetime.now(), time_format),
]

SPATIAL_EXTENT: list[float] = [-180.0, -90.0, 180.0, 90.0]

In [None]:
# link to pre-existing Item Collection root catalog.json
ITEM_COLLECTION: str = "https://raw.githubusercontent.com/anders0204/supraglacial-lakes-item-catalog/refs/heads/main/catalog.json"

## Pre-existing OSC collections
Visit the open science catalog metadata staging GitHub for links to the existing collections.

**Remember to use the _raw_ file links!**

### Project
If the associated project for the product is already existing in the OSC, provide a link to its `collection.json` file on the OSC GitHub.

If not, leave this variable as `None` and we will generate a Project entry based on the metadata for the product. You can change this file manually later to add contacts, websites, and more.

In [None]:
PROJECT: str | None = None

### Variables

In [None]:
VARIABLES: list[str] = [
    # river ice
    "https://raw.githubusercontent.com/ESA-EarthCODE/open-science-catalog-metadata-staging/refs/heads/main/variables/river-ice/catalog.json",
    # h2o
    "https://raw.githubusercontent.com/ESA-EarthCODE/open-science-catalog-metadata-staging/refs/heads/main/variables/h2o/catalog.json"
]

### Themes

In [None]:
THEMES: list[str] = [
    # land
    "https://raw.githubusercontent.com/ESA-EarthCODE/open-science-catalog-metadata/refs/heads/main/themes/land/catalog.json",
]

### EO Missions

In [None]:
EO_MISSIONS: list[str] = [
    # cryosat
    "https://raw.githubusercontent.com/ESA-EarthCODE/open-science-catalog-metadata/refs/heads/main/eo-missions/cryosat/catalog.json",
]

# Generating the product JSON
Here we will generate the product manually as a python dictionary. This part is not intended to be edited, simply run through the cells to generate a product based on the values you defined above.

## Creating Base

In [None]:
time_now = datetime.strftime(datetime.now(), time_format)

product = {
    "type": "Collection",
    "id": PRODUCT_ID,
    "stac_version": "1.0.0",
    "description": PRODUCT_DESCRIPTION,
    "updated": time_now,
    "title": PRODUCT_TITLE,
    "licence": "proprietary",
    "keywords": KEYWORDS,
    "extent": {
        "spatial": {
            "bbox": [
                SPATIAL_EXTENT
            ]
        },
        "temporal": {
            "interval": [
                TEMPORAL_EXTENT
            ]
        }
    },
    "stac_extensions": [
        "https://stac-extensions.github.io/osc/v1.0.0/schema.json",
        "https://stac-extensions.github.io/themes/v1.0.0/schema.json",
        "https://stac-extensions.github.io/cf/v0.2.0/schema.json"
    ],
    "osc:project": PRODUCT_TITLE,
    "osc:status": PRODUCT_STATUS,
    "osc:region": REGION,
    "osc:type": "product",
    "created": time_now,
    "version": "1.0",
    
}

## Adding Links

In [None]:
root_link = {
    "rel": "root",
    "href": "../../catalog.json",
    "type": "application/json",
    "title": "Open Science Catalog"
}
parent_link = {
  "rel": "parent",
  "href": "../catalog.json",
  "type": "application/json",
  "title": "Products"
}

child_link = {
  "rel": "child",
  "href": ITEM_COLLECTION,
  "type": "application/json",
  "title": "Items"
}

In [None]:
# Variables
variables_stac = []
variable_links = []
for file_name in VARIABLES:
    stac_catalog = pystac.Catalog.from_file(file_name)
    variables_stac.append(stac_catalog)

    variable_links.append(
        {
          "rel": "related",
          "href": f"../../variables/{stac_catalog.id}/catalog.json",
          "type": "application/json",
          "title": f"Variable: {stac_catalog.title}"
        })

product["osc:variables"] = [var.id for var in variables_stac]

# Themes
themes_stac = []
theme_links = []
for file_name in THEMES:
    stac_catalog = pystac.Catalog.from_file(file_name)
    themes_stac.append(stac_catalog)

    theme_links.append(
        {
          "rel": "related",
          "href": f"../../themes/{stac_catalog.id}/catalog.json",
          "type": "application/json",
          "title": f"Theme: {stac_catalog.title}"
        })

theme_ids = [{"id": theme.id} for theme in themes_stac]

product["themes"] = [
    {
      "scheme": "https://github.com/stac-extensions/osc#theme",
      "concepts": theme_ids
    }
  ]

# EO missions
eo_stac = []
eo_links = []
for file_name in EO_MISSIONS:
    stac_catalog = pystac.Catalog.from_file(file_name)
    eo_stac.append(stac_catalog)

    eo_links.append(
        {
          "rel": "related",
          "href": f"../../eo_missions/{stac_catalog.id}/catalog.json",
          "type": "application/json",
          "title": f"EO Mission: {stac_catalog.title}"
        })

product["osc:missions"] = [eo.id for eo in eo_stac]

## Creating a project link

In [None]:
if isinstance(PROJECT, str):
    project_stac = pystac.Collection.from_file(PROJECT)
    links.append(
        {
          "rel": "related",
          "href": f"../../projects/{project_stac.id}/collection.json",
          "type": "application/json",
          "title": f"Project: {project_stac.title}"
        }
    )
    product["osc:project"] = project_stac.title
    project = project_stac.to_dict()

In [None]:
if PROJECT is None:
    project = {
        "type": "Collection",
        "id": PRODUCT_ID,
        "stac_version": "1.0.0",
        "description": PRODUCT_DESCRIPTION,
        "updated": time_now,
        "title": PRODUCT_TITLE,
        "licence": "proprietary",
        "keywords": KEYWORDS,
        "extent": {
            "spatial": {"bbox": [SPATIAL_EXTENT]},
            "temporal": {"interval": [TEMPORAL_EXTENT]},
        },
        "stac_extensions": [
            "https://stac-extensions.github.io/osc/v1.0.0/schema.json",
            "https://stac-extensions.github.io/themes/v1.0.0/schema.json",
            "https://stac-extensions.github.io/contacts/v0.1.1/schema.json",
        ],
        "osc:status": PRODUCT_STATUS,
        "themes": [
            {
                "scheme": "https://github.com/stac-extensions/osc#theme",
                "concepts": [theme_ids],
            }
        ],
        "osc:type": "project",
        "contacts": [  # Add all affiliations and contact points
            {
                "name": "Your Name",
                "emails": [{"value": "your.email@institution.org"}],
                "roles": ["technical_officer"],
            },
            {
                "name": "Name of an affiliated institution, organisation, etc.",
                "roles": ["consortium_member"],
            },
            {
                "name": "Name of another institution, organisation, etc.",
                "roles": ["consortium_member"],
            },
        ],
    }

    project_links = [
        {
            "rel": "root",
            "href": "../../catalog.json",
            "type": "application/json",
            "title": "Open Science Catalog",
        },
        {
            "rel": "via",  # Add all relevant websites, documentation, etc., with "via" links
            "href": "https://www.<my-project-website>.org/",
            "title": "Website",
        },
        {
            "rel": "child",
            "href": f"../../products/{PRODUCT_ID}/collection.json",
            "type": "application/json",
            "title": PRODUCT_TITLE,
        },
        {
            "rel": "parent",
            "href": "../catalog.json",
            "type": "application/json",
            "title": "Projects",
        },
        {
            "rel": "self",
            "href": f"https://esa-earthcode.github.io/open-science-catalog-metadata/projects/{PRODUCT_ID}/collection.json",
            "type": "application/json",
        },
    ]

    for links in (variable_links, eo_links, theme_links):
        for link in links:
            project_links.append(link)

    project["links"] = project_links

### Finishing linking to our product

In [None]:
product["links"] = [link for links in (variable_links, eo_links, theme_links) for link in links] + [root_link, parent_link, child_link]

In [None]:
link_to_project = {
    "rel": "related",
    "href": f"../../projects/{project['id']}/collection.json",
    "type": "application/json",
    "title": f"Project: {project['title']}"
}
product["links"].append(link_to_project)

### **Done!**
We can now inspect the results:

In [None]:
print(json.dumps(product, indent=2))

In [None]:
print(json.dumps(project, indent=2))

# Saving dictionary as JSON
Now that we have the product represented as a dictionary in python, it's trivial to store it as a JSON object. The only thing you need to keep in mind is the location.

To add a product to the Open Science Catalog, you should store the product under the `products/` folder in your local fork of the [`open-science-catalog-metadata-staging`](https://github.com/ESA-EarthCODE/open-science-catalog-metadata-staging) repository.

Change the following `OSC_ROOT` to your local path and run the cells to save the product file.

In [None]:
from pathlib import Path

OSC_ROOT = Path("<local-path>/open-science-catalog-metadata-staging/")

In [None]:
def save_json(obj: dict, location: Path) -> None:
    if not location.parent.is_dir():
        location.parent.mkdir(parents=True, exist_ok=True)
    with open(location, "w") as f:
        json.dump(obj, f)

## Saving Product

In [None]:
product_path = OSC_ROOT / "products" / PRODUCT_ID / "collection.json"
save_json(product_path)

## Saving Project

In [None]:
project_path = OSC_ROOT / "projects" / project["id"] / "collection.json"
save_json(project_path)

::: important
Before making a pull request, make sure that you add link to your new product in all the associated metadata catalogs for Variables, Themes and EO Missions
:::