# EarthCode 101


## About EarthCODE

EarthCODE ([earthcode.esa.int](https://earthcode.esa.int)) is ESA’s strategic initiative to bring this vision to life. Originally starting as a simple repository for datasets from ESA-funded projects, it has since grown into a comprehensive environment that supports the full open science lifecycle—from data and workflow development to publication and community engagement.

![](images/EarthCODE%20Demo%20Session.pptx%20(1).png)

## Open Science Catalog


You can explore the Open Science Catalog from any web browser by navigating to:

👉 [https://opensciencedata.esa.int/](https://opensciencedata.esa.int/)

Upon entering the portal, you will see a welcome page that introduces the catalog and its functionalities. The catalog organizes resources into six thematic research domains, allowing users to easily browse and discover relevant projects, products, workflows, and experiments.

From the landing page, you can also access:

- Search tools for locating products
- Catalog to browse available products
- Metrics to explore data availability statistics
- API capabilities for programmatic access

![OSC-main-page](https://github.com/EOEPCA/open-science-catalog-metadata/assets/120453810/a97e40c1-0f69-4204-9aef-95030c5a8455)

## Working with Platforms

EarthCODE partners with a growing ecosystem of platforms to provide FAIR and Open Earth Observation science tools and infrastructure!

![](images/EarthCODE%20Demo%20Session.pptx%20(2).png)

The EarthCODE integrated platforms can either be self-sponsored if your team already have access to them. You can alternatively apply for sponsorship from the [Network of Resources](https://nor-discover.org/) for access & resources on your selected platform. Please refer to the [NoR Tutorial Page](../../Training%20and%20Resources/NoR.md) for more details about making a NoR application.

![](images/EarthCODE%20Demo%20Session.pptx%20(3).png)


__NOTE__: You can work locally and/or on custom hardware and upload metadata to the OSC or PRR seperately.

## Community effort

EarthCODE’s [Discourse](https://discourse-earthcode.eox.at/) is a community for EarthCODE users to engage in discussions about FAIR and open-science, share insights, and explore the diverse tools and solutions offered by the platform

This forum serves as a shared space where scientists, researchers, and practitioners come together to advance Earth science through open discussion and knowledge sharing. 


![](images/EarthCODE%20Demo%20Session.pptx%20(4).png)

# Publishing Data

A detailed description and examples of how to publish data to the OSC is available [here](https://esa-earthcode.github.io/documentation/Technical%20Documentation/Data/Contributing%20to%20the%20EarthCODE%20Catalog#how-to-publish-results). But in summary, you'll have to :

1. **Prepare your Product Package (Research Experiment)**, by uploading **dataset files**, **code** and **documentation** to appropriate, accessible locations.

2. **Generate a Self-Contained STAC Collection**
   - Use tools like [`stactools`](https://stactools.readthedocs.io/en/stable/), [`rio-stac`](https://github.com/developmentseed/rio-stac), or [`PySTAC`](https://pystac.readthedocs.io/en/stable/) to generate a STAC collection.
   - Generate the relevant Stac Items.
   - Host the resulting JSON files (Catalog + Items) in a **public GitHub repository** (or institutional equivalent).


   >     Make sure the Collection uses **relative paths** and points to remote asset URLs!


3. **Describe Your Research in the Open Science Catalog**
   - Create entries that describe your **dataset, workflow, and experiment**.
   - Link them to relevant **projects, variables, themes, and EO missions**.


## PRR

The ESA Project ResultsRrepository hosts value added products from several ESA EOP-S Projects and provides persistant storage. It is one option where to keep your datasets and the metadata that describes them.


### General information required


In [10]:
from pystac import Collection

collection = Collection.from_dict(
    
{
  "type": "Collection",
  "id": "",
  "stac_version": "1.1.0",
  "title": "",
  "description": "",
  "extent": {
    "spatial": {
      "bbox": [
        [
          -180.0,
          -90.0,
          180.0,
          90.0
        ]
      ]
    },
    "temporal": {
      "interval": [
        [
          "1982-01-01T00:00:00Z",
          "2022-12-31T23:59:59Z"
        ]
      ]
    }
  },
"license": "CC-BY-4.0",
"links": []

}

)

collection

### Data Structure

The STAC structure helps organize and describe your data in a consistent and machine-readable way. Its important to specify the best expected structure for the data, to increase usability. This varies based on project and data type.

Once the data structure is specified, you can describe it using Stac Items.

One way to create an Item Catalog is to copy an existing catalog and edit it manually in a text editor to fit your data. If you're new to STAC and only have a few data assets, this approach can work, but it is prone to errors.

Manually editing STAC Items can be tedious, and extracting all the required metadata correctly can be challenging. For most Item Catalogs, we recommend using automated tools, for example:

The stactools CLI provides a simple command-line interface for generating STAC Items. With the stactools-datacube extension even following the STAC datacube extension.
A combination of PySTAC to create the Catalog and rio-stac for automatically generating valid STAC Items with all required metadata.
Typically, this workflow starts by defining individual STAC objects (a Catalog and its Items). Once created, these objects are linked together using STAC relationships.

## Using rio_stac


In [13]:
import rio_stac
import pystac
from rio_stac import create_stac_item


In [3]:
filenames = [
    "https://zenodo.org/records/7568049/files/extent_S1B_EW_GRDH_1SDH_20171111T205337_20171111T205438_008239_00E91A_F8D1.tif",
    "https://zenodo.org/records/7568049/files/extent_S1B_EW_GRDH_1SDH_20190224T203744_20190224T203844_015093_01C356_B9C1.tif",
    "https://zenodo.org/records/7568049/files/extent_S1B_EW_GRDH_1SDH_20170620T205332_20170620T205433_006139_00AC89_6857.tif",
    "https://zenodo.org/records/7568049/files/extent_S1B_EW_GRDH_1SDH_20180923T202118_20180923T202218_012847_017B82_7DD5.tif",
    "https://zenodo.org/records/7568049/files/extent_S1B_EW_GRDH_1SDH_20181108T203747_20181108T203847_013518_01903B_D463.tif",
]

In [7]:
item = create_stac_item(
    source=filenames[0],
    id="item_1",
    asset_name="data",  # EarthCODE standard asset name
    # all the metadata!
    with_eo=True,
    with_proj=True,
    with_raster=True,
)

In [8]:
item

In [12]:
collection.add_item(item)

In [14]:
collection.normalize_and_save(root_href='../../data/example_collection/', catalog_type=pystac.CatalogType.SELF_CONTAINED)

# Using pySTAC

## OSC

Data ingestion to the catalog can be performed in different ways, depending on **where the products are originally stored** , but also depending on **the number of products to be ingested** and therefore size.

All Themes, Variables, EO Missions, Projects, Products, Workflows, and Experiments are hosted as a metadata repository placed on the GitHub platform: Git and [GitHub API](https://docs.github.com/en/rest). Each update to metadata is handled via a [Pull Request (PR)](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-pull-requests). This Pull Request allows for reviewers to see the changes to be applied in advance, to check for validity of the requested changes (via an automated validation script) and to provide reviews as comments. If appropriate, the changes can be merged with the main branch of the repository. When a Pull Request is merged, the updated STAC catalog is deployed as Static Catalog.

![ingest-data-scheme](https://github.com/EOEPCA/open-science-catalog-metadata/assets/120453810/5d6297e7-5d66-4564-9538-bb6eaeb92598)

At the moment Open Science Catalog supports ingestion of new products either directly via **GitHub** or indirectly, using a [GUI editor](https://workspace.earthcode-staging.earthcode.eox.at/osc-editor). 


**Requirements:**
- The Product should be related to a result of an ESA-funded project. Check if the Project's page is already existing within the ESA Open Science Catalog: [https://opensciencedata.esa.int/](https://opensciencedata.esa.int/). If not **create a Project page first.**
- **Complete metadata available** (to correctly describe the Product)
- The Product should be stored in an external database that is approved and a **stable data repository** (e.g. ESA PRR, CEDA Data Archive: [https://catalogue.ceda.ac.uk/](https://catalogue.ceda.ac.uk/); Zenodo repository: [https://zenodo.org/](https://zenodo.org/), etc.)
- If the product you would like to ingest is stored elsewhere, see other data ingestion scenarios described in the section TBD.
- Data provided in formats acceptable by GDAL and rasterio library.
- Do you have appropriate documentation?

# Publishing workflows

Workflows describe the code, enviroment and input data needed to produce results. It is strongly recommened that these be published alongside the final data products.

- What code do you have?
- What is the license?
- What enviroment does it run in?
- Can you show how it runs