# Demo - Ingesting data in the product catalogue
In this demo, we will demonstrate how to ingest project-generated data into the APEx product catalogue. The process will include:

**1. Generating Data with openEO**<br/>We will generate Sentinel-2 (S2) imagery over Gran Canaria using openEO. This data will be used to simulate project-generated data that can be ingested into the APEx catalogue.

**2. Ingesting the Data into the APEx STAC Catalogue**<br/>After generating the data, we will show how to ingest it into the APEx STAC catalogue. This will allow the data to be stored in a standardized format for discovery and access.

**3. Visualizing the Result in the STAC Browser**<br />Once the data is ingested, we will use the STAC browser to visualize the product and verify its availability in the catalogue.

In [1]:
!pip install openeo owslib requests_oauthlib
!rm /workspace/.local/share/openeo-python-client/refresh-tokens.json



## 1. Generating Data with openEO
In this section, we will use openEO to generate some Earth observation data, which we will later ingest into the catalogue. Specifically, we will generate a Sentinel-2 image over the Gran Canaria region. The result of this step is a `gran_canaria.tiff` that is downloaded locally.

In [2]:
import openeo

bounds = (-28.30438068296276, -16.171874999999993, -27.68352808378777, -15.468750000000012)
resolution = 38.21851414258813

con = openeo.connect("openeo.dataspace.copernicus.eu").authenticate_oidc()
cube = con.load_collection(
    "SENTINEL2_L2A",
    bands=["B04"],
    temporal_extent="2019-08-19",
    spatial_extent={
        "west": bounds[1],
        "south": abs(bounds[2]),#abs is weird, does GlobalMercator return wrong latitude values?
        "east": bounds[3],
        "north": abs(bounds[0]),
    }
)
cube.result_node().update_arguments(featureflags={"tilesize": 256})#force block size in output tiff
result = cube.resample_spatial(resolution=resolution,projection="EPSG:3857")
job = result.execute_batch("gran_canaria.tiff",title="APEx - Demo - Gran Canaria",filename_prefix="gran_canaria")

Authenticated using device code flow.
0:00:00 Job 'j-241209296c304e40aab094e29220d448': send 'start'
0:00:15 Job 'j-241209296c304e40aab094e29220d448': queued (progress 0%)
0:00:20 Job 'j-241209296c304e40aab094e29220d448': queued (progress 0%)
0:00:27 Job 'j-241209296c304e40aab094e29220d448': queued (progress 0%)
0:00:35 Job 'j-241209296c304e40aab094e29220d448': queued (progress 0%)
0:00:45 Job 'j-241209296c304e40aab094e29220d448': queued (progress 0%)
0:00:57 Job 'j-241209296c304e40aab094e29220d448': queued (progress 0%)
0:01:13 Job 'j-241209296c304e40aab094e29220d448': running (progress N/A)
0:01:32 Job 'j-241209296c304e40aab094e29220d448': running (progress N/A)
0:01:56 Job 'j-241209296c304e40aab094e29220d448': running (progress N/A)
0:02:26 Job 'j-241209296c304e40aab094e29220d448': running (progress N/A)
0:03:03 Job 'j-241209296c304e40aab094e29220d448': running (progress N/A)
0:03:50 Job 'j-241209296c304e40aab094e29220d448': running (progress N/A)
0:04:49 Job 'j-241209296c304e40aab0

In [3]:
job

# 2.  Ingesting the Data into the APEx STAC Catalogue
In this section we explore how to use the generated openEO result to generate STAC metadata. This STAC metadata is then used to ingest the result into the project's STAC catalogue.

In [4]:
import datetime

In [5]:
CATALOGUE_URL = "https://catalogue.demo.apex.esa.int"
STAC_METADATA_ID = "gran_canaria_demo"

## 2.1. Generating the STAC metadata
In this section, we demonstrate how to use the generated openEO results to create STAC metadata. This metadata is then utilized to ingest the results into the project’s STAC catalogue.

In [6]:
import pystac

In [7]:
stac_metadata_dict = job.get_results().get_metadata()
stac_metadata_dict["id"] = STAC_METADATA_ID
#remove collection assets, we will rely on item links
del stac_metadata_dict["assets"]
collection = pystac.Collection.from_dict(stac_metadata_dict)
collection

## 2.2. Convert online openEO collection to local collection
This step cleans up links to avoid that they point to the openEO API, which in some cases requires authentication.

There's an [open issue](https://github.com/Open-EO/openeo-python-client/issues/184) to integrate this in the API.

In [8]:
#remove collection and canonical links
collection.remove_links(rel="collection")
collection.remove_links(rel="canonical")

items = list(collection.get_stac_objects(rel=pystac.RelType.ITEM))
for i in items:
    i.remove_links(rel="collection")
    i.remove_links(rel="canonical")



collection.set_self_href("/tmp/collection.json")
collection.normalize_hrefs('/tmp/', skip_unresolved=True)

collection.license = "CC-BY-4.0"

def asset_transform(name,a):
    a.href = "/tmp/" + name
    return a

#this step can transform asset hrefs as well, 
#c2=catalog.map_assets(asset_transform)



collection.save(catalog_type=pystac.CatalogType.SELF_CONTAINED)
collection

## 2.3. Authenticate with STAC API
In this step, we will authenticate with the STAC catalogue for our project. The code leverages an external library, called [`requests_oauthlib`](https://requests-oauthlib.readthedocs.io/en/latest/index.html), to handle the authentication process and retrieve the access token. Obtaining this token is a necessary prerequisite for performing the data ingestion step.

> After opening the URL in your webbrowser, you will need to copy paste the resulting URL in the input field provided by the next cell.

In [19]:
from requests_oauthlib import OAuth2Session
import webbrowser

In [20]:
SERVER_URL = "https://auth.apex.esa.int/"
REALM = "apex"
CLIENT_ID = "demo-catalogue-dev-api"
REDIRECT_URI = f"{CATALOGUE_URL}/callback" 

BASE_URL = f"{SERVER_URL}/realms/{REALM}/protocol/openid-connect"
AUTHORIZATION_URL = f"{BASE_URL}/auth"
TOKEN_URL = f"{BASE_URL}/token"

# Initialize OAuth2 session
oauth = OAuth2Session(
    client_id=CLIENT_ID,
    redirect_uri=REDIRECT_URI
)

# Step 1: Generate the authorization URL
authorization_url, state = oauth.authorization_url(AUTHORIZATION_URL)
print(f"Visit this URL to authorize the application: {authorization_url}")
# Step 2: Paste the redirect URL after login
redirect_response = input("Paste the full redirect URL here: ")

# Step 3: Exchange the authorization code for an access token
try:
    token = oauth.fetch_token(
        TOKEN_URL,
        authorization_response=redirect_response,
    )
    access_token = token["access_token"]
except Exception as e:
    print(f"Error fetching token: {e}")

access_token

Visit this URL to authorize the application: https://auth.apex.esa.int//realms/apex/protocol/openid-connect/auth?response_type=code&client_id=demo-catalogue-dev-api&redirect_uri=https%3A%2F%2Fcatalogue.demo.apex.esa.int%2Fcallback&state=plQHbFFL4TpynvLEmM7RMpp7UJuB5o


Paste the full redirect URL here:  https://catalogue.demo.apex.esa.int/callback?state=plQHbFFL4TpynvLEmM7RMpp7UJuB5o&session_state=a13c63ff-049c-44d0-bd4d-46566729a8be&iss=https%3A%2F%2Fauth.apex.esa.int%2Frealms%2Fapex&code=db520cdb-18ea-4573-9e4b-fa14e0f56ebf.a13c63ff-049c-44d0-bd4d-46566729a8be.7f328923-c069-40b9-a30f-739450ad7703


'eyJhbGciOiJSUzI1NiIsInR5cCIgOiAiSldUIiwia2lkIiA6ICJTaEs1c1JIR0lLYlJlRXF6azZlenZxamdiZWdvVFJnQnRqQUxMSmtZdmZBIn0.eyJleHAiOjE3MzM3MzgzNTQsImlhdCI6MTczMzczODA1NCwiYXV0aF90aW1lIjoxNzMzNzM4MDUxLCJqdGkiOiI2YjM2MzNiOC04MDQ5LTRlMDAtOTE2My1hODA2Y2EzMTY3ZWMiLCJpc3MiOiJodHRwczovL2F1dGguYXBleC5lc2EuaW50L3JlYWxtcy9hcGV4Iiwic3ViIjoiNTI5MDMzZTQtZmQxYS00MzBiLWIzODMtYTM1YTkwM2JjNDg3IiwidHlwIjoiQmVhcmVyIiwiYXpwIjoiZGVtby1jYXRhbG9ndWUtZGV2LWFwaSIsInNpZCI6ImExM2M2M2ZmLTA0OWMtNDRkMC1iZDRkLTQ2NTY2NzI5YThiZSIsImFjciI6IjEiLCJhbGxvd2VkLW9yaWdpbnMiOlsiaHR0cHM6Ly9jYXRhbG9ndWUuZGVtby5hcGV4LmVzYS5pbnQiXSwicmVhbG1fYWNjZXNzIjp7InJvbGVzIjpbInN0YWMtYWRtaW4iXX0sInJlc291cmNlX2FjY2VzcyI6eyJkZW1vLWNhdGFsb2d1ZS1kZXYtYXBpIjp7InJvbGVzIjpbInN0YWMtYWRtaW4iXX19LCJzY29wZSI6InByb2ZpbGUgZW1haWwiLCJlbWFpbF92ZXJpZmllZCI6dHJ1ZSwibmFtZSI6IkJyYW0gSmFuc3NlbiIsInByZWZlcnJlZF91c2VybmFtZSI6ImJyYW0uamFuc3NlbkB2aXRvLmJlIiwiZ2l2ZW5fbmFtZSI6IkJyYW0iLCJmYW1pbHlfbmFtZSI6IkphbnNzZW4iLCJlbWFpbCI6ImJyYW0uamFuc3NlbkB2aXRvLmJlIn0.RvHI1zHGMCDRjQoX7v6

## 2.4. Upload to STAC API
The collection metadata has been cleaned and is now ready to be added to the STAC API. In this step, we are creating the full collection, but it’s also possible to add the generated item to a new collection if preferred.

Additionally, it’s recommended to enhance the metadata quality by adding extra metadata properties where applicable.

> **Important Note**<br/>
Currently, the Geotiff file is stored on the openEO backend and can be accessed through a signed URL. However, this URL will eventually expire. For more permanent cataloging, the data file should be moved to a more stable location. To do so, simply download the TIFF file and upload it to your preferred online storage location.


In [35]:
import requests
from owslib.ogcapi.records import Records
from owslib.util import Authentication

class BearerAuth(requests.auth.AuthBase):
    def __init__(self, token):
        self.token = token
    def __call__(self, r):
        r.headers["authorization"] = "Bearer " + self.token
        return r

In [36]:
auth = BearerAuth(access_token)
r = Records(CATALOGUE_URL,auth=Authentication(auth_delegate=auth))

In [37]:
coll_dict = collection.to_dict()

default_auth = {
    "_auth": {
        "read": ["anonymous"],
        "write": ["stac-openeo-admin", "stac-openeo-editor"]
    }
}

coll_dict.update(default_auth)

response = requests.post(f"{CATALOGUE_URL}/collections", auth=auth,json=coll_dict)
response

<Response [201]>

In [38]:
for item in collection.get_all_items():
    item_dict = item.to_dict()
    print(item)
    item_dict["collection"] = collection.id
    r.collection_item_create(collection.id, item_dict)

<Item id=gran_canaria_2019-08-19Z.tif>


# 3. Visualizing the Result in the STAC Browser

Now that the data has been ingested into the STAC catalogue, we can navigate to the [Demo Browser](https://browser.demo.apex.esa.int/?.language=en) to see the result.

# 4. Cleanup of the collection

In [34]:
requests.delete(f"{CATALOGUE_URL}/collections/" + collection.id, auth=auth)

<Response [204]>