## User Story: Load Data

https://eodc.atlassian.net/browse/SDO2024-307

Loading your user-collections is a simple process, in order to ensure this notebook works, make sure to already have results of a job saved
in your workspace so we can load it here.
If you don't you can check out the save result demo notebook first.

In [None]:
import openeo

# Connect to the openEO backend and authenticate with EGI Check-In

connection = openeo.connect("https://openeo-dev.eodc.eu/openeo/1.1.0")
connection = connection.authenticate_oidc(provider_id="egi")

In order to load a user collection, we have to first find out the collection id, we can do this by listing the stac collections within the workspace using
the list_stac_collections(WORKSPACE_NAME) function. This function returns us a list of tuples consisting of the collection_id and the path to the stac collection json.

In [None]:
from eodc.workspace import CephAdapter, EODC_CEPH_URL

# Set these variables to your own.

WORKSPACE_NAME = ""

S3_ENDPOINT = EODC_CEPH_URL
S3_ACCESS_KEY = ""
S3_SECRET_KEY = ""

adapter: CephAdapter = CephAdapter(S3_ENDPOINT, S3_ACCESS_KEY, S3_SECRET_KEY)

collections = adapter.list_stac_collections(WORKSPACE_NAME)

collections

Now you can either set your collection id manually or set it via the list_stac_collections return value.

In [None]:
# collection_id = ""

collection_id = collections[0][0]

collection_id

You can also check the available STAC items of your given collection beforehand by using the get_stac_items function of the workspace adapter.

In [None]:
items = adapter.get_stac_items(workspace_name=WORKSPACE_NAME, collection_id=collection_id)

items

We are also defaulting to the same spatial_extent and the same temporal_extent as in the save result notebook, feel free to change these to fit your use case.

If you want to check the full extent of your user collection you can use the following code:

In [None]:
print(adapter.get_collection(WORKSPACE_NAME, collection_id).extent.spatial.bboxes) # This prints the maximum bounding box around all items in the STAC collection
print(adapter.get_collection(WORKSPACE_NAME, collection_id).extent.temporal.intervals) # This prints the maximum temporal extent of all items in the STAC collection

Now we can start a job with the corresponding collection_id in the load_collection node, with the workspace property filled out.

This will load results from a previous job which we can then further transform and use as we like.

In [None]:
import uuid

process_graph = {
    "id": str(uuid.uuid4()),
    "process_graph": {
        "load1": {
            "process_id": "load_collection",
            "arguments": {
                "id": collection_id,
                "properties": {"workspace": WORKSPACE_NAME},
                "spatial_extent": {
                    "west": 15,
                    "east": 48,
                    "south": 17,
                    "north": 49,  
                },
                "temporal_extent": [
                    "2019-01-01T00:00:00Z",
                    "2024-01-08T00:00:00Z",
                ],
                "bands": ["raster-result"],
            },
        },
        "save2": {
            "process_id": "save_result",
            "arguments": {"data": {"from_node": "load1"}, "format": "GTIFF"},
            "result": True,
        },
    },
}


job = connection.create_job(process_graph=process_graph, title="load-from-workspace-job")
job.start_job()

In [None]:
job # Execute this to check on your jobs progress

Once the job has run through you can check the results!