# How to load a data collection?

This notebook provides a detailed guide on how to load a data collection, including all the necessary steps involved. Additionally, it will explain how to authenticate your account to ensure secure access to the data.

In [1]:
# import necessary packages

import openeo

import matplotlib.pyplot as plt
import numpy as np

In [2]:
# connect to the backend
connection = openeo.connect(url = "openeo-staging.creo.vito.be")

To verify whether they have been authenticated to load collections, users can check their connection to the backend. If the connection is established, they can be confident that they are authenticated and can proceed with loading the collections.

In [3]:
#check your connection
connection

<Connection to 'https://openeo-staging.creo.vito.be/openeo/1.1/' with NullAuth>

If the user's connection to the backend returns a NullAuth status, it means that they are not authenticated. In this case, they can authenticate themselves by using the `authenticate_oidc()` method.

In [4]:
#authenticate and recheck for your connection
connection.authenticate_oidc()

Authenticated using refresh token.


<Connection to 'https://openeo-staging.creo.vito.be/openeo/1.1/' with OidcBearerAuth>

After authentication to load the data collection, the next step is to filter it based on the specific requirements or criteria of the user. This involves narrowing down the dataset to a particular period or geographic location.

In [5]:
# load collection

cube = connection.load_collection(
                            "SENTINEL2_L2A",
                            bands = ["B04", "B03", "B02"],
                            temporal_extent = ("2022-05-01", "2022-05-30"),
                            spatial_extent = {'west': 3.202609,'south': 51.189474,'east': 3.254708,'north': 51.204641,'crs': 'EPSG:4326'},
                            max_cloud_cover=80

)

In [6]:
# Because GeoTIFF does not support a temporal dimension, we first eliminate it by taking the temporal maximum value for each pixel

cube = cube.max_time()

To complete the data analysis process, the final step involves downloading the filtered data. This can be done in two ways: synchronously or through batch job-based method. Synchronous downloading allows the user to download the data immediately, whereas batch job-based downloading enables the user to download the data in batches or at a scheduled time. The choice of method depends on the user's preference and the size of the dataset. In this example we follow the first method.

In [7]:
# download the RGB image
cube.download("RGB.tiff")