This notebook can be run on the Copernicus Dataspace Jupyterhub but running the following package installation cell first

**Note** You should select on of the kernels with GDAL installed, eg. "Geo science"

In [1]:
try:
    import eomaji
except ModuleNotFoundError:
    !pip install eomaji@git+https://github.com/DHI/EOMAJI-OpenEO-toolbox.git

# Data Mining Sharpener Workflow

In [2]:
from pathlib import Path
import openeo
import rasterio
import xarray
from eomaji.workflows.decision_tree_sharpener import run_decision_tree_sharpener
from eomaji.workflows.prepare_data_cubes import prepare_data_cubes
from eomaji.utils.general_utils import read_area_date_info

## 1. Set up the OpenEO connection

In [3]:
connection = openeo.connect("https://openeo.dataspace.copernicus.eu")
connection.authenticate_oidc()

2025-06-20 08:11:05,362 [INFO] Loaded openEO client config from sources: []
2025-06-20 08:11:05,878 [INFO] Found OIDC providers: ['CDSE']
2025-06-20 08:11:05,881 [INFO] No OIDC provider given, but only one available: 'CDSE'. Using that one.
2025-06-20 08:11:06,197 [INFO] Created user dir for 'openeo-python-client': /root/.config/openeo-python-client
2025-06-20 08:11:06,199 [INFO] Using default client_id 'sh-b1c3a958-52d4-40fe-a333-153595d1c71e' from OIDC provider 'CDSE' info.
2025-06-20 08:11:06,202 [INFO] Created user dir for 'openeo-python-client': /root/.local/share/openeo-python-client
2025-06-20 08:11:06,204 [INFO] Trying device code flow.


2025-06-20 08:11:12,505 [INFO] [  6.1s] not authorized yet: authorization_pending
2025-06-20 08:11:18,676 [INFO] [ 12.3s] not authorized yet: authorization_pending
2025-06-20 08:11:24,793 [INFO] [ 18.4s] not authorized yet: authorization_pending
2025-06-20 08:11:31,294 [INFO] [ 24.9s] Authorized successfully.
2025-06-20 08:11:31,301 [INFO] Obtained tokens: ['access_token', 'id_token', 'refresh_token']
2025-06-20 08:11:31,304 [INFO] Storing refresh token for issuer 'https://identity.dataspace.copernicus.eu/auth/realms/CDSE' (client 'sh-b1c3a958-52d4-40fe-a333-153595d1c71e')


Authenticated using device code flow.


<Connection to 'https://openeo.dataspace.copernicus.eu/openeo/1.2/' with OidcBearerAuth>

##

## 2. Define AOI and date
Either read it from dumped information from the [prepare_data.ipynb](notebooks/prepare_data.ipynb) notebook, or define it yourself

In [4]:
date_dir = "./data"
date, bbox = read_area_date_info(
    dir=date_dir
)
# date = datetime.date(2023, 6, 25)
# bbox = [6.153142, 45.045924, 6.433234, 45.251259]

## 3. Download Sentinel 2 and Sentinel 3 data for AOI and date
**Note** If you this function checks if the data already exists first based on date and bbox

In [5]:
s2_path, s3_path, worldcover_path, dem_s2_path, dem_s3_path, acq_time =  prepare_data_cubes(
    connection=connection,
    bbox=bbox,
    date=date,
    sentinel2_search_range = 3,
    out_dir = date_dir,
)

2025-06-20 08:12:51,809 [INFO] Cached Sentinel 2 data cube found. Skipping download.


  return DataCube.load_collection(
2025-06-20 08:12:51,901 [INFO] Cached DEM data cube found. Skipping download.
2025-06-20 08:12:51,902 [INFO] Cached DEM cube found. Skipping download.
2025-06-20 08:12:51,944 [INFO] Cached Worldcover cube found. Skipping download.
2025-06-20 08:12:51,945 [INFO] Data cubes prepared and saved.


Load and filter Sentinel 2 data

In [6]:
s2_cube =  xarray.open_dataset(s2_path)
band_names = ["B02", "B03", "B04", "B05", "B07", "B08", "B8A", "B11", "B12"]  # These are the Sentinel 2 bands to use for the sharpening
s2_array = s2_cube[band_names].to_array(dim="band").rio.write_crs(rasterio.crs.CRS.from_string(s2_cube.crs.spatial_ref).to_string())

Extract the Land Surface Temperature band from Sentinel 3 and the cloud mask

In [7]:
s3_cube = xarray.open_dataset(s3_path)
lst_array = s3_cube.LST.rio.write_crs(rasterio.crs.CRS.from_string(s3_cube.crs.spatial_ref).to_string())
mask_array = ((s3_cube.confidence_in < 16384).astype(float).rio.write_crs(rasterio.crs.CRS.from_string(s3_cube.crs.spatial_ref).to_string()))

## 4. Run the Data Mining Sharpener 

In [8]:
sharpened_data = run_decision_tree_sharpener(
    high_res_dataarray=s2_array,
    low_res_dataarray=lst_array,
    low_res_mask=mask_array,
    mask_values=[1],
    cv_homogeneity_threshold=0,
    moving_window_size=30,
    disaggregating_temperature=True,
    n_jobs=3,
    n_estimators=30,
    max_samples=0.8,
    max_features=0.8,
)

2025-06-20 08:12:55,420 [INFO] Downloaded high-resolution file to /tmp/tmpkta2vsp8.tiff
2025-06-20 08:12:55,432 [INFO] Downloaded low-resolution file to /tmp/tmpdv9kusv0.tiff
2025-06-20 08:12:55,438 [INFO] Downloaded low-resolution mask to /tmp/tmpkferili9.tiff
2025-06-20 08:12:58,067 [INFO] Temporary files /tmp/tmpkta2vsp8.tiff and /tmp/tmpdv9kusv0.tiff removed.


0
1


In [10]:
sharpened_data

## Save data or continue the aggregations

In [11]:
sharpened_data.band_data.rio.to_raster(Path(s3_path).parent/"sharpened_LST.tif")