# ComCrop ONNX Model run on OpenEO backend

This notebook guides you through running the ComCrop ONNX model using OpenEO.
It allows you to interactively select the Area of Interest (AOI) on a map.

**Instructions:**
1. Ensure `openeo`, `rasterio`, `matplotlib`, `ipyleaflet` and `ipywidgets` are installed.
2. Run the 'Imports' cell.
3. Run the 'Define Area of Interest (AOI)' cell. A map will appear.
4. Use the rectangle drawing tool (on the left side of the map) to draw your desired AOI.
5. The coordinates and approximate size (in 10m pixels) will be displayed below the map.
   - **Constraint:** The maximum size allowed is 600x600 pixels. If your selection is too large, an error message will appear, and you'll need to draw a smaller rectangle.
6. Once a valid AOI is selected, proceed to the subsequent cells to connect to OpenEO and run the processing job.

In [1]:
# --- Import Libraries ---
import openeo
import rasterio
import matplotlib.pyplot as plt


## Define Area of Interest (AOI)

Use the map below to draw a rectangle defining your processing area. The maximum size is approximately 600x600 pixels at 10m resolution.

In [None]:
CRS = "EPSG:32618"
RESOLUTION = 10  # important; the resolution is implicitely tied to the CRS; so we need to use UTM based CRS here

SPATIAL_EXTENT = {
    "west": 613251,
    "south": 493326,
    "east": 613291,
    "north": 493404,
    "crs": CRS
}
TEMPORAL_EXTENT = ["2023-03-24", "2023-03-24"]
SAR_TEMPORAL_EXTENT = ["2023-01-01", "2023-12-31"]

connection = openeo.connect("https://openeo.dataspace.copernicus.eu/")
connection.authenticate_oidc()

udf_lanlot  = openeo.UDF.from_file('C:/Git_projects/WAC/production/prediction/udf_lat_lon.py',
        context={
            "west": SPATIAL_EXTENT['west'],
            "south": SPATIAL_EXTENT['south'],
            "east": SPATIAL_EXTENT['east'],
            "north": SPATIAL_EXTENT['north'],
            "crs": SPATIAL_EXTENT['crs']
                }
)




Authenticated using refresh token.


In [None]:
#TODO request why no cloud masking

# S2 pipeline
sentinel2 = connection.load_collection(
    "SENTINEL2_L2A",
    temporal_extent=TEMPORAL_EXTENT,
    spatial_extent=SPATIAL_EXTENT,
    bands=["B02", "B03", "B04", "B05", "B06", "B07", "B08", "B11", "B12"]
).resample_spatial(resolution=RESOLUTION, projection=CRS)


# S1 pipeline
sentinel1 = connection.load_collection(
    "SENTINEL1_GLOBAL_MOSAICS",
    spatial_extent=SPATIAL_EXTENT,
    temporal_extent=SAR_TEMPORAL_EXTENT,
    bands=["VV", "VH"]
).resample_spatial(resolution=RESOLUTION, projection=CRS)
sentinel1 = sentinel1.reduce_dimension(dimension="t", reducer="mean")
sentinel1 = sentinel1.apply(lambda x: 10 * x.log(base=10))


#latlon pipeline
latlon = sentinel2.apply(process=udf_lanlot).rename_labels("bands", ["lon", "lat"]).resample_spatial(resolution=RESOLUTION, projection=CRS)

#DEM pipeline
dem = connection.load_collection(
    "COPERNICUS_30",
    spatial_extent=SPATIAL_EXTENT
).resample_spatial(resolution=RESOLUTION, projection=CRS, method="bilinear")


if dem.metadata.has_temporal_dimension():
    dem = dem.reduce_dimension(dimension="t", reducer="mean")

    
merged_datacube = (
    sentinel2
    .merge_cubes(sentinel1)
    .merge_cubes(dem)
    .merge_cubes(latlon)
)

In [24]:

merged_datacube.execute_batch('test.nc')




0:00:00 Job 'j-25050615564144b7b9d8ece12e642cda': send 'start'
0:00:15 Job 'j-25050615564144b7b9d8ece12e642cda': created (progress 0%)
0:00:20 Job 'j-25050615564144b7b9d8ece12e642cda': created (progress 0%)
0:00:26 Job 'j-25050615564144b7b9d8ece12e642cda': running (progress N/A)
0:00:34 Job 'j-25050615564144b7b9d8ece12e642cda': running (progress N/A)
0:00:44 Job 'j-25050615564144b7b9d8ece12e642cda': running (progress N/A)
0:00:57 Job 'j-25050615564144b7b9d8ece12e642cda': running (progress N/A)
0:01:12 Job 'j-25050615564144b7b9d8ece12e642cda': running (progress N/A)
0:01:31 Job 'j-25050615564144b7b9d8ece12e642cda': running (progress N/A)
0:01:55 Job 'j-25050615564144b7b9d8ece12e642cda': running (progress N/A)
0:02:25 Job 'j-25050615564144b7b9d8ece12e642cda': running (progress N/A)
0:03:02 Job 'j-25050615564144b7b9d8ece12e642cda': finished (progress 100%)


In [34]:
import xarray as xr
import matplotlib.pyplot as plt

ds = xr.open_dataset('test.nc')

ds

In [None]:




# --- Apply Normalisation UDF ---
print("Applying normalisation UDF...")
cubemerged = cubemerged.apply(process=normalise_bands_udf)

# After normalisation, we need to explicitly update band labels to include NDVI
# This ensures the dimension labels match the actual 15 bands output by the UDF
print("Updating band labels to include NDVI...")
cubemerged = cubemerged.rename_labels(
    'bands', ["B02", "B03", "B04", "B05", "B06", "B07", "B08", "B11", "B12", "NDVI", "VV", "VH", "DEM", "lon", "lat"]
)
print(f"Final band labels: {cubemerged.dimension_labels('bands')}")

Applying normalisation UDF...
Updating band labels to include NDVI...
Final band labels: DataCube(<PGNode 'dimension_labels' at 0x1e7f58dfa10>)


In [12]:
cubemerged.execute_batch(job_options=job_options)

0:00:00 Job 'j-2504301413224cc1b02c923748ed24ef': send 'start'
0:01:24 Job 'j-2504301413224cc1b02c923748ed24ef': created (progress 0%)
0:01:29 Job 'j-2504301413224cc1b02c923748ed24ef': created (progress 0%)
0:01:36 Job 'j-2504301413224cc1b02c923748ed24ef': created (progress 0%)
0:01:44 Job 'j-2504301413224cc1b02c923748ed24ef': queued (progress 0%)
0:01:54 Job 'j-2504301413224cc1b02c923748ed24ef': queued (progress 0%)
0:02:06 Job 'j-2504301413224cc1b02c923748ed24ef': queued (progress 0%)
0:02:21 Job 'j-2504301413224cc1b02c923748ed24ef': queued (progress 0%)
0:02:41 Job 'j-2504301413224cc1b02c923748ed24ef': queued (progress 0%)
0:03:05 Job 'j-2504301413224cc1b02c923748ed24ef': queued (progress 0%)
0:03:35 Job 'j-2504301413224cc1b02c923748ed24ef': queued (progress 0%)
0:04:12 Job 'j-2504301413224cc1b02c923748ed24ef': queued (progress 0%)
0:04:59 Job 'j-2504301413224cc1b02c923748ed24ef': queued (progress 0%)
0:05:57 Job 'j-2504301413224cc1b02c923748ed24ef': queued (progress 0%)
0:06:58 Job

In [10]:
# --- Run Model UDF with Neighborhood Processing ---

print("Proceeding to apply_neighborhood...")

# Use smaller patches with overlap to prevent grid bounds issues
print("Running model UDF with neighborhood processing...")
patch_size = 48  # Reduced from 64 to avoid grid bounds issues
overlap = 8     # Add overlap to handle edge cases
print(f"Using patch size {patch_size}x{patch_size} with {overlap} pixel overlap")
print("Monitor batch job @ https://openeo.dataspace.copernicus.eu/")

result = cubemerged.apply_neighborhood(
    process=model_udf,
    size=[
        {'dimension': 'x', 'value': patch_size, 'unit': 'px'},
        {'dimension': 'y', 'value': patch_size, 'unit': 'px'}
    ],
    overlap=[
        {'dimension': 'x', 'value': overlap, 'unit': 'px'},
        {'dimension': 'y', 'value': overlap, 'unit': 'px'}
    ]
)

# Set up the output file path
target_file = f"{model_filename}~jn~prediction.tif"

# Create and start a batch job
main_job = result.execute_batch(target_file, 
    job_options=job_options, 
    title=f"{model_filename}~jn~ONNX Prediction"
)
print(f"Main batch job {main_job.job_id} finished. Output will be saved to {target_file}")


Proceeding to apply_neighborhood...
Running model UDF with neighborhood processing...
Using patch size 48x48 with 8 pixel overlap
Monitor batch job @ https://openeo.dataspace.copernicus.eu/


NameError: name 'model_udf' is not defined

In [None]:
# --- Visualise Results ---
if main_job.status() == 'finished':
    print("If job completed successfully, we can show results...")
    # Download results to the directory where the notebook is running
    results = main_job.get_results()
    results.download_files()
    print(f"Results (probably) downloaded. Check for file... {target_file}")
    
    
    # Now try to open and display the file
    print(f"Opening result file: {target_file}")
    with rasterio.open(target_file) as dataset:
        print(f"Dataset properties:")
        print(f"  Driver: {dataset.driver}")
        print(f"  CRS: {dataset.crs}")
        print(f"  Count: {dataset.count}")
        print(f"  Width: {dataset.width}, Height: {dataset.height}")
        print(f"  Bounds: {dataset.bounds}")

        # Display the first band
        fig, ax = plt.subplots(1, 1, figsize=(10, 10))
        show(dataset.read(1), ax=ax, cmap='viridis', title=target_file)
        plt.show()
else:
    print(f"Showing results failed... Status: {main_job.status()}")
    print("Please check the job logs above or on the OpenEO platform for details.")