### Create WorldCereal Croptype UDP

This notebook is a step-by-step guide to create a new `worldcereal_crop_type` UDP based on the user's current `worldcereal-classification` version present in the environment used to run this notebook. 

Set some postprocess parameters. They will be parametrized in the UDP anyway, so these can be left unchanged.

In [6]:
from worldcereal.job import PostprocessParameters

### OPTIONAL PARAMETERS

# Choose whether you want to store the cropland mask as separate output file
save_mask = True

# Choose whether or not you want to spatially clean the classification results
postprocess_result = True

# Choose the postprocessing method you want to use ["smooth_probabilities", "majority_vote"]
# ("smooth_probabilities will do limited spatial cleaning,
# while "majority_vote" will do more aggressive spatial cleaning, depending on the value of kernel_size)
postprocess_method = "majority_vote"

# Additional parameter for the majority vote method
# (the higher the value, the more aggressive the spatial cleaning,
# should be an odd number, not larger than 25, default = 5)
kernel_size = 5

# Do you want to save the intermediate results? (before applying the postprocessing)
save_intermediate = True

# Do you want to save all class probabilities in the final product? (default is False)
keep_class_probs = True

postprocess_parameters = PostprocessParameters(
    enable=postprocess_result,
    method=postprocess_method,
    kernel_size=kernel_size,
    save_intermediate=save_intermediate,
    keep_class_probs=keep_class_probs,
)

Next, define all cropland/croptype parameters and the spatiotemporal extent. Again, since they will be paramatrized anyway in the UDP, these can be left unchanged by the user as well.

In [7]:
from worldcereal.job import CropLandParameters, CropTypeParameters
from openeo_gfmap import TemporalContext, BoundingBoxExtent

# Initializes default parameters
cropland_parameters = CropLandParameters()
croptype_parameters = CropTypeParameters()

model_url = "https://artifactory.vgt.vito.be/artifactory/auxdata-public/worldcereal/models/PhaseII/downstream/tests/be_multiclass-test_custommodel.onnx"

# Customize the parameters
cropland_parameters.feature_parameters.compile_presto = True

croptype_parameters.classifier_parameters.classifier_url = model_url
croptype_parameters.save_mask = save_mask
croptype_parameters.feature_parameters.compile_presto = True

# Get processing period and area
# temporal_extent = TemporalContext(start_date='2020-12-01', end_date='2021-11-30') 
# spatial_extent = BoundingBoxExtent(west=549260.0538727192, south=5643096.65598935, east=550221.062129418, north=5643965.825801395, epsg=32631) 


temporal_extent = TemporalContext(start_date="2023-10-01", end_date="2024-09-30") 
spatial_extent = BoundingBoxExtent(west=664000, south=5611134, east=684000, north=5631134, epsg=32631) 



Next, we call the `create_inference_process_graph` function from the `worldcereal-classification` repository. This function will return an openEO process graph, which will form the basis for the UDP.

In [8]:
from worldcereal.job import WorldCerealProductType, create_inference_process_graph
from openeo_gfmap import BackendContext, Backend

pg = create_inference_process_graph(
    spatial_extent=spatial_extent,
    temporal_extent=temporal_extent,
    product_type=WorldCerealProductType.CROPTYPE,
    cropland_parameters=cropland_parameters,
    croptype_parameters=croptype_parameters,
    postprocess_parameters=postprocess_parameters,
    backend_context=BackendContext(Backend.CDSE)
)

Authenticated using refresh token.


2025-07-23 11:31:43,280 - openeo_gfmap.utils - INFO - Selected orbit state: ASCENDING. Reason: Orbit has more cumulative intersected area. 13.033136612383506 > 7.983154394462191


OPTIONAL: in case you want to make sure the process graph works (for example, in case you made changes to the `worldcereal-classification` codebase). You can send the process graph to the CDSE backend in order to run it and see if it runs successfully and produces meaningful results.

In [9]:
# ONNX_DEPS_URL = "https://artifactory.vgt.vito.be/artifactory/auxdata-public/openeo/onnx_dependencies_1.16.3.zip"
ONNX_DEPS_URL = "https://s3.waw3-1.cloudferro.com/swift/v1/project_dependencies/onnx_deps_python311.zip"
TORCH_DEPS_URL = "https://s3.waw3-1.cloudferro.com/swift/v1/project_dependencies/torch_deps_python311.zip"

job_options = {
        "driver-memory": "4g",
        "executor-memory": "2g",
        "executor-memoryOverhead": "1g",
        "python-memory": "3g",
        "soft-errors": "true",
        "udf-dependency-archives": [f"{ONNX_DEPS_URL}#onnx_deps", 
                                   f"{TORCH_DEPS_URL}#feature_deps"],
        "image-name": "python311"
    }


pg.execute_batch(
    title="Test worldcereal croptype process graph",
    job_options=job_options,
)

0:00:00 Job 'j-2507230931534eb5a46172271a92d8ce': send 'start'
0:00:14 Job 'j-2507230931534eb5a46172271a92d8ce': queued (progress 0%)
0:00:19 Job 'j-2507230931534eb5a46172271a92d8ce': queued (progress 0%)
0:00:26 Job 'j-2507230931534eb5a46172271a92d8ce': queued (progress 0%)
0:00:34 Job 'j-2507230931534eb5a46172271a92d8ce': queued (progress 0%)
0:00:44 Job 'j-2507230931534eb5a46172271a92d8ce': queued (progress 0%)
0:00:57 Job 'j-2507230931534eb5a46172271a92d8ce': queued (progress 0%)
0:01:13 Job 'j-2507230931534eb5a46172271a92d8ce': queued (progress 0%)
0:01:32 Job 'j-2507230931534eb5a46172271a92d8ce': queued (progress 0%)
0:01:57 Job 'j-2507230931534eb5a46172271a92d8ce': queued (progress 0%)
0:02:28 Job 'j-2507230931534eb5a46172271a92d8ce': running (progress N/A)
0:03:07 Job 'j-2507230931534eb5a46172271a92d8ce': running (progress N/A)
0:03:55 Job 'j-2507230931534eb5a46172271a92d8ce': running (progress N/A)
0:04:54 Job 'j-2507230931534eb5a46172271a92d8ce': running (progress N/A)
0:05:5

In case your process graph works: you can save it to a JSON file locally.

In [None]:
import json

path_to_json = "./process_graph.json"

with open(path_to_json, "w") as f:
    json.dump(pg.flat_graph(), f, indent=2)

After this, the above JSON file can be used to create the UDP. From this point onwards, it's manual work (due to dependencies on GFMap). This might change in the future. 

It is recommend to use a previous version of the UDP, e.g.: https://github.com/WorldCereal/worldcereal-classification/blob/worldcereal_crop_type_v1.1.1/src/worldcereal/udp/worldcereal_crop_type.json.

The following input parameters need to be parametrized. How to do this, can be seen in the UDP example above. 

- spatial_extent
- temporal_extent
- orbit_state
- postprocess_method
- postprocess_kernel_size
- model_url

After the UDP is finished, you can push it to a feature branch on Github. Using the RAW Github URL to your new UDP, you can test again if your UDP behaves as expected, using the code snippet below:

In [10]:
import openeo

c = openeo.connect('openeo.dataspace.copernicus.eu').authenticate_oidc()

# Adjust this to the namespace of your newly created UDP
namespace = 'https://raw.githubusercontent.com/WorldCereal/worldcereal-classification/refs/tags/worldcereal_crop_type_v1.1.0/src/worldcereal/udp/worldcereal_crop_type.json'

model_url = "https://artifactory.vgt.vito.be/artifactory/auxdata-public/worldcereal/models/PhaseII/downstream/tests/be_multiclass-test_custommodel.onnx"
# temporal_extent = ["2020-12-01", "2021-11-30"]
# spatial_extent = {"west": 549260.0538727192, "south": 5643096.65598935, "east": 550221.062129418, "north": 5643965.825801395, "crs": "EPSG:32631"}

temporal_extent = ["2023-10-01", "2024-09-30"]
spatial_extent = {"west": 664000, "south": 5611134, "east": 684000, "north": 5631134, "crs": "EPSG:32631"}




orbit_state = "ASCENDING"  # optional parameter, default is "DESCENDING"
postprocess_method = "majority_vote"  # optional parameter, default is "smooth_probabilities"
postprocess_kernel_size = 5  # optional parameter,  default is 5


cube = c.datacube_from_process(
    process_id='worldcereal_crop_type',
    namespace=namespace,
    spatial_extent=spatial_extent,
    temporal_extent=temporal_extent,
    model_url=model_url,
    orbit_state=orbit_state,
    postprocess_method=postprocess_method,
    postprocess_kernel_size=postprocess_kernel_size
)

job = cube.execute_batch(
    title="Test worldcereal_crop_type UDP",
)

Authenticated using refresh token.
0:00:00 Job 'j-25072310144848918102642ca288ab48': send 'start'
0:00:13 Job 'j-25072310144848918102642ca288ab48': created (progress 0%)
0:00:19 Job 'j-25072310144848918102642ca288ab48': created (progress 0%)
0:00:25 Job 'j-25072310144848918102642ca288ab48': running (progress N/A)
0:00:33 Job 'j-25072310144848918102642ca288ab48': running (progress N/A)
0:00:44 Job 'j-25072310144848918102642ca288ab48': running (progress N/A)
0:00:56 Job 'j-25072310144848918102642ca288ab48': running (progress N/A)
0:01:11 Job 'j-25072310144848918102642ca288ab48': running (progress N/A)
0:01:31 Job 'j-25072310144848918102642ca288ab48': running (progress N/A)
0:01:55 Job 'j-25072310144848918102642ca288ab48': running (progress N/A)
0:02:25 Job 'j-25072310144848918102642ca288ab48': running (progress N/A)
0:03:03 Job 'j-25072310144848918102642ca288ab48': running (progress N/A)
0:03:49 Job 'j-25072310144848918102642ca288ab48': running (progress N/A)
0:04:48 Job 'j-2507231014484