# WEED inference
In this notebook we will showcase how we couple the EO processing with ONNX ML inference within weed. 

The way we operate is by first lazy loading a cube which contains every enabled training feature as a band. 
Next we read from the model stored on an openEO accesible storage site, on which features it was trained. 

It is important that users add this information to the stored models. There is code provided in onnx_model_utilities to showcase how it can be done. This code was specialized for obtaining the relevant information from a json file and adding it into the onnx metadata. As the project continues this approach will change since model training will also be streamlined within the WEED operation. 



In [1]:
import os
import sys
import openeo
from pathlib import Path

sys.path.append(os.path.abspath('C:/Git_projects/eo_processing/src'))

from eo_processing.utils.helper import init_connection, getUDFpath
from eo_processing.utils.onnx_model_utilities import get_training_features_from_model
from pathlib import Path

from eo_processing.openeo.processing import generate_master_feature_cube
from eo_processing.config import get_job_options, get_collection_options,  get_standard_processing_options

Connect to openEO processing backend

In [2]:
backend = 'cdse' 
# establish the connection to the selected backend
connection = init_connection(backend)

job_options = get_job_options(provider=backend)
collection_options = get_collection_options(provider=backend)

# We call again the standard processing options for feature generation
processing_options = get_standard_processing_options(provider=backend, task='feature_generation')

Authenticated using refresh token.


### specify space & time context

In [3]:
# the time context is given by start and end date
start = '2021-01-01'
end = '2021-02-01'   # the end is always exclusive
AOI = {'east': 4832000, 'south': 2818000, 'west': 4831000, 'north': 2819000, 'crs': 'EPSG:3035'}

Provided the model folder and model name, we perform multimodel inference. By using dedicated job settings, we will save every output band as a seperate asset.

In [4]:
# we link towards the used model
MODEL_URL = "https://s3.waw3-1.cloudferro.com/swift/v1/weed/catboost_models/"
MODELS_NAMES = ['model_1.onnx', 'model_2.onnx']

for i, modelname in enumerate(MODELS_NAMES):
    model_url = MODEL_URL + modelname
    model_str = Path(modelname).stem + '_'

    metadata = get_training_features_from_model(model_url)
    
    input_bands = metadata['input_features']
    output_bands = metadata['output_features']
    output_bands = [model_str + direction for direction in output_bands]

    data_cube = generate_master_feature_cube(connection,
                                   AOI,
                                   start,
                                   end,
                                   **collection_options,
                                   **processing_options)

    data_cube = data_cube.filter_bands(input_bands) 


    #we pass the model url as context information within the UDF
    udf  = openeo.UDF.from_file(
            getUDFpath('udf_catboost_inference.py'),
            context={
                "model_url": model_url
                    }
    )

    # Apply the UDF to the data cube.
    catboost_classification = data_cube.apply(
        process=udf)

    #run inference
    if i == 0:
        output = catboost_classification.rename_labels(dimension="bands",target= output_bands)
    else:
        bands =  catboost_classification.rename_labels(dimension="bands",target= output_bands)
        output = output.merge_cubes(bands)

#Save each band as a seperate tiff file
save_result_options = {}
save_result_options["separate_asset_per_band"] = True
save_result = output.save_result(
                format="GTiff",
                options = save_result_options)

#create and run the job
job = connection.create_job(
    process_graph=save_result.flat_graph(),
    title=f"multimodal_inference",
    additional=job_options,
)

job.start_and_wait()



0:00:00 Job 'j-241122d727bf4b6cbcfd8c4be88852aa': send 'start'
0:00:16 Job 'j-241122d727bf4b6cbcfd8c4be88852aa': created (progress 0%)
0:00:21 Job 'j-241122d727bf4b6cbcfd8c4be88852aa': created (progress 0%)
0:00:28 Job 'j-241122d727bf4b6cbcfd8c4be88852aa': created (progress 0%)
0:00:36 Job 'j-241122d727bf4b6cbcfd8c4be88852aa': created (progress 0%)
0:00:46 Job 'j-241122d727bf4b6cbcfd8c4be88852aa': created (progress 0%)
0:00:58 Job 'j-241122d727bf4b6cbcfd8c4be88852aa': created (progress 0%)
0:01:14 Job 'j-241122d727bf4b6cbcfd8c4be88852aa': created (progress 0%)
0:01:33 Job 'j-241122d727bf4b6cbcfd8c4be88852aa': created (progress 0%)
0:01:57 Job 'j-241122d727bf4b6cbcfd8c4be88852aa': running (progress N/A)
0:02:27 Job 'j-241122d727bf4b6cbcfd8c4be88852aa': running (progress N/A)
0:03:04 Job 'j-241122d727bf4b6cbcfd8c4be88852aa': running (progress N/A)
0:03:51 Job 'j-241122d727bf4b6cbcfd8c4be88852aa': running (progress N/A)
0:04:49 Job 'j-241122d727bf4b6cbcfd8c4be88852aa': running (progress N

Check on the results

In [None]:
results = job.get_results()
results.get_metadata()

{'assets': {'openEO_model_1_predicted_label.tif': {'eo:bands': [{'name': 'model_1_predicted_label'}],
   'href': 'https://openeo.dataspace.copernicus.eu/openeo/1.2/jobs/j-241122f5c3d64f7da4fb2ea5a908f546/results/assets/YTQyMWY4NDktMmFlNi00MmQzLTkzZjAtYjQzYWEyNTY3ZjFl/6bf420c294a9fd09e34d80649c9f8a66/openEO_model_1_predicted_label.tif?expires=1732887589',
   'proj:bbox': [4831000, 2818000, 4832000, 2819000],
   'proj:epsg': 3035,
   'proj:shape': [100, 100],
   'raster:bands': [{'name': 'model_1_predicted_label',
     'statistics': {'maximum': 100000.0,
      'mean': 99720.0,
      'minimum': 30000.0,
      'stddev': 4418.3254746566,
      'valid_percent': 100.0}}],
   'roles': ['data'],
   'title': 'openEO_model_1_predicted_label.tif',
   'type': 'image/tiff; application=geotiff'},
  'openEO_model_1_prob_class_100000.tif': {'eo:bands': [{'name': 'model_1_prob_class_100000'}],
   'href': 'https://openeo.dataspace.copernicus.eu/openeo/1.2/jobs/j-241122f5c3d64f7da4fb2ea5a908f546/results/a

In [14]:
results.download_files("./out")

[WindowsPath('out/openEO_model_1_predicted_label.tif'),
 WindowsPath('out/openEO_model_1_prob_class_100000.tif'),
 WindowsPath('out/openEO_model_1_prob_class_110000.tif'),
 WindowsPath('out/openEO_model_1_prob_class_30000.tif'),
 WindowsPath('out/openEO_model_1_prob_class_40000.tif'),
 WindowsPath('out/openEO_model_1_prob_class_50000.tif'),
 WindowsPath('out/openEO_model_1_prob_class_60000.tif'),
 WindowsPath('out/openEO_model_1_prob_class_70000.tif'),
 WindowsPath('out/openEO_model_1_prob_class_80000.tif'),
 WindowsPath('out/openEO_model_1_prob_class_90000.tif'),
 WindowsPath('out/openEO_model_2_predicted_label.tif'),
 WindowsPath('out/openEO_model_2_prob_class_100000.tif'),
 WindowsPath('out/openEO_model_2_prob_class_110000.tif'),
 WindowsPath('out/openEO_model_2_prob_class_30000.tif'),
 WindowsPath('out/openEO_model_2_prob_class_40000.tif'),
 WindowsPath('out/openEO_model_2_prob_class_50000.tif'),
 WindowsPath('out/openEO_model_2_prob_class_60000.tif'),
 WindowsPath('out/openEO_mode