Skip to content

ARGANS Waterlines from Sentinel-2 algorithm#468

Open
mileonai wants to merge 53 commits intomainfrom
argans_waterlines
Open

ARGANS Waterlines from Sentinel-2 algorithm#468
mileonai wants to merge 53 commits intomainfrom
argans_waterlines

Conversation

@mileonai
Copy link
Copy Markdown

As a part of ESA-founded FastTrack project we are aiming to onboard Argans's waterlines algorithm into APEx Catalogue.

  • new provider added
  • new service record and code added

@algorithm-services-catalogue
Copy link
Copy Markdown

algorithm-services-catalogue Bot commented Apr 15, 2026

🔍 Catalogue's Preview Site Deployed

Your changes have been deployed to the preview site:

🔗 Preview URL: https://esa-apex.github.io/apex-algorithms-catalogue-web/pr-preview/pr-468/

This preview will be updated automatically when you push new changes to your PR.

@mileonai
Copy link
Copy Markdown
Author

@JanssenBrm I am unsure how to get over the failures in the unit test related to branch naming. If I use the actual branch name getting the following error: Links should not point to ephemeral feature branches: found 'argans_waterlines' in 'https://raw.githubusercontent.com/ESA-APEx/apex_algorithms/refs/heads/argans_waterlines/algorithm_catalog/argans/waterlines/openeo_udp/waterlines.json'. If I change to main the tests fail as well saying that the url doesn't exist.

@mileonai mileonai requested a review from JanssenBrm April 16, 2026 08:52
The input must remain a DataCube because the workflow relies on
`raster_to_vector()` before applying the vector-based waterline UDF.
"""
cube = cube.raster_to_vector()
Copy link
Copy Markdown
Author

@mileonai mileonai Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tested the steps used to build the process graph locally as follows and all works and produces expected results (an example of successful job cdse-j-260421124051497da99717d1a929472b):

bbox = {
        "west": 18.640943488750203, "south": 54.57342845938183,
        "east": 18.863083414279657, "north": 54.71972351389667, "crs": "EPSG:4326",}
time_range = ["2025-01-01", "2025-05-31"]

mask_cube = build_water_land_mask_cube(
    con=con,
    bbox=bbox,
    time_range=time_range,
    max_cloud_coverage=1,
    iterations=2,
    ndwi_threshold=0.01,
)

waterlines = create_waterlines(mask_cube)

# Export result
result = waterlines.save_result(format="GeoJSON")
job = result.create_job(title="waterlines_py_PL")
job.start_and_wait()

However, when I try to execute the UDP from openEO WebEditor with the same spatial extent, temporal extend and other parameters all my runs fail indicating the the produced vector cube is empty (example of failing job:
cdse-j-2604211221524d169c4d4c082193cea6).

@EmileSonneveld

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, there are a few differences in the parameters of those 2 jobs. Can you provide an example using the same?
image

Copy link
Copy Markdown
Collaborator

@EmileSonneveld EmileSonneveld Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I re-ran the UDP with more memory, and the job finished successfully:

  "driver-memory": "5G",
  "executor-memory": "5G",

Those are high values and need to be tweaked. There exists a feature where the UDP can specify what memory settings are needed

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Emilie, I updated the comment with job ids that have the same spatial and temporal extends:
image
image

I since also push a commit to the code so the jobs run via UDP don't fail but they produce empty outputs.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a more minimal way to make the UDP work out-of-the-box:

  "default_job_options": {
    "python-memory": "4g"
  },
  "parameters": ...

Running a test ATM: j-26042114324240f2be3f326f1dc0a736

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expected results is a geojson with LineString geometries. The following geojson was produces when executing the job with exactly the same input parameters via Python API (copied only small part of it):

"type": "FeatureCollection",
"name": "vectorcube",
"crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:EPSG::3857" } },
"features": [
{ "type": "Feature", "properties": { "time": "2025-02-10T00:00:00Z~band0", "type": "waterline_segment", "sea_direction_8": "NE", "sea_azimuth_deg": 56.309932473328729 }, "geometry": { "type": "LineString", "coordinates": [ [ 2074615.562549961265177, 7292501.433900333940983 ], [ 2074668.986923298798501, 7292465.817651441320777 ] ] } },
{ "type": "Feature", "properties": { "time": "2025-02-10T00:00:00Z~band0", "type": "waterline_segment", "sea_direction_8": "E", "sea_azimuth_deg": 1.0416266759978612 }, "geometry": { "type": "LineString", "coordinates": [ [ 2074668.986923298798501, 7292465.817651441320777 ], [ 2074686.795047744410113, 7291486.37080692127347 ] ] } },
....
image

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I can check further tomorrow...

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EmileSonneveld have you had a chance to invest this issues further?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not see the cause directly. I made a ticket here to be sure to not lose track of it:
Open-EO/openeo-geopyspark-driver#1655
Is there a deadline for this project?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

End of the month, so it is quite urgent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants