Skip to content

feat(c/sedona-geos): Adds ST_Normalize(geometry) in c/sedona-geos using GEOS#802

Merged
paleolimbot merged 1 commit into
apache:mainfrom
oglego:feat/implement-st-normalize
May 5, 2026
Merged

feat(c/sedona-geos): Adds ST_Normalize(geometry) in c/sedona-geos using GEOS#802
paleolimbot merged 1 commit into
apache:mainfrom
oglego:feat/implement-st-normalize

Conversation

@oglego
Copy link
Copy Markdown
Contributor

@oglego oglego commented Apr 30, 2026

Pull Request: Implement ST_Normalize

Description

This PR introduces the ST_Normalize function, leveraging the GEOS library. ST_Normalize returns a geometry in its normalized (canonical) form, ensuring consistent representation across different processing steps.

Key Changes

  • GEOS Integration: Implemented the core normalization logic using GEOS.
  • Rust Unit Tests
  • Python Integration Tests
  • PostGIS Compatibility: Specific integration tests added to document and verify behavior regarding M and ZM coordinates, matching PostGIS expectations.

Tracking

This work is part of the following development tracks:

feat(c/sedona-geos): remove boilerplate created from st_area

add first iteration of ST_Normalize kernel and tests

feat(c/sedona-geos): align ST_Normalize implementation with
existing GEOS UDFs

feat(c/sedona-geos): update st_normalize to use Clone trait

feat(c/sedona-geos): add python integration tests

feat(c/sedona-geos): add additional tests for ST_Normalize

feat(c/sedona-geos): add additional tests for st_normalize

feat(c/sedona-geos): fix accidental overwrite on numgeometries

feat(c/sedona-geos): add documentation for st_normalize
@github-actions github-actions Bot requested a review from zhangfengcdt April 30, 2026 12:22
Copy link
Copy Markdown
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

It's worth seeing if the slightly more compact version of the test works (in theory you should be able to put WKT in the expected slot but there are some quirks).

Comment on lines +2373 to +2387
if isinstance(eng, PostGIS) and expected is not None:
# Normalize expected WKT to PostGIS's compact ST_AsText formatting.
expected = expected.replace(", ", ",")
expected = expected.replace(" (", "(")
expected = expected.replace(r"ZM(", r"ZM (")
expected = expected.replace(r"M(", r"M (")
expected = expected.replace(r"Z(", r"Z (")

if isinstance(eng, SedonaDB) and expected is not None:
expected = expected.replace(", ", ",")
expected = expected.replace(" (", "(")

eng.assert_query_result(
f"SELECT ST_AsText(ST_Normalize({geom_or_null(geom)}))", expected
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should work:

Suggested change
if isinstance(eng, PostGIS) and expected is not None:
# Normalize expected WKT to PostGIS's compact ST_AsText formatting.
expected = expected.replace(", ", ",")
expected = expected.replace(" (", "(")
expected = expected.replace(r"ZM(", r"ZM (")
expected = expected.replace(r"M(", r"M (")
expected = expected.replace(r"Z(", r"Z (")
if isinstance(eng, SedonaDB) and expected is not None:
expected = expected.replace(", ", ",")
expected = expected.replace(" (", "(")
eng.assert_query_result(
f"SELECT ST_AsText(ST_Normalize({geom_or_null(geom)}))", expected
)
eng.assert_query_result(
f"SELECT ST_Normalize({geom_or_null(geom)})", expected
)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for reviewing this, I really appreciate it! I tried the compact version, but it led to formatting-only failures in the Python harness, for example one of the diffs I ran into is:

E AssertionError: Expected:
E [('POLYGON Z ((0 0 5,0 1 5,1 1 5,1 0 5,0 0 5))',)]
E Got:
E [('POLYGON Z ((0 0 5, 0 1 5, 1 1 5, 1 0 5, 0 0 5))',)]

Without using ST_AsText, the failures seem to come from string formatting in the test harness rather than from ST_Normalize behavior itself.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! I will open a follow up to see if we can improve this at some point since at least in theory the testing module is supposed to deal with that 😬

Copy link
Copy Markdown
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Comment on lines +2373 to +2387
if isinstance(eng, PostGIS) and expected is not None:
# Normalize expected WKT to PostGIS's compact ST_AsText formatting.
expected = expected.replace(", ", ",")
expected = expected.replace(" (", "(")
expected = expected.replace(r"ZM(", r"ZM (")
expected = expected.replace(r"M(", r"M (")
expected = expected.replace(r"Z(", r"Z (")

if isinstance(eng, SedonaDB) and expected is not None:
expected = expected.replace(", ", ",")
expected = expected.replace(" (", "(")

eng.assert_query_result(
f"SELECT ST_AsText(ST_Normalize({geom_or_null(geom)}))", expected
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! I will open a follow up to see if we can improve this at some point since at least in theory the testing module is supposed to deal with that 😬

@paleolimbot paleolimbot merged commit 712672b into apache:main May 5, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants