Skip to content

Conversation

@Abeeujah
Copy link
Contributor

@Abeeujah Abeeujah commented Oct 18, 2025

This pull request references the discussions had in #224 Starting from the first, UnImplemented in the list, If I read correctly, there are integration tests available already, so I didn't write any integration tests, but if otherwise, please mention and I would implement the integration tests.

Edit: Going through the issue discussions again, I found out I misinterpreted the Integration tests system being set up, for the actual tests being available, I just pushed the integration tests for this function

None,
Some("POLYGON ((0 0, 0 1, 1 1, 1 0, 0 0))"),
Some("POLYGON ((0 0, 1 1, 0 1, 1 0, 0 0))"),
Some("LINESTRING (0 0, 1 1)"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Some("LINESTRING (0 0, 1 1)"),
Some("LINESTRING (0 0, 1 1)"),
Some("Polygon((0 0, 2 0, 1 1, 2 2, 0 2, 1 1, 0 0))"),

Let's add self self-intersecting ring case. To here and the Python integration tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I've added this test case

&f,
"geos",
"st_isvalid",
ArrayScalar(Polygon(10), Polygon(10)),
Copy link
Collaborator

@petern48 petern48 Oct 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ArrayScalar(Polygon(10), Polygon(10)),
Polygon(10),

These are like arguments to the function, since is_valid only takes one geometry, the value here is just a single Polygon. You can see st_centroid above as an example. This should fix the CI failure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I should pay more attention, Is there like a way I could catch subtle errors like these before pushing? My current checks include, tests passing and pre-commit hook not failing

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! For benches, you can try running you can run cargo bench -- <name pattern> while being in the correct directory.

For this case, you need to be in cd c/sedona-geos/ and run cargo bench -- st_isvalid. This is documented here in the contributors guide (though tbh, they could be better since it doesn't explain how to interpret them).

Personally, I don't run these myself locally unless I'm actually working on optimizing performance. They generally work as long as you have the right number of args and arg types (I've never seen them fail for other reasons). It also takes over a minute just for one function.

&f,
"geos",
"st_isvalid",
ArrayScalar(Polygon(10), Polygon(500)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ArrayScalar(Polygon(10), Polygon(500)),
Polygon(500),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Just a few minor comments 🙂

Comment on lines 53 to 57

[dependency-groups]
dev = [
"ruff>=0.14.1",
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove this change from this PR? I agree it's useful for most users to have ruff installed, but it doesn't necessarily have to be in your environment (pre-commit run -a will run it for you)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah, I've removed it, Currently using the command you suggested!

Comment on lines +214 to +218
@pytest.mark.parametrize("eng", [SedonaDB, PostGIS])
@pytest.mark.parametrize(
("geom", "expected"),
[
(None, None),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this particular case it might be nice to add a comment to the parameter combinations where this returns False to indicate why the example is invalid (I think they are mostly self-intersecting rings).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, Done that now

Comment on lines 221 to 222
("LINESTRING (0 0, 1 1)", True),
("LINESTRING EMPTY", True),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
("LINESTRING (0 0, 1 1)", True),
("LINESTRING EMPTY", True),
("LINESTRING (0 0, 1 1)", True),
("LINESTRING (0 0, 1 1, 1 0, 0 1)", False),
("LINESTRING EMPTY", True),

...I believe this will also be flagged as invalid because the linestring intersects itself (but check!). If this does indeed return false, adding a multilinestring version would also be a good idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is considered valid, cause unlike polygons, self-intersection does not invalidate a LineString in most standard spatial models (like OGC SFA or ISO 19107) which I believe the geos library based their implementation upon

A LINESTRING is generally considered valid if it meets two main conditions, and the geometry you provided adheres to both:

  1. A LineString must have at least two distinct points, or be empty. LINESTRING (0 0, 1 1, 1 0, 0 1) has four distinct points.
  2. No two consecutive points can be identical (no "spikes"). The points in your LineString are distinct and non-consecutive.

@Abeeujah Abeeujah requested a review from paleolimbot October 19, 2025 11:16
Copy link
Collaborator

@petern48 petern48 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I learned a lot more about what is and isn't considered an invalid geometry in this PR. Another great PR. Thanks!

Comment on lines +247 to +252
# Inner ring touches the outer ring at a point
(
"POLYGON ((0 0, 10 0, 10 10, 0 10, 0 0), (1 10, 1 9, 2 9, 2 10, 1 10))",
False,
),
# Overlapping polygons in a multipolygon
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super helpful comments! I didn't know about these invalid geometry cases until now.

Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@paleolimbot paleolimbot merged commit 6ba858b into apache:main Oct 20, 2025
12 checks passed
@Abeeujah Abeeujah deleted the st-isvalid branch November 12, 2025 06:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants