-
Notifications
You must be signed in to change notification settings - Fork 30
Implement ST_IsValid using geos library #229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| None, | ||
| Some("POLYGON ((0 0, 0 1, 1 1, 1 0, 0 0))"), | ||
| Some("POLYGON ((0 0, 1 1, 0 1, 1 0, 0 0))"), | ||
| Some("LINESTRING (0 0, 1 1)"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Some("LINESTRING (0 0, 1 1)"), | |
| Some("LINESTRING (0 0, 1 1)"), | |
| Some("Polygon((0 0, 2 0, 1 1, 2 2, 0 2, 1 1, 0 0))"), |
Let's add self self-intersecting ring case. To here and the Python integration tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I've added this test case
| &f, | ||
| "geos", | ||
| "st_isvalid", | ||
| ArrayScalar(Polygon(10), Polygon(10)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ArrayScalar(Polygon(10), Polygon(10)), | |
| Polygon(10), |
These are like arguments to the function, since is_valid only takes one geometry, the value here is just a single Polygon. You can see st_centroid above as an example. This should fix the CI failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I should pay more attention, Is there like a way I could catch subtle errors like these before pushing? My current checks include, tests passing and pre-commit hook not failing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep! For benches, you can try running you can run cargo bench -- <name pattern> while being in the correct directory.
For this case, you need to be in cd c/sedona-geos/ and run cargo bench -- st_isvalid. This is documented here in the contributors guide (though tbh, they could be better since it doesn't explain how to interpret them).
Personally, I don't run these myself locally unless I'm actually working on optimizing performance. They generally work as long as you have the right number of args and arg types (I've never seen them fail for other reasons). It also takes over a minute just for one function.
| &f, | ||
| "geos", | ||
| "st_isvalid", | ||
| ArrayScalar(Polygon(10), Polygon(500)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ArrayScalar(Polygon(10), Polygon(500)), | |
| Polygon(500), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
paleolimbot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Just a few minor comments 🙂
python/sedonadb/pyproject.toml
Outdated
|
|
||
| [dependency-groups] | ||
| dev = [ | ||
| "ruff>=0.14.1", | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove this change from this PR? I agree it's useful for most users to have ruff installed, but it doesn't necessarily have to be in your environment (pre-commit run -a will run it for you)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yeah, I've removed it, Currently using the command you suggested!
| @pytest.mark.parametrize("eng", [SedonaDB, PostGIS]) | ||
| @pytest.mark.parametrize( | ||
| ("geom", "expected"), | ||
| [ | ||
| (None, None), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this particular case it might be nice to add a comment to the parameter combinations where this returns False to indicate why the example is invalid (I think they are mostly self-intersecting rings).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, Done that now
| ("LINESTRING (0 0, 1 1)", True), | ||
| ("LINESTRING EMPTY", True), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ("LINESTRING (0 0, 1 1)", True), | |
| ("LINESTRING EMPTY", True), | |
| ("LINESTRING (0 0, 1 1)", True), | |
| ("LINESTRING (0 0, 1 1, 1 0, 0 1)", False), | |
| ("LINESTRING EMPTY", True), |
...I believe this will also be flagged as invalid because the linestring intersects itself (but check!). If this does indeed return false, adding a multilinestring version would also be a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is considered valid, cause unlike polygons, self-intersection does not invalidate a LineString in most standard spatial models (like OGC SFA or ISO 19107) which I believe the geos library based their implementation upon
A LINESTRING is generally considered valid if it meets two main conditions, and the geometry you provided adheres to both:
- A LineString must have at least two distinct points, or be empty.
LINESTRING (0 0, 1 1, 1 0, 0 1)has four distinct points. - No two consecutive points can be identical (no "spikes"). The points in your LineString are distinct and non-consecutive.
petern48
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I learned a lot more about what is and isn't considered an invalid geometry in this PR. Another great PR. Thanks!
| # Inner ring touches the outer ring at a point | ||
| ( | ||
| "POLYGON ((0 0, 10 0, 10 10, 0 10, 0 0), (1 10, 1 9, 2 9, 2 10, 1 10))", | ||
| False, | ||
| ), | ||
| # Overlapping polygons in a multipolygon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super helpful comments! I didn't know about these invalid geometry cases until now.
paleolimbot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
This pull request references the discussions had in #224 Starting from the first, UnImplemented in the list, If I read correctly, there are integration tests available already, so I didn't write any integration tests, but if otherwise, please mention and I would implement the integration tests.
Edit: Going through the issue discussions again, I found out I misinterpreted the Integration tests system being set up, for the actual tests being available, I just pushed the integration tests for this function