Skip to content

Conversation

@petern48
Copy link
Collaborator

@petern48 petern48 commented Sep 10, 2025

This PR fixes some bugs in the SpatialFilter used for pruning. The code now follows the same logic as used in the original Apache Sedona code here. Essentially, it uses a Covers SpatialFilter instead of a CoveredBy, and updates the code accordingly.

For example, st_covered_by function was modified in the following way

  • (old logic) "if (column, literal), use a CoveredBy filter, else if (literal, column, use a Intersects filter."
  • (new logic) "if (column, literal), use an Intersects filter, else if (literal column, use a Covers filter."

closes #49

@petern48
Copy link
Collaborator Author

petern48 commented Sep 10, 2025

Hi, it's me 🤠. I originally picked this up thinking this would be a simple one-liner fix. However, when I ran all of the predicates against geopandas output (the integration test), I saw there were more buggy predicate functions (e.g st_covers, st_contains, st_coveredby, st_within) than the original ones I expected.

Using new integration tests, I specifically found that the (column, literal) logic for coveredby was buggy (e.g CI failure). I honestly didn't think too hard about why that logic is buggy, but I figured it's easier and apparently more correct to switch to the Covers filter logic the original Apache Sedona code uses.

You're welcome to propose a different direction, if you can think of one.

@petern48 petern48 marked this pull request as ready for review September 11, 2025 00:47
Copy link
Member

@Kontinuation Kontinuation left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the fix is correct.

Only contains/covers can be pushed down, not within/covered_by.

@petern48 petern48 changed the title bug: Fix SpatialFilter for st_covers and st_contains for correct pruning bug: Fix SpatialFilter for correct pruning when reading from geoparquet Sep 11, 2025
@jiayuasu jiayuasu merged commit ea07640 into apache:main Sep 11, 2025
11 checks passed
@petern48 petern48 deleted the fix_spatial_filter branch October 12, 2025 18:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reading GeoParquet with ST_Within() or ST_Contains() predicate produces incorrect results

3 participants