-
Notifications
You must be signed in to change notification settings - Fork 748
[GH-908] Add ST_GeoHashNeighbors and ST_GeoHashNeighbor functions #2628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
38ad3fa to
1f78010
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Adds geohash-neighboring functionality across Sedona’s common library and exposes it through Spark, Flink, Snowflake, and Python APIs, with accompanying docs and tests.
Changes:
- Introduces
geohashNeighbors/geohashNeighborin the common module and wires them into Spark/Flink/Snowflake/Python function catalogs. - Extends Spark’s inferred expression typing to support
Array[String]round-tripping. - Adds unit/functional tests and API documentation for the two new functions.
Reviewed changes
Copilot reviewed 22 out of 23 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| spark/common/src/test/scala/org/apache/sedona/sql/functionTestScala.scala | Adds Spark SQL tests for the new geohash neighbor functions (including null handling). |
| spark/common/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/st_functions.scala | Exposes new functions in the Spark Scala API (st_functions). |
| spark/common/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/InferredExpression.scala | Adds inferred typing/serialization support for Array[String] results. |
| spark/common/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/Functions.scala | Adds Spark expression wrappers for the new functions. |
| spark/common/src/main/scala/org/apache/sedona/sql/UDF/Catalog.scala | Registers the new Spark SQL functions in the catalog. |
| snowflake/src/main/java/org/apache/sedona/snowflake/snowsql/UDFsV2.java | Adds Snowflake V2 UDF bindings for the new functions. |
| snowflake/src/main/java/org/apache/sedona/snowflake/snowsql/UDFs.java | Adds Snowflake UDF bindings for the new functions. |
| snowflake-tester/src/test/java/org/apache/sedona/snowflake/snowsql/TestFunctionsV2.java | Adds Snowflake V2 tests for the new functions. |
| snowflake-tester/src/test/java/org/apache/sedona/snowflake/snowsql/TestFunctions.java | Adds Snowflake tests for the new functions. |
| python/sedona/spark/sql/st_functions.py | Adds Python API wrappers + docstrings for the new functions. |
| flink/src/test/java/org/apache/sedona/flink/FunctionTest.java | Adds Flink SQL tests for neighbors + null handling. |
| flink/src/main/java/org/apache/sedona/flink/expressions/Functions.java | Adds Flink ScalarFunctions for the new geohash APIs. |
| flink/src/main/java/org/apache/sedona/flink/Catalog.java | Registers new functions in Flink’s catalog. |
| docs/api/sql/Function.md | Documents the new SQL functions (Spark). |
| docs/api/snowflake/vector-data/Function.md | Documents the new Snowflake functions. |
| docs/api/flink/Function.md | Documents the new Flink SQL functions. |
| common/src/test/java/org/apache/sedona/common/FunctionsTest.java | Adds common-module unit tests for neighbors and edge cases. |
| common/src/main/java/org/apache/sedona/common/utils/PointGeoHashEncoder.java | Refactors to reuse shared geohash constants. |
| common/src/main/java/org/apache/sedona/common/utils/GeoHashUtils.java | Introduces shared base32/bit constants + decode lookup table. |
| common/src/main/java/org/apache/sedona/common/utils/GeoHashNeighbor.java | Implements the neighbor computation algorithm. |
| common/src/main/java/org/apache/sedona/common/utils/GeoHashDecoder.java | Refactors to reuse shared geohash constants. |
| common/src/main/java/org/apache/sedona/common/Functions.java | Exposes the new neighbor functions from the common API. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
common/src/main/java/org/apache/sedona/common/utils/GeoHashNeighbor.java
Show resolved
Hide resolved
...k/common/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/InferredExpression.scala
Outdated
Show resolved
Hide resolved
...k/common/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/InferredExpression.scala
Outdated
Show resolved
Hide resolved
snowflake-tester/src/test/java/org/apache/sedona/snowflake/snowsql/TestFunctionsV2.java
Outdated
Show resolved
Hide resolved
snowflake-tester/src/test/java/org/apache/sedona/snowflake/snowsql/TestFunctions.java
Outdated
Show resolved
Hide resolved
spark/common/src/test/scala/org/apache/sedona/sql/functionTestScala.scala
Show resolved
Hide resolved
Implements #908: Adds two geohash neighbor functions across all platforms: - ST_GeoHashNeighbors(geohash): returns array of 8 neighboring geohash cells - ST_GeoHashNeighbor(geohash, direction): returns single neighbor in given direction (n, ne, e, se, s, sw, w, nw, case-insensitive) Algorithm ported from geohash-java (Apache 2.0 license). Platforms: Spark SQL, Flink SQL, Snowflake, Python. Added Array[String] support to InferredExpression. Tests: common unit tests, Spark SQL tests, Flink tests. Docs: Spark, Flink, Snowflake API documentation.
1f78010 to
a2179e5
Compare
…eoHashNeighbor - Add ST_GeoHashNeighbors and ST_GeoHashNeighbor tests to dataFrameAPITestScala - Add Python SQL tests in test_function.py (neighbors array + neighbor direction) - Add Python DataFrame API tests in test_dataframe_api.py (test_configurations + wrong_type_configurations)
Recent Python/uv versions no longer bundle setuptools by default, causing 'ModuleNotFoundError: No module named pkg_resources' in CI.
- Pin uv version to 0.9.22 in python.yml, pyflink.yml, docs.yml, python-wheel.yml to avoid GitHub API lookup failures (ECONNREFUSED) - Add setuptools install after second uv sync in python.yml (the 'Run basic tests without rasterio' step)
setuptools (providing pkg_resources) is needed by keplergl and other dependencies at runtime. Adding it to [dependency-groups] dev ensures uv sync always includes it, eliminating the need for manual 'uv pip install setuptools' calls that could be undone by subsequent uv sync operations.
setuptools 82.0.0 removed pkg_resources entirely. keplergl still imports from pkg_resources at module level, causing ModuleNotFoundError during pytest collection. Pin setuptools>=69,<82 in dev deps and remove the now-unnecessary 'uv pip install setuptools' lines from CI workflows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 30 out of 31 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
common/src/main/java/org/apache/sedona/common/utils/GeoHashUtils.java
Outdated
Show resolved
Hide resolved
...k/common/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/InferredExpression.scala
Outdated
Show resolved
Hide resolved
spark/common/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/st_functions.scala
Show resolved
Hide resolved
af0f466 to
bc46f6a
Compare
Did you read the Contributor Guide?
Is this PR related to a ticket?
[GH-XXX] my subject. Closes Geohash neighbours geohashes util #908What changes were proposed in this PR?
Algorithm ported from geohash-java (Apache 2.0 license).
How was this patch tested?
Did this PR include necessary documentation updates?
vX.Y.Zformat.