Skip to content

Conversation

@uros-db
Copy link
Contributor

@uros-db uros-db commented Nov 3, 2025

What changes were proposed in this pull request?

This PR adds rudimentary WKB read/write ST geospatial functions in PySpark API, and registers the new Python functions in both PySpark and PySpark Connect.

Note that a similar framework was already implemented in Catalyst, as part of: #52784.

Why are the changes needed?

Establish a minimal ST function framework in PySpark API, setting the foundations for expanding geospatial function support in the near future.

Does this PR introduce any user-facing change?

Yes, this PR introduces 3 new Python functions: st_asbinary, st_geogfromwkb, st_geomfromwkb.

How was this patch tested?

Added new PySpark unit test suites:

  • test_functions

Was this patch authored or co-authored using generative AI tooling?

No.

uros-db

This comment was marked as outdated.

Copy link
Contributor Author

@uros-db uros-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan
Copy link
Contributor

thanks, merging to master/4.1!

@cloud-fan cloud-fan closed this in 24110b6 Nov 3, 2025
cloud-fan pushed a commit that referenced this pull request Nov 3, 2025
…tions in PySpark

### What changes were proposed in this pull request?
This PR adds rudimentary WKB read/write ST geospatial functions in PySpark API, and registers the new Python functions in both PySpark and PySpark Connect.

Note that a similar framework was already implemented in Catalyst, as part of: #52784.

### Why are the changes needed?
Establish a minimal ST function framework in PySpark API, setting the foundations for expanding geospatial function support in the near future.

### Does this PR introduce _any_ user-facing change?
Yes, this PR introduces 3 new Python functions: `st_asbinary`, `st_geogfromwkb`, `st_geomfromwkb`.

### How was this patch tested?
Added new PySpark unit test suites:
- `test_functions`

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #52849 from uros-db/geo-python-functions.

Authored-by: Uros Bojanic <uros.bojanic@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 24110b6)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
dongjoon-hyun pushed a commit that referenced this pull request Nov 6, 2025
…ST functions

### What changes were proposed in this pull request?
Re-enable Scala/Python parity check for ST geospatial functions in `test_function_parity`.

### Why are the changes needed?
The test was temporarily disabled in #52803, but the corresponding functions have been subsequently added on PySpark side as part of #52849.

### Does this PR introduce _any_ user-facing change?
Yes, casting `GEOGRAPHY(<srid>)` to `GEOGRAPHY(ANY)` is now allowed.

### How was this patch tested?
Existing tests suffice:
- `test_functions`

Closes #52907 from uros-db/geo-function_parity.

Authored-by: Uros Bojanic <uros.bojanic@databricks.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
dongjoon-hyun pushed a commit that referenced this pull request Nov 6, 2025
…ST functions

### What changes were proposed in this pull request?
Re-enable Scala/Python parity check for ST geospatial functions in `test_function_parity`.

### Why are the changes needed?
The test was temporarily disabled in #52803, but the corresponding functions have been subsequently added on PySpark side as part of #52849.

### Does this PR introduce _any_ user-facing change?
Yes, casting `GEOGRAPHY(<srid>)` to `GEOGRAPHY(ANY)` is now allowed.

### How was this patch tested?
Existing tests suffice:
- `test_functions`

Closes #52907 from uros-db/geo-function_parity.

Authored-by: Uros Bojanic <uros.bojanic@databricks.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit b284a2c)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
zifeif2 pushed a commit to zifeif2/spark that referenced this pull request Nov 22, 2025
…ST functions

### What changes were proposed in this pull request?
Re-enable Scala/Python parity check for ST geospatial functions in `test_function_parity`.

### Why are the changes needed?
The test was temporarily disabled in apache#52803, but the corresponding functions have been subsequently added on PySpark side as part of apache#52849.

### Does this PR introduce _any_ user-facing change?
Yes, casting `GEOGRAPHY(<srid>)` to `GEOGRAPHY(ANY)` is now allowed.

### How was this patch tested?
Existing tests suffice:
- `test_functions`

Closes apache#52907 from uros-db/geo-function_parity.

Authored-by: Uros Bojanic <uros.bojanic@databricks.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
huangxiaopingRD pushed a commit to huangxiaopingRD/spark that referenced this pull request Nov 25, 2025
…tions in PySpark

### What changes were proposed in this pull request?
This PR adds rudimentary WKB read/write ST geospatial functions in PySpark API, and registers the new Python functions in both PySpark and PySpark Connect.

Note that a similar framework was already implemented in Catalyst, as part of: apache#52784.

### Why are the changes needed?
Establish a minimal ST function framework in PySpark API, setting the foundations for expanding geospatial function support in the near future.

### Does this PR introduce _any_ user-facing change?
Yes, this PR introduces 3 new Python functions: `st_asbinary`, `st_geogfromwkb`, `st_geomfromwkb`.

### How was this patch tested?
Added new PySpark unit test suites:
- `test_functions`

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#52849 from uros-db/geo-python-functions.

Authored-by: Uros Bojanic <uros.bojanic@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
huangxiaopingRD pushed a commit to huangxiaopingRD/spark that referenced this pull request Nov 25, 2025
…ST functions

### What changes were proposed in this pull request?
Re-enable Scala/Python parity check for ST geospatial functions in `test_function_parity`.

### Why are the changes needed?
The test was temporarily disabled in apache#52803, but the corresponding functions have been subsequently added on PySpark side as part of apache#52849.

### Does this PR introduce _any_ user-facing change?
Yes, casting `GEOGRAPHY(<srid>)` to `GEOGRAPHY(ANY)` is now allowed.

### How was this patch tested?
Existing tests suffice:
- `test_functions`

Closes apache#52907 from uros-db/geo-function_parity.

Authored-by: Uros Bojanic <uros.bojanic@databricks.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants