Skip to content

Conversation

@szehon-ho
Copy link
Member

@szehon-ho szehon-ho commented Nov 26, 2025

What changes were proposed in this pull request?

Add Geometry toWKB and fromWKB methods. These will serialize and deserialize Geometry objects to and from WKB (Well Known Binary) standard.

We implement our own reader/writer to avoid compile time dependency on JTS from Spark

Why are the changes needed?

Read geometry stored as WKB in external data sources like Iceberg, and also for users who are interested in utils to convert from and to WKB.

Does this PR introduce any user-facing change?

No, feature is not enabled yet.

How was this patch tested?

Unit tests

Was this patch authored or co-authored using generative AI tooling?

Yes claude 4.5 sonnet

@github-actions github-actions bot added the SQL label Nov 26, 2025
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this for Apache Spark 4.2.0 only, right?

To be clear, let me hold this PR a little, @szehon-ho .

@szehon-ho
Copy link
Member Author

yes it is for 4.2, geo type seems disabled in 4.1

@github-actions github-actions bot added the BUILD label Nov 27, 2025
files="src/main/java/org/apache/spark/examples/JavaLogQuery.java"/>
<suppress checks="LineLength"
files="src/main/java/org/apache/hive/service/*"/>
<suppress checks="LineLength"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

easier to see WKB hex strings in one line

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants