[SEDONA-511] Fix reading/writing geoparquet metadata for snake_case or camelCase column names #1270

Kontinuation · 2024-03-06T08:13:30Z

Did you read the Contributor Guide?

Yes, I have read Contributor Rules and Contributor Development Guide

Is this PR related to a JIRA ticket?

Yes, the URL of the associated JIRA ticket is https://issues.apache.org/jira/browse/SEDONA-511. The PR name follows the format [SEDONA-XXX] my subject.

What changes were proposed in this PR?

The current GeoParquet implementation converts the geo metadata to camel or underscore style during parsing and serialization, and it introduces consistency issues with the schema of the parquet files. This patch resolves this issue by skipping the style conversion for column names. Now it should work correctly with geometry column names such as geom_column or geomColumn.

How was this patch tested?

Added tests for geoparquet and geoparquet.metadata data source.

Did this PR include necessary documentation updates?

No, this PR does not affect any public API so no need to change the docs.

…names

…r camelCase column names (#1270) * Fix geoparquet metadata for snake_case and camelCase geometry column names * Apply the change to Spark 3.4 and 3.5 * Fix binary compatibility issue for Spark 3.0.x

Kontinuation added 2 commits March 6, 2024 15:55

Fix geoparquet metadata for snake_case and camelCase geometry column …

78c1e23

…names

Apply the change to Spark 3.4 and 3.5

98728cb

Kontinuation marked this pull request as ready for review March 6, 2024 08:13

jiayuasu added bug sedona-spark labels Mar 6, 2024

jiayuasu added this to the sedona-1.6.0 milestone Mar 6, 2024

Fix binary compatibility issue for Spark 3.0.x

fa55691

jiayuasu approved these changes Mar 6, 2024

View reviewed changes

jiayuasu merged commit fea229f into apache:master Mar 6, 2024
48 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SEDONA-511] Fix reading/writing geoparquet metadata for snake_case or camelCase column names #1270

[SEDONA-511] Fix reading/writing geoparquet metadata for snake_case or camelCase column names #1270

Kontinuation commented Mar 6, 2024

[SEDONA-511] Fix reading/writing geoparquet metadata for snake_case or camelCase column names #1270

[SEDONA-511] Fix reading/writing geoparquet metadata for snake_case or camelCase column names #1270

Conversation

Kontinuation commented Mar 6, 2024

Did you read the Contributor Guide?

Is this PR related to a JIRA ticket?

What changes were proposed in this PR?

How was this patch tested?

Did this PR include necessary documentation updates?