Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Apache Sedona to known libraries #150

Merged
merged 2 commits into from
Dec 2, 2022
Merged

Conversation

jiayuasu
Copy link
Contributor

@jiayuasu jiayuasu commented Dec 2, 2022

Dear maintainers,

Thank you all for the great work on GeoParquet!

This PR is to add Apache Sedona to the "known libraries that can read and write GeoParquet file"

Apache Sedona 1.3.0 implements the basic GeoParquet read/write function:

  1. read the WKB column in GeoParquet: https://sedona.apache.org/tutorial/sql/#load-geoparquet
  2. write a table that has Geometry type to GeoParquet (with the column written in WKB format): https://sedona.apache.org/tutorial/sql/#save-geoparquet

Please feel free to let me know if there is anything I need to fix :-)

@kylebarron
Copy link
Collaborator

If geometry column name is different

var df = sparkSession.read.format("geoparquet").option("fieldGeometry", "new_geometry").load(geoparquetdatalocation1)

I don't know anything about sedona, but just wondering: are you able to read that from the parquet file metadata, instead of having the user specify it manually.

@jiayuasu
Copy link
Contributor Author

jiayuasu commented Dec 2, 2022

@kylebarron We read metadata but a binary type column in a Parquet file could be something else other than WKB. So we leave this to the user:

Unless the column is explicitly named geometry, the user needs to tell us which column is the WKB column.

@cholmes
Copy link
Member

cholmes commented Dec 2, 2022

Thanks @jiayuasu! Will merge it in.

Following up on Kyle - the geoparquet spec provides the names of all the geometry columns, in the file metadata as JSON. So it seems like you could just look at that, instead of having a user specify? You can see an example of the metadata file at https://github.com/opengeospatial/geoparquet/blob/main/examples/example_metadata.json

For our linting
@cholmes cholmes merged commit f6983ab into opengeospatial:main Dec 2, 2022
@jiayuasu
Copy link
Contributor Author

jiayuasu commented Dec 3, 2022

@cholmes Got it! You made a good point. I believe Sedona should improve the reader/writer to leverage the metadata. We will make it happen. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants