doc(geography): document the DuckDB GEOGRAPHY boundary design#168
Open
estebanzimanyi wants to merge 1 commit into
Open
doc(geography): document the DuckDB GEOGRAPHY boundary design#168estebanzimanyi wants to merge 1 commit into
estebanzimanyi wants to merge 1 commit into
Conversation
Adds doc/geography-boundary.md as the canonical write-up of how MobilityDuck represents geodetic geography values across the MEOS<->DuckDB columnar boundary. Covers: - The closed-algebra property in MEOS and why it doesn't survive the columnar boundary without a dedicated LogicalType. - The GEOGRAPHY LogicalType registration: a BLOB alias carrying MEOS-WKB with the geodetic flag preserved in the type tag, with no dependence on a DuckDB upstream change or on a third-party duckdb-geography extension. - The I/O surface (ST_GeogFromText, ST_AsText, ST_AsBinary, ST_GeogFromBinary), all thin shims over existing MEOS exports. - The operation surface (length, area, eIntersects, etc.) — every call delegates to a MEOS function that takes geodetic input and returns the correct type; DuckDB never sees a non-geodetic representation of a geodetic value during a computation. - The complete inter-type cast matrix (GEOMETRY / GEOGRAPHY / TGEOGPOINT / TGEOMPOINT), mirroring the MobilityDB-on-Postgres surface. - TemporalParquet round-trip preservation via the footer JSON's base_type / geodetic / srid fields. - Pitfalls a binding implementation must avoid (using ST_GeomFromText to construct a GEOGRAPHY value, reusing DuckDB Spatial Cartesian functions on a GEOGRAPHY blob, stripping the geodetic flag in Parquet output, etc.). - Current state of the implementation and the bounded pending work (~430 LoC, single PR) to register the LogicalType, the I/O UDFs, the casts, and the tests. README updated with a single-paragraph pointer in the parity-gaps neighbourhood so adopters land here when looking for geography semantics on the DuckDB side.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds
doc/geography-boundary.mdas the canonical write-up of how MobilityDuck represents geodetic geography values across the MEOS ↔ DuckDB columnar boundary.Why this doc exists
Two layers of context get mixed up regularly when geodetic results come out of a query:
geog_in,geog_area,eIntersects(geog, geog),tgeog_length,tgeog_speedall take geodetic inputs, compute on the WGS-84 spheroid, and return properly-typed geodetic results — without leaving the MEOS C runtime.spatialextension exposes one logical type —GEOMETRY— that has no geodetic bit. The flag is at risk of being lost the moment a MEOS geography result becomes a DuckDB column value.The doc explains the boundary-layer solution: register a
GEOGRAPHYLogicalType in MobilityDuck (a BLOB alias whose payload is MEOS-WKB with the geodetic flag in the type tag), so the columnar engine carries the type information verbatim while every operation stays inside MEOS.What the doc covers
GEOGRAPHYLogicalType registration sketch (~10 LoC at registration time).ST_GeogFromText,ST_AsText,ST_AsBinary,ST_GeogFromBinary— all thin shims over existing MEOS exports.length,area,eIntersects, etc. — every call delegates to a MEOS function that takes geodetic input and returns the correct type.GEOMETRY/GEOGRAPHY/TGEOGPOINT/TGEOMPOINT), mirroring the MobilityDB-on-Postgres surface.ST_GeomFromTextto construct aGEOGRAPHYvalue, reusing DuckDB Spatial Cartesian functions on aGEOGRAPHYBLOB, stripping the geodetic flag in Parquet output, etc.).Where it's linked
README.mdnear the parity-gaps pointer, so adopters land here when they look for geography semantics on the DuckDB side.doc/multi-duckdb-version.mdand Discussion #913 — Temporal Data Lake RFC.This is the doc the user asked for: "document all the DuckDB geography issue and solution properly so it can be widely available and findable in the documentation". The implementation (the ~430 LoC PR registering
GEOGRAPHY+ UDFs + casts + tests) is the natural next step.