[GH-3073] Add geography support to SedonaFlink measurement and output functions#3074
Conversation
There was a problem hiding this comment.
Pull request overview
Adds Geography parity for SedonaFlink’s measurement and output scalar functions by introducing Geography-typed eval overloads in the existing Flink UDF wrappers, delegating to org.apache.sedona.common.geography.Functions. This enables geodesic measurement/formatting on geography columns while keeping the same function names shared with geometry.
Changes:
- Added Geography overloads for
ST_Area,ST_Length,ST_Distance,ST_Buffer(3 overloads),ST_Centroid,ST_Envelope(withsplitAtAntiMeridian),ST_NPoints,ST_NumGeometries,ST_GeometryType,ST_AsText, andST_AsEWKT. - Added an end-to-end Flink Table API test suite (
GeographyFunctionTest) covering the new Geography function paths and verifying geometry overload resolution still works.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| flink/src/main/java/org/apache/sedona/flink/expressions/Functions.java | Adds Geography eval overloads using RAW + GeographyTypeSerializer, delegating to common.geography.Functions. |
| flink/src/test/java/org/apache/sedona/flink/GeographyFunctionTest.java | Adds Table API integration tests for Geography measurement/output functions and shared-name overload resolution. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @Test | ||
| public void testBuffer() throws Exception { | ||
| String wkt = "POINT (0 0)"; | ||
| Object out = eval(wkt, call(Functions.ST_Buffer.class.getSimpleName(), $("geog"), lit(1000.0))); | ||
| Geography expected = | ||
| org.apache.sedona.common.geography.Functions.buffer( | ||
| Constructors.geogFromWKT(wkt, 4326), 1000.0); | ||
| assertEquals(expected.toEWKT(), ((Geography) out).toEWKT()); | ||
| } |
There was a problem hiding this comment.
Added testBufferWithParameters (the (Geography, radius, String) overload, asserted end-to-end against the common reference) and testBufferUseSpheroidThrows (the (Geography, radius, boolean) overload, asserting the clear IllegalArgumentException — which also confirms Flink resolves the boolean argument to this overload rather than the String one). Commit 779e766.
| @Test | ||
| public void testEnvelope() throws Exception { | ||
| String wkt = "LINESTRING (0 0, 2 3)"; | ||
| Object out = | ||
| eval(wkt, call(Functions.ST_Envelope.class.getSimpleName(), $("geog"), lit(false))); | ||
| Geography expected = | ||
| org.apache.sedona.common.geography.Functions.getEnvelope( | ||
| Constructors.geogFromWKT(wkt, 4326), false); | ||
| assertEquals(expected.toEWKT(), ((Geography) out).toEWKT()); | ||
| } |
There was a problem hiding this comment.
Added testEnvelopeSplitAtAntiMeridian using an antimeridian-crossing input (LINESTRING (170 10, -170 20)) with splitAtAntiMeridian=true, asserting the result matches the common reference and is a MULTIPOLYGON (the split path). Commit 779e766.
Did you read the Contributor Guide?
Is this PR related to a ticket?
[GH-XXX] my subject. Closes SedonaFlink: add geography support to measurement and output functions #3073.What changes were proposed in this PR?
Part of the geography parity effort for SedonaFlink (#3054), building on the type serializer (#3058) and constructors (#3061).
This adds
Geographysupport to the SedonaFlink measurement and output functions, so geography columns can be measured and formatted, not just constructed. Each of the following gains aGeographyevaloverload inflink/.../expressions/Functions.java, delegating toorg.apache.sedona.common.geography.Functions:ST_Area(geodesic, m²)ST_Length(geodesic, m)ST_Distance(geodesic, m)ST_Buffer(geog, radius[, useSpheroid | parameters] — 3 overloads)ST_CentroidST_Envelope(geog, splitAtAntiMeridian)ST_NPointsST_NumGeometriesST_GeometryTypeST_AsTextST_AsEWKTEach overload uses
@DataTypeHint(value = "RAW", rawSerializer = GeographyTypeSerializer.class, bridgedTo = Geography.class). Flink resolves geometry vs geography by the RAWbridgedTotype — the same mechanismST_AsTextalready uses to supportBox2D/Box3D— so a single registered function name serves both types and no newCatalogentries are required.This mirrors Spark, where these functions accept either
GeometryorGeography. Geography predicates (ST_Contains,ST_Intersects,ST_Within,ST_Equals,ST_DWithin) are tracked as a follow-up.How was this patch tested?
Added
GeographyFunctionTest(12 tests) exercising each function end-to-end through the Flink Table API, asserting against thecommon.geography.Functionsreference values, plustestGeometryStillWorksconfirming the geometry overload still resolves on the shared function. No regressions:FunctionTest(205 geometry tests),GeographyConstructorTest,GeographyTypeSerializerTest, andModuleTestall pass.Did this PR include necessary documentation updates?