[GH-2983] Box3D SQL parser keyword + Geometry→Box3D cast resolution#3016
Merged
Merged
Conversation
…tion Adds the parser keyword and analyzer rule that let `CAST(... AS box3d)` resolve, mirroring the Box2D parser/cast support landed in apache#2927. - `SedonaSqlAstBuilder.visitPrimitiveDataType` recognises `BOX3D` alongside the existing `GEOMETRY` and `BOX2D` keywords, returning `Box3DUDT`. Applied uniformly to the spark-3.4 / 3.5 / 4.0 / 4.1 parser variants. - `Box3DCastResolutionRule` rewrites `Cast(geom, Box3DUDT)` → `ST_Box3D(geom)` during analysis (before `CheckAnalysis`), since Spark's stock `Cast.canCast` refuses arbitrary UDT-to-UDT casts. Registered from `SedonaSqlExtensions.injectResolutionRule`. - Only the forward direction (`CAST(geom AS box3d)`) is wired. The inverse cast (`CAST(box3d AS geometry)`) is deferred until Box3D has an `ST_GeomFromBox3D` counterpart driven by a concrete consumer — Box2D shipped both directions because `ST_GeomFromBox2D` already existed, which is not the case here. Tests: - `Box3DCastResolutionRuleSuite` (rule unit test) — covers the forward rewrite, leaves the inverse Cast untouched (out of scope), and leaves unrelated casts alone. - `Box3DCastSuite` per spark-3.4 / 3.5 / 4.0 / 4.1 — DataFrame `.cast(Box3DUDT)` plus SQL `CAST(... AS box3d)` end-to-end, gated on the parser-extension probe (matches `Box2DCastSuite`'s pattern across the same four matrix cells).
20e9cba to
231c335
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
Adds end-to-end Spark SQL support for CAST(... AS box3d) by (1) teaching Sedona’s Spark SQL parser to recognize BOX3D as a primitive type and (2) injecting an analyzer resolution rule that rewrites Geometry→Box3D Catalyst Cast nodes into ST_Box3D(...) (mirroring the existing Box2D behavior). This fits into Sedona’s Spark SQL type system by making Box3D usable via standard SQL casting syntax across the Spark 3.4/3.5/4.0/4.1 build matrix.
Changes:
- Extend
SedonaSqlAstBuilder.visitPrimitiveDataTypeto parseBOX3Dand produceBox3DUDTin Spark 3.4/3.5/4.0/4.1 variants. - Add
Box3DCastResolutionRuleand register it viaSedonaSqlExtensions.injectResolutionRuleto resolveCAST(geom AS box3d)by rewriting toST_Box3D(geom). - Add unit + integration test coverage for the rule and for DataFrame/SQL cast behavior across Spark versions (with SQL tests gated by the parser-extension probe).
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| spark/common/src/main/scala/org/apache/spark/sql/sedona_sql/optimization/Box3DCastResolutionRule.scala | New analyzer rule to rewrite Geometry→Box3D casts to ST_Box3D during analysis. |
| spark/common/src/main/scala/org/apache/sedona/sql/SedonaSqlExtensions.scala | Registers the new Box3D cast resolution rule in Sedona’s SparkSession extensions. |
| spark/common/src/test/scala/org/apache/sedona/sql/Box3DCastResolutionRuleSuite.scala | Unit tests verifying the rewrite (and that the inverse cast remains untouched). |
| spark/spark-3.4/src/main/scala/org/apache/sedona/sql/parser/SedonaSqlAstBuilder.scala | Adds BOX3D keyword parsing to the Spark 3.4 parser extension. |
| spark/spark-3.5/src/main/scala/org/apache/sedona/sql/parser/SedonaSqlAstBuilder.scala | Adds BOX3D keyword parsing to the Spark 3.5 parser extension. |
| spark/spark-4.0/src/main/scala/org/apache/sedona/sql/parser/SedonaSqlAstBuilder.scala | Adds BOX3D keyword parsing to the Spark 4.0 parser extension. |
| spark/spark-4.1/src/main/scala/org/apache/sedona/sql/parser/SedonaSqlAstBuilder.scala | Adds BOX3D keyword parsing to the Spark 4.1 parser extension. |
| spark/spark-3.4/src/test/scala/org/apache/sedona/sql/Box3DCastSuite.scala | Spark 3.4 integration tests for DataFrame .cast(Box3DUDT) and SQL CAST(... AS box3d). |
| spark/spark-3.5/src/test/scala/org/apache/sedona/sql/Box3DCastSuite.scala | Spark 3.5 integration tests for DataFrame .cast(Box3DUDT) and SQL CAST(... AS box3d). |
| spark/spark-4.0/src/test/scala/org/apache/sedona/sql/Box3DCastSuite.scala | Spark 4.0 integration tests for DataFrame .cast(Box3DUDT) and SQL CAST(... AS box3d). |
| spark/spark-4.1/src/test/scala/org/apache/sedona/sql/Box3DCastSuite.scala | Spark 4.1 integration tests for DataFrame .cast(Box3DUDT) and SQL CAST(... AS box3d). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Did you read the Contributor Guide?
Is this PR related to a ticket?
What changes were proposed in this PR?
Adds the parser keyword and analyzer rule that let
CAST(... AS box3d)resolve, mirroring the Box2D parser/cast support landed in #2927.SedonaSqlAstBuilder.visitPrimitiveDataTyperecognisesBOX3Dalongside the existingGEOMETRYandBOX2Dkeywords, returningBox3DUDT. Applied uniformly to the spark-3.4 / 3.5 / 4.0 / 4.1 parser variants.Box3DCastResolutionRulerewritesCast(geom, Box3DUDT)→ST_Box3D(geom)during analysis (beforeCheckAnalysis), since Spark's stockCast.canCastrefuses arbitrary UDT-to-UDT casts. Registered fromSedonaSqlExtensions.injectResolutionRule.CAST(geom AS box3d)) is wired. The inverse cast (CAST(box3d AS geometry)) is deferred until Box3D has anST_GeomFromBox3Dcounterpart driven by a concrete consumer — Box2D shipped both directions becauseST_GeomFromBox2Dalready existed, which is not the case here. The issue tracks this rationale.How was this patch tested?
Box3DCastResolutionRuleSuite(rule unit test inspark/common) — verifies the forward rewrite, that the inverseCast(box3d, Geometry)is left untouched (out of scope), and that unrelated casts are not affected.Box3DCastSuiteadded per spark-3.4 / 3.5 / 4.0 / 4.1 — DataFrame.cast(Box3DUDT)plus SQLCAST(... AS box3d)end-to-end, gated on the parser-extension probe (matches the existingBox2DCastSuitepattern across the same four matrix cells).Did this PR include necessary documentation updates?