Skip to content

[GH-2983] Box3D SQL parser keyword + Geometry→Box3D cast resolution#3016

Merged
jiayuasu merged 1 commit into
apache:masterfrom
jiayuasu:feature/box3d-cast-resolution
Jun 1, 2026
Merged

[GH-2983] Box3D SQL parser keyword + Geometry→Box3D cast resolution#3016
jiayuasu merged 1 commit into
apache:masterfrom
jiayuasu:feature/box3d-cast-resolution

Conversation

@jiayuasu
Copy link
Copy Markdown
Member

Did you read the Contributor Guide?

Is this PR related to a ticket?

  • Yes — closes #2983; follow-up to the Box3D Phase 1 epic (#2973).

What changes were proposed in this PR?

Adds the parser keyword and analyzer rule that let CAST(... AS box3d) resolve, mirroring the Box2D parser/cast support landed in #2927.

  • SedonaSqlAstBuilder.visitPrimitiveDataType recognises BOX3D alongside the existing GEOMETRY and BOX2D keywords, returning Box3DUDT. Applied uniformly to the spark-3.4 / 3.5 / 4.0 / 4.1 parser variants.
  • Box3DCastResolutionRule rewrites Cast(geom, Box3DUDT)ST_Box3D(geom) during analysis (before CheckAnalysis), since Spark's stock Cast.canCast refuses arbitrary UDT-to-UDT casts. Registered from SedonaSqlExtensions.injectResolutionRule.
  • Only the forward direction (CAST(geom AS box3d)) is wired. The inverse cast (CAST(box3d AS geometry)) is deferred until Box3D has an ST_GeomFromBox3D counterpart driven by a concrete consumer — Box2D shipped both directions because ST_GeomFromBox2D already existed, which is not the case here. The issue tracks this rationale.

How was this patch tested?

  • Box3DCastResolutionRuleSuite (rule unit test in spark/common) — verifies the forward rewrite, that the inverse Cast(box3d, Geometry) is left untouched (out of scope), and that unrelated casts are not affected.
  • Box3DCastSuite added per spark-3.4 / 3.5 / 4.0 / 4.1 — DataFrame .cast(Box3DUDT) plus SQL CAST(... AS box3d) end-to-end, gated on the parser-extension probe (matches the existing Box2DCastSuite pattern across the same four matrix cells).
  • All tests pass locally on spark-3.5 (3 rule + 6 cast).

Did this PR include necessary documentation updates?

…tion

Adds the parser keyword and analyzer rule that let `CAST(... AS box3d)`
resolve, mirroring the Box2D parser/cast support landed in apache#2927.

- `SedonaSqlAstBuilder.visitPrimitiveDataType` recognises `BOX3D`
  alongside the existing `GEOMETRY` and `BOX2D` keywords, returning
  `Box3DUDT`. Applied uniformly to the spark-3.4 / 3.5 / 4.0 / 4.1
  parser variants.
- `Box3DCastResolutionRule` rewrites `Cast(geom, Box3DUDT)` →
  `ST_Box3D(geom)` during analysis (before `CheckAnalysis`), since
  Spark's stock `Cast.canCast` refuses arbitrary UDT-to-UDT casts.
  Registered from `SedonaSqlExtensions.injectResolutionRule`.
- Only the forward direction (`CAST(geom AS box3d)`) is wired. The
  inverse cast (`CAST(box3d AS geometry)`) is deferred until Box3D
  has an `ST_GeomFromBox3D` counterpart driven by a concrete consumer
  — Box2D shipped both directions because `ST_GeomFromBox2D` already
  existed, which is not the case here.

Tests:
- `Box3DCastResolutionRuleSuite` (rule unit test) — covers the
  forward rewrite, leaves the inverse Cast untouched (out of scope),
  and leaves unrelated casts alone.
- `Box3DCastSuite` per spark-3.4 / 3.5 / 4.0 / 4.1 — DataFrame
  `.cast(Box3DUDT)` plus SQL `CAST(... AS box3d)` end-to-end,
  gated on the parser-extension probe (matches `Box2DCastSuite`'s
  pattern across the same four matrix cells).
@jiayuasu jiayuasu force-pushed the feature/box3d-cast-resolution branch from 20e9cba to 231c335 Compare May 31, 2026 19:51
@jiayuasu jiayuasu requested a review from Copilot June 1, 2026 06:01
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds end-to-end Spark SQL support for CAST(... AS box3d) by (1) teaching Sedona’s Spark SQL parser to recognize BOX3D as a primitive type and (2) injecting an analyzer resolution rule that rewrites Geometry→Box3D Catalyst Cast nodes into ST_Box3D(...) (mirroring the existing Box2D behavior). This fits into Sedona’s Spark SQL type system by making Box3D usable via standard SQL casting syntax across the Spark 3.4/3.5/4.0/4.1 build matrix.

Changes:

  • Extend SedonaSqlAstBuilder.visitPrimitiveDataType to parse BOX3D and produce Box3DUDT in Spark 3.4/3.5/4.0/4.1 variants.
  • Add Box3DCastResolutionRule and register it via SedonaSqlExtensions.injectResolutionRule to resolve CAST(geom AS box3d) by rewriting to ST_Box3D(geom).
  • Add unit + integration test coverage for the rule and for DataFrame/SQL cast behavior across Spark versions (with SQL tests gated by the parser-extension probe).

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated no comments.

Show a summary per file
File Description
spark/common/src/main/scala/org/apache/spark/sql/sedona_sql/optimization/Box3DCastResolutionRule.scala New analyzer rule to rewrite Geometry→Box3D casts to ST_Box3D during analysis.
spark/common/src/main/scala/org/apache/sedona/sql/SedonaSqlExtensions.scala Registers the new Box3D cast resolution rule in Sedona’s SparkSession extensions.
spark/common/src/test/scala/org/apache/sedona/sql/Box3DCastResolutionRuleSuite.scala Unit tests verifying the rewrite (and that the inverse cast remains untouched).
spark/spark-3.4/src/main/scala/org/apache/sedona/sql/parser/SedonaSqlAstBuilder.scala Adds BOX3D keyword parsing to the Spark 3.4 parser extension.
spark/spark-3.5/src/main/scala/org/apache/sedona/sql/parser/SedonaSqlAstBuilder.scala Adds BOX3D keyword parsing to the Spark 3.5 parser extension.
spark/spark-4.0/src/main/scala/org/apache/sedona/sql/parser/SedonaSqlAstBuilder.scala Adds BOX3D keyword parsing to the Spark 4.0 parser extension.
spark/spark-4.1/src/main/scala/org/apache/sedona/sql/parser/SedonaSqlAstBuilder.scala Adds BOX3D keyword parsing to the Spark 4.1 parser extension.
spark/spark-3.4/src/test/scala/org/apache/sedona/sql/Box3DCastSuite.scala Spark 3.4 integration tests for DataFrame .cast(Box3DUDT) and SQL CAST(... AS box3d).
spark/spark-3.5/src/test/scala/org/apache/sedona/sql/Box3DCastSuite.scala Spark 3.5 integration tests for DataFrame .cast(Box3DUDT) and SQL CAST(... AS box3d).
spark/spark-4.0/src/test/scala/org/apache/sedona/sql/Box3DCastSuite.scala Spark 4.0 integration tests for DataFrame .cast(Box3DUDT) and SQL CAST(... AS box3d).
spark/spark-4.1/src/test/scala/org/apache/sedona/sql/Box3DCastSuite.scala Spark 4.1 integration tests for DataFrame .cast(Box3DUDT) and SQL CAST(... AS box3d).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jiayuasu jiayuasu added this to the sedona-1.9.1 milestone Jun 1, 2026
@jiayuasu jiayuasu linked an issue Jun 1, 2026 that may be closed by this pull request
@jiayuasu jiayuasu merged commit 6fed0da into apache:master Jun 1, 2026
42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Box3D SQL parser keyword + cast resolution rule (CAST AS box3d)

2 participants