[GLUTEN-11622][VL] Fallback TimestampNTZ to Spark by acvictor · Pull Request #11609 · apache/gluten

acvictor · 2026-02-12T11:40:59Z

What changes are proposed in this pull request?

TimestampNTZ is not fully supported in Velox. This PR adds a FallbackByTimestampNTZ validator that unconditionally falls back any operator whose input or output schema contains TimestampNTZType.

The Arrow type mapping for TimestampNTZType is added in SparkArrowUti and is required because RowToVeloxColumnarExec transitions are inserted after the validation phase at RTC boundaries, and these call SparkArrowUtil.toArrowSchema which must be able to handle all types present in the schema. Without this mapping, the transition crashes with UnsupportedOperationException even though the operator itself was correctly fallen back.

How was this patch tested?

Added UTs in VeloxParquetDataTypeValidationSuite and DeltaSuite to verify that TimestampNTZ scans fall back to Spark and return correct results, and also added tests for Delta tables with NTZ columns, NTZ partition columns, and filters on NTZ columns were we originally saw the UnsupportedOperationException being thrown.

Was this patch authored or co-authored using generative AI tooling?

No

Related issue: #11622

github-actions · 2026-02-12T14:41:49Z

Run Gluten Clickhouse CI on x86

github-actions · 2026-02-12T15:05:16Z

Run Gluten Clickhouse CI on x86

acvictor · 2026-02-13T07:20:03Z

@rui-mo can you please review this PR? Thanks in advance!

github-actions · 2026-02-15T18:23:00Z

Run Gluten Clickhouse CI on x86

rui-mo

Hi @acvictor, has the RowToVeloxColumnarExec transition for the TimestampNTZ type already been supported with this PR? Thanks.

acvictor · 2026-02-16T14:24:56Z

Hi @acvictor, has the RowToVeloxColumnarExec transition for the TimestampNTZ type already been supported with this PR? Thanks.

Thank you for the review!

This PR fixes the SparkArrowUtil Arrow type mapping so that the RTC transitions at fallback boundaries don't crash with UnsupportedOperationException. The flow is:

The new FallbackByTimestampNTZ validator detects TimestampNTZ in a plan node's schema and forces a fallback to Spark's row-based execution.
At the fallback boundary, RowToVeloxColumnarExec calls SparkArrowUtil.toArrowSchema() to convert the schema. Before this PR, that call would throw on timestamp_ntz java.lang.UnsupportedOperationException: Unsupported data type: timestamp_ntz at org.apache.spark.sql.utils.SparkArrowUtil$.toArrowType(SparkArrowUtil.scala:58).

rui-mo · 2026-02-16T15:22:39Z

@acvictor Thank you for the additional information. I’m still a bit unclear: in a plan like op0 -> R2C -> op1, if all plan nodes involving the timestamp_ntz type fall back, could you help clarify which operator op1 might be that would lead to an R2C being inserted?

acvictor · 2026-02-17T06:57:32Z

@acvictor Thank you for the additional information. I’m still a bit unclear: in a plan like op0 -> R2C -> op1, if all plan nodes involving the timestamp_ntz type fall back, could you help clarify which operator op1 might be that would lead to an R2C being inserted?

I added debug logs to Gluten and used the example of the test use TIMESTAMP_NTZ in a partition column from here. This test creates a table with schema c1 STRING, c2 TIMESTAMP, c3 TIMESTAMP_NTZ partitioned by c3, inserts a row, then calls spark.table("delta_test").head.

op1 here would be ColumnarCollectLimitExec.

The actual runtime plan is:

VeloxColumnarToRowExec
└── ColumnarCollectLimitExec - op1
└── RowToVeloxColumnarExec
└── WholeStageCodegenExec - op0 (vanilla Spark, wraps FileScan fall back)
└── ColumnarToRow
└── FileScan parquet spark_catalog.default.delta_test
[c1, c2, c3(TimestampNTZ)] PARTITIONED BY (c3)

Debug logs added to Transitions.scala confirm it:

   [TRANSITION-DEBUG] node: ColumnarCollectLimit
   [TRANSITION-DEBUG]   conv: Impl(None$,VanillaBatchType$) -> Impl(Any,Is(VeloxBatchType$))
   [TRANSITION-DEBUG]   child: Scan parquet spark_catalog.default.delta_test
   [TRANSITION-DEBUG]   new: RowToVeloxColumnar
   [TRANSITION-DEBUG]   schema: StructType(...,StructField(c3,TimestampNTZType,true))

ColumnarCollectLimitExec appears despite the FallbackByTimestampNTZ validator because it is registered as a post-transform rule, which runs after validation. It sees the vanilla CollectLimitExec with a columnar child and unconditionally replaces it with ColumnarCollectLimitExec bypassing the validator entirely. Then InsertTransitions sees a convention mismatch (VanillaBatch - VeloxBatch) and inserts the RowToVeloxColumnarExec, which throws an exception in SparkArrowUtil.toArrowSchema because there is no case for TimestampNTZType. The validator alone cannot handle this because post-transform rules like CollectLimitTransformerRule can reintroduce Gluten native operators after validation has already run.

rui-mo · 2026-02-17T18:09:03Z

VeloxColumnarToRowExec
└── ColumnarCollectLimitExec - op1
└── RowToVeloxColumnarExec

@acvictor Thanks for providing the detailed query plan. ColumnarCollectLimitExec is a Scala implementation and might automatically supports the TimestampNTZ type.

For this patch, to make the TimestampNTZ fallback strategy work, I assume we need to ensure that both VeloxColumnarToRowExec and RowToVeloxColumnarExec handle TimestampNTZ correctly. I was a bit surprised by your test results showing that they seem to work with only a small change in toArrowSchema. Is this because the type is treated during conversion simply as an Arrow timestamp without a timezone? Could you help confirm if they work well? Thanks.

acvictor · 2026-02-19T06:12:46Z

VeloxColumnarToRowExec
└── ColumnarCollectLimitExec - op1
└── RowToVeloxColumnarExec

@acvictor Thanks for providing the detailed query plan. ColumnarCollectLimitExec is a Scala implementation and might automatically supports the TimestampNTZ type.

For this patch, to make the TimestampNTZ fallback strategy work, I assume we need to ensure that both VeloxColumnarToRowExec and RowToVeloxColumnarExec handle TimestampNTZ correctly. I was a bit surprised by your test results showing that they seem to work with only a small change in toArrowSchema. Is this because the type is treated during conversion simply as an Arrow timestamp without a timezone? Could you help confirm if they work well? Thanks.

Is this because the type is treated during conversion simply as an Arrow timestamp without a timezone? - Yes that's right! I was able to get all OSS Delta tests to pass with this change as well as Spark and Gluten UTs and have not found any issue so far.

github-actions bot added the VELOX label Feb 12, 2026

Fallback TimestampNTZ

62c6a55

acvictor force-pushed the acvictor/timestampntz branch from afc463d to 62c6a55 Compare February 12, 2026 14:41

github-actions bot added CORE works for Gluten Core DATA_LAKE labels Feb 12, 2026

Fix compilation for 3.2/3.3

e77fa3a

FelixYBW requested a review from rui-mo February 12, 2026 23:51

Merge branch 'main' into acvictor/timestampntz

99af6d3

rui-mo approved these changes Feb 16, 2026

View reviewed changes

rui-mo reviewed Feb 16, 2026

View reviewed changes

rui-mo changed the title ~~[VL] Fallback TimestampNTZ to Spark~~ [GLUTEN-11622][VL] Fallback TimestampNTZ to Spark Feb 16, 2026

rui-mo merged commit 7dde101 into apache:main Feb 19, 2026
62 checks passed

acvictor deleted the acvictor/timestampntz branch February 21, 2026 11:28

rui-mo mentioned this pull request Mar 10, 2026

[VL] Support TIMESTAMP_NTZ Type #11622

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GLUTEN-11622][VL] Fallback TimestampNTZ to Spark#11609

[GLUTEN-11622][VL] Fallback TimestampNTZ to Spark#11609
rui-mo merged 3 commits intoapache:mainfrom
acvictor:acvictor/timestampntz

acvictor commented Feb 12, 2026 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Feb 12, 2026

Uh oh!

github-actions bot commented Feb 12, 2026

Uh oh!

acvictor commented Feb 13, 2026

Uh oh!

github-actions bot commented Feb 15, 2026

Uh oh!

rui-mo left a comment

Uh oh!

acvictor commented Feb 16, 2026

Uh oh!

rui-mo commented Feb 16, 2026

Uh oh!

acvictor commented Feb 17, 2026

Uh oh!

rui-mo commented Feb 17, 2026

Uh oh!

acvictor commented Feb 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

acvictor commented Feb 12, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes are proposed in this pull request?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

github-actions bot commented Feb 12, 2026

Uh oh!

github-actions bot commented Feb 12, 2026

Uh oh!

acvictor commented Feb 13, 2026

Uh oh!

github-actions bot commented Feb 15, 2026

Uh oh!

rui-mo left a comment

Choose a reason for hiding this comment

Uh oh!

acvictor commented Feb 16, 2026

Uh oh!

rui-mo commented Feb 16, 2026

Uh oh!

acvictor commented Feb 17, 2026

Uh oh!

rui-mo commented Feb 17, 2026

Uh oh!

acvictor commented Feb 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

acvictor commented Feb 12, 2026 •

edited by github-actions bot

Loading