Spark 4.1: Support time type by Benjamin0313 · Pull Request #16665 · apache/iceberg

Benjamin0313 · 2026-06-02T15:01:25Z

What

Adds support for the Iceberg time type in the Spark 4.1 module, mapping it to Spark's
TimeType (introduced in SPARK-51162).

Previously, projecting or writing a time column from Spark threw
UnsupportedOperationException: Spark does not support time fields from TypeToSparkType.
This revisits #9006, which was closed in 2019 — before Spark had a native time type.

TimeType only exists in Spark 4.1, so this targets spark/v4.1 only (not 3.5 / 4.0).

How

Type conversion

TypeToSparkType: Iceberg time → Spark TimeType() (microsecond precision)
SparkTypeToType: Spark TimeType → Iceberg time

Value conversion — Iceberg stores time as microseconds-from-midnight; Spark 4.1 stores
nanoseconds-from-midnight (SPARK-52460). Conversion happens at the read/write boundary
(×1000 on read, ÷1000 on write):

Parquet — SparkParquetReaders (TimeReader), SparkParquetWriters (TimeMicrosWriter)
ORC — SparkOrcValueReaders#times, SparkOrcValueWriters#times (via LongColumnVector)
Avro — SparkPlannedAvroReader / SparkAvroWriter (time-micros logical type)
Row-level: SparkValueConverter, InternalRowWrapper

Vectorized reads are intentionally not supported in this PR. Spark 4.1's ColumnarBatch
cannot expose TimeType values (ColumnarBatchRow#get throws
Datatype not supported TimeType(6)), and exposing time through the shared arrow module's
accessor would require an engine-wide change affecting Flink and others. SparkBatch therefore
falls back to row-based reads when a time column is projected (both Parquet and ORC). This can be
lifted in a follow-up once Spark's vectorized time support matures.

Testing

Enabled the existing supportsTime() hook in TestSparkParquetReader, TestSparkAvroReader,
and TestSparkRecordOrcReaderWriter, exercising schema + value round-trips via testTypeSchema.
Re-enabled TestInternalRowWrapper#testTime.
Added time handling to test helpers (GenericsHelpers#assertEqualsSafe/assertEqualsUnsafe,
RandomData).
TestSparkOrcReader keeps supportsTime() == false because it also exercises the vectorized
path, which is not supported here.

AI assistance

This change was implemented with the help of an AI coding assistant (Claude). I reviewed and
understand the implementation end-to-end and verified it locally (spotlessApply and the Spark 4.1
module tests pass). I'd especially welcome scrutiny on:

Deferring vectorized reads (the SparkBatch row-based fallback for time columns).
The time value paths in SparkValueConverter and InternalRowWrapper.

Closes #16663

Map Iceberg's time type to Spark 4.1's TimeType (added in SPARK-51162) for row-based reads and writes across Parquet, ORC, and Avro. Iceberg stores time as microseconds from midnight while Spark stores it as nanoseconds, so values are converted on the boundary (x1000 on read, /1000 on write). Vectorized reads are intentionally left unsupported for now: Spark 4.1's ColumnarBatch (ColumnarBatchRow#get) does not support TimeType, and exposing time through the shared Arrow accessor would require an engine-wide change. SparkBatch therefore falls back to row-based reads when a time column is projected. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

github-actions Bot added the spark label Jun 2, 2026

pvary mentioned this pull request Jun 4, 2026

Data: Add TCK coverage for reader default values #16638

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spark 4.1: Support time type#16665

Spark 4.1: Support time type#16665
Benjamin0313 wants to merge 1 commit into
apache:mainfrom
Benjamin0313:spark-4.1-time-type-support

Benjamin0313 commented Jun 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Benjamin0313 commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

How

Testing

AI assistance

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Benjamin0313 commented Jun 2, 2026 •

edited

Loading