feat: Added IcebergArrowInputSourceReader by Shekharrajak · Pull Request #19510 · apache/druid

Shekharrajak · 2026-05-23T09:49:54Z

Fixes #19498

Description

Added IcebergArrowInputSourceReader using iceberg-arrow vectorized API
Returns the live Table object so the Arrow reader can drive scan planning
Two new optional JSON properties on the input spec. useArrowReader=false keeps existing behaviour byte-for-byte; useArrowReader=true switches to the Arrow path. arrowBatchSize defaults to 1024.

Release note

Apache Iceberg ingestion now supports an opt-in vectorized reader path backed by iceberg-arrow. Enable by setting "useArrowReader": true on the iceberg input source. The Arrow path automatically applies V2 delete files (positional and equality), handles schema evolution, pushes column projection and predicates into the scan planner, and is 2x–3x faster than the existing path.
Future Iceberg spec features (V3 deletion vectors, row lineage, V4+) become available on Iceberg version bumps with no Druid code changes. Default remains the existing path; both coexist.

This PR has:

…ccess

…d API

…st constructor arity

Shekharrajak · 2026-05-23T09:50:25Z

Benchmark :

| numRows  | numCols | icebergArrowInputSourceReader    | icebergInputSourceReader         | Speedup |
|         |         |  ms/op         throughput        |  ms/op         throughput        | (vs baseline) |
+---------+---------+----------------------------------+----------------------------------+---------+
| 100,000 |   5     |   17.70   5,651,074 rows/s       |   51.35   1,947,542 rows/s       |  2.90x  |
| 100,000 |  15     |   44.11   2,266,881 rows/s       |   95.81   1,043,680 rows/s       |  2.17x  |
| 500,000 |   5     |   74.92   6,673,450 rows/s       |  187.73   2,663,420 rows/s       |  2.51x  |
| 500,000 |  15     |  197.97   2,525,637 rows/s       |  420.94   1,187,817 rows/s       |  2.13x  |
================================================================================
Summary:
  Best speedup:    2.90x  (numRows=100,000, numCols=5)
  Worst speedup:   2.13x  (numRows=500,000, numCols=15)
  Geomean speedup: 2.41x

Shekharrajak · 2026-05-23T09:51:06Z

Looking into benchmark module which will be used throughout the arrow and datafusion integration to quickly check the improvements.

…t line breaking, DateTimes.utc, Maps.newHashMapWithExpectedSize)

Shekharrajak · 2026-05-23T12:23:16Z

+  private final boolean useArrowReader;
+
+  @JsonProperty
+  private final int arrowBatchSize;


With a batch of 1024 values in a contiguous buffer -> emit SIMD instructions that process 8-16 rows per CPU cycle.

Decompression amortization -> batch-read helps in decompress once and consume many row, since parquet compressed pages .

Shekharrajak · 2026-05-23T12:50:25Z

ingestion time at 100k × 5

Read

Arrow implementation : 17ms
Current implementaiton : 53ms

Index + Persist (= total − read)

~330ms
Similar time on both implementation

total time

Arrow implementation: 347 ms
current implementation: 410 ms

Index+persist does substantially more work per row than read. So even though Arrow makes read ~3x faster, that gain is dwarfed when amortized over the much-larger indexing cost.

Benchmark added https://github.com/Shekharrajak/druid/pull/1/changes

…Allocation init failure

Shekharrajak · 2026-05-23T13:07:52Z

looks like network error https://github.com/apache/druid/actions/runs/26333466916/job/77523245546?pr=19510 - please trigger the CI check again.

FrankChen021

Severity	Findings
P0	0
P1	2
P2	1
P3	0
Total	3

Reviewed 7 of 7 changed files.

This is an automated review by Codex GPT-5.5

FrankChen021 · 2026-05-24T08:04:06Z

      File temporaryDirectory
  )
  {
+    if (useArrowReader) {


[P1] Arrow path bypasses residual FAIL handling

The useArrowReader branch returns an IcebergArrowInputSourceReader before retrieveIcebergDatafiles() runs, so it skips the residual detection that enforces residualFilterMode=FAIL. IcebergArrowInputSourceReaderTest documents that a non-partition equality filter returns all rows from the file, so a spec with useArrowReader=true and residualFilterMode=FAIL will silently ingest residual rows instead of rejecting the ingestion. Please run the same residual check for the Arrow scan or otherwise honor the fail mode before reading.

FrankChen021 · 2026-05-24T08:04:06Z

+    // Push column projection into the scan planner — only requested columns are read from disk.
+    final List<String> configuredDims = schema.getDimensionsSpec().getDimensionNames();
+    final String timestampColumn = schema.getTimestampSpec().getTimestampColumn();
+    if (!configuredDims.isEmpty()) {


[P1] Projection omits required non-dimension columns

When fixed dimensions are configured, this projection selects only those dimension names plus the timestamp column. Druid carries required transform inputs and aggregator source fields through InputRowSchema.getColumnsFilter(), not DimensionsSpec alone. For example, dimensions=[name] with an aggregator over value will scan only ts/name, so value is missing from the event and downstream metrics/transforms read null. Please drive projection from the full columnsFilter/required fields, or avoid pruning when those required columns cannot be represented safely.

FrankChen021 · 2026-05-24T08:04:06Z

      File temporaryDirectory
  )
  {
+    if (useArrowReader) {


[P2] Parallel ingestion bypasses the Arrow reader

This special case only affects direct reader() calls. The input source still reports itself as splittable, and Druid's parallel task runners call createSplits() and then withSplit(); withSplit() returns the delegate input source, so subtasks read raw files through the old delegate path instead of IcebergArrowInputSourceReader. With useArrowReader=true, behavior therefore changes based on maxNumConcurrentSubTasks and can lose the Arrow path's Iceberg semantics. Please make the Arrow mode non-splittable or return split-aware Iceberg input sources that preserve useArrowReader.

Shekharrajak added 5 commits May 23, 2026 00:12

deps: add arrow 15.0.2 and iceberg-arrow to pom dependency management

b13384c

feat: add retrieveTable() to IcebergCatalog for direct Table object a…

090cc09

…ccess

feat: add IcebergArrowInputSourceReader using iceberg-arrow vectorize…

973c702

…d API

feat: wire useArrowReader + arrowBatchSize into IcebergInputSource

d38b770

test: add IcebergArrowInputSourceReaderTest; fix IcebergInputSourceTe…

ca75a0b

…st constructor arity

github-actions Bot added the Area - Dependencies label May 23, 2026

style: fix checkstyle and forbidden-apis violations (imports, argumen…

9da5049

…t line breaking, DateTimes.utc, Maps.newHashMapWithExpectedSize)

Shekharrajak force-pushed the feature/iceberg-arrow-reader branch from 6297e2f to 9da5049 Compare May 23, 2026 10:24

Shekharrajak mentioned this pull request May 23, 2026

iceberg arrow reader harness Shekharrajak/druid#1

Open

Shekharrajak commented May 23, 2026

View reviewed changes

Shekharrajak mentioned this pull request May 23, 2026

[Proposal] Native, vectorised, zero-copy execution path for Druid #19456

Open

fix: switch arrow-memory-netty to arrow-memory-unsafe to fix CI Arrow…

8808ec4

…Allocation init failure

FrankChen021 reviewed May 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Added IcebergArrowInputSourceReader #19510

feat: Added IcebergArrowInputSourceReader #19510
Shekharrajak wants to merge 7 commits into
apache:masterfrom
Shekharrajak:feature/iceberg-arrow-reader

Shekharrajak commented May 23, 2026

Uh oh!

Shekharrajak commented May 23, 2026

Uh oh!

Shekharrajak commented May 23, 2026

Uh oh!

Shekharrajak May 23, 2026

Uh oh!

Shekharrajak May 23, 2026

Uh oh!

Shekharrajak commented May 23, 2026

Uh oh!

Shekharrajak commented May 23, 2026

Uh oh!

FrankChen021 left a comment

Uh oh!

FrankChen021 May 24, 2026

Uh oh!

FrankChen021 May 24, 2026

Uh oh!

FrankChen021 May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Shekharrajak commented May 23, 2026

Description

Release note

Uh oh!

Shekharrajak commented May 23, 2026

Uh oh!

Shekharrajak commented May 23, 2026

Uh oh!

Shekharrajak May 23, 2026

Choose a reason for hiding this comment

Uh oh!

Shekharrajak May 23, 2026

Choose a reason for hiding this comment

Uh oh!

Shekharrajak commented May 23, 2026

Read

Index + Persist (= total − read)

total time

Uh oh!

Shekharrajak commented May 23, 2026

Uh oh!

FrankChen021 left a comment

Choose a reason for hiding this comment

Uh oh!

FrankChen021 May 24, 2026

Choose a reason for hiding this comment

Uh oh!

FrankChen021 May 24, 2026

Choose a reason for hiding this comment

Uh oh!

FrankChen021 May 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants