perf: Add Comet config for native Iceberg reader's data file concurrency [iceberg] by mbutrovich · Pull Request #3584 · apache/datafusion-comet

mbutrovich · 2026-02-24T21:27:49Z

Which issue does this PR close?

N/A.

Rationale for this change

The native Iceberg scan currently hardcodes data_file_concurrency_limit to DataFusion's target_partitions(), which doesn't give users control over per-task file read parallelism. For tables with many scan tasks, increasing concurrency overlaps I/O latency and improves throughput.

We tested this in #3551 and saw good speedup with a value of 4 in an object store setting.

What changes are included in this PR?

Adds spark.comet.scan.icebergNative.dataFileConcurrencyLimit config that controls iceberg-rust's ArrowReaderBuilder::with_data_file_concurrency_limit. The value is serialized through protobuf from the JVM to native. Defaults to 1 to preserve deterministic output ordering in tests that lack an explicit ORDER BY.

How are these changes tested?

Existing tests.

andygrove

LGTM. Thanks @mbutrovich

Extract changes from reader_perf branch.

ed09459

andygrove approved these changes Feb 24, 2026

View reviewed changes

mbutrovich merged commit d0aa1ff into apache:main Feb 25, 2026
213 of 215 checks passed

mbutrovich deleted the iceberg-data-file-concurrency-limit branch February 25, 2026 00:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

perf: Add Comet config for native Iceberg reader's data file concurrency [iceberg]#3584

perf: Add Comet config for native Iceberg reader's data file concurrency [iceberg]#3584
mbutrovich merged 1 commit intoapache:mainfrom
mbutrovich:iceberg-data-file-concurrency-limit

mbutrovich commented Feb 24, 2026 •

edited

Loading

Uh oh!

andygrove left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

mbutrovich commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

Uh oh!

andygrove left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mbutrovich commented Feb 24, 2026 •

edited

Loading