feat: Add V2 scan support for native_iceberg_compat
#3272
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
This PR adds support for
native_iceberg_compatscan implementation with V2 data sources (BatchScanExec with ParquetScan).Rationale for this change
Previously, V2 Parquet scans always used the legacy
BatchReader(thenative_cometapproach) regardless of theCOMET_NATIVE_SCAN_IMPLsetting. This was inconsistent with V1 scans which respect the scan impl configuration.This change enables V2 scans to use the DataFusion-based
NativeBatchReaderwhennative_iceberg_compatorautois specified, which is important for deprecating the legacy mutable buffer code.What changes are included in this PR?
New Files
CometNativeParquetPartitionReaderFactory: A V2 partition reader factory that usesNativeBatchReader(DataFusion-based Parquet reader)CometNativeParquetScan: A V2 scan trait that creates the new reader factoryBehavior
autoornative_iceberg_compat: UseCometNativeParquetScan(DataFusion-based reader)native_comet: Use existingCometParquetScan(legacy JNI-based BatchReader)native_datafusion: Fall back to Spark (not yet supported for V2)How are these changes tested?
CometScanRuleSuitewith tests for V2 scan behaviorParquetReadV2Suiteto verify V2 scans work with the new implementation