Skip to content

Propagate schema.name-mapping.default from table metadata into FileScanTask #2518

@prakharjain09

Description

@prakharjain09

Is your feature request related to a problem or challenge?

FileScanTask.name_mapping is always set to None during scan planning,
even when the table sets the schema.name-mapping.default property. This
leaves a TODO in the code:

https://github.com/apache/iceberg-rust/blob/main/crates/iceberg/src/scan/cont
ext.rs#L140

// TODO: Extract name_mapping from table metadata property
"schema.name-mapping.default"
name_mapping: None,

Impact

Readers rely on the name mapping to resolve field IDs for Parquet files that
either lack field IDs or have conflicting ones — the existing
record-batch-transformer and projection code already handles
Some(NameMapping) correctly, but the value never reaches it from the planner.
 As a result, **scans of tables that depend on schema.name-mapping.default
(e.g. tables migrated from Hive, files written without Iceberg field-ID
metadata) silently fall back to position-based ID assignment**.

### Describe the solution you'd like

Parse the schema.name-mapping.default property once in
TableScanBuilder::build, store the resulting Arc<NameMapping> on PlanContext,
 and thread it through ManifestFileContext and ManifestEntryContext so it
lands on FileScanTask. Malformed JSON should surface as
ErrorKind::DataInvalid from build() rather than being silently dropped.

### Willingness to contribute

I would be willing to contribute to this feature with guidance from the Iceberg Rust community

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions