Skip to content

Projection Pushdown with Newly Added Columns Fails for Old Batches #3423

@loserwang1024

Description

@loserwang1024

Search before asking

  • I searched in the issues and found nothing similar.

Fluss version

0.9.0 (latest release)

Please describe the bug 🐞

Our system supports schema evolution: when a new column is added to the table schema, existing data files are not rewritten—instead, the query engine handles missing columns by injecting NULL or default values at read time (client-side compatibility). This works correctly in most scenarios.

However, when projection pushdown is applied and the query explicitly selects the newly added column, the scan operator may attempt to read the column directly from storage—despite its absence in old data files—leading to a failure because the column vector is missing in the batch.

Solution

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions