Skip to content

[C++] ReadRel is translated to a source node that emits unexpected fields #32516

@asfimport

Description

@asfimport

Currently, a Substrait plan with a RelRoot containing a ReadRel will contain extra, unexpected fields, namely __fragment_index et. al. Right now they are always included by default. There are a few things to be done:

  • ReadRel's base_schema could be converted into a ScanOptions.dataset_schema to limit the fields read. (Also see ARROW-15585, these fields should be used for pushdown projection)
  • The scanner always adds these extra fields - maybe it should be opt-in instead
  • There's no way to manually insert a Project to "fix" things because as implemented, it can only add new columns

Reporter: David Li / @lidavidm

Related issues:

Note: This issue was originally created as ARROW-17229. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions