Skip to content

Beam transforms reorder fields #18916

@kennknowles

Description

@kennknowles

Currently transforms such as Select, DropFields, RenameFields, and AddFields can create schemas with unexpected order. The problem is that FieldAccessDescriptor stores top-level fields and nested fields separately, so there's no way to tell the relative order between them. To fix this we should refactor FieldAccessDescriptor: instead of storing these separately it should store a single list, where each item in the list might optionally have a nested FieldAccessDescriptor.

Expected behavior from the transforms:

   DropFields: preserves order in original schema

   RenameFields: preserves order in original schema

   AddFields: adds fields in order specified. If multiple nested fields are selected, the first reference to the top field wins (e.g. adding "a.b", "c", "a.d" results in adding a before c.

  Select: Select fields in the order specified.

Imported from Jira BEAM-7301. Original Jira may contain additional context.
Reported by: reuvenlax.
Subtask of issue #18898

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions