Skip to content

[Avro] Structs inside nested lists with custom fields are not found in lookup #18744

@EmilyMatt

Description

@EmilyMatt

Describe the bug

Currently the field name in the lookup is hardcoded as "element", which is the parquet convention, arrow uses "item" in its convention, but when actual reading happen, the constructed name actually uses the list's field name, which means for "normal" list arrays(where the item is "item"), the lookup will never match.
This also goes against the avro behaviour, where lists should not have a name anyway, which might cause issues in the future as lookup evolves(using refs and such)

To Reproduce

Create a reader with the following schema:

{
          "type": "record",
          "name": "root",
          "fields": [
            {
              "name": "items",
              "type": {
                "type": "array",
                "items": {
                  "type": "record",
                  "name": "item_record",
                  "fields": [
                    {
                      "name": "id",
                      "type": "long"
                    },
                    {
                      "name": "name",
                      "type": "string"
                    }
                  ]
                }
              }
            }
          ]
        }

Reading will fail as the schema lookup will use "element"

Expected behavior

Lists should not have sub names at all, and this should be read successfully.

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions