Skip to content

Substrait literal conversion should preserve nested nullability #22095

@neilconway

Description

@neilconway

Is your feature request related to a problem or challenge?

DataFusion’s Substrait consumer currently normalizes some nested literal fields to nullable when converting Substrait literals into ScalarValues.

This is intentional today because VirtualTable / Values consumption converts the schema and literal rows separately. DataFusion later materializes the values into Arrow arrays and builds a RecordBatch using the schema. Arrow requires nested field types to match exactly, including nullability, so a mismatch like this can fail even
when the value itself is valid:

  schema:  List(Field { name: "item", data_type: Int32, nullable: false })
  literal: List(Field { name: "item", data_type: Int32, nullable: true })

Describe the solution you'd like

Add an expected-type-aware Substrait literal conversion path.

For VirtualTable / Values, the consumer has access to the NamedStruct schema. When converting each literal row, pass the expected top-level Field into literal conversion, and recursively use the expected nested fields for Struct, List, and Map literals.

The existing schema-less literal conversion path should remain available for ordinary expression literals where no expected field/type is available.

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions