Skip to content

Conversation

@Blizzara
Copy link
Contributor

@Blizzara Blizzara commented Jun 26, 2024

Similar to https://github.com/apache/datafusion/pull/10874/files

Which issue does this PR close?

Closes #.

Rationale for this change

Arrow requires schema to match exactly to literals. Substrait contains nullability separate in literals and types, and the nullability of a literal may vary
row-by-row.

The only reliable way I can think of is to either not use nullability at all (this PR), or alternatively pass the expected type into from_substrait_literal so that it can use the expected nullability.

What changes are included in this PR?

Ignore nullability when consuming Substrait struct type and literal, always erring on the side of nullable = true

Are these changes tested?

Tested by existing unit tests

Are there any user-facing changes?

Similar to https://github.com/apache/datafusion/pull/10874/files

Arrow requires schema to match exactly to literals.
Substrait contains nullability separate in literals and types,
and the nullability of a literal may vary
row-by-row.
The only reliable way is to either not use nullability at all (this PR),
or alternatively pass the expected type into from_substrait_literal
so that it can use the expected nullability.
@Blizzara Blizzara marked this pull request as ready for review June 26, 2024 20:39
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good step forward even though it seems less than idea

@waynexia do you have any thoughts or concerns about this approach?

@alamb alamb merged commit e52b5e5 into apache:main Jun 30, 2024
@alamb
Copy link
Contributor

alamb commented Jun 30, 2024

Thanks again @Blizzara

@waynexia
Copy link
Member

waynexia commented Jul 1, 2024

Don't use the nullability field seems to be a good workaround we can get at present. This (as well as #10874) didn't make things dangerous as we already ignore this field in some places, but they made the logic consistent. So this approach looks good to me, the downside of losing nullability information in substrait seems acceptable.

Thanks @Blizzara and @alamb for pushing this forward ❤️

@Blizzara Blizzara deleted the avo/substrait-struct-ignore-nullability branch July 2, 2024 15:41
findepi pushed a commit to findepi/datafusion that referenced this pull request Jul 16, 2024
Similar to https://github.com/apache/datafusion/pull/10874/files

Arrow requires schema to match exactly to literals.
Substrait contains nullability separate in literals and types,
and the nullability of a literal may vary
row-by-row.
The only reliable way is to either not use nullability at all (this PR),
or alternatively pass the expected type into from_substrait_literal
so that it can use the expected nullability.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants