Fix extension type metadata propagation through casts#22162
Draft
paleolimbot wants to merge 8 commits into
Draft
Fix extension type metadata propagation through casts#22162paleolimbot wants to merge 8 commits into
paleolimbot wants to merge 8 commits into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
The logical
Expr::CastandExpr::TryCasthave aFieldReftarget that was added in #18136 so that logical casts can express a cast to an extension type. In combination with a SQL type planner ( #20676 ) and an optimizer rule, this enabled casting to/from extension types with custom semantics to actually occur.The ability to do this was reverted by #20836 (which removed the original test) and I am not sure that ability ever made it into a release.
What changes are included in this PR?
This PR strips specific metadata keys (extension name and extension metadata) when propagating metadata from the source of a cast to the target (because doing so may result in an invalid destination field that consumers could reject), and propagates all metadata from the (logical) cast target field (e.g., so that a cast to an extension type represented by the cast target field will have a
to_field()that communicates the extension type).For the physical cast, this PR strips the extension name and metadata keys from the source field for the default cast (i.e., where the target field of the physical cast is just a DataType). This is needed so that the logical and physical behaviour agrees. The physical cast's target field comes from the logical cast's target field, so the extra metadata added by the logical cast field is already there.
In the planner, I re-added the behaviour where a cast to an extension type is rejected with an error. Casting to an extension type can be implemented with an optimizer rule, planner, or by the mechanism I have in the works in #21071 .
I would prefer to strip metadata across a cast (as we do for scalar function calls) but released DataFusions all currently do this and so this workaround is perhaps less disruptive.
Are these changes tested?
Yes
Are there any user-facing changes?
It was in practice not common to create a
Expr::Castwith field metadata internally and thus I don't think users will see metadata changes from the inclusion of metadata from the target field. I would be surprised if stripping the extension name/metadata from the source was disruptive (it was more likely to have caused errors).