Preserve logical cast field semantics during physical lowering and schema rewrite#20836
Open
kosiew wants to merge 12 commits intoapache:mainfrom
Open
Preserve logical cast field semantics during physical lowering and schema rewrite#20836kosiew wants to merge 12 commits intoapache:mainfrom
kosiew wants to merge 12 commits intoapache:mainfrom
Conversation
Update cast handling in planner.rs to retain logical target FieldRef metadata during the cast lowering process. Introduced a new field-aware helper in cast.rs for this purpose, which ensures that metadata, name, and nullability intent are kept intact, even for same-type casts. Updated planner tests to verify: - Preservation of metadata and nullability in lowered casts - CastExpr production for same-type casts with different field semantics - Regression checks for standard non-metadata casts - Enhanced CAST behavior to retain extension metadata while ensuring TRY_CAST still rejects it.
Change `cast_with_target_field_and_options` to crate-internal by modifying its visibility. Remove public re-export of the planner-only helper while retaining planner access via the internal module path. Make the `cast` module available within the crate for internal usage.
Make the cast module private again by changing its visibility to module only. Provide crate-internal access to the helper function from expressions::mod, and update the planner to use this internal entry point instead of direct module exposure.
Simplify is_valid_cast logic to compute a single decision. Return a single not_impl_err! for invalid cast cases. Construct CastExpr::new_with_target_field(...) once on the successful path, enhancing code clarity and reducing duplication.
Rename the test from `test_cast_to_extension_type` to `test_cast_preserves_extension_metadata` to better reflect its purpose. Replace the arrow.uuid fixture with a neutral extension name "datafusion.test.int64_extension". Update the assertion to check for the new extension name, ensuring that the test remains focused on verifying that cast lowering preserves target field metadata while surfacing extension metadata in the TryCast error path.
Eliminate unnecessary DataType clone and simplify the cast-validity branches into a single boolean expression. Ensure that the behavior and public surface remain unchanged while improving code efficiency.
Restore no-op cast canonicalization in cast_with_target_field_and_options(...). Update CastExpr::nullable() to align with runtime nullability, while maintaining logical target field semantics via return_field(). Adjust tests to ensure type-changing casts preserve logical target field semantics, and validate cast nullability against runtime semantics.
…uct casting validation
Refine cast handling in both cast.rs and schema_rewriter.rs. Explicit same-type casts now preserve CastExpr semantics, while default type-only casts are elided. Update planner tests to properly distinguish between the two and ensure consistent unified behavior across adapters. Added low-level tests for preserved and elided same-type cases for better coverage.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
Physical lowering in
planner.rswas dropping logical cast field semantics by forwarding only the targetDataTypewhen building physical cast expressions. That meant metadata-bearing target fields were rejected or reduced to type-only behavior, and planner-produced casts could diverge from adapter-produced casts.This patch fixes that gap by preserving the logical target
Fieldthrough lowering and related rewrite paths. It also aligns cast behavior across the planner and schema rewriter, so field metadata and logical nullability intent remain available in the produced physical expression.What changes are included in this PR?
This PR updates physical cast construction to be field-aware end to end.
Expr::Castinplanner.rswith field-aware lowering using the logical targetFieldRef.cast_with_target_field_and_optionsso physical cast construction can preserve target field metadata and nullability semantics.CastExprwhen the target field carries distinct field semantics such as name, metadata, or nullability.CastExpr::nullable()so runtime nullability follows the child expression, while logical target field nullability remains exposed throughreturn_field().CastExprinstances instead ofCastColumnExpr, including name-based physical column resolution when indexes differ.Are these changes tested?
Yes.
This PR adds and updates unit tests in the affected areas to cover:
CastExprAre there any user-facing changes?
There are no intended user-facing API changes, but physical cast expressions now preserve logical field metadata and nullability intent more faithfully during lowering. This should improve correctness and consistency for queries that depend on field metadata or logical cast semantics.
LLM-generated code disclosure
This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested.