[Variant] When possible, cast perfectly shredded children#9862
Open
AdamGS wants to merge 2 commits into
Open
Conversation
2de441e to
23b785d
Compare
23b785d to
e154a22
Compare
Contributor
Author
|
Ok seems like some tests are not fully deterministic and I missed that. |
e154a22 to
fb87e3d
Compare
b6e9d4c to
11538b5
Compare
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
11538b5 to
e01c2da
Compare
klion26
reviewed
May 25, 2026
| } | ||
|
|
||
| None | ||
| match from_type { |
Member
There was a problem hiding this comment.
Seems this overlaps with can_cast_types? can we simplify this
|
|
||
| #[test] | ||
| fn test_perfect_shredding_list_cast_gate_uses_variant_element_semantics() { | ||
| let int64_item = Arc::new(Field::new("item", DataType::Int64, true)); |
Member
There was a problem hiding this comment.
Do we need to add more tests to cover the logic of can_use_perfect_shredding_arrow_cast here?
| variant_array: &VariantArray, | ||
| as_field: &Field, | ||
| cast_options: &CastOptions, | ||
| ) -> Result<Option<ArrayRef>> { |
Member
There was a problem hiding this comment.
Why do we need the Result in the return value? Is it ok to use Option<arrayRef> here?
|
|
||
| // Use Arrow's vectorized cast when it cleanly matches the shredded representation. If not, | ||
| // fall back to row-wise extraction to preserve the existing variant-specific semantics. | ||
| Ok(cast_with_options(target_array.as_ref(), as_field.data_type(), cast_options).ok()) |
Member
There was a problem hiding this comment.
Does this mean that cast_with_options throws some error if the return value here is None? do we need to add some tests to cover this
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
variant_getto cast kernel #8982Rationale for this change
For cases we can just cast the perfectly shredded array, this is a significant performance boost, we do that only for a subset of types where arrow's casting matches the existing cast behavior in
parquet-variant-compute.One issue here is that the casting behavior is slightly different, but that seems aligned with #8982 and other work.What changes are included in this PR?
If an array is perfectly shredded AND can be cast according to the existing cast semantics, we cast it instead of falling back to the row builder.
Are these changes tested?
In addition to existing tests, added an additional test to make sure it works.
Are there any user-facing changes?
None.