fix: cardinality returns incorrect results for ragged nested arrays#23271
Open
lyne7-sc wants to merge 4 commits into
Open
fix: cardinality returns incorrect results for ragged nested arrays#23271lyne7-sc wants to merge 4 commits into
lyne7-sc wants to merge 4 commits into
Conversation
nuno-faria
approved these changes
Jul 1, 2026
nuno-faria
left a comment
Contributor
There was a problem hiding this comment.
Thanks @lyne7-sc, LGTM.
For example, array_dims([[1], [2, 3]]) currently returns [2, 1].
Yeah array_dims does not make sense without fixed lengths. Maybe it could return an error in the future, like array_add:
> select array_add([1, 2], [3]);
Execution error: array_add requires both list inputs to have the same length per row, got 2 and 1 at row 0
Contributor
Author
|
Thanks @nuno-faria for the review!
That makes sense |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
cardinalityreturns incorrect results for ragged nested arrays #23270.Rationale for this change
cardinalityusedcompute_array_dimsand multiplied the inferred dimensions to compute nested array cardinality. That only works for rectangular nested arrays, where every nested list has the same shape.For ragged nested arrays,
compute_array_dimsfollows the first nested value shape and can return dimensions that do not describe the actual number of leaf elements. As a result,cardinalitycan return incorrect results for valid nested arrays.Examples:
What changes are included in this PR?
This PR changes list cardinality computation to recursively count actual leaf elements instead of multiplying inferred dimensions.
Are these changes tested?
Yes, added sqllogictest coverage.
Are there any user-facing changes?
Bug fix only.
Additional discussion
The root cause is that
cardinality()usedcompute_array_dims()and multiplies the returned dimensions.compute_array_dims()itself follows the first child array when computing nested dimensions. This behavior may also need a separate discussion for ragged nested arrays, because such arrays do not have a single rectangular dimension vector. For example,array_dims([[1], [2, 3]])currently returns[2, 1].