Merged
Conversation
…umns dictionary uniqueness when allowing dimension selector cursor, fixes a bug with unnest on realtime segments with empty rows incorrectly specifying index 0 as the row dictionary value
…r-capabilities-more
abhishekagarwal87
approved these changes
Jul 11, 2024
Closed
sreemanamala
pushed a commit
to sreemanamala/druid
that referenced
this pull request
Aug 6, 2024
changes: * fixes a bug with unnest storage adapter not preserving underlying columns dictionary uniqueness when allowing dimension selector cursor * fixes a bug with unnest on realtime segments with empty rows incorrectly specifying index 0 as the row dictionary value
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Fixes of #16690, but tries to instead fix the attempted behavior of coercing
[]into[null]forUnnestDimensionCursorto be less disruptive while we determine what the correct behavior for unnest on[]for a multi-value string actually is. For regular arrays,[]is skipped per the standard, but for MVDs it looks like theUnnestDimensionCursorwas trying to match group-by behavior. Idk if matching group-by behavior is correct, but the way it was doing it was totally wrong ifnullwasn't id 0 in the dictionary (or if the dictionary had no nulls at all). This PR fixes that by using the idLookup of the underlying dimension selector to lookup the null value id, and if present, using that, and if not, creating a synthetic nullId at position 0 and offsetting all of the real dictionary ids by 1 as we have done in some other places for other reasons (e.g. adapters for default value mode when segments were written in SQL compatible null handling mode).This PR has: