New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release-23.2: inverted: fix intersection of SpanExpressions in some cases #118360
release-23.2: inverted: fix intersection of SpanExpressions in some cases #118360
Conversation
This commit fixes the logic for intersecting two SpanExpressions in some cases. In particular, consider the following example: - left expr is `ARRAY['str1'] <@ t.links` which translates to a SpanExpression in which only FactoredUnionSpans is set to the span containing `str1` - right expr is `ARRAY ['str1'] && t.links OR ...` which translates to a SpanExpression in which FactoredUnionSpans is set to exactly same span containing `str1` plus some other stuff in children expressions. In other words, we have `a AND (a OR b)`. When intersecting SpanExpressions, we intersect their FactoredUnionSpans, and then we update FactoredUnionSpans of expressions to subtract the shared ones. Previously, in the example above this would result in making the left expression empty, so it would be pruned. However, that is incorrect - we have the equivalent of `left AND right`, so we must ensure that `left` expression is satisfied, and previously this wouldn't be the case. The fix is that if one of the children expressions became empty, then intersection of two children is also empty, so instead of promoting non-empty child we now nil non-empty child out. In the example above, previously we would get `a OR b`, and now we'll correctly get `a`. Release note (bug fix): Previously, in some cases CockroachDB could incorrectly evaluate queries that scanned an inverted index and had a `WHERE` filter in which two sides of the `AND` expression had "similar" expressions (e.g. `ARRAY['str1'] <@ col AND (ARRAY['str1'] && col OR ...)`), and this is now fixed. The bug has been present since pre-22.2 version.
e7d6618
to
778652a
Compare
Thanks for opening a backport. Please check the backport criteria before merging:
If your backport adds new functionality, please ensure that the following additional criteria are satisfied:
Also, please add a brief release justification to the body of your PR to justify this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner, @rafiss, and @yuzefovich)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @blathers-crl[bot], @rafiss, and @yuzefovich)
pkg/sql/inverted/testdata/expression
line 645 at r1 (raw file):
---- span expression ├── tight: true, unique: false
Why is this tight if we don't have spans for b AND c
?
Previously, mgartner (Marcus Gartner) wrote…
|
Previously, rytaft (Rebecca Taft) wrote…
My confusion is how do the spans from only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 2 of 0 LGTMs obtained (waiting on @rafiss and @yuzefovich)
pkg/sql/inverted/testdata/expression
line 645 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
My confusion is how do the spans from only
a
fully describea OR (b AND c)
?
Oh. a AND (a OR (b AND c))
is just a
. I misread the comment above:
Backport 1/1 commits from #118256 on behalf of @yuzefovich.
/cc @cockroachdb/release
This commit fixes the logic for intersecting two SpanExpressions in some cases. In particular, consider the following example:
ARRAY['str1'] <@ t.links
which translates to a SpanExpression in which only FactoredUnionSpans is set to the span containingstr1
ARRAY ['str1'] && t.links OR ...
which translates to a SpanExpression in which FactoredUnionSpans is set to exactly same span containingstr1
plus some other stuff in children expressions.In other words, we have
a AND (a OR b)
.When intersecting SpanExpressions, we intersect their FactoredUnionSpans, and then we update FactoredUnionSpans of expressions to subtract the shared ones. Previously, in the example above this would result in making the left expression empty, so it would be pruned. However, that is incorrect - we have the equivalent of
left AND right
, so we must ensure thatleft
expression is satisfied, and previously this wouldn't be the case. The fix is that if one of the children expressions became empty, then intersection of two children is also empty, so instead of promoting non-empty child we now nil non-empty child out. In the example above, previously we would geta OR b
, and now we'll correctly geta
.Fixes: #117979.
Release note (bug fix): Previously, in some cases CockroachDB could incorrectly evaluate queries that scanned an inverted index and had a
WHERE
filter in which two sides of theAND
expression had "similar" expressions (e.g.ARRAY['str1'] <@ col AND (ARRAY['str1'] && col OR ...)
), and this is now fixed. The bug has been present since pre-22.2 version.Release justification: bug fix.