New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release-23.2: opt: fix inverted index constrained scans for equality filters #112791
Conversation
This commit fixes a bug introduced in #101178 that allows the optimizer to generated inverted index scans on columns that are not filtered by the query. For example, an inverted index over the column `j1` could be scanned for a filter involving a different column, like `j2 = '5'`. The bug is caused by a simple omission of code that must check that the column in the filter is an indexed column. Fixes #111963 There is no release note because this bug is not present in any releases. Release note: None
This commit makes `randgen` more likely to generate single-column indexes. It is motivated by the bug #111963, which surprisingly lived on the master branch for sixth months without being detected. It's not entirely clear why TLP or other randomized tests did not catch the bug, which has such a simple reproduction. One theory is that indexes tend to be multi-column and constrained scans on multi-column inverted indexes are not commonly planned for randomly generated queries because the set of requirements to generate the scan are very specific: the query must hold each prefix column constant, e.g. `a=1 AND b=2 AND j='5'::JSON`. The likelihood of randomly generating such an expression may be so low that the bug was not caught. By making 50% of indexes single-column, this bug may have been more likely to be caught because only the inverted index column needs to be constrained by an equality filter. Release note: None
b7c629a
to
1776e14
Compare
b930c08
to
99f5f57
Compare
Thanks for opening a backport. Please check the backport criteria before merging:
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
Add a brief release justification to the body of your PR to justify this backport. Some other things to consider:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball, @mgartner, and @rharding6373)
Backport 2/2 commits from #112654 on behalf of @mgartner.
/cc @cockroachdb/release
opt: fix inverted index constrained scans for equality filters
This commit fixes a bug introduced in #101178 that allows the optimizer
to generated inverted index scans on columns that are not filtered by
the query. For example, an inverted index over the column
j1
could bescanned for a filter involving a different column, like
j2 = '5'
. Thebug is caused by a simple omission of code that must check that the
column in the filter is an indexed column.
Fixes #111963
There is no release note because this bug is not present in any
releases.
Release note: None
randgen: generate single-column indexes more often
This commit makes
randgen
more likely to generate single-columnindexes. It is motivated by the bug #111963, which surprisingly lived on
the master branch for sixth months without being detected. It's not
entirely clear why TLP or other randomized tests did not catch the bug,
which has such a simple reproduction.
One theory is that indexes tend to be multi-column and constrained scans
on multi-column inverted indexes are not commonly planned for randomly
generated queries because the set of requirements to generate the scan
are very specific: the query must hold each prefix column constant, e.g.
a=1 AND b=2 AND j='5'::JSON
. The likelihood of randomly generatingsuch an expression may be so low that the bug was not caught.
By making 10% of indexes single-column, this bug may have been more
likely to be caught because only the inverted index column needs to be
constrained by an equality filter.
Release note: None
Release justification: Fixes a regression that causes incorrect
results for queries involving inverted JSON indexes.