-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql, opt: support hint to disallow full scan, update coster to avoid full scans #71317
Conversation
I'm going to add a few more tests, so hold off on reviewing for now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, I love to see this.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @michae2 and @RaduBerinde)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed just the first commit for now.
Reviewed 8 of 8 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @michae2, @RaduBerinde, and @rytaft)
-- commits, line 5 at r1:
I brought this up when disallow_full_table_scans
was added and AFAIK this was intentional. @arulajmani Do you remember why we made the decision to have disallow_full_table_scans
prevent full scans on partial indexes?
docs/generated/sql/bnf/stmt_block.bnf, line 2804 at r1 (raw file):
| 'NO_INDEX_JOIN' | 'NO_ZIGZAG_JOIN' | 'NO_FULL_SCAN'
nit: seems like this change should not be in the first commit?
pkg/sql/opt/exec/execbuilder/relational.go, line 673 at r1 (raw file):
// Save if we planned a full table/index scan on the builder so that the // planner can be made aware later. We only do this for non-virtual tables. if !tab.IsVirtualTable() && isUnfiltered {
nit: assignment to a variable seems unnecessary
f6811dd
to
fed1e6b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR! The other commits are RFAL now too.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @mgartner, @michae2, and @RaduBerinde)
Previously, mgartner (Marcus Gartner) wrote…
I brought this up when
disallow_full_table_scans
was added and AFAIK this was intentional. @arulajmani Do you remember why we made the decision to havedisallow_full_table_scans
prevent full scans on partial indexes?
Maybe it was intentional, but it seems very suboptimal to me -- why should a full scan on INDEX (b) WHERE b > 0
fail while a constrained scan of INDEX (b)
succeeds?
docs/generated/sql/bnf/stmt_block.bnf, line 2804 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: seems like this change should not be in the first commit?
Oops good catch. Fixed
pkg/sql/opt/exec/execbuilder/relational.go, line 673 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: assignment to a variable seems unnecessary
The variable is used again in a later commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! I think many people will be really happy with this!
Reviewed 5 of 8 files at r1, 15 of 15 files at r6, 8 of 8 files at r7, 12 of 12 files at r8, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @arulajmani, @mgartner, and @RaduBerinde)
pkg/sql/opt/exec/execbuilder/relational.go, line 665 at r8 (raw file):
Quoted 4 lines of code…
// Normally a full scan of a partial index would be allowed with the // NO_FULL_SCAN hint (isUnfiltered is false for partial indexes), but if the // user has explicitly forced the partial index *and* used NO_FULL_SCAN, we // disallow the full index scan.
Just clarifying: is this because forcing the partial index without a matching predicate could lead to a full scan of the partial index plus a constrained scan of another index to include the rows not in the partial index? And that wouldn't be detected as a full scan?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some tests with UPDATE
and DELETE
? (we can have a filter like k IN (...)
where the list is significantly larger than the table size - this kind of query would otherwise do a full table scan. There are some existing tests with hints in execbuilder/testdata/update and delete
Reviewable status:
complete! 2 of 0 LGTMs obtained (waiting on @arulajmani and @mgartner)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 15 of 15 files at r6, 8 of 8 files at r7, 12 of 12 files at r8, all commit messages.
Reviewable status:complete! 3 of 0 LGTMs obtained (waiting on @arulajmani)
-- commits, line 5 at r1:
I agree with you on this, just wanted to bring it to your attention. I searched for anything we wrote down about this decision but didn't find anything. My guess is that we just used the same logic from the `sql.log.slow_query.experimental_full_table_scans.enabled] setting added here.
Tthe docs say this about disallow_full_table_scans
:
If set to on, all queries that have planned a full table or full secondary index scan will return an error message. This setting does not apply to internal queries, which may plan full table or index scans without checking the session variable.
Maybe we should clarify full non-partial secondary index scan
?
pkg/sql/opt/exec/execbuilder/relational.go, line 673 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
The variable is used again in a later commit
Oops, sorry.
Release note (sql change): Fixed an oversight in which a full scan of a partial index could be rejected due to the disallow_full_table_scans setting. Full scans of partial indexes will no longer be rejected if disallow_full_table_scans is true, since a full scan of a partial index must be a constrained scan of the table.
Updated the coster so that full table scans and full scans of non-partial indexes are given a "huge cost" if disallow_full_table_scans is true and stats are unavailable or the estimated row count is greater than large_full_scan_rows. Since disallow_full_table_scans and large_full_scan_rows can now affect the chosen plan, the optimizer now tracks these settings in the memo to ensure cached plans are not stale. Release note (sql change): The optimizer has been updated so that if disallow_full_table_scans is true, it will never plan a full table scan with an estimated row count greater than large_full_scan_rows. If no alternative plan is possible, an error will be returned, just as it was before. However, cases where an alternative plan is possible will no longer produce an error, since the alternative plan will be chosen. As a result, users should see fewer errors due to disallow_full_table_scans. A side effect of this change is that if disallow_full_table_scans is set along with statement-level hints such as an index hint, the optimizer will try to avoid a full scan while also respecting the index hint. If this is not possible, the optimizer will return an error and might not log the attempted full table scan or update the sql.guardrails.full_scan_rejected.count metric. If no index hint is used, the full scan will be logged and the metric updated.
Release note (sql change): Added support for a new index hint, NO_FULL_SCAN, which will prevent the optimizer from planning a full scan for the specified table. The hint can be used in the same way as other existing index hints. For example, SELECT * FROM table_name@{NO_FULL_SCAN};. Note that a full scan of a partial index may still be planned, unless NO_FULL_SCAN is forced in combination with a specific partial index via FORCE_INDEX=index_name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTRs!
Could you add some tests with UPDATE and DELETE?
Done.
Reviewable status:
complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @arulajmani, @mgartner, @michae2, and @stbof)
Previously, mgartner (Marcus Gartner) wrote…
I agree with you on this, just wanted to bring it to your attention. I searched for anything we wrote down about this decision but didn't find anything. My guess is that we just used the same logic from the `sql.log.slow_query.experimental_full_table_scans.enabled] setting added here.
Tthe docs say this about
disallow_full_table_scans
:If set to on, all queries that have planned a full table or full secondary index scan will return an error message. This setting does not apply to internal queries, which may plan full table or index scans without checking the session variable.
Maybe we should clarify
full non-partial secondary index scan
?
Good call. I assume this will get picked up from the release note, but cc @stbof for awareness that the docs on disallow_full_table_scans
will need an update.
pkg/sql/opt/exec/execbuilder/relational.go, line 665 at r8 (raw file):
Previously, michae2 (Michael Erickson) wrote…
// Normally a full scan of a partial index would be allowed with the // NO_FULL_SCAN hint (isUnfiltered is false for partial indexes), but if the // user has explicitly forced the partial index *and* used NO_FULL_SCAN, we // disallow the full index scan.
Just clarifying: is this because forcing the partial index without a matching predicate could lead to a full scan of the partial index plus a constrained scan of another index to include the rows not in the partial index? And that wouldn't be detected as a full scan?
Forcing the partial index without a matching predicate would fail with an error "index "index_name" is a partial index that does not contain all the rows needed to execute this query".
I just added this exception because I've seen examples from customer queries where CRDB was performing a full scan of the partial index when it really could have constrained it further. For example, if we have a partial index index (a) where a > 0
and a predicate a = 1
, we should be able to use the partial index and constrain it to a: [/1 - /1]
. This exception will allow us to hint more precisely in cases like this.
It also just seems intuitive that if you're using NO_FULL_SCAN
with a specific index, you really want to avoid a full scan of that index.
Does this help clarify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @arulajmani, @mgartner, @michae2, and @stbof)
pkg/sql/opt/exec/execbuilder/relational.go, line 665 at r8 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Forcing the partial index without a matching predicate would fail with an error "index "index_name" is a partial index that does not contain all the rows needed to execute this query".
I just added this exception because I've seen examples from customer queries where CRDB was performing a full scan of the partial index when it really could have constrained it further. For example, if we have a partial index
index (a) where a > 0
and a predicatea = 1
, we should be able to use the partial index and constrain it toa: [/1 - /1]
. This exception will allow us to hint more precisely in cases like this.It also just seems intuitive that if you're using
NO_FULL_SCAN
with a specific index, you really want to avoid a full scan of that index.Does this help clarify?
Yes, thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors r+
Reviewable status:
complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @arulajmani, @mgartner, @michae2, and @stbof)
Build succeeded: |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from ed0af7e to blathers/backport-release-21.2-71317: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 21.2.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
Thanks for doing this! I'm excited to check it out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (and 2 stale)
@arulajmani Do you remember why we made the decision to have disallow_full_table_scans prevent full scans on partial indexes?
Just catching up on my mentions. I can't remember exactly how we ended up on this behaviour, but I think we were concerned about an unbounded partial index scan. This was probably the wrong way of looking at things now that I see the example Becca illustrated above so happy for things to change here!
Adding |
There is a docs ticket for this: cockroachdb/docs#12279. Actually looking into this more. This PR has several release notes and we had a period of time when some doc issues were not created. Actually cockroachdb/docs#11936 is the doc ticket for |
sql: partial index scans are exempt from
disallow_full_table_scans
Release note (sql change): Fixed an oversight in which a full scan of a
partial index could be rejected due to the
disallow_full_table_scans
setting. Full scans of partial indexes will no longer be rejected if
disallow_full_table_scans
is true, since a full scan of a partial indexmust be a constrained scan of the table.
opt: update coster to avoid full scans with
disallow_full_table_scans
Updated the coster so that full table scans and full scans of non-partial
indexes are given a "huge cost" if
disallow_full_table_scans
is true andstats are unavailable or the estimated row count is greater than
large_full_scan_rows
. Since disallow_full_table_scans andlarge_full_scan_rows
can now affect the chosen plan, the optimizer now tracksthese settings in the memo to ensure cached plans are not stale.
Fixes #70795
Release note (sql change): The optimizer has been updated so that if
disallow_full_table_scans
is true, it will never plan a full table scan withan estimated row count greater than
large_full_scan_rows
. If no alternativeplan is possible, an error will be returned, just as it was before. However,
cases where an alternative plan is possible will no longer produce an error,
since the alternative plan will be chosen. As a result, users should see
fewer errors due to
disallow_full_table_scans
.A side effect of this change is that if
disallow_full_table_scans
is set alongwith statement-level hints such as an index hint, the optimizer will try to
avoid a full scan while also respecting the index hint. If this is not
possible, the optimizer will return an error and might not log the attempted
full table scan or update the
sql.guardrails.full_scan_rejected.count
metric.If no index hint is used, the full scan will be logged and the metric updated.
opt,sql: support hint to disallow full scan
Release note (sql change): Added support for a new index hint,
NO_FULL_SCAN
, which will prevent the optimizer from planning afull scan for the specified table. The hint can be used in the
same way as other existing index hints. For example,
SELECT * FROM table_name@{NO_FULL_SCAN};
. Note that a full scanof a partial index may still be planned, unless
NO_FULL_SCAN
isforced in combination with a specific partial index via
FORCE_INDEX=index_name
.