Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-21.2: sql, opt: support hint to disallow full scan, update coster to avoid full scans #71437

Merged
merged 3 commits into from
Nov 15, 2021

Conversation

rytaft
Copy link
Collaborator

@rytaft rytaft commented Oct 12, 2021

Backport 3/3 commits from #71317.

/cc @cockroachdb/release

Release justification: low risk, high benefit change to existing functionality


sql: partial index scans are exempt from disallow_full_table_scans

Release note (sql change): Fixed an oversight in which a full scan of a
partial index could be rejected due to the disallow_full_table_scans
setting. Full scans of partial indexes will no longer be rejected if
disallow_full_table_scans is true, since a full scan of a partial index
must be a constrained scan of the table.

opt: update coster to avoid full scans with disallow_full_table_scans

Updated the coster so that full table scans and full scans of non-partial
indexes are given a "huge cost" if disallow_full_table_scans is true and
stats are unavailable or the estimated row count is greater than
large_full_scan_rows. Since disallow_full_table_scans and
large_full_scan_rows can now affect the chosen plan, the optimizer now tracks
these settings in the memo to ensure cached plans are not stale.

Fixes #70795

Release note (sql change): The optimizer has been updated so that if
disallow_full_table_scans is true, it will never plan a full table scan with
an estimated row count greater than large_full_scan_rows. If no alternative
plan is possible, an error will be returned, just as it was before. However,
cases where an alternative plan is possible will no longer produce an error,
since the alternative plan will be chosen. As a result, users should see
fewer errors due to disallow_full_table_scans.

A side effect of this change is that if disallow_full_table_scans is set along
with statement-level hints such as an index hint, the optimizer will try to
avoid a full scan while also respecting the index hint. If this is not
possible, the optimizer will return an error and might not log the attempted
full table scan or update the sql.guardrails.full_scan_rejected.count metric.
If no index hint is used, the full scan will be logged and the metric updated.

opt,sql: support hint to disallow full scan

Release note (sql change): Added support for a new index hint,
NO_FULL_SCAN, which will prevent the optimizer from planning a
full scan for the specified table. The hint can be used in the
same way as other existing index hints. For example,
SELECT * FROM table_name@{NO_FULL_SCAN};. Note that a full scan
of a partial index may still be planned, unless NO_FULL_SCAN is
forced in combination with a specific partial index via
FORCE_INDEX=index_name.

Release note (sql change): Fixed an oversight in which a full scan of a
partial index could be rejected due to the disallow_full_table_scans
setting. Full scans of partial indexes will no longer be rejected if
disallow_full_table_scans is true, since a full scan of a partial index
must be a constrained scan of the table.
@rytaft rytaft requested review from mgartner, RaduBerinde, michae2 and a team October 12, 2021 01:24
@rytaft rytaft requested a review from a team as a code owner October 12, 2021 01:24
@blathers-crl
Copy link

blathers-crl bot commented Oct 12, 2021

Thanks for opening a backport.

Please check the backport criteria before merging:

  • Patches should only be created for serious issues.
  • Patches should not break backwards-compatibility.
  • Patches should change as little code as possible.
  • Patches should not change on-disk formats or node communication protocols.
  • Patches should not add new functionality.
  • Patches must not add, edit, or otherwise modify cluster versions; or add version gates.
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
  • There is a high priority need for the functionality that cannot wait until the next release and is difficult to address in another way.
  • The new functionality is additive-only and only runs for clusters which have specifically “opted in” to it (e.g. by a cluster setting).
  • New code is protected by a conditional check that is trivial to verify and ensures that it only runs for opt-in clusters.
  • The PM and TL on the team that owns the changed code have signed off that the change obeys the above rules.

Add a brief release justification to the body of your PR to justify this backport.

Some other things to consider:

  • What did we do to ensure that a user that doesn’t know & care about this backport, has no idea that it happened?
  • Will this work in a cluster of mixed patch versions? Did we test that?
  • If a user upgrades a patch version, uses this feature, and then downgrades, what happens?

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rytaft
Copy link
Collaborator Author

rytaft commented Oct 12, 2021

Will wait for 21.2.1

Updated the coster so that full table scans and full scans of non-partial
indexes are given a "huge cost" if disallow_full_table_scans is true and
stats are unavailable or the estimated row count is greater than
large_full_scan_rows. Since disallow_full_table_scans and
large_full_scan_rows can now affect the chosen plan, the optimizer now tracks
these settings in the memo to ensure cached plans are not stale.

Release note (sql change): The optimizer has been updated so that if
disallow_full_table_scans is true, it will never plan a full table scan with
an estimated row count greater than large_full_scan_rows. If no alternative
plan is possible, an error will be returned, just as it was before. However,
cases where an alternative plan is possible will no longer produce an error,
since the alternative plan will be chosen. As a result, users should see
fewer errors due to disallow_full_table_scans.

A side effect of this change is that if disallow_full_table_scans is set along
with statement-level hints such as an index hint, the optimizer will try to
avoid a full scan while also respecting the index hint. If this is not
possible, the optimizer will return an error and might not log the attempted
full table scan or update the sql.guardrails.full_scan_rejected.count metric.
If no index hint is used, the full scan will be logged and the metric updated.
Release note (sql change): Added support for a new index hint,
NO_FULL_SCAN, which will prevent the optimizer from planning a
full scan for the specified table. The hint can be used in the
same way as other existing index hints. For example,
SELECT * FROM table_name@{NO_FULL_SCAN};. Note that a full scan
of a partial index may still be planned, unless NO_FULL_SCAN is
forced in combination with a specific partial index via
FORCE_INDEX=index_name.
Copy link
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 7 of 7 files at r1, 3 of 7 files at r2, 15 of 15 files at r4, 14 of 14 files at r5, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @RaduBerinde)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants