Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: better logging about the optimizer as well as some cancellation checks #85041

Merged
merged 2 commits into from Jul 27, 2022

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Jul 26, 2022

sql: add logging about the optimizer into the trace

This commit derives a separate tracing span on the main query path that
covers the whole optimizer as well as adds a few logging messages around
the optbuilder and the optimization. It also audits some of the already
present logs to be written down before performing possibly long
operations.

Release note: None

xform: check for context cancellation

reviously, the optimizer didn't check the context cancellation ever
which could lead to surprising behavior - e.g. if the statement timeout
occurs in the middle of the optimization, we would still finish the
whole optimization phase only to realize that we were canceled at the
execution time.

This situation is now improved by periodically checking for the context
cancellation every time a group is optimized by optimizeGroup. It's
still possible for the optimizer to continue running long after
cancellation if there is a long running procedure at a deeper level.
For example, a long-running, individual invocation of an exploration
rule will not halt due to context cancellation. However,
the cancellation improvement in this commit should halt long-running
optimization in most cases.

Fixes: #70245.
Addresses: #70314.

Release note: None

@yuzefovich yuzefovich added the do-not-merge bors won't merge a PR with this label. label Jul 26, 2022
@yuzefovich yuzefovich requested review from mgartner, rytaft and a team July 26, 2022 00:33
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@yuzefovich yuzefovich force-pushed the opt-tracing branch 2 times, most recently from 5bc8e7b to 6e3f944 Compare July 26, 2022 17:18
@yuzefovich yuzefovich removed the do-not-merge bors won't merge a PR with this label. label Jul 26, 2022
@yuzefovich
Copy link
Member Author

Rebased on top of master, RFAL.

@yuzefovich yuzefovich force-pushed the opt-tracing branch 2 times, most recently from e02ed2c to 7b5b2ed Compare July 26, 2022 21:19
Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thanks for doing this! :lgtm:

Reviewed 16 of 18 files at r1, 7 of 7 files at r3, 5 of 5 files at r4, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @rytaft and @yuzefovich)


-- commits line 20 at r4:
nit: make this more specific, like:

This situation is now improved by periodically checking for the context cancellation every time a group is optimized by optimizeGroup. It's still possible for the optimizer to continue running long after cancellation if there is a long running procedure at a deeper level. For example, a long-running, individual invocation of an exploration rule will not halt due to context cancellation. However, the cancellation improvement in this commit should halt long-running optimization in most cases.

This commit derives a separate tracing span on the main query path that
covers the whole optimizer as well as adds a few logging messages around
the optbuilder and the optimization. It also audits some of the already
present logs to be written down before performing possibly long
operations.

Release note: None
@yuzefovich
Copy link
Member Author

Fixed the commit message and moved some changes from the second commit into the first one so that it compiles.

TFTR!

bors r+

Previously, the optimizer didn't check the context cancellation ever
which could lead to surprising behavior - e.g. if the statement timeout
occurs in the middle of the optimization, we would still finish the
whole optimization phase only to realize that we were canceled at the
execution time.

This situation is now improved by periodically checking for the context
cancellation every time a group is optimized by `optimizeGroup`. It's
still possible for the optimizer to continue running long after
cancellation if there is a long running procedure at a deeper level.
For example, a long-running, individual invocation of an exploration
rule will not halt due to context cancellation. However,
the cancellation improvement in this commit should halt long-running
optimization in most cases.

Release note: None
@craig
Copy link
Contributor

craig bot commented Jul 27, 2022

Canceled.

@yuzefovich
Copy link
Member Author

Forgot to include the new logic test file, now fixed.

bors r+

@craig
Copy link
Contributor

craig bot commented Jul 27, 2022

Build succeeded:

@craig craig bot merged commit 658bf2b into cockroachdb:master Jul 27, 2022
@yuzefovich yuzefovich deleted the opt-tracing branch July 27, 2022 22:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

roachtest/sqlsmith: query timed out during the optimization but didn't cancel
3 participants