Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: add metric for queries run with optimizer #26981

Merged
merged 1 commit into from Jul 18, 2018

Conversation

justinj
Copy link
Contributor

@justinj justinj commented Jun 26, 2018

Suggested by @knz.

Release note (sql change): A new metric sql.optimizer.count has been
added which tracks the number of queries with with the experimental
cost-based optimizer.

@justinj justinj requested review from knz, rytaft, mrtracy and a team June 26, 2018 14:47
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@knz
Copy link
Contributor

knz commented Jun 27, 2018

This is nice. Thanks!

While you're at it, perhaps print out the value of the metric at the end of logic tests with t.Log()?


Reviewed 6 of 6 files at r1.
Review status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/executor_statement_metrics.go, line 128 at r1 (raw file):

	if optUsed {
		m.SQLOptCount.Inc(1)

Move this below automaticRetryCount == 0 - we only count once, upon the first execution attempt.


Comments from Reviewable

@knz
Copy link
Contributor

knz commented Jun 27, 2018

Review status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/executor_statement_metrics.go, line 128 at r1 (raw file):

Previously, knz (kena) wrote…

Move this below automaticRetryCount == 0 - we only count once, upon the first execution attempt.

Oh hm. I realize this last comment is based on an undocumented assumption. I think it would be good that you satisfy yourself that:

  1. this method (recordStatementSummary) is indeed called again on every attempt, with an increasing value of automaticRetryCount every time (perhaps you'll want to add a comment above the function and under the conditional below to explain this fully to future readers)
  2. you indeed only want to count unique issues of the queries with opt enabled, not executions

I also realize as I am writing this that the call to RecordStatement() below may not make sense if there are automatic retries -- what does it record exactly for every attempt after the first? Please file an issue with that question and cc me and david on it. thanks


Comments from Reviewable

@justinj
Copy link
Contributor Author

justinj commented Jun 27, 2018

I ran into enough roadblocks printing out the value that I don't think it's worth the time right now.


Review status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/executor_statement_metrics.go, line 128 at r1 (raw file):

Previously, knz (kena) wrote…

Oh hm. I realize this last comment is based on an undocumented assumption. I think it would be good that you satisfy yourself that:

  1. this method (recordStatementSummary) is indeed called again on every attempt, with an increasing value of automaticRetryCount every time (perhaps you'll want to add a comment above the function and under the conditional below to explain this fully to future readers)
  2. you indeed only want to count unique issues of the queries with opt enabled, not executions

I also realize as I am writing this that the call to RecordStatement() below may not make sense if there are automatic retries -- what does it record exactly for every attempt after the first? Please file an issue with that question and cc me and david on it. thanks

done, and opened #27015


Comments from Reviewable

@knz
Copy link
Contributor

knz commented Jun 28, 2018

:lgtm: thank you


Reviewed 1 of 1 files at r2.
Review status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


Comments from Reviewable

@knz
Copy link
Contributor

knz commented Jun 28, 2018

Oy I completely forgot something: we also want this to go to the reg server.
It's relatively simple, you do this as follows:

  1. add an "opt" flag in roachpb/app_stats.proto StatementStatisticsKey
  2. add it also to stmtKey in sql/app_stats.go
  3. add the parameter to (*appStats) recordStatement() in app_stats.go
  4. propagate the new field in stmtKey to StatementStatisticsKey in (*sqlStats) getStmtStats() in the same file
  5. invent a character for (stmtKey) flags() to report the flag. Feel free to choose anything that's not valid at the start of a SQL query nor a potential comment character, if you lack inspiration you can take *.

@rytaft
Copy link
Collaborator

rytaft commented Jun 29, 2018

This is great, but it seems like we might also want to track which statements are actually planned by the optimizer, without falling back to the 2.0 planner. Is there a way to do that?


Reviewed 5 of 6 files at r1, 1 of 1 files at r2.
Review status: :shipit: complete! 1 of 0 LGTMs obtained


Comments from Reviewable

@knz
Copy link
Contributor

knz commented Jun 30, 2018

This is great, but it seems like we might also want to track which statements are actually planned by the optimizer, without falling back to the 2.0 planner. Is there a way to do that?

If we plug the statement stat key like I recommend in my previous comment, the opt and non-opt statements will be counted separately (and reported to our reg server!) That's exactly what we need for PM.

@justinj
Copy link
Contributor Author

justinj commented Jun 30, 2018

@rytaft optMode gets mutated in this function, so its value at this point reflects whether the optimizer was used or not, which I think measures what you described, unless I'm misunderstanding?

@rytaft
Copy link
Collaborator

rytaft commented Jun 30, 2018

Oops sorry I was confused. Yep, this is what I was talking about.

It might also be interesting to track the number of queries that fell back, vs. those that were just planned by the 2.0 planner in the first place. Maybe we can already figure that out from settings? Are the cluster/session settings also tracked in the same place?


Comments from Reviewable

@justinj
Copy link
Contributor Author

justinj commented Jul 16, 2018

Sorry, forgot about this! I think you're right that that's important @rytaft - I added a sql.optimizer.fallback.count metric, PTAL.

@rytaft
Copy link
Collaborator

rytaft commented Jul 16, 2018

Nice! :lgtm:

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 5 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: but please file my request above in a separate issue, and assign the issue to me and the project "Telemetry". Thank you.

Reviewed 1 of 6 files at r1, 5 of 5 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)

@knz
Copy link
Contributor

knz commented Jul 17, 2018

this one: #26981 (comment)

@justinj
Copy link
Contributor Author

justinj commented Jul 17, 2018

TFTR!

bors r+

@craig
Copy link
Contributor

craig bot commented Jul 17, 2018

Build failed

Release note (sql change): A new metric sql.optimizer.count has been
added which tracks the number of queries with with the experimental
cost-based optimizer.
@justinj
Copy link
Contributor Author

justinj commented Jul 17, 2018

looks like a flake

bors r+

@craig
Copy link
Contributor

craig bot commented Jul 17, 2018

Build failed

@justinj
Copy link
Contributor Author

justinj commented Jul 18, 2018

flake

bors r+

craig bot pushed a commit that referenced this pull request Jul 18, 2018
26981: sql: add metric for queries run with optimizer r=justinj a=justinj

Suggested by @knz.

Release note (sql change): A new metric sql.optimizer.count has been
added which tracks the number of queries with with the experimental
cost-based optimizer.

27651: sql: move lease acquisition out of tableState r=vivekmenezes a=vivekmenezes

This change removes lease acquisition from the tableState
struct and makes lease acquisition a part of LeaseStore.

Future work: Move acquireFreshest and release code paths
out of tableState.

The eventual goal is to have tableState be a simple cache of
table descriptors with the leasing code separated out and
only responsible for the expiration time of the latest descriptor
in the cache.

related to #23510

Release note: None

Co-authored-by: Justin Jaffray <justin@cockroachlabs.com>
Co-authored-by: Vivek Menezes <vivek@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Jul 18, 2018

Build succeeded

@craig craig bot merged commit 4d8fb20 into cockroachdb:master Jul 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants