-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql, server: fix errors seen from combined statements endpoint #74831
Conversation
48d54d0
to
90f510d
Compare
Not sure if we want to include the third commit in this fix as it wasn't discussed too much, so I can remove it or create a separate PR for it. |
7dfdb14
to
429aca9
Compare
Was the response to the browser endpoint where we hit the size issues or the response to the internal gRPC call to collect stats from each node? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 15 of 21 files at r1, 4 of 6 files at r2, 6 of 6 files at r3, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @xinhaoz)
pkg/sql/sqlstats/persistedsqlstats/cluster_settings.go, line 85 at r3 (raw file):
// SQLStatsAggregationInterval is the cluster setting that controls the aggregation // interval for stats when we flush to disk var SQLStatsAggregationInterval = settings.RegisterDurationSetting(
nit: should we add some comment or check that this value is always bigger than sql.stats.flush.interval
?
pkg/cli/zip_test.go, line 93 at r3 (raw file):
'statement_statistics', 'transaction_statistics', 'cluster_transaction_statistics',
you added this twice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this out and it works nicely. I think it strikes a good balance. Do we have any clusters with high traffic and large node counts to validate this on? Once upon a time there were long running clusters where such bugs would have been obvious. I hope we can bring them back one day. Curious about the methodology we have today.
I think it'd be nice to see some execbuilder
logic tests that show query plans for queries made by the various requests which demonstrate that they don't end up doing any full table scans. You can add a new file or two to sql/opt/exec/execbuilder/testdata
which have the queries as they'd be issued from the http endpoint and have it rewrite the plans. That'll provide confidence and prevent regressions.
When I looked at the queries I was a little surprised to see no limits. I wonder if that may prove to also be a key part of the problem here.
The thing that leaves me most anxious is a race condition where either we see the stats in-memory and then we also see them on disk or we don't see them in either place because they've just been flushed and our out of memory but our disk read was at a timestamp before the flush. Solving that problem definitely shouldn't happen here, but it's something we should track. Solving it is a little tricky. We'd to make sure to ship a timestamp in the RPCs and we'd need to hold on to some in-memory state after we've flushed it to serve. The follower reads make this problem much much more common.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @maryliag, and @xinhaoz)
pkg/sql/crdb_internal.go, line 4986 at r3 (raw file):
} var crdbInternalClusterStmtStatsTable = virtualSchemaTable{
I think this deserves commentary noting that it will only contain data that has not yet been flushed to the persistent system tables. Same for the other new table.
pkg/sql/crdb_internal.go, line 5086 at r3 (raw file):
} var crdbInternalStmtStatsView = virtualSchemaView{
This deserves commentary explaining what it is and why it exists. Same for the other view.
pkg/sql/logictest/testdata/logic_test/crdb_internal, line 646 at r1 (raw file):
/Table/62/2/1 /Table/62/2/2 /Table/62/2/2 /Table/62/2/3 /Table/62/2/3 /Table/62/3/1
I think you can lose this. I suspect it was added by the rewrite.
Code quote:
/Table/62/1/1 /Table/62/1/2
/Table/62/1/2 /Table/62/1/3
/Table/62/1/3 /Table/62/2/1
/Table/62/2/1 /Table/62/2/2
/Table/62/2/2 /Table/62/2/3
/Table/62/2/3 /Table/62/3/1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @Azhng, @maryliag, and @xinhaoz)
pkg/server/combined_statement_stats.go, line 109 at r1 (raw file):
} followerReadClause := "AS OF SYSTEM TIME follower_read_timestamp()"
quick refactor suggestion: since the pattern of checking the testing knobs and return the string representation of the AOST clause comes up so frequently, we can refactor this logic into a method defined on the sqlstats.TestingKnobs
.
func (t *TestingKnobs ) GetAOSTClause() string {
if testingKnobs != nil {
return testingKnobs.AOSTClause
}
return "AS OF SYSTEM TIME follower_read_timestamp()"
}
pkg/server/tenant_status.go, line 466 at r1 (raw file):
return getCombinedStatementStats(ctx, req, t.sqlServer.pgServer.SQLServer.GetSQLStatsProvider(), t.sqlServer.internalExecutor, t.sqlServer.execCfg.SQLStatsTestingKnobs)
minor style nit: this function call is getting quite long, it would improve the readability a bit by vertically align the arguments
return getCombinedStatementStats(
ctx,
req,
t.sqlServer.pgServer.SQLServer.GetSQLStatsProvider(),
t.sqlServer.internalExecutor,
t.sqlServer.execCfg.SQLStatsTestingKnobs,
)
pkg/server/serverpb/status.proto, line 1204 at r3 (raw file):
int64 end = 4 [(gogoproto.nullable) = true]; bool exclude_statements = 5; bool exclude_transactions = 6;
Don't think I'm a very big fan of bolting on boolean flags onto an already big endpoint. Is there any reason this method is in favour over new split endpoint approach?
If we really need to go with this route, I would prefer an "inclusive" flag over "exclusive" flag. Or even better, perhaps defining an enum here:
enum FetchMode {
StmtAndTxnStats = 0;
StmtStatsOnly = 1;
TxnStatsOnly = 2;
}
pkg/sql/sqlstats/persistedsqlstats/cluster_settings.go, line 85 at r3 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
nit: should we add some comment or check that this value is always bigger than
sql.stats.flush.interval
?
Hmm it's a bit difficult to write validation function directly for this cluster setting. The closest thing we have is probably SetOnChange
callback that we can register to validate the input, and perhaps logs a warning message if we detect any anomaly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @Azhng, @maryliag, and @xinhaoz)
pkg/sql/sqlstats/persistedsqlstats/cluster_settings.go, line 85 at r3 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Hmm it's a bit difficult to write validation function directly for this cluster setting. The closest thing we have is probably
SetOnChange
callback that we can register to validate the input, and perhaps logs a warning message if we detect any anomaly.
My concern was that if that value is smaller, if that could cause any issues.
One solution could be here that here (https://github.com/cockroachdb/cockroach/pull/74831/files#diff-070ff15c495b929697bdcf0d193a22d0c5e1cbafa596cf6dbf4731bbbf8e7144R255) we compare the values and if the aggregation is smaller, return the flush value instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @Azhng, @maryliag, and @xinhaoz)
pkg/sql/sqlstats/persistedsqlstats/cluster_settings.go, line 85 at r3 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
My concern was that if that value is smaller, if that could cause any issues.
One solution could be here that here (https://github.com/cockroachdb/cockroach/pull/74831/files#diff-070ff15c495b929697bdcf0d193a22d0c5e1cbafa596cf6dbf4731bbbf8e7144R255) we compare the values and if the aggregation is smaller, return the flush value instead?
Hmm good point. We could potentially override that if the cluster setting doesn't make sense at all. Perhaps we go one step further and just directly take this into our own hand? If we see the agg interval setting doesn't make sense, we just go ahead and set it to flush interval. Though we should probably:
- log a message saying that we are overriding user's setting
- be explicit on the release note that we will have this behavior.
pkg/server/serverpb/status.proto, line 1204 at r3 (raw file): Previously, Azhng (Archer Zhang) wrote…
I threw this in to help mitigate the response size issues in a way that wouldn't require any changes to the frontend (and potentially further considerations or modifications). We can still go with a split endpoint for stmts and txns in the future, or if we'd rather not expand this EP more and wait for the former to be done, I can remove this commit from the PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: For the note in commit 2,
Release note (sql change): The default sql stats flush interval
is now 20 minutes...
I think this should be 10 minutes, not 20?
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @Azhng, @maryliag, and @xinhaoz)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @Azhng, @maryliag, and @xinhaoz)
pkg/server/serverpb/status.proto, line 1204 at r3 (raw file):
Previously, xinhaoz (Xin Hao Zhang) wrote…
I threw this in to help mitigate the response size issues in a way that wouldn't require any changes to the frontend (and potentially further considerations or modifications). We can still go with a split endpoint for stmts and txns in the future, or if we'd rather not expand this EP more and wait for the former to be done, I can remove this commit from the PR.
Let's split the last commit into its own PR since it's less so urgent for 21.2.5 release. We can iterate on the API design a bit more.
1559fea
to
38c69e5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @Azhng, @kevin-v-ngo, @maryliag, and @xinhaoz)
pkg/server/combined_statement_stats.go, line 140 at r9 (raw file):
app_name, aggregated_ts, jsonb_set(
I think you might want to rebase against @THardy98 's PR or there's gonna be merge conflict.
pkg/server/combined_statement_stats.go, line 242 at r9 (raw file):
func collectCombinedTransactions( ctx context.Context, ie *sql.InternalExecutor, whereClause *string, qargs []interface{},
Any particular reason it's a string pointer here? In Golang string is just a struct with length and pointer to underlying bytes, so the overhead of passing the string is not that high.
38c69e5
to
3d52725
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @Azhng, @kevin-v-ngo, @maryliag, and @xinhaoz)
pkg/server/combined_statement_stats.go, line 139 at r6 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Let's open an issue to track a new endpoint that prettify the statement just for the Statement Detail Page.
Ah right, rebased and filed: #75466
pkg/server/combined_statement_stats.go, line 242 at r9 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Any particular reason it's a string pointer here? In Golang string is just a struct with length and pointer to underlying bytes, so the overhead of passing the string is not that high.
TIL! Removed the pointer for the clause string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, seems like CI is complaining about malformed SQL statement
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @Azhng, @kevin-v-ngo, @maryliag, and @xinhaoz)
pkg/sql/crdb_internal.go, line 5305 at r11 (raw file):
max(metadata), crdb_internal.merge_transaction_stats(array_agg(statistics)), aggregation_interval
small nit: misaligned indentation
3d52725
to
617ed5e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great! Added a few nits and comments
Reviewed 7 of 42 files at r10, 29 of 35 files at r11, 2 of 2 files at r12, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @Azhng, @kevin-v-ngo, @maryliag, and @xinhaoz)
pkg/sql/sqlstats/test_utils.go, line 38 at r12 (raw file):
// GetAOSTClause returns the appropriate AS OF SYSTEM TIME clause to be //used when reading from statements and transactions system tables.
small nit: space after //
pkg/sql/sqlstats/persistedsqlstats/cluster_settings.go, line 85 at r12 (raw file):
// SQLStatsAggregationInterval is the cluster setting that controls the aggregation // interval for stats when we flush to disk
small nit: period
pkg/ui/workspaces/cluster-ui/src/statementsPage/statementsPage.tsx, line 285 at r12 (raw file):
componentDidUpdate = (): void => { this.updateQueryParams();
if you remove the updateQueryParams from here, it won't properly update the query params when a tab is changed under the SQL Activity page. That is the main reason we need to keep it here
pkg/ui/workspaces/cluster-ui/src/statementsPage/statementsPage.tsx, line 285 at r12 (raw file): Previously, maryliag (Marylia Gutierrez) wrote…
Oh these changes weren't meant to be included here, so I'll remove them from this PR. |
617ed5e
to
400e1dd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @Azhng, @kevin-v-ngo, @maryliag, and @xinhaoz)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 35 files at r11, 3 of 3 files at r13, all commit messages.
Reviewable status: complete! 2 of 0 LGTMs obtained (waiting on @ajwerner and @xinhaoz)
Partially addresses: cockroachdb#71245 Previously, we created virtual tables to join in-memory and persisted disk statement and transaction statistics. This proved to be inefficient as requests to the the virtual tables lead to full scans of the underlying system tables. This commit utilizes virtual views to combine in-memory and disk stats. The view queries in-memory stats from the virtual tables `crdb_internal.cluster_{statement, transactions}_statistics`, and combines them with the results from the system tables. This allows us to push down query filters into the system tables, leveraging their existing indexes. Release note: None
Closes: cockroachdb#71245 Previously, we flush in-memory sql stats collected by each node on an hourly interval. We have found that this hourly interval might be too conservative, and the size of the returned cluster-wide stats after an hour can also be quite large, sometimes exceeding the gRPC max message size. This commit lowers the flush interval to every 10 minutes. Since we want to continue to aggregate stats on an hourly interval, we introduce a new cluster setting `sql.stats.aggregation.interval` to control the aggregation interval separately from the flush frequency. Release note (sql change): The default sql stats flush interval is now 10 minutes. A new cluster setting `sql.stats.aggregatinon.interval` controls the aggregation interval of sql stats, with a default value of 1 hour.
Closes: cockroachdb#71829 Previously, the /statements endpoint returned cluster-wide in-memory stats, containing both statements and transactions stats. In the past, we've observed the Statements endpoint response being too large for gRPC. Because this endpoint is being used by virtual tables that powers our combined stats api, cluster_{statement,transactions}_stats, we might continue to surpass the gRPC message size in the new api. However, each virtual table only uses roughly half the response size (either statement or txn stats). This commit allows the virtual tables to exclude statement or txn stats from the Statements endpoint resposne by introducing new request parameters to /statements. This reduces the response size in the stats virtual tables. Release note: None
400e1dd
to
5d51963
Compare
TFTR!!! |
Build succeeded: |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 3fe4e08 to blathers/backport-release-21.2-74831: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 21.2.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
Commit 1 sql, server: use a view to join persisted and in-mem sql stats
Partially addresses: #71245
Previously, we created virtual tables to join in-memory and
persisted disk statement and transaction statistics. This
proved to be inefficient as requests to the the virtual
tables lead to full scans of the underlying system tables.
This commit utilizes virtual views to combine in-memory and disk
stats. The view queries in-memory stats from the virtual tables
crdb_internal.cluster_{statement, transaction}_statistics,
and combines them with the results from the system tables.
This allows us to push down query filters into the system
tables, leveraging their existing indexes.
Release note: None
Commit 2 sql: decrease stats flush interval to every 10 mins
Previously, we flush in-memory sql stats collected by each
node on an hourly interval. We have found that this hourly
interval might be too conservative, and the size of the
returned cluster-wide stats after an hour can also be
quite large, sometimes exceeding the gRPC max message
size.
This commit lowers the flush interval to every 10 minutes.
Since we want to continue to aggregate stats on an hourly
interval, we introduce a new cluster setting
sql.stats.aggregation.interval
to control theaggregation interval separately from the flush frequency.
Release note (sql change): The default sql stats flush interval
is now 10 minutes. A new cluster setting
sql.stats.aggregation.interval
controls the aggregationinterval of sql stats, with a default value of 1 hour.
Commit 3 server: allow statements EP to optionally exclude stmt or txns stats
Closes: #71829
Previously, the /statements endpoint returned cluster-wide
in-memory stats, containing both statements and transactions stats.
In the past, we've observed the Statements endpoint response being
too large for gRPC. Because this endpoint is being used by virtual
tables that powers our combined stats api,
cluster_{statement,transactions}_stats, we might continue to surpass
the gRPC message size in the new api. However, each virtual table
only uses roughly half the response size (either statement or txn
stats).
This commit allows the virtual tables to exclude statement or txn
stats from the Statements endpoint resposne by introducing new
request parameters to /statements. This reduces the response size
in the stats virtual tables.
Release note: None