-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: introduce crdb_internal.statement_statistics virtual table #68715
Conversation
d7f32e2
to
3d3e3ad
Compare
3d3e3ad
to
2429c2b
Compare
cc: @kevin-v-ngo . Can you take a look at this new schema and let me know if you have any questions or concerns? This is what I hope will be replacing the current |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good - left a few comments.
app_name STRING NOT NULL, | ||
metadata JSONB NOT NULL, | ||
statistics JSONB NOT NULL, | ||
plan JSONB NOT NULL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove the plan for now? Since we have a plan_hash which doesn't correlate to to this plan, it might cause confusion. Users might assume a change in the hash indicates a change in the plan which wouldn't always be the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This view will also be used to serve the DB Console. So this column kind of has to here in order for that to happen. I think what we can do here is rename the current virtual table to something like crdb_internal.<some_name_suggests_it_should_not_be_used_by_end_user
and we create a crdb_internal
view on top of it to expose the schema that we want user to query.
Do you have any thoughts on the name we should use? I'm thinking something like crdb_internal.private_statement_statistics
or something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline and renamed the column to sampled_plan
.
@@ -1042,6 +1042,23 @@ CREATE TABLE crdb_internal.session_variables ( | |||
value STRING NOT NULL, | |||
hidden BOOL NOT NULL | |||
) {} {} | |||
CREATE TABLE crdb_internal.statement_statistics ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Seems like the exact same schema as system.statement_statistics right?
What's your rationale for making this a table vs a view?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a table here is necessary because we are exposing cluster-wide in-memory data. This data is not accessible via SQL anywhere in the system and that is the reason why we need to create a virtual table to populate the data ourselves in Golang instead of SQL.
fingerprint_id BYTES NOT NULL, | ||
plan_hash INT8 NOT NULL, | ||
app_name STRING NOT NULL, | ||
metadata JSONB NOT NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the future, we may want to consider a view which would normalize the JSONB COLUMNS (statistics and metadata) so users wouldn't have to do this on their own.
It would make it easier to consume as well for those who rely on the DDL to understand what fields are available to them. Currently, you need to query the JSONB columns to know what's available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. I think this is another point that suggest us to use switch this to a internal virtual table and exposes a user friendly virtual view on top of it.
2429c2b
to
89f8768
Compare
89f8768
to
977b7c7
Compare
Rebased to remove the merged commit. |
metadata JSONB NOT NULL, | ||
statistics JSONB NOT NULL, | ||
sampled_plan JSONB NOT NULL | ||
) CREATE TABLE crdb_internal.statement_statistics ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you duplicate the table definition here intentionally?
// If unsafe flag is set to true, then the new entry is fetched/created without | ||
// locking onto the mutex. This should only be used when caller is absolutely | ||
// sure that the Container is not being used concurrently by multiple goroutines. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than calling it unsafe, what if we called it doNotLock
or alreadyLocked
. It might be the case that it's not explicitly locked above this call but it might be implicitly locked. It's not actually "unsafe" to not lock it, it just means that that locking is being handled at a different level.
Alternatively, much of the rest of the cockroach codebase would split this into two methods, one called getStatsForStmtWithKey
and another called getStatsForStmtWithKeyLocked
where the first is:
func (s *Container) getStatsForStmtWithKey(
key stmtKey, stmtFingerprintID roachpb.StmtFingerprintID, createIfNonexistent bool, unsafe bool,
) (stats *stmtStats, created, throttled bool) {
s.mu.Lock()
defer s.mu.Unlock()
return s.getStatsForStmtWithKey(key, stmtFingerprintID, createIfNonexistent)
}
s.mu.apps = make(map[string]*ssmemstorage.Container) | ||
|
||
for i := range statistics { | ||
container := s.unsafeGetStatsContainerForAppName(statistics[i].Key.KeyData.App) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if you sorted the statistics
slice by App
and then made a form of ssmemstorage
that took a slice of either these things or of some of its fields and then we could stop talking about unsafe
in any exported APIs.
// * it bypasses the mutex locking when it's inserting new data since | ||
// it SHOULD NOT be used concurrently by multiple goroutines. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean SHOULD NOT be used concurrently
here. What should not be used concurrently?
// Perform RPC fanout. | ||
stats, err := p.extendedEvalCtx.SQLStatusServer.Statements(ctx, &serverpb.StatementsRequest{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any plans to do either streaming or memory accounting here?
b02ef9b
to
f763a8c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @kevin-v-ngo)
pkg/sql/crdb_internal.go, line 4984 at r3 (raw file):
Previously, ajwerner wrote…
Do we have any plans to do either streaming or memory accounting here?
Yes: #69032. We would be introducing memory accounting for the RPC endpoint.
For streaming, do you mean the gRPC streaming RPC?
pkg/sql/logictest/testdata/logic_test/create_statements, line 1055 at r3 (raw file):
Previously, ajwerner wrote…
did you duplicate the table definition here intentionally?
This is an intentional duplication of the system table definition (with some book-keeping columns removed. e.g. node_id)
pkg/sql/sqlstats/sslocal/temp_sql_stats.go, line 28 at r3 (raw file):
Previously, ajwerner wrote…
What do you mean
SHOULD NOT be used concurrently
here. What should not be used concurrently?
Hmm this comment actually doesn't make sense. Removed.
pkg/sql/sqlstats/sslocal/temp_sql_stats.go, line 37 at r3 (raw file):
Previously, ajwerner wrote…
what if you sorted the
statistics
slice byApp
and then made a form ofssmemstorage
that took a slice of either these things or of some of its fields and then we could stop talking aboutunsafe
in any exported APIs.
Updated. Now this constructor will sort the slice by App
field and creates ssmemstorage.Container
to ingest that slice.
pkg/sql/sqlstats/ssmemstorage/ss_mem_storage.go, line 348 at r3 (raw file):
Previously, ajwerner wrote…
Rather than calling it unsafe, what if we called it
doNotLock
oralreadyLocked
. It might be the case that it's not explicitly locked above this call but it might be implicitly locked. It's not actually "unsafe" to not lock it, it just means that that locking is being handled at a different level.Alternatively, much of the rest of the cockroach codebase would split this into two methods, one called
getStatsForStmtWithKey
and another calledgetStatsForStmtWithKeyLocked
where the first is:func (s *Container) getStatsForStmtWithKey( key stmtKey, stmtFingerprintID roachpb.StmtFingerprintID, createIfNonexistent bool, unsafe bool, ) (stats *stmtStats, created, throttled bool) { s.mu.Lock() defer s.mu.Unlock() return s.getStatsForStmtWithKey(key, stmtFingerprintID, createIfNonexistent) }
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 8 of 15 files at r2, 4 of 17 files at r3, 2 of 4 files at r4, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @Azhng, and @kevin-v-ngo)
pkg/sql/logictest/testdata/logic_test/create_statements, line 1055 at r3 (raw file):
Previously, Azhng (Archer Zhang) wrote…
This is an intentional duplication of the system table definition (with some book-keeping columns removed. e.g. node_id)
I don't understand the need for the duplication (and it's exactly the same, none of the two definitions have node_id as you mentioned)
pkg/sql/sqlstats/ssmemstorage/ss_mem_storage.go, line 348 at r3 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Done.
The comment on the function needs to be updated. You're not passing any unsafe flag on this function anymore
pkg/sql/sqlstats/ssmemstorage/ss_mem_storage.go, line 210 at r4 (raw file):
} // NewTempContainerFromExistingData creates a new Container by ingests a slice
nit: by ingesting
something is missing on this comment, creates by ingesting the slice into what?
6379927
to
d9209ba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @kevin-v-ngo)
pkg/sql/logictest/testdata/logic_test/create_statements, line 1055 at r3 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
I don't understand the need for the duplication (and it's exactly the same, none of the two definitions have node_id as you mentioned)
Oh wait you meant the duplicate in the test file? This is because this is selecting from two different columns from the crdb_internal.create_statements
. (create_statements
and create_nofks
).
pkg/sql/sqlstats/ssmemstorage/ss_mem_storage.go, line 348 at r3 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
The comment on the function needs to be updated. You're not passing any unsafe flag on this function anymore
Done.
pkg/sql/sqlstats/ssmemstorage/ss_mem_storage.go, line 210 at r4 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
nit: by ingesting
something is missing on this comment, creates by ingesting the slice into what?
Done. The comment says it creates a new Container by ingesting the slice. Do you want to word it differently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please get the approval from @ajwerner before merging
Reviewed 1 of 17 files at r3, 1 of 4 files at r4, 1 of 1 files at r5, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @kevin-v-ngo)
d9209ba
to
8ba65a5
Compare
8ba65a5
to
6348748
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some nits left from me
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @Azhng, @kevin-v-ngo, and @maryliag)
pkg/sql/crdb_internal.go, line 4984 at r3 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Yes: #69032. We would be introducing memory accounting for the RPC endpoint.
For streaming, do you mean the gRPC streaming RPC?
I'm not exactly sure, I suppose that that's an approach. It might not be a good one. Some form of pagination in unary RPCs might also be useful.
The more I think about it, the more I think accounting matters more. Each node shouldn't have more than some number of MBs. Maybe we do something like pre-reserve the memory before sending the RPC and have the server only send up to that many bytes or perhaps fail. Anyway, good you have a tracking issue.
pkg/sql/sqlstats/sslocal/temp_sql_stats.go, line 39 at r6 (raw file):
Quoted 4 lines of code…
tmp := s[i] s[i] = s[j] s[j] = tmp
no need for this in go. This should be:
func (s stmtResponseList) Swap(i, j int) { s[i], s[j] = s[j], s[i] }
pkg/sql/sqlstats/ssmemstorage/ss_mem_storage.go, line 384 at r6 (raw file):
} func (s *Container) getStatsForStmtWithKeyNotLocked(
I think this should be called Locked
as in it implies that the data structure is locked. You'll find this idiom all over the kvserver
code base and those methods are often used also during construction. Calling this NotLocked
I find very confusing.
This commit introduces crdb_internal.statement_statistics virtual table that exposes both cluster-wide in-memory statement statistics as well as persited statement statistics. This new virtual table will be used to replace crdb_internal.node_statement_statistics virtual table, which only surface node-local in-memory statement statsitics. Release note (sql change): introduced new crdb_internal.statement_statistics virtual table that surfaces both cluster-wide in-memory statement statistics as well as persited statement statistics.
6348748
to
7c27f3f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @kevin-v-ngo and @maryliag)
pkg/sql/crdb_internal.go, line 4984 at r3 (raw file):
Previously, ajwerner wrote…
I'm not exactly sure, I suppose that that's an approach. It might not be a good one. Some form of pagination in unary RPCs might also be useful.
The more I think about it, the more I think accounting matters more. Each node shouldn't have more than some number of MBs. Maybe we do something like pre-reserve the memory before sending the RPC and have the server only send up to that many bytes or perhaps fail. Anyway, good you have a tracking issue.
I see. Sounds good,
pkg/sql/sqlstats/sslocal/temp_sql_stats.go, line 39 at r6 (raw file):
Previously, ajwerner wrote…
tmp := s[i] s[i] = s[j] s[j] = tmp
no need for this in go. This should be:
func (s stmtResponseList) Swap(i, j int) { s[i], s[j] = s[j], s[i] }
Done.
pkg/sql/sqlstats/ssmemstorage/ss_mem_storage.go, line 384 at r6 (raw file):
Previously, ajwerner wrote…
I think this should be called
Locked
as in it implies that the data structure is locked. You'll find this idiom all over thekvserver
code base and those methods are often used also during construction. Calling thisNotLocked
I find very confusing.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @kevin-v-ngo and @maryliag)
TFTR! bors r=maryliag,ajwerner |
Build failed (retrying...): |
Build succeeded: |
This commit introduces crdb_internal.transaction_statistics virtual table that exposes both cluster-wide in-memory transaction statistics as well as persited transaction statistics. This new virtual table will be used to replace crdb_internal.node_transaction_statistics virtual table, which only surface node-local in-memory transaction statsitics. Follow up to cockroachdb#68715 Release note (sql change): introduced new crdb_internal.transaction_statistics virtual table that surfaces both cluster-wide in-memory transaction statistics as well as persisted transaction statistics.
This commit introduces crdb_internal.transaction_statistics virtual table that exposes both cluster-wide in-memory transaction statistics as well as persited transaction statistics. This new virtual table will be used to replace crdb_internal.node_transaction_statistics virtual table, which only surface node-local in-memory transaction statsitics. Follow up to cockroachdb#68715 Release note (sql change): introduced new crdb_internal.transaction_statistics virtual table that surfaces both cluster-wide in-memory transaction statistics as well as persisted transaction statistics. Release justification: Category 4
69049: sql: implement crdb_internal.transaction_statistics r=ajwerner,maryliag a=Azhng Depends on #68715 and #69022 # First Commit roachpb,sql,server: add TransactionFingerprintID to roachpb.CollectedTransactionStatistics Previously, roachpb.CollectedTransactionStatistics does not contain TransactionFingerprintID. This results in SQLStatusServer's /_status/statements endpoint not able to return transaction fingerprint id in the response. This commit adds transaction fingerprint id into CollectedTransactionStatistics and simplifies the API in multiple layers. Release note: None # Second Commit sql: introduce crdb_internal.transaction_statistics virtual table This commit introduces crdb_internal.transaction_statistics virtual table that exposes both cluster-wide in-memory transaction statistics as well as persited transaction statistics. This new virtual table will be used to replace crdb_internal.node_transaction_statistics virtual table, which only surface node-local in-memory transaction statsitics. Follow up to #68715 Release justification: Category 4 Release note (sql change): introduced new crdb_internal.transaction_statistics virtual table that surfaces both cluster-wide in-memory transaction statistics as well as persisted transaction statistics. 69151: sql/resolver: wrong error when db does not exist for virtual schemas r=ajwerner a=fqazi Fixes: #68060 Previously, when a database did not exist under a virtual schema the code would fall through do a normal look up. Before a recent refactor of the some of the internals, we accidentally changed behavior, so that an undefined relation error was returned. To address this, this patch adds a check inside the resolver layer to return the correct error instead of falling through. Release note: None 69212: workload: fix pgx error cast in kv95 and schemachange r=RichardJCai a=rafiss fixes #69189 #69100 is failing but i'm not sure if this is the cause. (cc @ajwerner) This was done incorrectly after the recent upgrade to pgx4. `pgconn.PgError` does not implement `error`, but `*pgconn.PgError` does. Release justification: test only change Release note: None 69223: changefeedccl: Rework webhook sink flushing implementation. r=miretskiy a=miretskiy Stop relying on wait group to implement flush logic in webhook sink. The wait group does not respect context cancellation. Because of that, it is possible that the caller blocks, waiting for Flush to complete, while immediately after blocking, the context is cancelled. When this happens the go routines running responsible for processing the messages may terminate prior to decrementing wait group counts. Instead of relying on waitgroup, use synchronization provided by the channels themselves, and introduce a new type of worker request (flush) which correctly flushes and waits for flush to complete, while respecting context cancellation. Fixes #69175 Release Notes: None Co-authored-by: Azhng <archer.xn@gmail.com> Co-authored-by: Faizan Qazi <faizan@cockroachlabs.com> Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com> Co-authored-by: Yevgeniy Miretskiy <yevgeniy@cockroachlabs.com>
Depends on #68675
Related to: #64743
This commit introduces crdb_internal.statement_statistics virtual
table that exposes both cluster-wide in-memory statement statistics
as well as persited statement statistics. This new virtual table
will be used to replace crdb_internal.node_statement_statistics
virtual table, which only surface node-local in-memory statement
statsitics.
Release note (sql change): introduced new crdb_internal.statement_statistics
virtual table that surfaces both cluster-wide in-memory statement statistics
as well as persited statement statistics.