-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
contention: record 40001 errors as contention events #111685
Conversation
6bfb6b3
to
e45af52
Compare
e45af52
to
e8fa6ea
Compare
e8fa6ea
to
6be84bc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @xinhaoz)
pkg/sql/conn_executor_exec.go
line 3189 at r2 (raw file):
// If we have a known conflicting txn meta for serialization conflict, we // should add it to our contention registry so we can resolve the event. var retryErr *kvpb.TransactionRetryWithProtoRefreshError
recordTransactionFinish
is already a pretty large. Can this be refactored into it's own function to make it easier to read?
pkg/sql/crdb_internal.go
line 7454 at r1 (raw file):
table_name STRING NOT NULL, index_name STRING contention_type STRING
nit: align and remove spaces after string
pkg/sql/crdb_internal.go
line 7545 at r1 (raw file):
tree.NewDString(tableName), // table_name tree.NewDString(indexName), // index_name tree.NewDString(resp.Events[i].ContentionType.String()),
During upgrade scenario could this be nil?
pkg/sql/contention/registry.go
line 266 at r1 (raw file):
// non lock wait related contention to index contention surfaces. c := event.BlockingEvent r.globalLock.Lock()
Do we need the global lock to add to the event store?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @j82w)
pkg/sql/crdb_internal.go
line 7454 at r1 (raw file):
Previously, j82w (Jake) wrote…
nit: align and remove spaces after string
Hmm, it seems it's actually the above 4 lines that aren't aligned with the rest of the fields due to them using tab spsacing. I'll convert the spacing there (and also remove the extra spacing after the string on this line).
pkg/sql/crdb_internal.go
line 7545 at r1 (raw file):
Previously, j82w (Jake) wrote…
During upgrade scenario could this be nil?
It will default to LOCK_WAIT
for the events from nodes that haven't upgraded yet, which is accurate since there won't be any serialization conflict events there.
pkg/sql/contention/registry.go
line 266 at r1 (raw file):
Previously, j82w (Jake) wrote…
Do we need the global lock to add to the event store?
I can double check what this lock is doing. I'd assume so since we weren't unlocking before the event store before.
6be84bc
to
c64a6ae
Compare
Previously, xinhaoz (Xin Hao Zhang) wrote…
nvm, I left the comment on an old revision. It's fixed in the latest. |
c64a6ae
to
02a81ff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, I have also now put the new behaviour behind a cluster setting. This is reflected in the commit message and testing. It's undocumented right now but I'm asssuming we want to make it public so users can turn it on themselves?
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @j82w)
pkg/sql/conn_executor_exec.go
line 3189 at r2 (raw file):
Previously, j82w (Jake) wrote…
recordTransactionFinish
is already a pretty large. Can this be refactored into it's own function to make it easier to read?
Done.
pkg/sql/crdb_internal.go
line 7454 at r1 (raw file):
Previously, j82w (Jake) wrote…
nvm, I left the comment on an old revision. It's fixed in the latest.
Yeah, I converted the 4 lines above it to use spaces instead of tabs 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @xinhaoz)
pkg/sql/contention/event_store.go
line 248 at r7 (raw file):
// duration exceeds the threshold. threshold := DurationThreshold.Get(&s.st.SV) if e.ContentionType != contentionpb.ContentionType_SERIALIZATION_CONFLICT &&
Why have the threshold be ignored for serialization conflict?
pkg/sql/contention/registry.go
line 266 at r1 (raw file):
Previously, xinhaoz (Xin Hao Zhang) wrote…
I can double check what this lock is doing. I'd assume so since we weren't unlocking before the event store before.
Any update on this?
02a81ff
to
d739af5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @j82w)
pkg/sql/contention/event_store.go
line 248 at r7 (raw file):
Previously, j82w (Jake) wrote…
Why have the threshold be ignored for serialization conflict?
We don't record how long the stmt 'waited' for a serialiation conflict since that doesn't always apply in this scenario. So all serialization_conflict events will have a 0 duration currently and we record them all. For now, I wouldn't expect there to be too many of these events since we only have enoug info to record the event in the scenarios where the conflicting txn hadn't commited at the time of conflict.
pkg/sql/contention/registry.go
line 266 at r1 (raw file):
Previously, j82w (Jake) wrote…
Any update on this?
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained
This commit adds the `contention_type` field to `ExtendedContentionEvent` and also adds it as a new column in `crdb_internal.transaction_contention_events`. Note that this field does not impact the following surfaces: - crdb_internal.cluster_contention_events - crdb_internal.cluster_contended_tables - crdb_internal.cluster_contended_indexes - crdb_internal.cluster_contended_keys Release note (sql change): New column `contention_type` in crdb_internal.transaction_contention_events table indicating the type of txn contention encountered. Current values are - LOCK_WAIT Part of: cockroachdb#111649
This commit writes contention events to the contention registry's event store when a 40001 error is returned at the end of a transaction. The criteria for writing the event to the store is: - If a `TransactionRetryWithProtoRefreshError` error was encountered - If there exists conflicting txn information on the error Enabling this feature is controlled by the cluster setting `sql.contention.record_serialization_conflicts.enabled` which is off by default. `SERIALIZATION_CONFLICT` is also added to the list of contention types in the contention protobufs. These contention events will have their txn ids mapped to txn fingerprints and will be surfaced in the table `crdb_internal.transaction_contention_events`. Note that all `SERIALIZATION_CONFLICT` type contention events will have a `contentino_duration` value of 0. Release note (cli change): New contention_type in table `SERIALIZATION_CONFLICT` `crdb_internal.transaction_contention_events`. These contention events specify that contention occurred between 2 transactions due to serialization conflicts. Closes: cockroachdb#111649
d739af5
to
cba5fd1
Compare
bors r+ |
Build succeeded: |
Commit 1 contention: add contention_type field to ExtendedContentionEvent
This commit adds the
contention_type
field toExtendedContentionEvent
andalso adds it as a new column in
crdb_internal.transaction_contention_events
.Note that this field does not impact the following surfaces:
Release note (sql change): New column
contention_type
incrdb_internal.transaction_contention_events table indicating
the type of txn contention encountered. Current values are
Part of: #111649
Commit 2 sql, contention: record 40001 errors as contention events
This commit writes contention events to the contention registry's
event store when a 40001 error is returned at the end of a transaction.
The criteria for writing the event to the store is:
TransactionRetryWithProtoRefreshError
error was encounteredEnabling this feature is controlled by the cluster setting
sql.contention.record_serialization_conflicts.enabled
which is off bydefault.
SERIALIZATION_CONFLICT
is also added to the list of contention typesin the contention protobufs. These contention events will have their
txn ids mapped to txn fingerprints and will be surfaced in the table
crdb_internal.transaction_contention_events
.Note that all
SERIALIZATION_CONFLICT
type contention events willhave a
contentino_duration
value of 0.Release note (cli change): New contention_type in table
crdb_internal.transaction_contention_events
,SERIALIZATION_CONFLICT
.These contention events specify that contention occurred between
2 transactions due to serialization conflicts.
Closes: #111649