-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: improve high contention performance under deadlock scenarios #16256
storage: improve high contention performance under deadlock scenarios #16256
Conversation
0788c12
to
b19c530
Compare
Here's the performance improvement running the ledger example for 1m with contention:
These are stats computed from 60
|
Didn't get through the "meat" of this, sorry about that. Mostly nits right now, will try to take another pass. Glad to see such an improvement, though! Oh, and looks like there's a merge conflict. Reviewed 8 of 9 files at r1. pkg/storage/push_txn_queue.go, line 115 at r1 (raw file):
don't you need to grab all the locks before iterating through the dependents? pkg/storage/push_txn_queue.go, line 247 at r1 (raw file):
I think this should be TODO() pkg/storage/push_txn_queue_test.go, line 340 at r1 (raw file):
isn't this duplicative with the below? pkg/storage/store.go, line 2632 at r1 (raw file):
nit: this method does more harm than good - i'd rather it was just inlined here. pkg/storage/store.go, line 2640 at r1 (raw file):
This comment is incorrect; the copy happens unconditionally. Comments from Reviewable |
The commit message says "under deadlock scenarios" - is the improvement limited to just deadlocks or does it help with other contention scenarios too? (the code looks like the latter to me). If it's just deadlocks, I'm not sure this complexity is worth it. There's getting to be a lot of moving parts with fairly complex interactions. Reviewed 9 of 9 files at r1. pkg/base/config.go, line 300 at r1 (raw file):
Changing the default retry options for the whole system should at least be done in its own commit and probably its own PR (would be nice to do this after the benchmark reporting stuff is in). We also need to stress all the tests for this change to make sure it doesn't cause tests to start missing deadlines. pkg/storage/push_txn_queue.go, line 38 at r1 (raw file):
This is awfully short; it hardly seems worth waiting at all if this will be the limit. pkg/storage/push_txn_queue.go, line 411 at r1 (raw file):
There should be comments explaining what each of these channels is used for. pkg/storage/push_txn_queue.go, line 580 at r1 (raw file):
I think this sentence would be more clear if phrased positively: "This returns control to the pusher periodically so it can bail out if the pusher itself has already been aborted" (right?) I'm getting confused by the term "pusher" in this file. This method runs on the range that holds the pusher's transaction record, right? But in this comment, we're using "pusher" to refer to the process running on behalf of the pusher transaction on some other range, which is calling QueryTxn (If this the pusher txn record is here, what could cause an abort there? Is that referring to context cancellation? Isn't that propagated across the RPC?) Why is the setting of pkg/storage/push_txn_queue.go, line 638 at r1 (raw file):
Closing the channel (instead of just guaranteeing that exactly one value is written to it) is slightly unorthodox. I see why you're doing it (you may read from the channel once in the select loop and once in the defer), but it feels wrong to me to do it this way. Maybe set the channel to nil after reading from it in the select loop and then only read it in the defer if it is non-nil? pkg/storage/push_txn_queue_test.go, line 538 at r1 (raw file):
We've generally found 10ms sleeps to be too short to be reliable when tests are run under stress. Comments from Reviewable |
Review status: all files reviewed at latest revision, 12 unresolved discussions, some commit checks failed. pkg/roachpb/api.proto, line 543 at r1 (raw file):
During a rolling upgrade what will the behavior of the system be when an upgraded node sends a Comments from Reviewable |
Reiterating what @bdarnell said, seems like this PR helps high contention scenarios in general, not just deadlock scenarios, right? If yes, the added complexity seems worth it. Review status: all files reviewed at latest revision, 12 unresolved discussions, some commit checks failed. Comments from Reviewable |
It will help in cases where there is non-trivial contention (i.e. more than two txns in play over same subset of keys), even if there is no deadlock. This is true if there are retries due to timestamps being advanced for SERIALIZABLE transactions. This is probably the normal case where we would see significant contention. My $0.02: I think it's worth doing this even if it were just degenerate deadlock cases. Review status: all files reviewed at latest revision, 12 unresolved discussions, some commit checks failed. Comments from Reviewable |
b19c530
to
33d24cd
Compare
Review status: all files reviewed at latest revision, 12 unresolved discussions, some commit checks failed. pkg/base/config.go, line 300 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
I agree. It was an error to have changed this to 10ms in the first place. That was done peremptorily in a98969d I sent PR #16357 to correct this separately. pkg/roachpb/api.proto, line 543 at r1 (raw file): Previously, petermattis (Peter Mattis) wrote…
The query will busy loop. I don't see this as worth fixing, as this requires unusually high contention to be a problem in practice, and the window of opportunity is limited to a rolling restart. It'll potentially cause a performance blip. pkg/storage/push_txn_queue.go, line 38 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Changed to pkg/storage/push_txn_queue.go, line 115 at r1 (raw file): Previously, tamird (Tamir Duberstein) wrote…
There's no need for strict consistency with the set of dependents. We just need to avoid thread safety issues. pkg/storage/push_txn_queue.go, line 247 at r1 (raw file): Previously, tamird (Tamir Duberstein) wrote…
Changed the method signature to accept a context and updated the call sites. pkg/storage/push_txn_queue.go, line 411 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. pkg/storage/push_txn_queue.go, line 580 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
I've updated the comment for clarity and added a comment to Yes, this method runs on the range that holds the pusher's txn record. I've removed the use of the word "pusher" in the comment and instead refer to the "txns" which might be waiting on the pusher we're trying to query. If the txn we're querying is in the queue, then it's definitely still pending; if it's not in the queue, then it might have been aborted or committed already. So, we ultimately have no option but to eventually bail waiting for an update. The caller could have been aborted by a HIGH priority concurrent txn, it could have failed to heartbeat from its coordinator, etc.
pkg/storage/push_txn_queue.go, line 638 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. pkg/storage/push_txn_queue_test.go, line 340 at r1 (raw file): Previously, tamird (Tamir Duberstein) wrote…
Good point. Removed. pkg/storage/push_txn_queue_test.go, line 538 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
This is expected to be flaky in the opposite way. In other words, it may fail to discover an error if the machine is slow. But since we run these things so often, I expect it would fail many times a day if the condition it's testing were broken. Keeping it "fast" at pkg/storage/store.go, line 2632 at r1 (raw file): Previously, tamird (Tamir Duberstein) wrote…
I find this less noisy and there is already a pkg/storage/store.go, line 2640 at r1 (raw file): Previously, tamird (Tamir Duberstein) wrote…
Done. Comments from Reviewable |
33d24cd
to
3b1def1
Compare
Reviewed 3 of 7 files at r2. pkg/storage/push_txn_queue.go, line 247 at r1 (raw file): Previously, spencerkimball (Spencer Kimball) wrote…
you're still using context.Background() rather than the context that was passed in. Comments from Reviewable |
Review status: 6 of 10 files reviewed at latest revision, 9 unresolved discussions. pkg/storage/push_txn_queue.go, line 247 at r1 (raw file): Previously, tamird (Tamir Duberstein) wrote…
Oops. Fixed. Comments from Reviewable |
3b1def1
to
b0c9fba
Compare
Reviewed 3 of 7 files at r2. Comments from Reviewable |
Reviewed 6 of 7 files at r2, 1 of 1 files at r3. pkg/storage/push_txn_queue.go, line 580 at r1 (raw file): Previously, spencerkimball (Spencer Kimball) wrote…
Got it. I'm less concerned about the arbitrariness (and shortness) of pkg/storage/push_txn_queue.go, line 646 at r3 (raw file):
We tend to avoid Comments from Reviewable |
Review status: all files reviewed at latest revision, 6 unresolved discussions, all commit checks successful. pkg/storage/push_txn_queue.go, line 646 at r3 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. Comments from Reviewable |
b0c9fba
to
5b4270e
Compare
During contention, a transaction which encounters another transaction's intents will move to "push" the transaction. This sends the "pusher" into the `pushTxnQueue` to wait for the transaction which is blocking it (the "pushee") to complete. Previously, the waiting pusher would periodically (using an overly aggressive backoff / retry loop) query _itself_ in order to determine whether its priority or status had been updated, as well as whether any transactions were waiting on the pusher. The set of waiting txns is transitive, so with sufficient repeated queries, even a large dependency cycle would be revealed. This change modifies the backoff / retry loop to instead immediately send a query which waits at the range containing the pusher's txn until either the pusher's priority or status changes, or the set of transactions waiting on it changes. This allows any changes to quickly propagate, but avoids unnecessary queries if there's nothing to communicate.
5b4270e
to
d5b7980
Compare
During contention, a transaction which encounters another transaction's
intents will move to "push" the transaction. This sends the "pusher" into
the
pushTxnQueue
to wait for the transaction which is blocking it (the"pushee") to complete.
Previously, the waiting pusher would periodically (using an overly
aggressive backoff / retry loop) query itself in order to determine
whether its priority or status had been updated, as well as whether
any transactions were waiting on the pusher. The set of waiting txns
is transitive, so with sufficient repeated queries, even a large
dependency cycle would be revealed.
This change modifies the backoff / retry loop to instead immediately
send a query which waits at the range containing the pusher's txn
until either the pusher's priority or status changes, or the set of
transactions waiting on it changes. This allows any changes to quickly
propagate, but avoids unnecessary queries if there's nothing to
communicate.