New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
upgrade: use high priority txn's to update the cluster version #113996
Conversation
a6cc40a
to
829243e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @fqazi!
In addition to the test that you're adding here, there are two other thing that would give us more confidence in the efficacy and safety of this change:
- we demonstrate that it fixes the reproduction presented in https://github.com/cockroachlabs/support/issues/2685#issuecomment-1788125904.
- we determine all of the keys that the upgrade txn that we're marking as high priority could read from and write to. The code itself makes it look like we're only reading from the version key and then writing to it. But are there any other reads or writes that are performed on behalf of the txn (now at high priority) due to the use of an internal executor?
Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt and @fqazi)
pkg/sql/set_cluster_setting.go
line 621 at r1 (raw file):
return db.Txn(ctx, func(ctx context.Context, txn isql.Txn) error { // Run the version bump inside the upgrade as high priority, since // lease manager ends up locking the version row inside the settings
s/locking/reading/
"with high priority"
pkg/sql/set_cluster_setting.go
line 624 at r1 (raw file):
// table when refreshing leases. On larger clusters or with a large number // of descriptors its possible for normal transactions to be starved // (#95227)
nit: punctuation.
pkg/sql/set_cluster_setting.go
line 624 at r1 (raw file):
// table when refreshing leases. On larger clusters or with a large number // of descriptors its possible for normal transactions to be starved // (#95227)
Let's also add a paragraph about why this is safe (see my comment above).
Assuming that we're only reading from the version key and then writing to it, changing all of these upgrade transactions from normal priority to high priority should not alter how they interact with each other — two transactions at the same priority interact the same with each other, regardless of what that priority is.
So the only effects this will have are:
- the transaction will now abort any other transactions that write to the version key with normal priority (there aren't any, right?)
- the transaction will now block any other transaction that reads from the version key. That's kind of the point.
- Depending on whether the internal executor is hiding any other reads or writes, this may interact differently with other txns for those.
pkg/sql/set_cluster_setting.go
line 660 at r1 (raw file):
ctx, "update-setting", txn.KV(), sessiondata.RootUserSessionDataOverride, `UPSERT INTO system.settings (name, value, "lastUpdated", "valueType") VALUES ($1, $2, now(), $3)`,
Could this use of SQL ever result in a lease acquisition? Does that risk a deadlock, if the lease acquisition gets stuck on the version upgrade's write?
pkg/upgrade/upgrades/version_starvation_test.go
line 32 at r1 (raw file):
// the cluster version is done with a high priority txn and cannot // be pushed out. Previously, this would be normal priority and // get pushed by the leasing code. Note: This test just confirms
"by the leasing code, leading to starvation when leases were acquired with sufficiently high frequency"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt and @nvanbenschoten)
pkg/sql/set_cluster_setting.go
line 660 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Could this use of SQL ever result in a lease acquisition? Does that risk a deadlock, if the lease acquisition gets stuck on the version upgrade's write?
Oh good point that is a potential danger here, if a lease renewal is needed on the system.settings table. Specifically a synchronous renewal was required then we can deadlock.
829243e
to
a38c253
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt and @nvanbenschoten)
pkg/sql/set_cluster_setting.go
line 624 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Let's also add a paragraph about why this is safe (see my comment above).
Assuming that we're only reading from the version key and then writing to it, changing all of these upgrade transactions from normal priority to high priority should not alter how they interact with each other — two transactions at the same priority interact the same with each other, regardless of what that priority is.
So the only effects this will have are:
- the transaction will now abort any other transactions that write to the version key with normal priority (there aren't any, right?)
- the transaction will now block any other transaction that reads from the version key. That's kind of the point.
- Depending on whether the internal executor is hiding any other reads or writes, this may interact differently with other txns for those.
Clarified why this should be safe:
// Run the version bump inside the upgrade as high priority, since
// lease manager ends up reading the version row (with high priority)
// inside the settings table when refreshing leases. On larger clusters
// or with a large number of descriptors its possible for normal
// transactions to be starved (#95227).
// This is safe because we expected this transaction only do the following:
// 1) We expect this transaction to only read and write to the
// version key in the system.settings table.
// 2) Reads from the system.sql_instances table to confirm all SQL servers
// have been upgraded in multi-tenant environments.
// 3) Other transactions will use a normal priority and get pushed out by
// this one.
// 4) There is a potential danger to conflict with lease renewal of the
// settings table and the upgrade, but this cannot occur in practice,
// because the lease will always be acquired before we try to write to
// the settings table.
If we want to be paranoid we could disable using leased descriptors in the planner.
pkg/sql/set_cluster_setting.go
line 660 at r1 (raw file):
Previously, fqazi (Faizan Qazi) wrote…
Oh good point that is a potential danger here, if a lease renewal is needed on the system.settings table. Specifically a synchronous renewal was required then we can deadlock.
Turns out this is safe, with some experimentation I was able to confirm this is a non-issue, since we will always get the lease to the setting table before writing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I looked into (2) and the only other operation is a read to the sqlinstances table.
I'm going to run (1) locally to reproduce the scenario and confirm the fix works
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt and @nvanbenschoten)
@nvanbenschoten Actually disregard my previous comment yeah there is a deadlock here, my test was flawed. Let me see if I can work around it by avoiding leasing during the settings update. |
46b2a42
to
9ac22a2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 6 files at r3, 3 of 3 files at r4, 5 of 5 files at r5, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt and @fqazi)
-- commits
line 29 at r5:
"simulate"?
pkg/sql/set_cluster_setting.go
line 623 at r4 (raw file):
// lease manager ends up reading the version row (with high priority) // inside the settings table when refreshing leases. On larger clusters // or with a large number of descriptors its possible for normal
nit: "it's"
pkg/sql/set_cluster_setting.go
line 626 at r4 (raw file):
// transactions to be starved (#95227). // This is safe because we expected this transaction only do the following: // 1) We expect this transaction to only read and write to the
How does the use of WithSkipDescriptorCache
change this behavior? Are there new keys that we read from and write to?
pkg/sql/set_cluster_setting.go
line 630 at r4 (raw file):
// 2) Reads from the system.sql_instances table to confirm all SQL servers // have been upgraded in multi-tenant environments. // 3) Other transactions will use a normal priority and get pushed out by
Which other transactions is this referring to? Other transactions that write to the version key? Do such transactions exist?
pkg/sql/set_cluster_setting.go
line 634 at r4 (raw file):
// 4) There is a potential danger to conflict with lease renewal of the // settings table and the upgrade, but to avoid this, we intentionally // force the internal executor to avoid leased descriptors.
nit: indentation looks off here.
pkg/sql/set_cluster_setting.go
line 686 at r4 (raw file):
return err }, isql.WithSkipDescriptorCache())
Could we give this a comment? Specifically, why does this help? If we never perform leasing then this transaction will never get stuck waiting on a leasing transaction. Does it have to perform other reads and writes to compensate? If so, how do those operations interact with other transactions?
pkg/sql/catalog/lease/kv_writer.go
line 71 at r5 (raw file):
) (settingswatcher.VersionGuard, error) { targetVersion := clusterversion.TODO_Delete_V23_1_SystemRbrCleanup if w.knobs.ForceVersionGuardReads {
Would it be cleaner to push this knob one level deeper and consult it directly in SettingsWatcher.MakeVersionGuard
? So we would have logic in that function that looks like: if activeVersion.IsActive(maxGate) || forceVersionGuardReads {
9ac22a2
to
0c5aa88
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt and @nvanbenschoten)
pkg/sql/set_cluster_setting.go
line 626 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
How does the use of
WithSkipDescriptorCache
change this behavior? Are there new keys that we read from and write to?
// 1) We expect this transaction to only read and write to the
// version key in the system.settings table. It will also do high
// priority reads from the system.namespace and system.descriptor tables
// because the WithSkipDescriptorCache below (this will push out schema
// changes during the upgrade).
pkg/sql/set_cluster_setting.go
line 630 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Which other transactions is this referring to? Other transactions that write to the version key? Do such transactions exist?
I was thinking of other transactions involving the system database getting pushed if they involve the system database descriptor:
// 3) Other transactions will use a normal priority and get pushed out by
// this one, if they involve schema changes on the system database
// descriptor (highly unlikely).
pkg/sql/set_cluster_setting.go
line 686 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Could we give this a comment? Specifically, why does this help? If we never perform leasing then this transaction will never get stuck waiting on a leasing transaction. Does it have to perform other reads and writes to compensate? If so, how do those operations interact with other transactions?
I clarified this above, but I'll add another comment stating why this is needed here:
// We will intentionally skip the leasing layer to update the
// system.settings table to avoid a deadlock with the leasing layer.
// The leasing layer will read the version key in system.settings with
// a high priority before any renewal. The update code above will need a
// lease on system.settings and potentially system.sqlinstances. If any of
// leases required expire, we could deadlock with the renewal being
// blocked on the version key. To get around this, we will force the
// transaction to directly make round trips to read descriptors instead
// of using the leasing cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 6 files at r6, 9 of 9 files at r7, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt and @fqazi)
pkg/sql/set_cluster_setting.go
line 686 at r4 (raw file):
Previously, fqazi (Faizan Qazi) wrote…
I clarified this above, but I'll add another comment stating why this is needed here:
// We will intentionally skip the leasing layer to update the
// system.settings table to avoid a deadlock with the leasing layer.
// The leasing layer will read the version key in system.settings with
// a high priority before any renewal. The update code above will need a
// lease on system.settings and potentially system.sqlinstances. If any of
// leases required expire, we could deadlock with the renewal being
// blocked on the version key. To get around this, we will force the
// transaction to directly make round trips to read descriptors instead
// of using the leasing cache.
Thanks for spelling this out. The high-priority read on the system.sqlinstances and system.descriptors tables makes me vaguely concerned that this may run into trouble with #104728. We'll want to make sure that this change is minimizing risk. Expanding the footprint of this transaction while making it high priority is working against this goal.
This makes me wonder how difficult it would be to have the upgrade transaction read and write the settings key directly, without going through SQL. The leasing transaction is already doing this with GetClusterVersionFromStorage
. Would it be difficult to do the same here?
pkg/server/settingswatcher/version_guard_test.go
line 145 at r7 (raw file):
watcher := settingswatcher.New(nil, s.Codec(), settings, nil, nil, nil) require.NoError(t, kvDB.Txn(ctx, func(ctx context.Context, txn *kv.Txn) error { guard, err := watcher.MakeVersionGuard(ctx, txn, maxVersion, false /*force versin guard reads*/)
nit: the comment here is off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly just questions and some comment nits.
The change makes sense to me but I don't think I can vouch for it being side effect free so I'll leave an approval to Nathan.
Will the session based leasing work fix this issue more reliably? If so, can we add a note or link to either a commit message or comment?
pkg/sql/isql/options.go
Outdated
@@ -50,6 +50,12 @@ func WithSessionData(sd *sessiondata.SessionData) Option { | |||
return (*sessionDataOption)(sd) | |||
} | |||
|
|||
// WithSkipDescriptorCache forces the internal executor to avoid all leasing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a short not about when or why this option might be used and any warnings that users should consider if they do opt to use this option?
pkg/sql/set_cluster_setting.go
Outdated
// Run the version bump inside the upgrade as high priority, since | ||
// lease manager ends up reading the version row (with high priority) | ||
// inside the settings table when refreshing leases. On larger clusters | ||
// or with a large number of descriptors it's possible for normal | ||
// transactions to be starved (#95227). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add some reference numbers to "large" here? Doesn't have to be exact, just within the ballpark as large is quite ambiguous.
The wording here feels a bit difficult to digest, in particular, "normal transactions" made me think about non-system transactions but I think the intended meaning is non-high priority transactions.
What would you think about (With appropriate line wrapping and appropriate reference numbers):
// Run the version bump inside the upgrade as high priority, since | |
// lease manager ends up reading the version row (with high priority) | |
// inside the settings table when refreshing leases. On larger clusters | |
// or with a large number of descriptors it's possible for normal | |
// transactions to be starved (#95227). | |
// On clusters with a large number of descriptors (> 10k) or nodes (>1k), non-priority transactions reading the version row can be staved. This is due to the lease manager reading the version row at high priority (#95227). | |
// Run the version bump inside the upgrade as a high priority txn to avoid being bullied out by the lease manager. |
// TestLeasingClusterVersionStarvation validates that setting | ||
// the cluster version is done with a high priority txn and cannot | ||
// be pushed out. Previously, this would be normal priority and | ||
// get pushed by the leasing by the leasing code, leading to starvation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// get pushed by the leasing by the leasing code, leading to starvation | |
// get pushed by the leasing, leading to starvation |
return | ||
} | ||
resumeBump <- struct{}{} | ||
for retry := retry.Start(retry.Options{}); retry.Next(); { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure which is the correct question to ask here: Why is this in a retry loop or why is there a Commit
call after an infinite loop?
Are we expecting the select for update query to receive the txn retry error?
pkg/sql/set_cluster_setting.go
Outdated
// inside the settings table when refreshing leases. On larger clusters | ||
// or with a large number of descriptors it's possible for normal | ||
// transactions to be starved (#95227). | ||
// This is safe because we expected this transaction only do the following: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a me issue but I don't see how these items say using a high priority txn is safe? I also don't understand how it may be unsafe 😅
0c5aa88
to
0dc1f23
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They would eliminate one of the causes, unfortunately that migration will run into this, so it will temporarily invoke the same wrath 👀
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @chrisseto, @dt, and @nvanbenschoten)
pkg/sql/set_cluster_setting.go
line 686 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Thanks for spelling this out. The high-priority read on the system.sqlinstances and system.descriptors tables makes me vaguely concerned that this may run into trouble with #104728. We'll want to make sure that this change is minimizing risk. Expanding the footprint of this transaction while making it high priority is working against this goal.
This makes me wonder how difficult it would be to have the upgrade transaction read and write the settings key directly, without going through SQL. The leasing transaction is already doing this with
GetClusterVersionFromStorage
. Would it be difficult to do the same here?
We can directly do the version bump using KV calls. The system.sqlinstances comes from the upgrade manager hook and deeply nested so we can't easily remove that (that would only impact serverless).
pkg/sql/set_cluster_setting.go
line 624 at r7 (raw file):
Previously, chrisseto (Chris Seto) wrote…
Could we add some reference numbers to "large" here? Doesn't have to be exact, just within the ballpark as large is quite ambiguous.
The wording here feels a bit difficult to digest, in particular, "normal transactions" made me think about non-system transactions but I think the intended meaning is non-high priority transactions.
What would you think about (With appropriate line wrapping and appropriate reference numbers):
// On clusters with a large number of descriptors (> 10k) or nodes (>1k), non-priority transactions reading the version row can be staved. This is due to the lease manager reading the version row at high priority (#95227). // Run the version bump inside the upgrade as a high priority txn to avoid being bullied out by the lease manager.
// On complex clusters with a large number of descriptors (> 500) and
// multi-region nodes (> 9), normal priority transactions reading/updating // the version row can be starved. This is due to the lease manager reading // the version row at high priority, when refreshing leases (#95227), with // a complex cluster this traffic will continuous.
pkg/sql/set_cluster_setting.go
line 625 at r7 (raw file):
Previously, chrisseto (Chris Seto) wrote…
I think this is a me issue but I don't see how these items say using a high priority txn is safe? I also don't understand how it may be unsafe 😅
Clarified this with:
// This is safe from deadlocks / starvation because we expected this
// transaction only do the following:
pkg/sql/isql/options.go
line 53 at r3 (raw file):
Previously, chrisseto (Chris Seto) wrote…
Could you add a short not about when or why this option might be used and any warnings that users should consider if they do opt to use this option?
Updated to:
WithSkipDescriptorCache forces the internal executor to avoid all leasing infrastructure, which will force descriptors to be read from KV. This can incur network latencies when accessing descriptors.
pkg/upgrade/upgrades/version_starvation_test.go
line 34 at r7 (raw file):
Previously, chrisseto (Chris Seto) wrote…
// get pushed by the leasing, leading to starvation
Done.
pkg/upgrade/upgrades/version_starvation_test.go
line 100 at r7 (raw file):
Previously, chrisseto (Chris Seto) wrote…
I'm not sure which is the correct question to ask here: Why is this in a retry loop or why is there a
Commit
call after an infinite loop?Are we expecting the select for update query to receive the txn retry error?
Ooh good catch, yes the commit isn't needed. We expect the select to get pushed and hit an error eventually.
Do we know when this issue will be fixed? We have a high visibility customer who is severely impacted by this. No one seems to be assigned to this, so I'm cc'ing everyone who has commented on it: |
0dc1f23
to
3dea8ef
Compare
6f497bf
to
7455a4b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunately that migration will run into this
Oof, that's rough.
though I'll leave the final sign off to @nvanbenschoten
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @dt and @nvanbenschoten)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 15 of 15 files at r11, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @dt and @fqazi)
pkg/sql/set_cluster_setting.go
line 625 at r11 (raw file):
return err } return db.KV().Txn(ctx, func(ctx context.Context, txn *kv.Txn) error {
Could you add a comment about why this goes out of its way to use raw KV?
pkg/sql/set_cluster_setting.go
line 633 at r11 (raw file):
// Run the version bump inside the upgrade as a high priority txn to avoid // being starved out by the lease manager. // Run the version bump inside the upgrade as high priority, since
nit: some redundancy here. "Run the version bump inside the upgrade as" twice.
pkg/sql/set_cluster_setting.go
line 682 at r11 (raw file):
} // Encode the setting value to write out the updated version. var tuple []byte
This all looks correct to me, but the logic feels easy to get wrong. It also feels like it could drift from SQL over time. Do we feel like we have sufficient testing around this upgrade code to give us confidence that we're writing SQL-decodable bytes?
pkg/sql/set_cluster_setting.go
line 682 at r11 (raw file):
} // Encode the setting value to write out the updated version. var tuple []byte
On ^ note, consider pulling this into a separate function and unit testing it.
pkg/sql/set_cluster_setting.go
line 707 at r11 (raw file):
newValue.SetTuple(tuple) if row.Value != nil { if err := txn.CPut(ctx, row.Key, newValue, row.Value.TagAndDataBytes()); err != nil {
This is all in a serializable transaction, so is there a need for the CPut? I might be missing something.
dfb2d0c
to
ccf31fb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @dt and @nvanbenschoten)
pkg/sql/set_cluster_setting.go
line 682 at r11 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
This all looks correct to me, but the logic feels easy to get wrong. It also feels like it could drift from SQL over time. Do we feel like we have sufficient testing around this upgrade code to give us confidence that we're writing SQL-decodable bytes?
Done
Added unit tests as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for the hard work on pushing this along @fqazi. I think we can go ahead an merge this once the remaining comments are addressed.
I do wonder if the backport would benefit from another look from someone else who has worked in this code recently, just to provide additional confidence that we aren't missing anything. @ajstorm or @JeffSwenson would either of you two (or both!) like to provide a review?
Reviewed 8 of 8 files at r12, 4 of 4 files at r13, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @dt and @fqazi)
-- commits
line 4 at r12:
"minimimal"
-- commits
line 5 at r12:
"s/writoe/write to/"
pkg/server/settingswatcher/setting_encoder.go
line 27 at r12 (raw file):
// EncodeSettingKey encodes a key for the system.settings table, which // can be used for direct KV operations. func EncodeSettingKey(codec keys.SQLCodec, setting string) []byte {
Should we call this from GetClusterVersionFromStorage
?
pkg/server/settingswatcher/setting_encoder.go
line 29 at r12 (raw file):
func EncodeSettingKey(codec keys.SQLCodec, setting string) []byte { indexPrefix := codec.IndexPrefix(keys.SettingsTableID, uint32(1)) return encoding.EncodeUvarintAscending(encoding.EncodeStringAscending(indexPrefix, "version"), uint64(0))
Was this intending to pass setting
, instead of "version"
?
pkg/sql/set_cluster_setting.go
line 682 at r11 (raw file):
Previously, fqazi (Faizan Qazi) wrote…
Done
Added unit tests as well.
This turned out nicely.
pkg/sql/set_cluster_setting.go
line 636 at r13 (raw file):
// 1) We expect this transaction to only read and write to the // version key in the system.settings table. To achieve the smallest // possible txn and extra to other tables we are going to
"and extra to"
Something sounds off here.
pkg/server/settingswatcher/setting_encoder_test.go
line 31 at r12 (raw file):
) func TestSettingsEncoder(t *testing.T) {
Nice test!
To be able to have minimal high priority txns for bumping cluster version, we need the ability to write the setting table using KV calls. This patch adds, the functions EncodeSettingKey and EncodeSettingValue to allow directly generate values for Put/Get calls. Release note: None
Previously, it was possible for the leasing subsystem to starve out attempts to set the cluster version during upgrades, since the leasing subsystem uses high priority txn for renewals. To address this, this patch makes the logic to set the cluster version high priority so it can't be pushed out by lease renewals. Fixes: cockroachdb#113908 Release note (bug fix): Addressed a bug that could cause cluster version finalization to get starved out by descriptor lease renewals on larger clusters.
ccf31fb
to
07f3628
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @dt and @nvanbenschoten)
pkg/server/settingswatcher/setting_encoder.go
line 27 at r12 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Should we call this from
GetClusterVersionFromStorage
?
I'm tempted to leave these as general incase we run into similar use cases for other settings.
pkg/server/settingswatcher/setting_encoder.go
line 29 at r12 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Was this intending to pass
setting
, instead of"version"
?
Yes, I was trying to generalize this code.
pkg/sql/set_cluster_setting.go
line 636 at r13 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
"and extra to"
Something sounds off here.
Ugh..I meant to write:
To achieve the smallest
// possible txn and avoid extra operations on other keys, we are going to
// use KV call with EncodeSettingKey/EncodeSettingValue functions
// instead of using the internal executor.
@nvanbenschoten @chrisseto TFTR! bors r+ |
Build succeeded: |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 07f3628 to blathers/backport-release-23.1-113996: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.1.x failed. See errors above. error creating merge commit from 07f3628 to blathers/backport-release-23.2-113996: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.2.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Previously, it was possible for the leasing subsystem to starve out attempts to set the cluster version during upgrades, since the leasing subsystem uses high priority txn for renewals. To address this, this patch makes the logic to set the cluster version high priority so it can't be pushed out by lease renewals.
Fixes: #113908
Release note (bug fix): Addressed a bug that could cause cluster version finalization to get starved out by descriptor lease renewals on larger clusters.