Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
126336: sql: override StmtTimeout to 0 for background jobs r=annrpom a=annrpom

For background jobs and schema changes, our Internal
Executor inherits from cluster settings. If a statement
timeout is set cluster-wide, then background jobs will
respect this -- which does not make much sense; thus,
this patch overrides the StmtTimeout to 0s for background
jobs.

Fixes: #126261
Release note (bug fix): Background jobs no longer respect
a statement timeout.

126419: ui: fix query fetching store ids for db/tables r=xinhaoz a=xinhaoz

These queries were updated in #118904 to make them more performant. In that change we mistakenly removed the null check before reading the `rows` field, causing pages using this request to crash if this query fails.

Epic: none
Fixes: #126348

Release note (bug fix): Database overview and db details pages should not crash if the range information is not available.

126575: kv: add support for Lease.MinExpiration r=nvanbenschoten a=nvanbenschoten

Closes #125235.

This commit begins using the new `Lease.MinExpiration` field, which was introduced in 7781cfa. `Lease.MinExpiration` is assigned to epoch-based leases (and soon, leader leases) during upgrades from expiration-based lease. It is a simple, direct way to ensure that the expiration of the lease does not regress during such promotions.

We previously solved this issue in 6dd54b4 by manually heartbeating the node liveness record when we detected an expiration regression. This was backwards compatible, so it was a better backport target, but it was also complex, as demonstrated by the need for b2ac52a. In the future, we will be able to remove the manual heartbeat logic, as it is no longer necessary once MinTimestamp is supported by all clusters.

Epic: None

Co-authored-by: Annie Pompa <annie@cockroachlabs.com>
Co-authored-by: Xin Hao Zhang <xzhang@cockroachlabs.com>
Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
  • Loading branch information
4 people committed Jul 8, 2024
4 parents eed9f5a + 178e453 + a907afb + ba758c0 commit b5ab804
Show file tree
Hide file tree
Showing 19 changed files with 285 additions and 45 deletions.
2 changes: 1 addition & 1 deletion docs/generated/settings/settings-for-tenants.txt
Original file line number Diff line number Diff line change
Expand Up @@ -352,4 +352,4 @@ trace.snapshot.rate duration 0s if non-zero, interval at which background trace
trace.span_registry.enabled boolean true if set, ongoing traces can be seen at https://<ui>/#/debug/tracez application
trace.zipkin.collector string the address of a Zipkin instance to receive traces, as <host>:<port>. If no port is specified, 9411 will be used. application
ui.display_timezone enumeration etc/utc the timezone used to format timestamps in the ui [etc/utc = 0, america/new_york = 1] application
version version 1000024.1-upgrading-to-1000024.2-step-010 set the active cluster version in the format '<major>.<minor>' application
version version 1000024.1-upgrading-to-1000024.2-step-012 set the active cluster version in the format '<major>.<minor>' application
2 changes: 1 addition & 1 deletion docs/generated/settings/settings.html
Original file line number Diff line number Diff line change
Expand Up @@ -307,6 +307,6 @@
<tr><td><div id="setting-trace-span-registry-enabled" class="anchored"><code>trace.span_registry.enabled</code></div></td><td>boolean</td><td><code>true</code></td><td>if set, ongoing traces can be seen at https://&lt;ui&gt;/#/debug/tracez</td><td>Serverless/Dedicated/Self-Hosted</td></tr>
<tr><td><div id="setting-trace-zipkin-collector" class="anchored"><code>trace.zipkin.collector</code></div></td><td>string</td><td><code></code></td><td>the address of a Zipkin instance to receive traces, as &lt;host&gt;:&lt;port&gt;. If no port is specified, 9411 will be used.</td><td>Serverless/Dedicated/Self-Hosted</td></tr>
<tr><td><div id="setting-ui-display-timezone" class="anchored"><code>ui.display_timezone</code></div></td><td>enumeration</td><td><code>etc/utc</code></td><td>the timezone used to format timestamps in the ui [etc/utc = 0, america/new_york = 1]</td><td>Serverless/Dedicated/Self-Hosted</td></tr>
<tr><td><div id="setting-version" class="anchored"><code>version</code></div></td><td>version</td><td><code>1000024.1-upgrading-to-1000024.2-step-010</code></td><td>set the active cluster version in the format &#39;&lt;major&gt;.&lt;minor&gt;&#39;</td><td>Serverless/Dedicated/Self-Hosted</td></tr>
<tr><td><div id="setting-version" class="anchored"><code>version</code></div></td><td>version</td><td><code>1000024.1-upgrading-to-1000024.2-step-012</code></td><td>set the active cluster version in the format &#39;&lt;major&gt;.&lt;minor&gt;&#39;</td><td>Serverless/Dedicated/Self-Hosted</td></tr>
</tbody>
</table>
5 changes: 5 additions & 0 deletions pkg/clusterversion/cockroach_versions.go
Original file line number Diff line number Diff line change
Expand Up @@ -264,6 +264,10 @@ const (
// the `system.tenant_settings` row for the `version` setting.
V24_2_DeleteTenantSettingsVersion

// V24_2_LeaseMinTimestamp is the earlier version which supports the lease
// minimum timestamp field.
V24_2_LeaseMinTimestamp

// *************************************************
// Step (1) Add new versions above this comment.
// Do not add new versions to a patch release.
Expand Down Expand Up @@ -327,6 +331,7 @@ var versionTable = [numKeys]roachpb.Version{
V24_2_TenantSystemTables: {Major: 24, Minor: 1, Internal: 6},
V24_2_TenantRates: {Major: 24, Minor: 1, Internal: 8},
V24_2_DeleteTenantSettingsVersion: {Major: 24, Minor: 1, Internal: 10},
V24_2_LeaseMinTimestamp: {Major: 24, Minor: 1, Internal: 12},

// *************************************************
// Step (2): Add new versions above this comment.
Expand Down
19 changes: 9 additions & 10 deletions pkg/kv/kvserver/client_lease_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1668,10 +1668,8 @@ func TestLeaseRequestBumpsEpoch(t *testing.T) {

// TestLeaseRequestFromExpirationToEpochDoesNotRegressExpiration tests that a
// promotion from an expiration-based lease to an epoch-based lease does not
// permit the expiration time of the lease to regress. This is enforced by
// detecting cases where the leaseholder's liveness record's expiration trails
// its expiration-based lease's expiration and synchronously heartbeating the
// leaseholder's liveness record before promoting the lease.
// permit the expiration time of the lease to regress. This is enforced using
// the min_expiration field on the epoch-based lease.
func TestLeaseRequestFromExpirationToEpochDoesNotRegressExpiration(t *testing.T) {
defer leaktest.AfterTest(t)()
defer log.Scope(t).Close(t)
Expand Down Expand Up @@ -1736,13 +1734,14 @@ func TestLeaseRequestFromExpirationToEpochDoesNotRegressExpiration(t *testing.T)
})

// Once the lease has been promoted to an epoch-based lease, the effective
// expiration (maintained indirectly in the liveness record) must be greater
// than that in the preceding expiration-based lease. If this were to regress,
// a non-cooperative lease failover to a third lease held by a different node
// expiration (maintained indirectly in the liveness record and directly in
// the lease through its min_expiration field) must be greater than or equal
// to that in the preceding expiration-based lease. If this were to regress, a
// non-cooperative lease failover to a third lease held by a different node
// could overlap in MVCC time with the first lease (i.e. its start time could
// precede expLease.Expiration), violating the lease disjointness property.
//
// If we disable the `expToEpochPromo` branch in replica_range_lease.go, this
// assertion fails.
require.True(t, expLease.Expiration().Less(epochLease.Expiration()))
// If we disable the `expPromo` and branch in leases/build.go, this assertion
// fails.
require.True(t, expLease.Expiration().LessEq(epochLease.Expiration()))
}
8 changes: 7 additions & 1 deletion pkg/kv/kvserver/kvserverpb/lease_status.go
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,13 @@ func (st LeaseStatus) Expiration() hlc.Timestamp {
case roachpb.LeaseExpiration:
return st.Lease.GetExpiration()
case roachpb.LeaseEpoch:
return st.Liveness.Expiration.ToTimestamp()
exp := st.Lease.MinExpiration
// The expiration of the liveness record is inherited by the lease iff the
// lease epoch matches the liveness epoch.
if st.Lease.Epoch == st.Liveness.Epoch {
exp.Forward(st.Liveness.Expiration.ToTimestamp())
}
return exp
default:
panic("unexpected")
}
Expand Down
5 changes: 4 additions & 1 deletion pkg/kv/kvserver/kvserverpb/lease_status.proto
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,10 @@ enum LeaseState {

// LeaseStatus holds the lease state, the current clock time at which the
// state is accurate, the request time at which the status is accurate, the
// lease iself, and optionally the liveness if the lease is epoch-based.
// lease itself, and optionally the liveness if the lease is epoch-based.
//
// LeaseStatus is not persisted anywhere. However, it is part of the
// serverpb.RangeInfo proto.
message LeaseStatus {
// Lease which this status describes.
roachpb.Lease lease = 1 [(gogoproto.nullable) = false];
Expand Down
53 changes: 39 additions & 14 deletions pkg/kv/kvserver/leases/build.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ type Settings struct {
UseExpirationLeases bool
TransferExpirationLeases bool
ExpToEpochEquiv bool
MinExpirationSupported bool
RangeLeaseDuration time.Duration
}

Expand Down Expand Up @@ -93,12 +94,6 @@ func (i BuildInput) Remote() bool { return !i.PrevLocal() && !i.NextLocal() }

// PrevLeaseExpiration returns the expiration time of the previous lease.
func (i BuildInput) PrevLeaseExpiration() hlc.Timestamp {
if i.PrevLease.Type() == roachpb.LeaseEpoch && i.PrevLeaseNodeLiveness.Epoch != i.PrevLease.Epoch {
// For epoch-based leases, we only consider the liveness expiration if it
// matches the lease epoch. If the liveness epoch is greater than the lease
// epoch, we consider the lease expired.
return hlc.Timestamp{}
}
return kvserverpb.LeaseStatus{
Lease: i.PrevLease,
Liveness: i.PrevLeaseNodeLiveness,
Expand Down Expand Up @@ -220,6 +215,8 @@ func Build(st Settings, nl NodeLiveness, i BuildInput) (Output, error) {
}
nextLease.Epoch = l.Epoch
nextLeaseLiveness = &l.Liveness

nextLease.MinExpiration = leaseMinTimestamp(st, i, nextLeaseType)
default:
panic("unexpected")
}
Expand All @@ -235,7 +232,7 @@ func Build(st Settings, nl NodeLiveness, i BuildInput) (Output, error) {
// Construct the output and determine whether any node liveness manipulation
// is necessary before the lease can be requested.
o := Output{NextLease: nextLease}
o.NodeLivenessManipulation = nodeLivenessManipulation(i, nextLease, nextLeaseLiveness)
o.NodeLivenessManipulation = nodeLivenessManipulation(st, i, nextLease, nextLeaseLiveness)

// Validate the output.
if err := o.validate(i); err != nil {
Expand Down Expand Up @@ -345,6 +342,25 @@ func leaseEpoch(
return l, nil
}

func leaseMinTimestamp(st Settings, i BuildInput, nextType roachpb.LeaseType) hlc.Timestamp {
if nextType != roachpb.LeaseEpoch {
panic("leaseMinTimestamp called for non-epoch lease")
}
if !st.MinExpirationSupported {
return hlc.Timestamp{}
}
// If we are promoting an expiration-based lease to an epoch-based lease, we
// must make sure the expiration does not regress. Do so by assigning a
// minimum expiration time to the new lease, which sets a lower bound for the
// lease's expiration, independent of the expiration stored indirectly in the
// liveness record.
expPromo := i.Extension() && i.PrevLease.Type() == roachpb.LeaseExpiration
if expPromo {
return *i.PrevLease.Expiration
}
return hlc.Timestamp{}
}

func leaseDeprecatedStartStasis(i BuildInput, nextExpiration *hlc.Timestamp) *hlc.Timestamp {
if i.Transfer() {
// We don't set StartStasis for lease transfers. It's not clear why this was
Expand Down Expand Up @@ -381,7 +397,7 @@ func leaseSequence(st Settings, i BuildInput, nextLease roachpb.Lease) roachpb.L
}

func nodeLivenessManipulation(
i BuildInput, nextLease roachpb.Lease, nextLeaseLiveness *livenesspb.Liveness,
st Settings, i BuildInput, nextLease roachpb.Lease, nextLeaseLiveness *livenesspb.Liveness,
) NodeLivenessManipulation {
// If we are promoting an expiration-based lease to an epoch-based lease, we
// must make sure the expiration does not regress. We do this here because the
Expand All @@ -390,12 +406,21 @@ func nodeLivenessManipulation(
// manually heartbeat our liveness record if necessary. This is expected to
// work because the liveness record interval and the expiration-based lease
// interval are the same.
expToEpochPromo := i.Extension() &&
i.PrevLease.Type() == roachpb.LeaseExpiration && nextLease.Type() == roachpb.LeaseEpoch
if expToEpochPromo && nextLeaseLiveness.Expiration.ToTimestamp().Less(i.PrevLeaseExpiration()) {
return NodeLivenessManipulation{
Heartbeat: nextLeaseLiveness,
HeartbeatMinExpiration: i.PrevLeaseExpiration(),
//
// We only need to perform this check if the minimum expiration field is not
// supported by the current cluster version. Otherwise, that field will be
// used to enforce the minimum expiration time.
//
// TODO(nvanbenschoten): remove this logic when we no longer support clusters
// that do not support the minimum expiration field.
if !st.MinExpirationSupported {
expToEpochPromo := i.Extension() &&
i.PrevLease.Type() == roachpb.LeaseExpiration && nextLease.Type() == roachpb.LeaseEpoch
if expToEpochPromo && nextLeaseLiveness.Expiration.ToTimestamp().Less(i.PrevLeaseExpiration()) {
return NodeLivenessManipulation{
Heartbeat: nextLeaseLiveness,
HeartbeatMinExpiration: i.PrevLeaseExpiration(),
}
}
}

Expand Down
29 changes: 28 additions & 1 deletion pkg/kv/kvserver/leases/build_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,7 @@ func defaultSettings() Settings {
UseExpirationLeases: false,
TransferExpirationLeases: true,
ExpToEpochEquiv: true,
MinExpirationSupported: true,
RangeLeaseDuration: 20,
}
}
Expand Down Expand Up @@ -440,6 +441,7 @@ func TestBuild(t *testing.T) {
Epoch: 3,
Sequence: 7, // sequence not changed
AcquisitionType: roachpb.LeaseAcquisitionType_Request,
MinExpiration: ts30,
},
},
},
Expand All @@ -448,6 +450,7 @@ func TestBuild(t *testing.T) {
st: func() Settings {
st := defaultSettings()
st.ExpToEpochEquiv = false
st.MinExpirationSupported = false
return st
}(),
input: expirationInput,
Expand All @@ -463,7 +466,31 @@ func TestBuild(t *testing.T) {
},
},
{
name: "promote expiration to epoch, needs liveness heartbeat",
name: "promote expiration to epoch, needs min_expiration",
input: func() BuildInput {
i := expirationInput
i.PrevLease.Expiration = &ts40
return i
}(),
expOutput: Output{
NextLease: roachpb.Lease{
Replica: repl1,
Start: cts10,
ProposedTS: cts20,
Epoch: 3,
Sequence: 7, // sequence not changed
AcquisitionType: roachpb.LeaseAcquisitionType_Request,
MinExpiration: ts40,
},
},
},
{
name: "promote expiration to epoch, needs liveness heartbeat, pre-24.2",
st: func() Settings {
st := defaultSettings()
st.MinExpirationSupported = false
return st
}(),
input: func() BuildInput {
i := expirationInput
i.PrevLease.Expiration = &ts40
Expand Down
17 changes: 8 additions & 9 deletions pkg/kv/kvserver/leases/status.go
Original file line number Diff line number Diff line change
Expand Up @@ -113,10 +113,9 @@ func Status(ctx context.Context, nl NodeLiveness, i StatusInput) kvserverpb.Leas
RequestTime: i.RequestTs,
MinValidObservedTimestamp: i.MinValidObservedTs,
}
var expiration hlc.Timestamp
if lease.Type() == roachpb.LeaseExpiration {
expiration = lease.GetExpiration()
} else {
if lease.Type() == roachpb.LeaseEpoch {
// For epoch-based leases, retrieve the node liveness record associated with
// the lease.
l, ok := nl.GetLiveness(lease.Replica.NodeID)
status.Liveness = l.Liveness
if !ok || status.Liveness.Epoch < lease.Epoch {
Expand All @@ -138,12 +137,12 @@ func Status(ctx context.Context, nl NodeLiveness, i StatusInput) kvserverpb.Leas
status.ErrInfo = msg.String()
return status
}
if status.Liveness.Epoch > lease.Epoch {
status.State = kvserverpb.LeaseState_EXPIRED
return status
}
expiration = status.Liveness.Expiration.ToTimestamp()
// If the epoch matches (l.Epoch == lease.Epoch), status.Expiration uses the
// liveness record's expiration to determine the expiration of the lease. If
// not, the lease is likely expired, but its minimum expiration may still be
// in the future (also consulted in status.Expiration), so we check below.
}
expiration := status.Expiration()
maxOffset := i.MaxOffset
stasis := expiration.Add(-int64(maxOffset), 0)
ownedLocally := lease.OwnedBy(i.LocalStoreID)
Expand Down
10 changes: 9 additions & 1 deletion pkg/kv/kvserver/leases/status_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,9 @@ func TestStatus(t *testing.T) {
Epoch: 5,
ProposedTS: expLease.ProposedTS,
}
epoLeaseMinExp1, epoLeaseMinExp2 := epoLease, epoLease
epoLeaseMinExp1.MinExpiration = ts[3].ToTimestamp()
epoLeaseMinExp2.MinExpiration = ts[5].ToTimestamp()

oldLiveness := livenesspb.Liveness{
NodeID: 1, Epoch: 4, Expiration: hlc.LegacyTimestamp{WallTime: ts[1].WallTime},
Expand Down Expand Up @@ -87,6 +90,9 @@ func TestStatus(t *testing.T) {
{lease: epoLease, now: ts[5], liveness: curLiveness, want: kvserverpb.LeaseState_EXPIRED},
{lease: epoLease, now: ts[4], liveness: curLiveness, want: kvserverpb.LeaseState_EXPIRED},
{lease: epoLease, now: ts[2], liveness: expLiveness, want: kvserverpb.LeaseState_EXPIRED},
{lease: epoLeaseMinExp1, now: ts[5], liveness: curLiveness, want: kvserverpb.LeaseState_EXPIRED},
{lease: epoLeaseMinExp1, now: ts[4], liveness: curLiveness, want: kvserverpb.LeaseState_EXPIRED},
{lease: epoLeaseMinExp2, now: ts[5], liveness: curLiveness, want: kvserverpb.LeaseState_EXPIRED},
// Epoch-based lease, PROSCRIBED.
{lease: epoLease, now: ts[3], liveness: curLiveness, minProposedTS: ts[1],
want: kvserverpb.LeaseState_PROSCRIBED},
Expand All @@ -97,7 +103,9 @@ func TestStatus(t *testing.T) {
want: kvserverpb.LeaseState_UNUSABLE},
// Epoch-based lease, VALID.
{lease: epoLease, now: ts[2], liveness: curLiveness, want: kvserverpb.LeaseState_VALID},

{lease: epoLeaseMinExp1, now: ts[2], liveness: expLiveness, want: kvserverpb.LeaseState_VALID},
{lease: epoLeaseMinExp2, now: ts[2], liveness: expLiveness, want: kvserverpb.LeaseState_VALID},
{lease: epoLeaseMinExp2, now: ts[4], liveness: curLiveness, want: kvserverpb.LeaseState_VALID},
// Epoch-based lease, ERROR.
{lease: epoLease, now: ts[2], want: kvserverpb.LeaseState_ERROR,
wantErr: "liveness record not found"},
Expand Down
1 change: 1 addition & 0 deletions pkg/kv/kvserver/replica_range_lease.go
Original file line number Diff line number Diff line change
Expand Up @@ -286,6 +286,7 @@ func (p *pendingLeaseRequest) InitOrJoinRequest(
UseExpirationLeases: p.repl.shouldUseExpirationLeaseRLocked(),
TransferExpirationLeases: TransferExpirationLeasesFirstEnabled.Get(&p.repl.store.ClusterSettings().SV),
ExpToEpochEquiv: p.repl.store.ClusterSettings().Version.IsActive(ctx, clusterversion.V24_1Start),
MinExpirationSupported: p.repl.store.ClusterSettings().Version.IsActive(ctx, clusterversion.V24_2_LeaseMinTimestamp),
RangeLeaseDuration: p.repl.store.cfg.RangeLeaseDuration,
}
nl := p.repl.store.cfg.NodeLiveness
Expand Down
4 changes: 4 additions & 0 deletions pkg/server/server_sql.go
Original file line number Diff line number Diff line change
Expand Up @@ -1229,6 +1229,10 @@ func newSQLServer(ctx context.Context, cfg sqlServerArgs) (*SQLServer, error) {

// Job internal operations use the node principal.
sd.UserProto = username.NodeUserName().EncodeProto()

// The following should not apply to SQL operations performed by the jobs
// subsystem.
sd.StmtTimeout = 0
})
jobRegistry.SetInternalDB(jobsInternalDB)

Expand Down
1 change: 1 addition & 0 deletions pkg/sql/tests/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ go_test(
"hash_sharded_test.go",
"impure_builtin_test.go",
"inverted_index_test.go",
"job_statement_timeout_test.go",
"kv_test.go",
"main_test.go",
"max_open_txns_test.go",
Expand Down
Loading

0 comments on commit b5ab804

Please sign in to comment.