Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
120507: sql: default sql.stats.statement_fingerprint.format_mask to use special flags r=xinhaoz a=xinhaoz

Please note only the latest commit should be reviewed.

------------------------
By default `sql.stats.statement_fingerprint.format_mask` is now
set to `FmtCollapseLists|FmtConstantsAsUnderscores` to reduce
statement fingerprint cardinality due to long constant lists and
variations in constant formatting. Note that the default fmt flag
for statement fingerprint generation is `FmtHideConstants`. Any
flags set with sql.stats.statement_fingerprint.format_mask will be
OR'd with `FmtHideConstants`.

Closes: #120409

Release note (sql change): Users will see the following changes
in their generated statement fingerprints from sql stats:
- lists with only literals/placeholders and similar subexpressions are
shortened to their first item followed by "__more__", e.g.
- constants and placeholders are all replaced with the same character,
an underscore `_`
```
SELECT * FROM foo WHERE f IN (1, $1, 1+2) ->
SELECT * FROM foo WHERE f IN (_, __more__)
```

120596: kvcoord: add observability for DistSender circuit breakers r=erikgrinaker a=erikgrinaker

**util/circuit: add `error` parameter for `EventHandler.OnReset`**

This allows the `OnReset` handler to determine whether the circuit breaker was tripped when reset. This can happen e.g. if an async probe succeeds before the breaker has been tripped.

**kvcoord: add DistSender circuit breaker metrics**

This patch adds an initial set of DistSender circuit breaker metrics:

* distsender.circuit_breaker.replicas.count
* distsender.circuit_breaker.replicas.tripped
* distsender.circuit_breaker.replicas.tripped_events
* distsender.circuit_breaker.replicas.probes.running
* distsender.circuit_breaker.replicas.probes.success
* distsender.circuit_breaker.replicas.probes.failure
* distsender.circuit_breaker.replicas.requests.cancelled
* distsender.circuit_breaker.replicas.requests.rejected

**kvcoord: annotate DistSender circuit breaker async contexts**

**kvcoord: improve DistSender circuit breaker errors**

**kvcoord: add logging/tracing for DistSender circuit breakers**

By default, only the trip/reset events are logged. vmodule logs yield:

```
I240316 13:07:33.815949 113 kv/kvclient/kvcoord/dist_sender_circuit_breaker.go:899  [T1,Vsystem,n1] 243  launching circuit breaker probe for r68/(n3,s3):3 (tripped=false stall=1.318s error=0s)
I240316 13:07:33.816084 4941 kv/kvclient/kvcoord/dist_sender_circuit_breaker.go:811  [T1,Vsystem,n1] 244  sending probe to r68/(n3,s3):3: LeaseInfo [/Table/Max]
I240316 13:07:34.816556 4941 kv/kvclient/kvcoord/dist_sender_circuit_breaker.go:813  [T1,Vsystem,n1] 249  probe result from r68/(n3,s3):3: br=<nil> err=ba: LeaseInfo [/Table/Max] RPC error: grpc: context deadline exceeded [code 4/DeadlineExceeded]
E240316 13:07:34.816788 4941 kv/kvclient/kvcoord/dist_sender_circuit_breaker.go:873  [T1,Vsystem,n1] 250  r68/(n3,s3):3 circuit breaker tripped: probe timed out: context deadline exceede  (stalled for 2.32s, erroring for 0s)
E240316 13:07:38.023324 5287 kv/kvclient/kvcoord/dist_sender_circuit_breaker.go:519  [T1,Vsystem,n1] 535  request rejected by tripped circuit breaker for r68/(n3,s3):3: probe timed out: context deadline exceeded
I240316 13:07:38.024899 9324 kv/kvclient/kvcoord/dist_sender_circuit_breaker.go:813  [T1,Vsystem,n1] 550  probe result from r68/(n3,s3):3: br=<nil> err=ba: LeaseInfo [/Table/Max] RPC error: grpc: context canceled [code 1/Canceled]
I240316 13:07:38.025341 9324 kv/kvclient/kvcoord/dist_sender_circuit_breaker.go:890  [T1,Vsystem,n1] 556  r68/(n3,s3):3 circuit breaker reset
I240316 13:07:38.025374 9324 kv/kvclient/kvcoord/dist_sender_circuit_breaker.go:906  [T1,Vsystem,n1] 557  stopping circuit breaker probe for r68/(n3,s3):3 (tripped=false lastRequest=1.344s)
```

Resolves #119916.
Epic: none
Release note: None

**kvcoord: use duration helpers in DistSender circuit breakers**

**kvcoord: change DistSender circuit breaker string identifier**

Changes e.g. `r68/(n3,s3):3` to `r68/3:(n3,s3)`.

120908: go.mod: bump Pebble to 10ebcdd794ec r=itsbilal a=aadityasondhi

Changes:

 * [`10ebcdd7`](cockroachdb/pebble@10ebcdd7) metamorphic: track IngestAndExcise in keymgr and resolve singledel conflicts
 * [`b3c1664a`](cockroachdb/pebble@b3c1664a) db: fix nil map error when ingest-splitting during flushable ingests
 * [`fd5dc141`](cockroachdb/pebble@fd5dc141) metamorphic: re-enable ingest split and ingestAndExcise in TestMeta
 * [`4335ae09`](cockroachdb/pebble@4335ae09) keyspan: simplify Filter
 * [`30f455fa`](cockroachdb/pebble@30f455fa) keyspan: simplify and reimplement Truncate
 * [`e2f53d2d`](cockroachdb/pebble@e2f53d2d) wal: fix recordQueue bug due to forgetting to mod when indexing
 * [`8cdabcc9`](cockroachdb/pebble@8cdabcc9) metamorphic: add support for external ingestions in replicateOp
 * [`e5c9f635`](cockroachdb/pebble@e5c9f635) db: remove strictWALTail option
 * [`3c9893d6`](cockroachdb/pebble@3c9893d6) ingest: fix ingestion metric for flushableIngest
 * [`8a097e8a`](cockroachdb/pebble@8a097e8a) ingest,compaction: use excise when flushing flushableIngest
 * [`dc7ccb2b`](cockroachdb/pebble@dc7ccb2b) vfs/vfstest: adjust WithOpenFileTracking to not wrap typed nils
 * [`c268820f`](cockroachdb/pebble@c268820f) meta: temporarily disable downloadOp
 * [`2aa3786c`](cockroachdb/pebble@2aa3786c) sstable: split block.go
 * [`702f8cc3`](cockroachdb/pebble@702f8cc3) base: add UserKeyBounds

Release note: none.
Epic: none.

Co-authored-by: Xin Hao Zhang <xzhang@cockroachlabs.com>
Co-authored-by: Erik Grinaker <grinaker@cockroachlabs.com>
Co-authored-by: Aaditya Sondhi <20070511+aadityasondhi@users.noreply.github.com>
  • Loading branch information
4 people committed Mar 22, 2024
4 parents 01c8fa0 + bf24702 + 4c6b02a + 8d2d89b commit a2e6ec5
Show file tree
Hide file tree
Showing 32 changed files with 357 additions and 139 deletions.
6 changes: 3 additions & 3 deletions DEPS.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -1693,10 +1693,10 @@ def go_deps():
patches = [
"@com_github_cockroachdb_cockroach//build/patches:com_github_cockroachdb_pebble.patch",
],
sha256 = "e258498e380ea2266386054a1277c98dc58f58bbe5df29f5f91d3c163a9e5231",
strip_prefix = "github.com/cockroachdb/pebble@v0.0.0-20240320172852-7b8b3d5a8211",
sha256 = "f91c1724434bde2eb5d1c5fa5678f485056e963faa23c3974ac6bc9899aa8b18",
strip_prefix = "github.com/cockroachdb/pebble@v0.0.0-20240322192100-10ebcdd794ec",
urls = [
"https://storage.googleapis.com/cockroach-godeps/gomod/github.com/cockroachdb/pebble/com_github_cockroachdb_pebble-v0.0.0-20240320172852-7b8b3d5a8211.zip",
"https://storage.googleapis.com/cockroach-godeps/gomod/github.com/cockroachdb/pebble/com_github_cockroachdb_pebble-v0.0.0-20240322192100-10ebcdd794ec.zip",
],
)
go_repository(
Expand Down
2 changes: 1 addition & 1 deletion build/bazelutil/distdir_files.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -330,7 +330,7 @@ DISTDIR_FILES = {
"https://storage.googleapis.com/cockroach-godeps/gomod/github.com/cockroachdb/gostdlib/com_github_cockroachdb_gostdlib-v1.19.0.zip": "c4d516bcfe8c07b6fc09b8a9a07a95065b36c2855627cb3514e40c98f872b69e",
"https://storage.googleapis.com/cockroach-godeps/gomod/github.com/cockroachdb/logtags/com_github_cockroachdb_logtags-v0.0.0-20230118201751-21c54148d20b.zip": "ca7776f47e5fecb4c495490a679036bfc29d95bd7625290cfdb9abb0baf97476",
"https://storage.googleapis.com/cockroach-godeps/gomod/github.com/cockroachdb/metamorphic/com_github_cockroachdb_metamorphic-v0.0.0-20231108215700-4ba948b56895.zip": "28c8cf42192951b69378cf537be5a9a43f2aeb35542908cc4fe5f689505853ea",
"https://storage.googleapis.com/cockroach-godeps/gomod/github.com/cockroachdb/pebble/com_github_cockroachdb_pebble-v0.0.0-20240320172852-7b8b3d5a8211.zip": "e258498e380ea2266386054a1277c98dc58f58bbe5df29f5f91d3c163a9e5231",
"https://storage.googleapis.com/cockroach-godeps/gomod/github.com/cockroachdb/pebble/com_github_cockroachdb_pebble-v0.0.0-20240322192100-10ebcdd794ec.zip": "f91c1724434bde2eb5d1c5fa5678f485056e963faa23c3974ac6bc9899aa8b18",
"https://storage.googleapis.com/cockroach-godeps/gomod/github.com/cockroachdb/redact/com_github_cockroachdb_redact-v1.1.5.zip": "11b30528eb0dafc8bc1a5ba39d81277c257cbe6946a7564402f588357c164560",
"https://storage.googleapis.com/cockroach-godeps/gomod/github.com/cockroachdb/returncheck/com_github_cockroachdb_returncheck-v0.0.0-20200612231554-92cdbca611dd.zip": "ce92ba4352deec995b1f2eecf16eba7f5d51f5aa245a1c362dfe24c83d31f82b",
"https://storage.googleapis.com/cockroach-godeps/gomod/github.com/cockroachdb/stress/com_github_cockroachdb_stress-v0.0.0-20220803192808-1806698b1b7b.zip": "3fda531795c600daf25532a4f98be2a1335cd1e5e182c72789bca79f5f69fcc1",
Expand Down
8 changes: 8 additions & 0 deletions docs/generated/metrics/metrics.html
Original file line number Diff line number Diff line change
Expand Up @@ -827,6 +827,14 @@
<tr><td>APPLICATION</td><td>distsender.batches.async.sent</td><td>Number of partial batches sent asynchronously</td><td>Partial Batches</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
<tr><td>APPLICATION</td><td>distsender.batches.async.throttled</td><td>Number of partial batches not sent asynchronously due to throttling</td><td>Partial Batches</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
<tr><td>APPLICATION</td><td>distsender.batches.partial</td><td>Number of partial batches processed after being divided on range boundaries</td><td>Partial Batches</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
<tr><td>APPLICATION</td><td>distsender.circuit_breaker.replicas.count</td><td>Number of replicas currently tracked by DistSender circuit breakers</td><td>Replicas</td><td>GAUGE</td><td>COUNT</td><td>AVG</td><td>NONE</td></tr>
<tr><td>APPLICATION</td><td>distsender.circuit_breaker.replicas.probes.failure</td><td>Cumulative number of failed DistSender replica circuit breaker probes</td><td>Probes</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
<tr><td>APPLICATION</td><td>distsender.circuit_breaker.replicas.probes.running</td><td>Number of currently running DistSender replica circuit breaker probes</td><td>Probes</td><td>GAUGE</td><td>COUNT</td><td>AVG</td><td>NONE</td></tr>
<tr><td>APPLICATION</td><td>distsender.circuit_breaker.replicas.probes.success</td><td>Cumulative number of successful DistSender replica circuit breaker probes</td><td>Probes</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
<tr><td>APPLICATION</td><td>distsender.circuit_breaker.replicas.requests.cancelled</td><td>Cumulative number of requests cancelled when DistSender replica circuit breakers trip</td><td>Requests</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
<tr><td>APPLICATION</td><td>distsender.circuit_breaker.replicas.requests.rejected</td><td>Cumulative number of requests rejected by tripped DistSender replica circuit breakers</td><td>Requests</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
<tr><td>APPLICATION</td><td>distsender.circuit_breaker.replicas.tripped</td><td>Number of DistSender replica circuit breakers currently tripped</td><td>Replicas</td><td>GAUGE</td><td>COUNT</td><td>AVG</td><td>NONE</td></tr>
<tr><td>APPLICATION</td><td>distsender.circuit_breaker.replicas.tripped_events</td><td>Cumulative number of DistSender replica circuit breakers tripped over time</td><td>Replicas</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
<tr><td>APPLICATION</td><td>distsender.errors.inleasetransferbackoffs</td><td>Number of times backed off due to NotLeaseHolderErrors during lease transfer</td><td>Errors</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
<tr><td>APPLICATION</td><td>distsender.errors.notleaseholder</td><td>Number of NotLeaseHolderErrors encountered from replica-addressed RPCs</td><td>Errors</td><td>COUNTER</td><td>COUNT</td><td>AVG</td><td>NON_NEGATIVE_DERIVATIVE</td></tr>
<tr><td>APPLICATION</td><td>distsender.rangefeed.catchup_ranges</td><td>Number of ranges in catchup mode<br/><br/>This counts the number of ranges with an active rangefeed that are performing catchup scan.<br/></td><td>Ranges</td><td>GAUGE</td><td>COUNT</td><td>AVG</td><td>NONE</td></tr>
Expand Down
2 changes: 1 addition & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ require (
github.com/cockroachdb/go-test-teamcity v0.0.0-20191211140407-cff980ad0a55
github.com/cockroachdb/gostdlib v1.19.0
github.com/cockroachdb/logtags v0.0.0-20230118201751-21c54148d20b
github.com/cockroachdb/pebble v0.0.0-20240320172852-7b8b3d5a8211
github.com/cockroachdb/pebble v0.0.0-20240322192100-10ebcdd794ec
github.com/cockroachdb/redact v1.1.5
github.com/cockroachdb/returncheck v0.0.0-20200612231554-92cdbca611dd
github.com/cockroachdb/stress v0.0.0-20220803192808-1806698b1b7b
Expand Down
4 changes: 2 additions & 2 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -507,8 +507,8 @@ github.com/cockroachdb/logtags v0.0.0-20230118201751-21c54148d20b h1:r6VH0faHjZe
github.com/cockroachdb/logtags v0.0.0-20230118201751-21c54148d20b/go.mod h1:Vz9DsVWQQhf3vs21MhPMZpMGSht7O/2vFW2xusFUVOs=
github.com/cockroachdb/metamorphic v0.0.0-20231108215700-4ba948b56895 h1:XANOgPYtvELQ/h4IrmPAohXqe2pWA8Bwhejr3VQoZsA=
github.com/cockroachdb/metamorphic v0.0.0-20231108215700-4ba948b56895/go.mod h1:aPd7gM9ov9M8v32Yy5NJrDyOcD8z642dqs+F0CeNXfA=
github.com/cockroachdb/pebble v0.0.0-20240320172852-7b8b3d5a8211 h1:P4IriHxRJeIGtpzJSbWtw+FzM07k09Hpi8f7f/Lo3yE=
github.com/cockroachdb/pebble v0.0.0-20240320172852-7b8b3d5a8211/go.mod h1:4vn8KzcL6D2yW6hZAabweFFHVYSIL6z9BKTAEBvAmS4=
github.com/cockroachdb/pebble v0.0.0-20240322192100-10ebcdd794ec h1:pvg/EIMk3MMADtSNi8i9akNZupSpdLygaHXGIAXyVOw=
github.com/cockroachdb/pebble v0.0.0-20240322192100-10ebcdd794ec/go.mod h1:4vn8KzcL6D2yW6hZAabweFFHVYSIL6z9BKTAEBvAmS4=
github.com/cockroachdb/redact v1.1.3/go.mod h1:BVNblN9mBWFyMyqK1k3AAiSxhvhfK2oOZZ2lK+dpvRg=
github.com/cockroachdb/redact v1.1.5 h1:u1PMllDkdFfPWaNGMyLD1+so+aq3uUItthCFqzwPJ30=
github.com/cockroachdb/redact v1.1.5/go.mod h1:BVNblN9mBWFyMyqK1k3AAiSxhvhfK2oOZZ2lK+dpvRg=
Expand Down
12 changes: 6 additions & 6 deletions pkg/ccl/logictestccl/testdata/logic_test/crdb_internal_tenant
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# LogicTest: 3node-tenant
# LogicTest: 3node-tenant

query II
SELECT count(distinct(node_id)), count(*) FROM crdb_internal.node_runtime_info
Expand Down Expand Up @@ -543,11 +543,11 @@ SELECT key, max_retries, failure_count
WHERE application_name = 'test_max_retry'
ORDER BY key
----
CREATE SEQUENCE s 0 0
DROP SEQUENCE s 0 0
RESET application_name 0 0
SELECT IF(nextval('_') < _, crdb_internal.force_retry('_'::INTERVAL), _) 0 1
SELECT IF(nextval(_) < _, crdb_internal.force_retry(_), _) 2 1
CREATE SEQUENCE s 0 0
DROP SEQUENCE s 0 0
RESET application_name 0 0
SELECT IF(nextval(_) < _, crdb_internal.force_retry(_), _) 2 1
SELECT IF(nextval(_) < _, crdb_internal.force_retry(_::INTERVAL), _) 0 1

query T
SELECT crdb_internal.cluster_name()
Expand Down
4 changes: 2 additions & 2 deletions pkg/ccl/serverccl/statusccl/tenant_status_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -427,7 +427,7 @@ func TestTenantCannotSeeNonTenantStats(t *testing.T) {
{stmt: `CREATE TABLE posts_t (id INT8 PRIMARY KEY, body STRING)`},
{
stmt: `INSERT INTO posts_t VALUES (1, 'foo')`,
fingerprint: `INSERT INTO posts_t VALUES (_, '_')`,
fingerprint: `INSERT INTO posts_t VALUES (_, __more__)`,
},
{stmt: `SELECT * FROM posts_t`},
}
Expand All @@ -446,7 +446,7 @@ func TestTenantCannotSeeNonTenantStats(t *testing.T) {
{stmt: `CREATE TABLE posts_nt (id INT8 PRIMARY KEY, body STRING)`},
{
stmt: `INSERT INTO posts_nt VALUES (1, 'foo')`,
fingerprint: `INSERT INTO posts_nt VALUES (_, '_')`,
fingerprint: `INSERT INTO posts_nt VALUES (_, __more__)`,
},
{stmt: `SELECT * FROM posts_nt`},
}
Expand Down
2 changes: 1 addition & 1 deletion pkg/cli/interactive_tests/test_demo_workload.tcl
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ for {set i 0} {$i < 10} {incr i} {
set timeout 1
send "select key from crdb_internal.node_statement_statistics order by count desc limit 1;\r"
expect {
"SELECT city, id FROM vehicles WHERE city = \$1" {
"SELECT city, id FROM vehicles WHERE city = _" {
set workloadRunning 1
break
}
Expand Down
2 changes: 1 addition & 1 deletion pkg/cli/interactive_tests/test_explain_analyze_debug.tcl
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ eexpect root@
send "PREPARE p AS SELECT * FROM t WHERE k = \$1;\r"
eexpect root@

send "SELECT crdb_internal.request_statement_bundle('SELECT * FROM t WHERE k = \$1', 0::FLOAT, 0::INTERVAL, 0::INTERVAL);\r"
send "SELECT crdb_internal.request_statement_bundle('SELECT * FROM t WHERE k = _', 0::FLOAT, 0::INTERVAL, 0::INTERVAL);\r"
eexpect root@

send "EXECUTE p(1);\r"
Expand Down
80 changes: 79 additions & 1 deletion pkg/kv/kvclient/kvcoord/dist_sender.go
Original file line number Diff line number Diff line change
Expand Up @@ -237,6 +237,55 @@ This counts the number of ranges with an active rangefeed that are performing ca
Measurement: "Ranges",
Unit: metric.Unit_COUNT,
}

metaDistSenderCircuitBreakerReplicasCount = metric.Metadata{
Name: "distsender.circuit_breaker.replicas.count",
Help: `Number of replicas currently tracked by DistSender circuit breakers`,
Measurement: "Replicas",
Unit: metric.Unit_COUNT,
}
metaDistSenderCircuitBreakerReplicasTripped = metric.Metadata{
Name: "distsender.circuit_breaker.replicas.tripped",
Help: `Number of DistSender replica circuit breakers currently tripped`,
Measurement: "Replicas",
Unit: metric.Unit_COUNT,
}
metaDistSenderCircuitBreakerReplicasTrippedEvents = metric.Metadata{
Name: "distsender.circuit_breaker.replicas.tripped_events",
Help: `Cumulative number of DistSender replica circuit breakers tripped over time`,
Measurement: "Replicas",
Unit: metric.Unit_COUNT,
}
metaDistSenderCircuitBreakerReplicasProbesRunning = metric.Metadata{
Name: "distsender.circuit_breaker.replicas.probes.running",
Help: `Number of currently running DistSender replica circuit breaker probes`,
Measurement: "Probes",
Unit: metric.Unit_COUNT,
}
metaDistSenderCircuitBreakerReplicasProbesSuccess = metric.Metadata{
Name: "distsender.circuit_breaker.replicas.probes.success",
Help: `Cumulative number of successful DistSender replica circuit breaker probes`,
Measurement: "Probes",
Unit: metric.Unit_COUNT,
}
metaDistSenderCircuitBreakerReplicasProbesFailure = metric.Metadata{
Name: "distsender.circuit_breaker.replicas.probes.failure",
Help: `Cumulative number of failed DistSender replica circuit breaker probes`,
Measurement: "Probes",
Unit: metric.Unit_COUNT,
}
metaDistSenderCircuitBreakerReplicasRequestsCancelled = metric.Metadata{
Name: "distsender.circuit_breaker.replicas.requests.cancelled",
Help: `Cumulative number of requests cancelled when DistSender replica circuit breakers trip`,
Measurement: "Requests",
Unit: metric.Unit_COUNT,
}
metaDistSenderCircuitBreakerReplicasRequestsRejected = metric.Metadata{
Name: "distsender.circuit_breaker.replicas.requests.rejected",
Help: `Cumulative number of requests rejected by tripped DistSender replica circuit breakers`,
Measurement: "Requests",
Unit: metric.Unit_COUNT,
}
)

// CanSendToFollower is used by the DistSender to determine if it needs to look
Expand Down Expand Up @@ -325,9 +374,24 @@ type DistSenderMetrics struct {
SlowReplicaRPCs *metric.Counter
MethodCounts [kvpb.NumMethods]*metric.Counter
ErrCounts [kvpb.NumErrors]*metric.Counter
CircuitBreaker DistSenderCircuitBreakerMetrics
DistSenderRangeFeedMetrics
}

// DistSenderCircuitBreakerMetrics is the set of circuit breaker metrics.
type DistSenderCircuitBreakerMetrics struct {
Replicas *metric.Gauge
ReplicasTripped *metric.Gauge
ReplicasTrippedEvents *metric.Counter
ReplicasProbesRunning *metric.Gauge
ReplicasProbesSuccess *metric.Counter
ReplicasProbesFailure *metric.Counter
ReplicasRequestsCancelled *metric.Counter
ReplicasRequestsRejected *metric.Counter
}

func (DistSenderCircuitBreakerMetrics) MetricStruct() {}

// DistSenderRangeFeedMetrics is a set of rangefeed specific metrics.
type DistSenderRangeFeedMetrics struct {
RangefeedRanges *metric.Gauge
Expand Down Expand Up @@ -356,6 +420,7 @@ func makeDistSenderMetrics() DistSenderMetrics {
RangeLookups: metric.NewCounter(metaDistSenderRangeLookups),
SlowRPCs: metric.NewGauge(metaDistSenderSlowRPCs),
SlowReplicaRPCs: metric.NewCounter(metaDistSenderSlowReplicaRPCs),
CircuitBreaker: makeDistSenderCircuitBreakerMetrics(),
DistSenderRangeFeedMetrics: makeDistSenderRangeFeedMetrics(),
}
for i := range m.MethodCounts {
Expand All @@ -375,6 +440,19 @@ func makeDistSenderMetrics() DistSenderMetrics {
return m
}

func makeDistSenderCircuitBreakerMetrics() DistSenderCircuitBreakerMetrics {
return DistSenderCircuitBreakerMetrics{
Replicas: metric.NewGauge(metaDistSenderCircuitBreakerReplicasCount),
ReplicasTripped: metric.NewGauge(metaDistSenderCircuitBreakerReplicasTripped),
ReplicasTrippedEvents: metric.NewCounter(metaDistSenderCircuitBreakerReplicasTrippedEvents),
ReplicasProbesRunning: metric.NewGauge(metaDistSenderCircuitBreakerReplicasProbesRunning),
ReplicasProbesSuccess: metric.NewCounter(metaDistSenderCircuitBreakerReplicasProbesSuccess),
ReplicasProbesFailure: metric.NewCounter(metaDistSenderCircuitBreakerReplicasProbesFailure),
ReplicasRequestsCancelled: metric.NewCounter(metaDistSenderCircuitBreakerReplicasRequestsCancelled),
ReplicasRequestsRejected: metric.NewCounter(metaDistSenderCircuitBreakerReplicasRequestsRejected),
}
}

// rangeFeedErrorCounters are various error related counters for rangefeed.
type rangeFeedErrorCounters struct {
RangefeedRestartRanges *metric.Counter
Expand Down Expand Up @@ -731,7 +809,7 @@ func NewDistSender(cfg DistSenderConfig) *DistSender {
// the stopper stops. This can only error if the server is shutting down, so
// ignore the returned error.
ds.circuitBreakers = NewDistSenderCircuitBreakers(
ds.stopper, ds.st, ds.transportFactory, ds.metrics)
ds.AmbientContext, ds.stopper, ds.st, ds.transportFactory, ds.metrics)
_ = ds.circuitBreakers.Start()

if cfg.TestingKnobs.LatencyFunc != nil {
Expand Down
Loading

0 comments on commit a2e6ec5

Please sign in to comment.