obs/ash: add periodic top-N workload summary logging#165093
obs/ash: add periodic top-N workload summary logging#165093trunk-io[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
|
😎 Merged successfully - details. |
| @@ -59,6 +62,16 @@ var BufferSize = settings.RegisterIntSetting( | |||
| settings.PositiveInt, | |||
| ) | |||
|
|
|||
There was a problem hiding this comment.
Do we want cluster setting for the two constants below?
I could see an argument for logSummaryTopN being a setting, but i think a 60 second periodic log wouldn't need to be modified .. would it?
There was a problem hiding this comment.
Yeah, I think cluster settings for both is deseriable. I think in general, a big structured log per minute feels too high. Maybe set to 10m by default and let customers adjust if they want more.
There was a problem hiding this comment.
What if we had a smaller structured log per minute? @dhartunian
There was a problem hiding this comment.
Or rather, 10 (logs per message) as defined feels rather small - is this okay?
There was a problem hiding this comment.
Since it's easily configurable, I don't feel super strongly. We can do 10 lines per minute if we feel it would help with diagnosing issues.
| @@ -59,6 +62,16 @@ var BufferSize = settings.RegisterIntSetting( | |||
| settings.PositiveInt, | |||
| ) | |||
|
|
|||
There was a problem hiding this comment.
Yeah, I think cluster settings for both is deseriable. I think in general, a big structured log per minute feels too high. Maybe set to 10m by default and let customers adjust if they want more.
pkg/obs/ash/sampler.go
Outdated
| fmt.Fprintf(&buf, "\n count=%-5d type=%-9s event=%-20s workload=%s", | ||
| e.count, e.key.WorkEventType, e.key.WorkEvent, e.key.WorkloadID) | ||
| } | ||
| log.Ops.Infof(ctx, "%s", buf.String()) |
There was a problem hiding this comment.
This should be a structured log with a protobuf in eventpb so that it's output as JSON and easily machine-parsed.
pkg/obs/ash/sampler.go
Outdated
| pendingSamples []pendingSample | ||
| // tickSamples tracks per-workload sample counts since the last | ||
| // periodic log summary was emitted. | ||
| tickSamples map[workloadKey]int |
There was a problem hiding this comment.
Why do you need this? Why not loop through the ring buffer with a timestamp cutoff?
There was a problem hiding this comment.
It's a bit more efficient to keep the summary data updated while doing the ticks, instead of constructing it when the log summary interval happens.
But i think that efficiency gain is too small to warrant adding more state to the sampler and making the code a bit harder to follow (as @angles-n-daemons pointed out with setting/resetting these fields in different functions)
Changes to loop through the ring buffer w/ timestamp cutoff
|
Can you also show example output in the commit message? That would be helpful to see as context. |
angles-n-daemons
left a comment
There was a problem hiding this comment.
couple small comments, nothing blocking
pkg/obs/ash/sampler.go
Outdated
| log.Ops.Infof(ctx, "%s", buf.String()) | ||
|
|
||
| // Reset counters. | ||
| for k := range s.tickSamples { |
There was a problem hiding this comment.
What's the reason for maybeLogSamples resetting tickSamples and totalSamples. It feels like these should be initialized each time takeSample is called, because that's where they're going to be set.
The current code should work, but it feels a little fragile, and prone to correctness bugs if ordering, or frequency of sample vs log operations were changed.
There was a problem hiding this comment.
ack, i made this code simpler as per this and davids suggestion to just scan the ring buffer when doing the log summary.
There was a problem hiding this comment.
nice, that's great thanks!
| @@ -59,6 +62,16 @@ var BufferSize = settings.RegisterIntSetting( | |||
| settings.PositiveInt, | |||
| ) | |||
|
|
|||
There was a problem hiding this comment.
What if we had a smaller structured log per minute? @dhartunian
8a730d5 to
87e7250
Compare
87e7250 to
5b404b6
Compare
5b404b6 to
16bbaec
Compare
angles-n-daemons
left a comment
There was a problem hiding this comment.
looks good, one last comment
| // maybeLogSummary emits a top-N workload summary as structured events | ||
| // to the OPS log if enough time has elapsed since the last report. | ||
| // It scans the ring buffer for samples newer than lastLogTime and | ||
| // aggregates them by workload key. |
There was a problem hiding this comment.
Can we add a comment stating this is coupled to table sampling? It wasn't obvious to me at first, but has subtle implications on how the various settings interact.
16bbaec to
d9c30a0
Compare
pkg/obs/ash/sampler.go
Outdated
| var LogTopN = settings.RegisterIntSetting( | ||
| settings.SystemVisible, | ||
| "obs.ash.log_top_n", | ||
| "maximum number of entries in periodic ASH workload summary", |
There was a problem hiding this comment.
it would be helpful to document here what it means for stuff to appear "at the top"
There was a problem hiding this comment.
ack, added.
| @@ -59,6 +62,16 @@ var BufferSize = settings.RegisterIntSetting( | |||
| settings.PositiveInt, | |||
| ) | |||
|
|
|||
There was a problem hiding this comment.
Since it's easily configurable, I don't feel super strongly. We can do 10 lines per minute if we feel it would help with diagnosing issues.
docs/generated/eventlog.md
Outdated
|
|
||
| | Field | Description | Sensitive | | ||
| |--|--|--| | ||
| | `WindowDurationNanos` | The duration of the reporting window in nanoseconds. | no | |
There was a problem hiding this comment.
I would make this millis. do we allow sampling at intervals below a second? this is just adding zeros to the string for no huge value. Plus millis are more human-readable. I'd also be fine with seconds if that's our lowest supported interval.
There was a problem hiding this comment.
Sampler can run at 100ms, 500ms reliably from what i've seen with my benchmarks.
Changed to milliseconds.
The ASH sampler holds samples only in memory, which are lost on node
restart. Add periodic structured log summaries so operators have durable
evidence of workload patterns.
Two new cluster settings control this behavior:
obs.ash.log_interval (default 10m) — how often the summary is emitted.
obs.ash.log_top_n (default 10) — max entries per summary.
Every log_interval, the sampler scans the ring buffer for samples
collected since the last report, groups them by (WorkEventType,
WorkEvent, WorkloadID), and emits one structured event per top-N entry
on the OPS channel. If there are zero samples in the window, no log is
emitted.
Example log output (one line per top-N entry, sorted by SampleCount descending):
I260309 19:25:00.123456 42 1@obs/ash/sampler.go:308 [n1] 500
={"Timestamp":1773084300123456000,"EventType":"ash_workload_summary",
"WindowDurationNanos":600000000000,"WorkEventType":"IO",
"WorkEvent":"BatchEval","WorkloadID":"000000000000002a",
"SampleCount":200}
Resolves: cockroachdb#164382
Release note (ops change): Added periodic ASH workload summary logging
to the OPS channel. Two new cluster settings, `obs.ash.log_interval`
(default 10m) and `obs.ash.log_top_n` (default 10), control how often
and how many entries are emitted. Each summary reports the most
frequently sampled workloads grouped by event type, event name, and
workload ID, providing durable visibility into workload patterns that
previously existed only in memory.
Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
d9c30a0 to
55f1256
Compare
|
TFTRs! |
|
/trunk merge |
The ASH sampler holds samples only in memory, which are lost on node
restart. Add periodic structured log summaries so operators have durable
evidence of workload patterns.
Two new cluster settings control this behavior:
obs.ash.log_interval (default 10m) — how often the summary is emitted.
obs.ash.log_top_n (default 10) — max entries per summary.
Every log_interval, the sampler scans the ring buffer for samples
collected since the last report, groups them by (WorkEventType,
WorkEvent, WorkloadID), and emits one structured event per top-N entry
on the OPS channel. If there are zero samples in the window, no log is
emitted.
Example log output (one line per top-N entry, sorted by SampleCount descending):
I260309 19:25:00.123456 42 1@obs/ash/sampler.go:308 [n1] 500
={"Timestamp":1773084300123456000,"EventType":"ash_workload_summary",
"WindowDurationNanos":600000000000,"WorkEventType":"IO",
"WorkEvent":"BatchEval","WorkloadID":"000000000000002a",
"SampleCount":200}
Resolves: #164382
Release note (ops change): Added periodic ASH workload summary logging
to the OPS channel. Two new cluster settings,
obs.ash.log_interval(default 10m) and
obs.ash.log_top_n(default 10), control how oftenand how many entries are emitted. Each summary reports the most
frequently sampled workloads grouped by event type, event name, and
workload ID, providing durable visibility into workload patterns that
previously existed only in memory.