Add INFO COMMANDSTATS tracking with per-command success rates#1702
Merged
Mathos1432 merged 6 commits intomainfrom Apr 22, 2026
Merged
Add INFO COMMANDSTATS tracking with per-command success rates#1702Mathos1432 merged 6 commits intomainfrom
Mathos1432 merged 6 commits intomainfrom
Conversation
0460835 to
60dc7f0
Compare
Implements Redis-compatible INFO COMMANDSTATS support, gated behind --commandstats-monitor config flag. Tracks per-command calls, failed_calls, and rejected_calls using lightweight counter increments (no Stopwatch). Key design decisions: - Array-indexed by RespCommand enum for O(1) access - Per-session counters (single-writer, no locking on hot path) - On-demand aggregation from active sessions + disposed session history - No latency tracking (usec=0) to avoid Stopwatch overhead (~4.4x perf hit) - Monitor dispose handles case where Start() was never called (EmbeddedServer) Benchmark results (100 PINGs, .NET 10): Disabled: 1.687 us | Enabled: 1.820 us | Overhead: ~7.8% SET/GET overhead: 0-4% (within noise for real workloads) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
60dc7f0 to
f26aea3
Compare
…latency references When MetricsSamplingFrequency > 0, globalCommandStats already includes history from disposed sessions. PopulateCommandStatsInfo was adding historyCommandStats on top, double-counting disposed session stats. Also removed stale 'latency' references from doc comments and help text since per-command latency tracking was removed to avoid Stopwatch overhead. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 16 out of 16 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
vazois
reviewed
Apr 20, 2026
vazois
reviewed
Apr 20, 2026
vazois
reviewed
Apr 20, 2026
vazois
reviewed
Apr 20, 2026
vazois
approved these changes
Apr 20, 2026
…mes, fix tests - Remove bufferFlushedDuringCommand field and pointer-based error detection; rely solely on commandErrorWritten flag set by WriteError/AbortWithErrorMessage - Use RespCommandsInfo.GetRespCommandName() for canonical Redis command names instead of enum ToString with string replacement - Switch most tests to metricsSamplingFreq: 0 (on-demand aggregation) to remove Thread.Sleep delays; keep one test with periodic sampling for coverage - Fix CommandStatsFailedCallsTest to use SETRANGE with invalid offset (goes through AbortWithErrorMessage) instead of WRONGTYPE which bypasses the flag Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
GarnetLatencyMetricsSession dereferences monitor.monitor_iterations, but StoreWrapper only created the monitor when MetricsSamplingFrequency > 0 or CommandStatsMonitor was enabled. With --latency-monitor alone and no sampling frequency, this would cause a NullReferenceException at session creation. Add LatencyMonitor to the condition so the monitor is always created when latency tracking is enabled. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
vazois
approved these changes
Apr 22, 2026
vazois
approved these changes
Apr 22, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why is this change being made?
Garnet currently has no way to track per-command usage statistics. Redis provides INFO COMMANDSTATS which reports how many times each command was called, how many failed, and how many were rejected. This is critical for production observability — operators need to understand command usage patterns, identify failing commands, and monitor success rates.
What changed?
Adds Redis-compatible INFO COMMANDSTATS support, gated behind --commandstats-monitor (disabled by default, zero overhead when off). When enabled, tracks per-command calls, failed_calls, and rejected_calls.
Key design decisions:
Counter-only tracking — No latency (usec) on the hot path. Reports usec=0,usec_per_call=0.00 for Redis format compatibility. Latency can be added later via sampling.
Per-session single-writer — Each session owns its own CommandStats array indexed by RespCommand enum (O(1) access). No locking on the hot path; aggregation happens on-demand when INFO COMMANDSTATS is requested.
Error detection — Dual strategy: commandErrorWritten flag set by WriteError/AbortWithErrorMessage, plus RESP - prefix check guarded by a bufferFlushedDuringCommand flag.
Excluded from default INFO — Only appears when explicitly requested via INFO COMMANDSTATS.
Bug fix: GarnetServerMonitor.Dispose() deadlocked when Start() was never called (the ManualResetEvent was never signaled). This affected EmbeddedRespServer used by BDN benchmarks. Fixed by tracking a started flag.
How was this validated?
Unit tests (17 passing)
Performance benchmarks (BenchmarkDotNet)
PING (100 commands/op, .NET 10, Server GC)
PING is ~17ns/command, so ~1.3ns tracking overhead is proportionally large. This is the worst case.
SET (100 commands/op, .NET 10, Server GC)
GET (100 commands/op, .NET 10, Server GC)
Summary: For real-world commands (SET/GET), overhead is 0-4% and within measurement noise for most configurations. Disabled by default = zero overhead unless opted in.