obs(redis): expose unsupported-command names via bounded metric by bootjp · Pull Request #594 · bootjp/elastickv

bootjp · 2026-04-23T13:51:06Z

Summary

Add a new Prometheus counter elastickv_redis_unsupported_commands_total{command} that records the actual name of Redis commands the adapter rejects at the route-miss path (adapter/redis.go). Today those are all collapsed into command="unknown", which leaves a large unnamed bar in Grafana's "Errors by Command".
The new metric is additive: the existing elastickv_redis_requests_total{command="unknown",outcome="error"} and elastickv_redis_errors_total{command="unknown"} counters are unchanged, so existing dashboards and alerts keep working.

Cardinality guarantee

Hard cap of 32 distinct command-name label values (maxUnsupportedCommandLabels = 32), guarded by a sync.Mutex on an in-memory seen-set in RedisMetrics.
Once the cap is reached, any further novel name folds into command="other". Already-seen names still record with their real label even after the cap is full.
Label values are uppercased, trimmed, and truncated to 64 chars (maxUnsupportedCommandLabelLen = 64) before use, so a pathological client cannot pollute label values via protocol abuse (e.g. sending a huge first argument).

Wiring

monitoring.RedisRequestReport gains an Unsupported bool field. The adapter sets it to true only at the route-miss path via a new helper observeRedisUnsupported; all other call sites are unchanged.
RedisMetrics.ObserveRedisRequest still runs the existing "unknown"-bucket path for unsupported commands, then additionally increments the new bounded counter when Unsupported is set.

Test plan

Summary by CodeRabbit

New Features
- Added dedicated monitoring for unsupported Redis commands with a new metric tracking command names and frequency, enabling clearer visibility into command rejections.
Tests
- Expanded test coverage for unsupported command tracking, including label cardinality management and concurrent operation validation.

Add elastickv_redis_unsupported_commands_total{command} counter so operators can see which commands clients are probing instead of a single opaque "unknown" bar in Grafana. Cardinality is capped at 32 distinct observed names (guarded by a sync.Mutex); novel names observed after the cap are folded into command="other". Label values are uppercased, trimmed, and truncated to 64 chars so a pathological client cannot pollute labels via protocol abuse. Wiring is additive: RedisRequestReport gains an Unsupported bool that the adapter sets only at the route-miss path. The existing elastickv_redis_requests_total{command="unknown",outcome="error"} and elastickv_redis_errors_total{command="unknown"} counters are unchanged so existing dashboards and alerts keep working.

coderabbitai · 2026-04-23T13:51:20Z

Warning

Rate limit exceeded

@bootjp has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 52 minutes and 25 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 52 minutes and 25 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ddfac6ea-4660-4704-8830-fd9fa4926fb2

📥 Commits

Reviewing files that changed from the base of the PR and between 91e16e9 and 214d71d.

📒 Files selected for processing (3)

adapter/redis.go
adapter/redis_metrics_test.go
monitoring/redis.go

📝 Walkthrough

Walkthrough

A new monitoring capability distinguishes unsupported Redis commands from other errors through a dedicated observer path. The Redis adapter now routes unsupported commands to a specialized handler that integrates with the monitoring layer to emit a bounded-cardinality Prometheus counter with normalized, truncated command labels.

Changes

Cohort / File(s)	Summary
Core Implementation `monitoring/redis.go`, `adapter/redis.go`	Added `Unsupported` field to `RedisRequestReport`, implemented `observeRedisUnsupported` method, and introduced a new Prometheus counter `elastickv_redis_unsupported_commands_total` with command-label deduplication and truncation logic; labels are bounded by mapping excess commands to an `"other"` bucket.
Test Coverage `adapter/redis_metrics_test.go`	Extended Redis metrics tests to validate unsupported command tracking, including label cardinality bounds, command name normalization and truncation, counter aggregation, and concurrent-call correctness.

Sequence Diagram

sequenceDiagram
    participant Client
    participant RedisServer
    participant Monitoring
    participant Prometheus

    Client->>RedisServer: Send unsupported command
    RedisServer->>RedisServer: Detect unsupported command
    RedisServer->>Monitoring: observeRedisUnsupported(command, duration)
    Monitoring->>Monitoring: Normalize & uppercase command
    Monitoring->>Monitoring: Truncate label to max length
    Monitoring->>Monitoring: Check command cardinality bound
    alt Within cardinality limit
        Monitoring->>Monitoring: Add to tracked commands set
    else Exceeds cardinality limit
        Monitoring->>Monitoring: Map to "other" bucket
    end
    Monitoring->>Prometheus: Emit elastickv_redis_unsupported_commands_total
    Prometheus->>Prometheus: Increment counter with command label

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 Hops with glee!
Unsupported commands now dance in their own spotlight,
No longer lost in the error shadows,
Bounded labels keep the chaos at bay,
Prometheus counts them all with precision! 📊✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: introducing a new Prometheus metric to expose unsupported Redis command names with bounded cardinality.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch obs/redis-unsupported-command-names

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@monitoring/redis.go`:
- Around line 217-237: The truncation in RedisMetrics.observeUnsupportedCommand
uses byte slicing which can split multi-byte UTF-8 sequences and cause
Prometheus to panic; fix by first ensuring name is valid UTF-8 (e.g., replace
invalid sequences or convert bytes to a valid string using utf8/bytes.Runes) and
then truncate on rune boundaries to maxUnsupportedCommandLabelLen (counting
runes, not bytes), preserve the existing redisCommandUnknown fallback and the
logic around m.unsupportedNames and m.unsupportedCommands.WithLabelValues;
update the truncation at the place where name is shortened (originally name =
name[:maxUnsupportedCommandLabelLen]) to perform UTF-8-safe rune truncation
before labeling.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 372c888a-215c-4d3f-928f-54941d65bc28

📥 Commits

Reviewing files that changed from the base of the PR and between 306eb0b and 91e16e9.

📒 Files selected for processing (3)

adapter/redis.go
adapter/redis_metrics_test.go
monitoring/redis.go

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 91e16e9a5d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-23T13:59:10Z

+	if len(name) > maxUnsupportedCommandLabelLen {
+		name = name[:maxUnsupportedCommandLabelLen]


Preserve UTF-8 when truncating unsupported command labels

Truncating name by raw bytes can split a multibyte rune, producing invalid UTF-8 label values; prometheus.(*CounterVec).WithLabelValues then panics during unsupported-command observation. A client can trigger this by sending an unsupported command name longer than 64 bytes containing non-ASCII characters (for example, 63 ASCII bytes plus é), which makes the Redis request path panic instead of returning an error metric. Please truncate by rune boundary (or validate/re-encode) before calling WithLabelValues.

Useful? React with 👍 / 👎.

gemini-code-assist

Code Review

This pull request introduces a new metric, elastickv_redis_unsupported_commands_total, to track unsupported Redis commands with bounded cardinality and label length. The implementation includes a mechanism to collapse excess command names into an "other" bucket and truncate long command names. Feedback was provided to address a potential Denial of Service (DoS) vulnerability by truncating raw input before performing expensive string operations and to ensure that string truncation does not result in invalid UTF-8 sequences.

gemini-code-assist · 2026-04-23T14:07:11Z

+func (m *RedisMetrics) observeUnsupportedCommand(raw string) {
+	name := strings.ToUpper(strings.TrimSpace(raw))
+	if name == "" {
+		name = redisCommandUnknown
+	}
+	if len(name) > maxUnsupportedCommandLabelLen {
+		name = name[:maxUnsupportedCommandLabelLen]
+	}


The current truncation logic uses byte-slicing, which can cut through multi-byte UTF-8 characters and produce invalid strings, potentially causing Prometheus scraping to fail. Additionally, performing operations on the full input string before truncation presents a DoS vector. This change introduces a defensive limit on the input size to prevent unbounded processing, aligning with repository standards for handling external input to prevent Denial of Service attacks.

func (m *RedisMetrics) observeUnsupportedCommand(raw string) { // Defensive truncation to avoid expensive string operations on pathological input. if len(raw) > 256 { raw = raw[:256] } name := strings.ToUpper(strings.TrimSpace(raw)) if name == "" { name = redisCommandUnknown } if len(name) > maxUnsupportedCommandLabelLen { name = name[:maxUnsupportedCommandLabelLen] // Ensure the truncated string is valid UTF-8. for len(name) > 0 && !utf8.ValidString(name) { name = name[:len(name)-1] } } }

References

When handling external requests or data that can grow from external input, apply fixed bounds or size limits to prevent Denial of Service (DoS) attacks and unbounded memory growth.

gemini-code-assist · 2026-04-23T14:07:11Z


 import (
 	"strings"
+	"sync"


Add unicode/utf8 to the imports to support safe string truncation in observeUnsupportedCommand.

Suggested change

"sync"

"sync"

"unicode/utf8"

…d panic The 64-byte byte-slice truncation in RedisMetrics.observeUnsupportedCommand could split a multibyte UTF-8 rune, producing an invalid UTF-8 label value. prometheus.(*CounterVec).WithLabelValues panics on invalid UTF-8, so a client sending an unsupported command name >64 bytes containing non-ASCII bytes could crash the goroutine handling the request. Fix: - Truncate on UTF-8 rune boundaries while preserving the 64-byte cap. - Add an ingress guard that folds commands whose raw bytes are already invalid UTF-8 into an "invalid_utf8" sentinel, since strings.ToUpper silently rewrites invalid bytes to U+FFFD and would mask the problem. Tests cover a long multibyte-terminated name (no panic, valid UTF-8, byte length <= 64), a purely invalid-UTF-8 input (no panic, sentinel label), and extend the existing length-cap test to assert UTF-8 validity.

bootjp · 2026-04-23T14:15:18Z

Addressed the CodeRabbit + codex P1 in 8539857.

Bug (attack vector)

RedisMetrics.observeUnsupportedCommand truncated to 64 bytes with name = name[:maxUnsupportedCommandLabelLen], a raw byte slice. If the name contained multibyte UTF-8 runes (e.g. 63 ASCII bytes + é), the cut could land mid-rune and produce invalid UTF-8. prometheus.(*CounterVec).WithLabelValues then panics — not errors — so a Redis client could crash the handling goroutine by sending an unsupported command name >64 bytes containing any non-ASCII byte.

Fix

Truncate on UTF-8 rune boundaries, preserving the 64-byte cap (semantics unchanged for ASCII, which is the expected path).
Add an ingress guard: if the raw command bytes are not valid UTF-8, fold to a fixed "invalid_utf8" sentinel. The guard runs before strings.ToUpper, because ToUpper silently rewrites invalid bytes to U+FFFD and would mask the problem.

Behaviour

Before: 63 ASCII + é... crashed the goroutine via prometheus panic. Raw binary blob as a command name did the same.
After: no panic in either case. Valid-UTF-8 inputs are truncated on a rune boundary with byte length <= 64; invalid-UTF-8 inputs are folded to invalid_utf8.

Tests (`adapter/redis_metrics_test.go`)

TestRedisMetricsUnsupportedCommandTruncatesOnRuneBoundary — 63 'A' + many 'é', asserts NotPanics, len <= 64, utf8.ValidString.
TestRedisMetricsUnsupportedCommandRejectsInvalidUTF8 — raw 0xff 0xfe 0xfd 0xc3 0x28, asserts sentinel label emitted.
TestRedisMetricsUnsupportedCommandTruncatesLongName — now also asserts utf8.ValidString on the label.

go test -race ./monitoring/... ./adapter/... passes for the redis-metrics tests (pre-existing Test_grpc_transaction flake on this branch is unrelated — same failure occurs with our change reverted). golangci-lint run ./monitoring/... ./adapter/... reports 0 issues.

/gemini review
@codex review

gemini-code-assist

Code Review

This pull request introduces a new metric, "elastickv_redis_unsupported_commands_total", to track Redis commands rejected as unsupported. The implementation includes robust safeguards such as cardinality capping, length truncation on rune boundaries, and UTF-8 validation to ensure stability and prevent metric explosion. The review feedback suggests optimizing performance under high load by switching to a "sync.RWMutex" for a fast-path check of previously seen commands and simplifying the truncation logic.

gemini-code-assist · 2026-04-23T14:18:12Z

+	errorsTotal         *prometheus.CounterVec
+	unsupportedCommands *prometheus.CounterVec
+
+	unsupportedMu    sync.Mutex


To reduce lock contention on the error path, consider using a sync.RWMutex instead of a sync.Mutex. While this is an error path, it is externally triggerable and can be flooded by a malicious or misbehaving client. Once the cardinality cap is reached or the set of common unsupported commands is established, most requests will only need a read lock to verify the command name exists in the map.

Suggested change

unsupportedMu sync.Mutex

unsupportedMu sync.RWMutex

gemini-code-assist · 2026-04-23T14:18:12Z

+		b := 0
+		cut := len(name)
+		for i, r := range name {
+			size := utf8.RuneLen(r)
+			if size < 0 {
+				size = 1
+			}
+			if b+size > maxUnsupportedCommandLabelLen {
+				cut = i
+				break
+			}
+			b += size
+		}


The truncation logic can be simplified. The variable b is redundant because the range loop index i already provides the byte offset of the current rune. Additionally, since the string has been validated as UTF-8 (or is a known safe sentinel), utf8.RuneLen will return a positive value, making the negative check unnecessary.

cut := len(name) for i, r := range name { if i+utf8.RuneLen(r) > maxUnsupportedCommandLabelLen { cut = i break } } name = name[:cut]

gemini-code-assist · 2026-04-23T14:18:12Z

+	label := redisUnsupportedCommandOther
+	m.unsupportedMu.Lock()
+	if _, seen := m.unsupportedNames[name]; seen {
+		label = name
+	} else if len(m.unsupportedNames) < maxUnsupportedCommandLabels {
+		m.unsupportedNames[name] = struct{}{}
+		label = name
+	}
+	m.unsupportedMu.Unlock()


Leverage the RWMutex to implement a fast-path check for already-seen commands. This avoids acquiring a write lock for every unsupported command request once the command name is known to the monitoring layer.

label := redisUnsupportedCommandOther m.unsupportedMu.RLock() _, seen := m.unsupportedNames[name] m.unsupportedMu.RUnlock() if seen { label = name } else { m.unsupportedMu.Lock() if _, seen := m.unsupportedNames[name]; seen { label = name } else if len(m.unsupportedNames) < maxUnsupportedCommandLabels { m.unsupportedNames[name] = struct{}{} label = name } m.unsupportedMu.Unlock() }

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 85398572e3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-23T14:27:09Z

 				r.traceCommandError(conn, name, cmd.Args[1:], "unsupported")
 				conn.WriteError("ERR unsupported command '" + string(cmd.Args[0]) + "'")
-				r.observeRedisError(name, time.Since(start))
+				r.observeRedisUnsupported(name, time.Since(start))


Pass raw command text to unsupported-command observer

Run forwards the already-normalized name to observeRedisUnsupported, but name was built with strings.ToUpper(string(cmd.Args[0])) before routing. For malformed/binary command names, strings.ToUpper rewrites invalid UTF-8 into replacement runes, so observeUnsupportedCommand in monitoring/redis.go never sees invalid UTF-8 and cannot use the invalid_utf8 sentinel path introduced in this commit. Under hostile traffic this consumes real-label slots with synthetic �... values and can crowd out genuine unsupported command names.

Useful? React with 👍 / 👎.

The Redis adapter's Run() loop uppercased the command name with strings.ToUpper before passing it to the unsupported-command observer. ToUpper silently rewrites invalid UTF-8 bytes into the U+FFFD replacement character, so by the time the observer runs its utf8.ValidString check the value is already synthetically valid and the invalid_utf8 sentinel path is unreachable. A hostile client sending binary garbage as a command name could therefore burn through the bounded distinct-name slots with replacement-character garbage, displacing legitimate unsupported-command labels. Pass the raw bytes of cmd.Args[0] to observeRedisUnsupported and document the contract on both sides (adapter and monitoring). The routing path still uses the existing uppercased name; only the metric path takes the raw form so utf8.ValidString runs against the original bytes. Adds TestRedisMetricsUnsupportedCommandRawInvalidUTF8Bytes which feeds three 0xff bytes through the observer and asserts the sentinel label is incremented.

…595) ## Summary Adds a regression-preventing end-to-end test that pins the observability contract around `elastickv_redis_errors_total` / `elastickv_redis_requests_total{outcome="error"}`: commands whose Redis protocol semantics report a missing key via a **null / 0 / "none"** reply must be counted as `outcome="success"`, not as errors. ### Audit Starting from `grep 'WriteError(err.Error())' adapter/*.go`, I classified every call site where `err` could plausibly be `store.ErrKeyNotFound`: - **(a) Redis spec: null/0/none — FIX if leaked** GET, HGET, HMGET, HLEN, HEXISTS, HGETALL, LLEN, LINDEX, LRANGE, LPOP, RPOP, SCARD, SMEMBERS, SISMEMBER, ZCARD, ZSCORE, ZRANGE, XLEN, XRANGE, EXISTS, TYPE, TTL, PTTL. **Result:** already correct in the current codebase. Every handler either routes through `keyTypeAt` / `logicalExistsAt` (which return `redisTypeNone` / `false` for a missing key, not an error) or through helpers like `loadHashAt` / `loadSetAt` / `loadZSetAt` / `loadStreamAt` / `listPopClaim` / `hashFieldFastLookup`, all of which internalise `store.ErrKeyNotFound` and translate it to the caller-visible empty reply. No production-code change was needed. - **(b) Redis spec: ERR reply — LEAVE AS-IS** RENAME (`cmdRename` in `redis_lua_context.go:1339`), LSET (`cmdLSet` in `redis_lua_context.go:2131`). Both return `errors.New("ERR no such key")` — not `store.ErrKeyNotFound` — so the Lua compatibility layer at `redis_lua.go:857` sends the error to the client, the metric correctly records `outcome="error"`, and the wire-format matches real Redis. RPOPLPUSH / LPOS / etc. route through the same Lua layer and inherit the same semantics. - **(c) Not implemented — no call site** DUMP, OBJECT, RESTORE, COPY: no handler exists yet, nothing to audit. ### What changed `adapter/redis_metrics_missing_key_test.go`: a single new test (`TestRedisMetrics_MissingKeyNotCountedAsError`) with three subtests that spin up a 3-node cluster, share a `monitoring.RedisRequestObserver` across every node (so `proxyToLeader`-induced routing is captured too), and: 1. Issue the 21 missing-key commands listed above and assert that `elastickv_redis_errors_total` does not move (with a per-command diagnostic dump on failure). GET is additionally checked against `elastickv_redis_requests_total{outcome="success"}` to guard against a regression that silently drops the success sample. 2. Issue RENAME on a missing key and assert the errors counter increments by exactly 1. 3. Issue LSET on a missing key and assert the errors counter increments by exactly 1. The `monitoring/redis.go` / `RedisRequestReport` struct is left untouched per the coordination note in the task brief (PR #594 owns that file). ## Test plan - [x] `go test -race -count=1 -run '^TestRedisMetrics_MissingKeyNotCountedAsError$' ./adapter/` passes. - [x] `go test -race -count=1 ./adapter/` passes (394 → 395 tests, +1 new with 3 subtests). - [x] `golangci-lint run ./adapter/...` is clean. - [x] No production-code file modified; behaviour change is zero.  ## Summary by CodeRabbit ## Tests * Added test suite validating Redis metrics behavior for missing key operations, ensuring commands like GET, HGET, HLEN, LLEN, and ZCARD don't incorrectly count as errors in metrics while verifying actual errors are properly tracked.

…path Three review items addressed at once because they all touch observeUnsupportedCommand in monitoring/redis.go: 1. Security: cap the raw input length to 256 bytes BEFORE any O(n) string operations (utf8.ValidString, strings.ToUpper, rune iteration). A hostile client sending a multi-megabyte first argument would otherwise burn CPU on every unsupported-command observation. Cutting mid-rune is safe here: the UTF-8 validity check below rejects the result (producing the invalid_utf8 sentinel), or the rune-aware truncation further down trims to a clean boundary. 2. Simpler rune truncation: replace the manual byte accumulator with the range-loop byte index. Since the upstream UTF-8 validity check guarantees utf8.RuneLen never returns -1, the redundant helper and its negative-size guard can go. 3. RWMutex fast path: switch unsupportedMu from sync.Mutex to sync.RWMutex. Observations for already-admitted names take a read lock and return without contending on a write lock; only novel names acquire the write lock, with a double-check to avoid races between RUnlock and Lock. New tests: - TestRedisMetricsUnsupportedCommandRejectsPathologicalLength: 1 MiB input must not panic and must still emit a 64-byte-capped valid-UTF-8 label. - TestRedisMetricsUnsupportedCommandRWMutexFastPath: 32 workers x 500 iterations hitting a mix of pre-seeded and novel names, exercised under go test -race.

bootjp · 2026-04-23T15:03:53Z

Addressed the five review items from the prior round.

db4fa30 — fix(obs): preserve raw command bytes to reach invalid_utf8 sentinel
Codex P2. The adapter's Run() loop was handing the already-strings.ToUpper'd name to observeRedisUnsupported. strings.ToUpper silently rewrites invalid UTF-8 bytes to U+FFFD, so by the time observeUnsupportedCommand ran utf8.ValidString the value was synthetically "valid" and the invalid_utf8 sentinel path was unreachable — a hostile client sending binary garbage could burn through the 32 distinct-name slots with replacement-char strings. Now the adapter passes raw string(cmd.Args[0]); ToUpper/TrimSpace happen inside the observer, AFTER the UTF-8 check. Contract documented on both sides. Added TestRedisMetricsUnsupportedCommandRawInvalidUTF8Bytes.

cc41633 — perf(obs): cap raw input, simplify rune truncation, add RWMutex fast path
Three items bundled because they all touch observeUnsupportedCommand:

Gemini HIGH (security): early if len(raw) > 256 { raw = raw[:256] } before any O(n) string op. Cutting mid-rune is fine; the downstream utf8.ValidString or rune-aware truncation handles it. Guards against multi-megabyte first-argument CPU abuse.
Gemini Medium (simplify): rune-truncation loop now uses the range-loop byte index i and utf8.RuneLen(r) directly; the manual accumulator and negative-size guard are gone since the upstream UTF-8 validity check makes them dead code.
Gemini Medium (lock): unsupportedMu is now sync.RWMutex. Already-admitted names take the read lock and return; only novel names acquire the write lock with a double-check to avoid the RUnlock→Lock race.

unicode/utf8 was already imported (Item 5 no-op).

New tests: TestRedisMetricsUnsupportedCommandRejectsPathologicalLength (1 MiB input, no panic, label still 64-byte-capped valid UTF-8) and TestRedisMetricsUnsupportedCommandRWMutexFastPath (32 workers x 500 iters, mixed fast/slow paths, runs clean under -race).

Local verification:

go test -race -timeout 60s -run TestRedisMetrics ./adapter/... — pass.
go test -race ./monitoring/... — pass.
golangci-lint run ./monitoring/... ./adapter/... — 0 issues.

/gemini review
@codex review

gemini-code-assist

Code Review

This pull request introduces a new metric, elastickv_redis_unsupported_commands_total, to track the specific names of unsupported Redis commands while maintaining bounded cardinality. The implementation includes robust defensive measures against malformed or malicious input, such as raw byte length capping, UTF-8 validation to prevent Prometheus panics, and rune-aware truncation. It also utilizes a thread-safe mechanism with a read-lock fast path for recording previously seen command names. Comprehensive tests have been added to verify cardinality limits, invalid UTF-8 handling, and concurrent performance. I have no feedback to provide.

chatgpt-codex-connector · 2026-04-23T15:13:48Z

Codex Review: Didn't find any major issues. Keep them coming!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Codex flagged that the "Unsupported Commands by Name" (Redis summary) and "FSM Apply Sync Mode" (Pebble internals) panels query metrics not yet on main, so during an incident operators could mistake the empty panels for a scrape failure. Append a sentence to each panel's description calling out the dependency and that the panel will populate automatically once PR #594 / PR #592 merge.

…ctor Redis/Cluster dashboards (#597) ## Summary - Add **Elastickv Pebble Internals** dashboard covering block cache hit rate/capacity, L0 pressure, compactions, memtables, FSM apply sync mode, and store write conflicts. - Add **Unsupported Commands by Name** panel to the Redis dashboard (driven by PR #594 `elastickv_redis_unsupported_commands_total`) and annotate the legacy **Errors by Command** panel so operators know the `unknown` bucket is broken down by real name in the new panel. - Fold the standalone Redis hot-path (PR #560) dashboard into a collapsed row at the bottom of the Redis summary dashboard, delete the standalone file, and retitle the summary to **Elastickv Redis** so there is a single Redis-facing dashboard. - Rename the misnamed `elastickv-cluster-summary.json` (which was already a pure DynamoDB dashboard) to `elastickv-dynamodb.json` / **Elastickv DynamoDB** with `uid=elastickv-dynamodb`, and retitle `elastickv-cluster-overview.json` to **Elastickv Cluster** now that it is the sole cluster landing page. ## Before / After file list | State | File | | --- | --- | | Added | `monitoring/grafana/dashboards/elastickv-pebble-internals.json` | | Renamed | `elastickv-cluster-summary.json` to `elastickv-dynamodb.json` | | Retitled | `elastickv-cluster-overview.json` (title now `Elastickv Cluster`) | | Modified | `elastickv-redis-summary.json` (title `Elastickv Redis`; added Unsupported Commands panel and a collapsed Hot Path row) | | Deleted | `monitoring/grafana/dashboards/elastickv-redis-hotpath.json` | | Unchanged | `monitoring/grafana/dashboards/elastickv-raft-status.json` | ## What each dashboard now covers - **Elastickv Cluster** - leader identity, cluster-wide latency/error posture, per-node Raft health, and an embedded Redis row. - **Elastickv DynamoDB** - per-operation throughput/latency and per-table error/volume breakdown for the DynamoDB-compatible API. - **Elastickv Raft Status** - unchanged: membership, leader changes, failed proposals, node state, index drift, backlog, leader contact. - **Elastickv Redis** - per-command throughput/latency/errors, unsupported-commands breakdown, secondary-write and Lua diagnostics, and a collapsible `Hot Path (legacy PR #560)` row with the 9 GET-fast-path panels preserved verbatim. - **Elastickv Pebble Internals** - block cache hit rate and size-vs-capacity, hits/misses rate, L0 sublevels/files with write-stall thresholds, compaction debt/progress/rate, memtable count/size/zombies, FSM apply sync mode, store write conflicts by prefix. ## Dead-metric warnings Two metrics referenced by panels in this PR are **not yet present in `monitoring/*.go` on `main`**: - `elastickv_redis_unsupported_commands_total` (PR #594) - the spec asked for a panel targeting this and this PR adds one; it will stay empty until #594 lands. - `elastickv_pebble_fsm_apply_sync_mode` (PR #592) - same situation; panel is in place and will populate once #592 lands. Both are noted here so reviewers can decide whether to sequence this PR after the upstream ones. ## Test plan - [x] `python3 -c "import json; json.load(open('<file>'))"` passes on every modified dashboard file. - [x] Every dashboard exposes `schemaVersion`, `panels`, `time`, `title`, `uid`. - [x] Panel IDs are unique within each dashboard (including nested hot-path panels). - [x] No overlapping gridPos cells in the modified Redis dashboard (checked with a programmatic sweep). - [x] `go build ./...` succeeds after updating the four source comments / doc page that referenced the deleted hotpath JSON by path. - [x] `go test ./...` passes, including `./monitoring/...`. - [ ] Import each of the five JSON files into Grafana (10.x, schemaVersion 41) via UI; every panel renders without errors and the Hot Path row expands/collapses cleanly.

coderabbitai Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread monitoring/redis.go

chatgpt-codex-connector Bot reviewed Apr 23, 2026

View reviewed changes

bootjp mentioned this pull request Apr 23, 2026

obs(redis): treat missing keys per Redis spec (nil/0) not as errors #595

Merged

4 tasks

gemini-code-assist Bot reviewed Apr 23, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Apr 23, 2026

View reviewed changes

gemini-code-assist Bot reviewed Apr 23, 2026

View reviewed changes

bootjp enabled auto-merge April 23, 2026 16:06

Merge branch 'main' into obs/redis-unsupported-command-names

214d71d

bootjp merged commit 830c3e5 into main Apr 23, 2026
7 checks passed

bootjp deleted the obs/redis-unsupported-command-names branch April 23, 2026 16:13

bootjp mentioned this pull request Apr 23, 2026

obs(grafana): add Pebble internals + unsupported-commands panel; refactor Redis/Cluster dashboards #597

Merged

7 tasks

		if len(name) > maxUnsupportedCommandLabelLen {
		name = name[:maxUnsupportedCommandLabelLen]

Conversation

bootjp commented Apr 23, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Cardinality guarantee

Wiring

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

bootjp commented Apr 23, 2026

Bug (attack vector)

Fix

Behaviour

Tests (adapter/redis_metrics_test.go)

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

bootjp commented Apr 23, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

chatgpt-codex-connector Bot commented Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

bootjp commented Apr 23, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 23, 2026 •

edited

Loading

Tests (`adapter/redis_metrics_test.go`)