feat(main): keyviz wiring follow-up (--keyvizHistoryColumns + Phase-2 read TODO) by bootjp · Pull Request #651 · bootjp/elastickv

bootjp · 2026-04-26T09:23:07Z

Summary

Follow-up to PR #647, which was merged at the round-1 commit before round-2 review fixes propagated. This PR carries the orphaned round-2 changes:

New --keyvizHistoryColumns flag (defaults to keyviz.DefaultHistoryColumns = 1440, i.e. 24h at 60s) so operators can shorten the ring buffer for high-cardinality clusters without rebuilding.
startKeyVizFlusher early-returns when the sampler is nil instead of spawning a goroutine that just parks on ctx.Done — the goroutine was harmless but had no signal.
TODO on observeMutation documenting the Phase-2 read-sampling milestone (design §5.1, §10) so future readers don't think the missing read path is a regression. Until that wiring lands the matrix's Reads/ReadBytes series stay zero.

These items came out of Claude bot's round-2 review of #647 but landed after the merge button was pressed.

Test plan

go build ., go vet ., golangci-lint run ./... clean.
go test -race -count=1 -run 'TestBuildKeyVizSampler|TestSeedKeyVizRoutes|TestStartKeyVizFlusher' . clean.

Round-2 review fixes for PR #647 (Claude bot): - Add --keyvizHistoryColumns flag so operators can shorten the ring buffer for high-cardinality clusters; defaults to keyviz.DefaultHistoryColumns (1440 = 24h at 60s). - startKeyVizFlusher now early-returns when the sampler is nil instead of spawning a goroutine that just parks on ctx.Done. - Document the Phase-2 read-sampling milestone on observeMutation so future readers don't think the missing read path is a bug — the design (§5.1, §10) calls for read attribution from the node that serves the read (LeaseRead / LinearizableRead / follower reads), and that wiring is intentionally deferred to the next slice. Until then the Reads/ReadBytes series stay zero.

coderabbitai · 2026-04-26T09:23:13Z

Warning

Rate limit exceeded

@bootjp has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 14 minutes and 5 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 14 minutes and 5 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0cc4e5b3-a2f7-4547-9caa-f1e9135a3beb

📥 Commits

Reviewing files that changed from the base of the PR and between 41c7f0f and 8f5d0e0.

📒 Files selected for processing (5)

keyviz/sampler.go
keyviz/sampler_test.go
kv/sharded_coordinator.go
main.go
main_keyviz_test.go

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/keyviz-main-wiring-followup

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

bootjp · 2026-04-26T09:24:32Z

@claude review

claude · 2026-04-26T09:24:47Z

Claude finished @bootjp's task in 1m 59s —— View job

Code Review — PR #651

Todo list:

Gather context and read PR description
Diff changed files against origin/main
Read main.go changes
Read kv/sharded_coordinator.go changes
Verify flag wiring, defaults, and usage consistency
Cross-check design doc references (§5.1, §10)
Post final review

Summary

Clean, minimal follow-up. All three stated changes are correct. One minor test isolation gap worth fixing.

`main.go` — `--keyvizHistoryColumns` flag (+8/-2)

Flag registration and wiring (main.go:138, main.go:1380) — correct. The flag defaults to keyviz.DefaultHistoryColumns (1440), the value flows directly into MemSamplerOptions.HistoryColumns inside buildKeyVizSampler, and the flag description accurately describes the unit. No issues here.

startKeyVizFlusher nil guard — the early return on s == nil (before eg.Go) is a clean improvement. The prior code would spawn a goroutine that blocked on ctx.Done with no productive work; the updated comment text is accurate.

Minor: withFlags helper in main_keyviz_test.go doesn't cover the new flag (main_keyviz_test.go:94). The helper saves/restores keyvizEnabled, keyvizStep, keyvizMaxTrackedRoutes, and keyvizMaxMemberRoutesPerSlot — but keyvizHistoryColumns is now a fifth package-level flag var and is not covered. This won't cause any of the current tests to fail, but if a future test modifies *keyvizHistoryColumns and calls withFlags, the value won't be restored, leaking state across tests. Adding it now while the change is fresh keeps the helper consistent with the flag surface: Fix this →

Minor: no test pins the HistoryColumns forwarding path. The three tests cited in the PR test plan exercise buildKeyVizSampler (nil vs non-nil), seedKeyVizRoutes, and startKeyVizFlusher lifecycle — but none asserts that the value of *keyvizHistoryColumns actually reaches MemSamplerOptions.HistoryColumns. A one-line assertion inside TestBuildKeyVizSamplerHonorsEnabledFlag (e.g., set the flag to a non-default value and call s.opts.HistoryColumns == expected, or check through a method if the field is unexported) would lock down the wiring. Not a blocker, but it is the only new wiring in this PR.

`kv/sharded_coordinator.go` — Phase-2 TODO comment (+7)

The comment accurately describes what is deferred and why — the design (§5.1, §10 of docs/admin_ui_key_visualizer_design.md) calls for read attribution from the node that serves the read; that wiring is intentionally out of scope for this writes-first slice. The referenced sections match the actual design doc content. The comment will prevent future readers from filing a "reads are always zero, is this a bug?" question. Good documentation.

One minor style note: the TODO token is TODO(keyviz Phase 2) which is non-standard for Go (TODO(username) or TODO(issue#) being the common idioms). This is a project-level convention question and not a linting failure — just worth knowing if you plan to grep for open TODOs later.

Overall

Approve. The flag wiring and nil-guard are correct, the TODO documents a real design gap without over-promising, and the tests cited in the PR description pass. The two "minor" items above are non-blocking but worth a follow-up commit or issue if you want the helper to stay tidy.

gemini-code-assist

Code Review

This pull request introduces the keyvizHistoryColumns configuration flag to manage the KeyViz ring buffer size and optimizes the flusher by avoiding goroutine creation when the sampler is disabled. It also adds documentation regarding future read sampling support. A review comment suggests implementing an upper bound for the history columns flag to mitigate potential out-of-memory risks from large pre-allocations.

gemini-code-assist · 2026-04-26T09:28:18Z

 	keyvizStep                   = flag.Duration("keyvizStep", keyviz.DefaultStep, "Flush interval / matrix-column resolution for the keyviz sampler")
 	keyvizMaxTrackedRoutes       = flag.Int("keyvizMaxTrackedRoutes", keyviz.DefaultMaxTrackedRoutes, "Maximum routes tracked individually before excess routes coarsen into virtual buckets")
 	keyvizMaxMemberRoutesPerSlot = flag.Int("keyvizMaxMemberRoutesPerSlot", keyviz.DefaultMaxMemberRoutesPerSlot, "Maximum members listed on a virtual bucket; excess routes still drive the bucket counters")
+	keyvizHistoryColumns         = flag.Int("keyvizHistoryColumns", keyviz.DefaultHistoryColumns, "Maximum matrix columns retained in the keyviz ring buffer (each column = one Step)")


While this flag allows operators to adjust the ring buffer size, setting an excessively large value could lead to significant memory allocation and potential OOM issues at startup, as the buffer is pre-allocated. Consider adding a reasonable upper bound (e.g., 100,000 columns) in buildKeyVizSampler or NewMemSampler to prevent accidental misconfiguration from crashing the node.

References

To prevent unbounded memory growth and potential OOM issues, apply a fixed bound to collections that can grow from external requests or configuration changes.

Round-1 review fix for PR #651 (Gemini medium): newRingBuffer pre-allocates a slice of capacity HistoryColumns at construction. A misconfiguration like --keyvizHistoryColumns=100000000 (operator typo) would reserve gigabytes upfront and likely OOM the node before the heatmap returned its first column. Add MaxHistoryColumns = 100_000 (~70 days at 60s Step) and clamp opts.HistoryColumns in NewMemSampler so excessive values silently land at the cap instead of being trusted as-is. The cap lives in the keyviz package (next to the existing Default* constants) because the risk is a property of the data structure, not the caller. Operators wanting longer retention should use the Phase 3 persistence path (per-Raft-group `!admin|keyviz|*` namespace) — not a giant in-memory ring. Test TestHistoryColumnsClampedAboveMax confirms both above-cap input clamps to MaxHistoryColumns and exactly-at-cap input is preserved.

bootjp · 2026-04-26T09:33:05Z

@claude review

claude · 2026-04-26T09:33:16Z

Claude finished @bootjp's task in 1m 31s —— View job

Code Review — PR #651

Todo list:

Gather context and read PR description
Read changed files (keyviz/sampler.go, keyviz/sampler_test.go, kv/sharded_coordinator.go, main.go)
Check if OOM concern (gemini review) was addressed by MaxHistoryColumns clamp commit
Verify test isolation gap (withFlags helper coverage)
Post final review

Overall verdict: Approve ✓

All three stated goals are correctly implemented. The gemini-raised OOM concern is fully addressed by the second commit. One pre-existing test isolation gap remains.

`keyviz/sampler.go` — `MaxHistoryColumns` upper bound

New constant + clamp (sampler.go:85-92, sampler.go:315-317):

const MaxHistoryColumns = 100_000

if opts.HistoryColumns > MaxHistoryColumns {
    opts.HistoryColumns = MaxHistoryColumns
}

This directly addresses the gemini review concern. The clamp sits between the <= 0 default and the first allocation call, so both under- and over-range inputs are handled in the same place. The constant's doc comment explains the design rationale (Phase 3 persistence as the intended path for long retention). Clean.

TestHistoryColumnsClampedAboveMax (sampler_test.go:878-895) — covers 10× MaxHistoryColumns (clamped) and MaxHistoryColumns exactly (preserved). Both branches exercised, behaviour locked down.

`main.go` — `--keyvizHistoryColumns` flag

Flag registration (main.go:138):

keyvizHistoryColumns = flag.Int("keyvizHistoryColumns", keyviz.DefaultHistoryColumns, "Maximum matrix columns retained in the keyviz ring buffer (each column = one Step)")

Default matches keyviz.DefaultHistoryColumns (1440). Description is accurate. Flag name follows the existing keyviz* naming convention.

Wiring into buildKeyVizSampler (main.go:1380): HistoryColumns: *keyvizHistoryColumns is the only new line here. The MaxHistoryColumns guard in NewMemSampler means the coordinator path doesn't need a defensive clamp.

startKeyVizFlusher nil guard (main.go:1420-1422): Early return on s == nil is correct. The removed goroutine was allocating a stack and a goroutine for the sole purpose of waiting on ctx.Done — harmless, but worthless. The updated comment accurately describes why it was removed.

`kv/sharded_coordinator.go` — Phase-2 TODO

The comment is accurate and self-contained — it names the design document, the specific sections (§5.1, §10), and explains the zero Reads/ReadBytes series without leaving it ambiguous whether that is a bug or an intent. One minor style note (carry-over from the previous round):

TODO(keyviz Phase 2) is non-standard Go TODO style — the common idiom is TODO(username) or TODO(issue#). This won't cause a linting failure, but if you grep for open TODOs later a standardised tag is easier to filter. Low priority, not a blocker.

Remaining gap: `withFlags` helper doesn't cover `keyvizHistoryColumns`

main_keyviz_test.go:94–117 saves and restores keyvizEnabled, keyvizStep, keyvizMaxTrackedRoutes, and keyvizMaxMemberRoutesPerSlot — but not the new keyvizHistoryColumns. No current test in main_keyviz_test.go sets *keyvizHistoryColumns, so no existing test is broken by this. The risk is that a future test modifying the flag through withFlags will leak state across parallel tests. Fixing it now while the context is fresh is preferable to debugging a subtle test flake later: Fix this →

The end-to-end forwarding path (*keyvizHistoryColumns → buildKeyVizSampler → MemSamplerOptions.HistoryColumns) is also not directly tested via main_keyviz_test.go — a one-liner assertion inside TestBuildKeyVizSamplerHonorsEnabledFlag (set the flag to a non-default value, verify s.opts.HistoryColumns matches) would lock down this wiring. Not blocking, but consistent with the project convention that new flag wiring has a forwarding test.

| Branch: feat/keyviz-main-wiring-followup

Round-2 review fix for PR #651 (Claude bot minor): withFlags helper saved/restored every keyviz flag except the new --keyvizHistoryColumns, so a future test mutating the flag through the helper would leak state across parallel tests. Add the slot, update both call sites, and pin the flag → buildKeyVizSampler → MemSamplerOptions.HistoryColumns forwarding by asserting s.HistoryColumns() in TestBuildKeyVizSamplerHonorsEnabledFlag. Also adds a HistoryColumns() accessor on *MemSampler paralleling Step() so wiring tests don't need to reach into the unexported opts struct.

bootjp · 2026-04-26T09:36:53Z

@claude review

claude · 2026-04-26T09:37:08Z

Claude finished @bootjp's task in 2m 4s —— View job

Code Review — PR #651

Todo list:

Gather context and read PR description
Read all changed files
Verify withFlags helper coverage for keyvizHistoryColumns
Check MaxHistoryColumns clamp implementation and tests
Check forwarding path test coverage
Post final review

Overall verdict: Approve ✓

All three issues raised in the previous two review rounds have been closed out. The PR is clean.

Round-2 gaps — now closed

withFlags helper coverage (main_keyviz_test.go:98–124) — fixed. The helper now accepts historyColumns int as a 5th parameter, saves prevHistoryColumns := *keyvizHistoryColumns, sets the flag, and restores it in the defer. Both call sites pass an explicit value (16 in the enabled branch, default in others). No state will leak across parallel tests.

Forwarding path assertion (main_keyviz_test.go:30) — fixed. The test now calls:

require.Equal(t, 16, s.HistoryColumns())

This pins the wiring *keyvizHistoryColumns → buildKeyVizSampler → MemSamplerOptions.HistoryColumns end-to-end. The HistoryColumns() accessor (sampler.go:781) is nil-safe and exposes the post-default, post-clamp value, so the assertion correctly reflects what the ring buffer actually uses.

`keyviz/sampler.go` — `MaxHistoryColumns` clamp (commit `5bddeba`)

const MaxHistoryColumns = 100_000

if opts.HistoryColumns > MaxHistoryColumns {
    opts.HistoryColumns = MaxHistoryColumns
}

Correct placement: sits after the <= 0 defaulting block and before the first allocation. The doc comment explains the 70-day upper bound and why longer retention belongs to the Phase 3 persistence path. The HistoryColumns() accessor mirrors the nil-safe pattern of the existing Step() accessor.

TestHistoryColumnsClampedAboveMax (sampler_test.go:878–895) exercises both branches: 10× MaxHistoryColumns (clamped to cap) and MaxHistoryColumns exactly (preserved). Both critical boundary conditions covered.

`kv/sharded_coordinator.go` — Phase-2 TODO

The comment accurately describes what is deferred, cites the design sections (§5.1, §10), and makes explicit that zero Reads/ReadBytes is intentional, not a regression. One carry-over style note:

TODO(keyviz Phase 2) is non-standard Go idiom (conventional form is TODO(username) or TODO(issue#)). Not a linter failure, but worth standardising if you ever grep open TODOs by tag. Low priority.

`main.go` — nil guard and flag wiring

startKeyVizFlusher early-return on s == nil is correct and the comment accurately explains why the prior goroutine was dropped. Flag default matches keyviz.DefaultHistoryColumns (1440) and the description text is accurate.

All items from the previous two rounds are resolved. No blocking issues remain.
| Branch: feat/keyviz-main-wiring-followup

## Summary Phase 2-B of the Key Visualizer design (`docs/admin_ui_key_visualizer_design.md`): integrate the heatmap into the existing `web/admin/` SPA rather than building a parallel dashboard. Doc + implementation in one PR (doc commit first, per CLAUDE.md design-doc-first workflow). - New page at `/keyviz` — canvas heatmap polling `/admin/api/v1/keyviz/matrix` with series picker (writes / reads / write_bytes / read_bytes), row-budget input (clamped at 1024), and off / 5 s / 30 s auto-refresh. - Cold cells (value 0) render as the page background, not a faint blue — spotting actually-cold routes stays the dominant visual signal. - Row-detail flyout on hover: bucket_id, start, end, aggregate, route_count, route_ids (with truncation indicator). `start` / `end` decode through a printable-or-hex preview so binary keys do not render as mojibake. - Dependency-free: hand-rolled five-stop colour ramp in `lib/colorRamp.ts`. No d3, no ECharts. Bundle grew from ~155 kB to ~208 kB (raw), 64 kB gzipped. Backend is unchanged. The handler at `/admin/api/v1/keyviz/matrix` and its sampler wiring already shipped under Phase 2-A (PRs #639 / #645 / #646 / #647 / #651 / #660 / #661 / #672). ## Design `docs/design/2026_04_27_proposed_keyviz_spa_integration.md` — proposed status, lands with this PR. Parent design §12 phase table is split into 2-A (server, shipped) / 2-B (SPA, this PR) / 2-C (cluster fan-out, open). ## Five-lens self-review 1. **Data loss** — n/a; SPA is read-only against an existing handler. 2. **Concurrency / distributed** — n/a; single browser tab polling a single handler. Sampler concurrency was already covered by Phase 2-A tests. 3. **Performance** — Canvas + per-cell `fillRect` runs under §10 budget at 1024 × 500. Auto-refresh defaults off; 5-second cadence is the lower bound (sampler flush is 1 s). 4. **Data consistency** — SPA renders whatever the handler returns; consistency guarantees come from the existing leader-issued counters in the sampler. 5. **Test coverage** — `tsc -b --noEmit` clean; `vite build` clean; `go build ./internal/admin/...` clean (embed glob unaffected); `go test ./internal/admin/...` clean. Manual verification documented in the design doc §5. ## Test plan - [x] `npm run lint` (`tsc -b --noEmit`) — clean - [x] `npm run build` (Vite) — clean, output goes to `internal/admin/dist` - [x] `go build ./internal/admin/...` — clean - [x] `go test ./internal/admin/...` — clean - [ ] Manual: `make run` + `make client`, navigate to `/keyviz`, see hot routes light up red within ~5 s of write traffic - [ ] Manual: series picker swaps the displayed counter; row-budget input clamps at 1024; auto-refresh polls without flicker ## Out of scope - **Cluster fan-out** — handler is currently node-local. Phase 2-C will add a cross-node admin RPC; this PR will pick up the aggregate view automatically once that ships. - **Drill-down per-route sparkline** — Phase 3. - **Routes / Raft Groups correlation** — Phase 1 SPA pages not yet built; correlation lands when those pages do. - **`localStorage` for series / rows / refresh** — punt to follow-up.

gemini-code-assist Bot reviewed Apr 26, 2026

View reviewed changes

bootjp enabled auto-merge April 26, 2026 09:47

bootjp merged commit cc7108f into main Apr 26, 2026
8 checks passed

bootjp deleted the feat/keyviz-main-wiring-followup branch April 26, 2026 09:49

bootjp mentioned this pull request Apr 26, 2026

feat(admin): KeyViz heatmap page in web/admin SPA (Phase 2-B) #680

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(main): keyviz wiring follow-up (--keyvizHistoryColumns + Phase-2 read TODO)#651

feat(main): keyviz wiring follow-up (--keyvizHistoryColumns + Phase-2 read TODO)#651
bootjp merged 3 commits intomainfrom
feat/keyviz-main-wiring-followup

bootjp commented Apr 26, 2026

Uh oh!

coderabbitai Bot commented Apr 26, 2026 •

edited

Loading

Rate limit exceeded

Uh oh!

bootjp commented Apr 26, 2026

Uh oh!

claude Bot commented Apr 26, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 26, 2026

Uh oh!

bootjp commented Apr 26, 2026

Uh oh!

claude Bot commented Apr 26, 2026 •

edited

Loading

Uh oh!

bootjp commented Apr 26, 2026

Uh oh!

claude Bot commented Apr 26, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bootjp commented Apr 26, 2026

Summary

Test plan

Uh oh!

coderabbitai Bot commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Uh oh!

bootjp commented Apr 26, 2026

Uh oh!

claude Bot commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review — PR #651

Summary

main.go — --keyvizHistoryColumns flag (+8/-2)

kv/sharded_coordinator.go — Phase-2 TODO comment (+7)

Overall

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

bootjp commented Apr 26, 2026

Uh oh!

claude Bot commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review — PR #651

Overall verdict: Approve ✓

keyviz/sampler.go — MaxHistoryColumns upper bound

main.go — --keyvizHistoryColumns flag

kv/sharded_coordinator.go — Phase-2 TODO

Remaining gap: withFlags helper doesn't cover keyvizHistoryColumns

Uh oh!

bootjp commented Apr 26, 2026

Uh oh!

claude Bot commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review — PR #651

Overall verdict: Approve ✓

Round-2 gaps — now closed

keyviz/sampler.go — MaxHistoryColumns clamp (commit 5bddeba)

kv/sharded_coordinator.go — Phase-2 TODO

main.go — nil guard and flag wiring

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Apr 26, 2026 •

edited

Loading

claude Bot commented Apr 26, 2026 •

edited

Loading

`main.go` — `--keyvizHistoryColumns` flag (+8/-2)

`kv/sharded_coordinator.go` — Phase-2 TODO comment (+7)

claude Bot commented Apr 26, 2026 •

edited

Loading

`keyviz/sampler.go` — `MaxHistoryColumns` upper bound

`main.go` — `--keyvizHistoryColumns` flag

`kv/sharded_coordinator.go` — Phase-2 TODO

Remaining gap: `withFlags` helper doesn't cover `keyvizHistoryColumns`

claude Bot commented Apr 26, 2026 •

edited

Loading

`keyviz/sampler.go` — `MaxHistoryColumns` clamp (commit `5bddeba`)

`kv/sharded_coordinator.go` — Phase-2 TODO

`main.go` — nil guard and flag wiring