Conversation
Mirror the existing ADMIN_* env-driven plumbing for the KeyViz heatmap sampler so operators can flip --keyvizEnabled and the --keyvizFanoutNodes peer list from deploy.env without editing the docker run command. The previous workflow required either editing the script per-deploy or running the binary outside the rolling- update path. What's added: - KEYVIZ_ENABLED master switch (validated as true|false at the top of the local script alongside ADMIN_ENABLED). - KEYVIZ_FANOUT_NODES comma-separated host:port list. Pre-quoted via printf %q for the same reason ADMIN_FULL_ACCESS_KEYS is: commas survive an unquoted env pass but pre-quoting keeps the pattern uniform. - build_keyviz_flags helper (nameref-output, mirrors build_admin_flags). Empty array when KEYVIZ_ENABLED != "true", so existing deploys see no behaviour change. - KEYVIZ_* env vars threaded through the SSH env passthrough so the heredoc-resident build_keyviz_flags reads them on the remote. - env.example block documenting the variables and pointing at the Phase 2-C fan-out auth follow-up (docs/design/2026_04_27_proposed_keyviz_cluster_fanout.md). What's not added: - KEYVIZ_STEP, KEYVIZ_MAX_TRACKED_ROUTES, KEYVIZ_HISTORY_COLUMNS, KEYVIZ_FANOUT_TIMEOUT — these have sensible binary-side defaults and operators have not asked for env-driven overrides yet. Adding them is a one-line _flags+= edit per knob if needed. - Inter-node fan-out auth — Phase 2-C does not yet ship a pre- shared bearer token, so a multi-node KEYVIZ_FANOUT_NODES setup surfaces 5/N peer 401s in the SPA degraded banner. The env.example comment documents this. Verified locally: deployed to a 5-node cluster with KEYVIZ_ENABLED=true and KEYVIZ_FANOUT_NODES set across all five admin listeners; --keyviz* flags appeared on every container's docker inspect, and the admin keyviz handler returned a matrix with the fan-out block populated. Self-review (CLAUDE.md 5 lenses): 1. Data loss -- None. Build/runbook plumbing only. 2. Concurrency -- None. 3. Performance -- None on the deploy path. Per-node sampler cost is covered by the sampler's own design doc. 4. Data consistency -- None. 5. Test coverage -- bash -n syntax-check passes. Existing rolling- update.sh has no automated tests in this repo; the verification path is "deploy to a real cluster and inspect docker run args."
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@claude review |
|
Claude finished @bootjp's task in 3m 3s —— View job Code Review
Overall: Clean, backward-compatible PR. The env-driven plumbing faithfully mirrors the
|
There was a problem hiding this comment.
Code Review
This pull request introduces support for the KeyViz heatmap sampler in the rolling update scripts. It adds new environment variables, KEYVIZ_ENABLED and KEYVIZ_FANOUT_NODES, to the example configuration and the update script, ensuring the boolean flag is validated and the settings are correctly passed to the container as CLI flags. Feedback was provided to refactor the build_keyviz_flags function to use local variables and quoted namerefs for better consistency and safety.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Address claude bot review on 6c7ae79. Findings 1 and 2 already landed in 94ade0d (gemini's nameref-quote + locals-snapshot suggestion accepted by the author). This commit picks up the two remaining items: - Stale comment at scripts/rolling-update.sh:1233 still said "two boolean flags (ADMIN_ENABLED, ADMIN_ALLOW_*)" -- the validation loop above also covers KEYVIZ_ENABLED now. Rewrite to "boolean flags (ADMIN_ENABLED, ADMIN_ALLOW_*, KEYVIZ_ENABLED)". - env.example fan-out note was both ambiguously phrased and outdated. PR #692 ("forward session cookie on KeyViz fan-out so peers do not 401") shipped on main while this PR was in review, so the prior "Phase 2-C does not yet ship inter-node auth" note is no longer accurate. Rewrite to describe the as-shipped behaviour: the aggregator forwards the operator's session cookie, so peers with --adminEnabled accept the call as long as the same signing key and role allow-lists are configured cluster-wide; peers without --adminEnabled expose an unauthenticated endpoint and respond unconditionally. bash -n scripts/rolling-update.sh passes.
|
@claude review |
|
Claude finished @bootjp's task in 3m 1s —— View job Code Review
Overall: The PR is clean and well-structured. The four nits from the previous Claude review (nameref quoting, local-var snapshotting, stale boolean-flags comment, ambiguous 401 wording in env.example) have all been addressed in the latest commit. One new edge case is worth noting:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 76f290618f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # --adminEnabled expose an unauthenticated keyviz endpoint and | ||
| # respond unconditionally. |
There was a problem hiding this comment.
Fix incorrect auth guidance for non-admin fanout peers
This comment says peers without --adminEnabled expose an unauthenticated keyviz endpoint, but that does not match the current server behavior: startAdminFromFlags returns immediately when admin is disabled (main_admin.go), so those peers do not expose /admin/api/v1/keyviz/matrix at all, and the route is documented as auth-protected when admin is running (internal/admin/server.go). Following this guidance can lead operators to include non-admin nodes in KEYVIZ_FANOUT_NODES and get avoidable fan-out failures (connection/refused or degraded peer status).
Useful? React with 👍 / 👎.
Summary
Mirror the existing
ADMIN_*env-driven plumbing for the KeyViz heatmap sampler so operators can flip--keyvizEnabledand the--keyvizFanoutNodespeer list fromdeploy.envwithout editing the script per-deploy.What's added
KEYVIZ_ENABLEDmaster switch (validated astrue|falseat the top of the local script alongsideADMIN_ENABLED).KEYVIZ_FANOUT_NODEScomma-separated host:port list. Pre-quoted viaprintf %qfor the same reasonADMIN_FULL_ACCESS_KEYSis: commas survive an unquoted env pass but pre-quoting keeps the pattern uniform.build_keyviz_flagshelper (nameref-output, mirrorsbuild_admin_flags). Empty array whenKEYVIZ_ENABLED != "true", so existing deploys see no behaviour change.KEYVIZ_*env vars threaded through the SSH env passthrough so the heredoc-residentbuild_keyviz_flagsreads them on the remote.env.exampleblock documenting the variables and pointing at the Phase 2-C fan-out auth follow-up (docs/design/2026_04_27_proposed_keyviz_cluster_fanout.md).What's not added
KEYVIZ_STEP,KEYVIZ_MAX_TRACKED_ROUTES,KEYVIZ_HISTORY_COLUMNS,KEYVIZ_FANOUT_TIMEOUT— these have sensible binary-side defaults and operators have not asked for env-driven overrides yet. Adding them is a one-line_flags+=edit per knob if needed.KEYVIZ_FANOUT_NODESsetup surfaces 5/N peer 401s in the SPA degraded banner. The env.example comment documents this.Verification
Deployed to a live 5-node cluster (192.168.0.210-214) with
KEYVIZ_ENABLED=trueandKEYVIZ_FANOUT_NODESset across all five admin listeners;--keyviz*flags appeared on every container'sdocker inspect, and the admin keyviz handler returned a matrix with thefanoutblock populated.(The 5/6 401s are the expected fan-out-auth gap, surfaced by the PR #687
FanoutBannerin the SPA. Tracked separately.)Self-review (CLAUDE.md 5 lenses)
bash -nsyntax-check passes. Existingrolling-update.shhas no automated tests in this repo; the verification path is "deploy to a real cluster and inspect docker run args."Test plan
bash -n scripts/rolling-update.sh— syntax OK@claude review