admin: CreateTable / DeleteTable write endpoints (P1, leader-only) by bootjp · Pull Request #634 · bootjp/elastickv

bootjp · 2026-04-25T13:53:44Z

Stacked on #633 (the read-only chunk). Writes are limited to the leader node for now; follower-side AdminForward RPC (design Section 3.3 acceptance criteria 1-6) ships in a follow-up PR. Mergeable on its own — followers respond 503 leader_unavailable + Retry-After: 1.

Summary

POST /admin/api/v1/dynamo/tables and DELETE /admin/api/v1/dynamo/tables/{name} both go through the existing protect chain (BodyLimit → SessionAuth → Audit → CSRF). The handler also enforces RoleFull so a read-only key cannot create or delete even with a valid CSRF token.
Adapter side: AdminCreateTable / AdminDeleteTable take an AdminPrincipal and re-validate the role at the adapter layer even when a higher tier already enforced it. Preserves the design's adapter side is the source of truth for authz invariant (Section 3.2). Two sentinel errors (ErrAdminNotLeader, ErrAdminForbidden) signal the structured failure modes.
Bridge in main_admin.go translates adapter errors to admin sentinels (ErrTablesNotLeader to 503 + Retry-After: 1, ErrTablesForbidden to 403, ResourceInUse to 409, ResourceNotFound to 404, ValidationException to 400). Raw adapter error text is never surfaced to clients; everything else falls through to a generic 500 with the original message logged at error level.
Strict JSON decoding (DisallowUnknownFields); each validation message is plain English so the SPA can render it directly.
Two summary structs (adapter.AdminCreateTableInput / admin.CreateTableRequest) stay independent so neither package imports the other; the bridge keeps them in sync and any drift breaks the build there.

Test plan

go build ./...
go vet ./...
golangci-lint run (admin, adapter, root: 0 issues)
go test ./internal/admin/ -count=1 (49 tests pass — 14 new write-handler unit tests, 4 new server-level integration tests)
go test ./adapter/ -count=1 -run 'TestDynamoDB_Admin' (14 tests pass — 9 new write-path tests including duplicate rejection, role enforcement at adapter, validation errors, delete missing to ResourceNotFound, etc.)
Manual smoke against a running node:
- curl -X POST .../dynamo/tables with full-role cookies + CSRF header to 201 + JSON summary
- same against a follower to 503 + Retry-After: 1
- DELETE on a non-existent table to 404 not_found

Stacked roadmap

admin: read-only DynamoDB tables endpoints (P1) #633 read-only GET /tables + GET /tables/{name} (in review)
THIS PR — POST + DELETE (leader-only)
AdminForward RPC + follower-leader forwarding (Section 3.3 acceptance criteria 1-6)
S3 read-only endpoints
S3 write endpoints
SPA (React + Vite, embed.FS)

coderabbitai · 2026-04-25T13:53:51Z

Warning

Rate limit exceeded

@bootjp has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 47 minutes and 36 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 47 minutes and 36 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 32455620-226c-4575-af80-9d0c15d91006

📥 Commits

Reviewing files that changed from the base of the PR and between 8e3bb37 and 5783c2a.

📒 Files selected for processing (11)

adapter/dynamodb_admin.go
adapter/dynamodb_admin_test.go
internal/admin/dynamo_handler.go
internal/admin/dynamo_handler_test.go
internal/admin/role_store.go
internal/admin/router.go
internal/admin/server.go
internal/admin/server_test.go
main.go
main_admin.go
main_admin_test.go

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/admin-dynamo-tables-write

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

bootjp · 2026-04-25T13:53:57Z

@claude review

gemini-code-assist

Code Review

This pull request introduces a SigV4-bypass administrative interface for DynamoDB-compatible tables, enabling listing, description, creation, and deletion through the admin dashboard. The implementation includes a new adapter layer for core logic, an HTTP handler with pagination and validation, and wiring into the admin server. Review feedback points out a redundant table name validation in the adapter and recommends refactoring the JSON response utility to encode the body before flushing headers, ensuring response reliability.

gemini-code-assist · 2026-04-25T13:59:56Z

+	if strings.TrimSpace(legacy.TableName) == "" {
+		return nil, newDynamoAPIError(http.StatusBadRequest, dynamoErrValidation, "missing table name")
+	}


This check for an empty table name is redundant. buildLegacyCreateTableInput (called at line 200) already performs this validation and returns a structured dynamoErrValidation error if the name is missing or whitespace-only. Removing this improves code maintainability by eliminating unreachable logic.

References

Remove dead or unreachable code paths, such as asynchronous handlers for tasks that are always performed synchronously, to improve code clarity and maintainability.

gemini-code-assist · 2026-04-25T13:59:57Z

+	w.WriteHeader(status)
+	if err := json.NewEncoder(w).Encode(body); err != nil {
+		if logger == nil {
+			logger = slog.Default()
+		}
+		logger.LogAttrs(ctx, slog.LevelWarn, "admin response encode failed",
+			slog.String("error", err.Error()),
+		)
+	}


Calling w.WriteHeader(status) before encoding the JSON body can lead to partial or malformed responses if json.NewEncoder(w).Encode(body) fails (e.g., due to a broken connection or an unexpected encoding error). Since the status code and headers are already flushed, the client might misinterpret a truncated response as a successful one. It is safer to encode the body into a buffer or use json.Marshal first, and only write the headers and status code if encoding succeeds.

func writeAdminJSONStatus(w http.ResponseWriter, ctx context.Context, logger *slog.Logger, status int, body any) { b, err := json.Marshal(body) if err != nil { if logger == nil { logger = slog.Default() } logger.LogAttrs(ctx, slog.LevelWarn, "admin response encode failed", slog.String("error", err.Error()), ) writeJSONError(w, http.StatusInternalServerError, "internal_error", "failed to encode response") return } w.Header().Set("Content-Type", "application/json; charset=utf-8") w.Header().Set("Cache-Control", "no-store") w.WriteHeader(status) _, _ = w.Write(b) }

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 14ae1e9bb0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-25T14:04:46Z

+	dec := json.NewDecoder(body)
+	dec.DisallowUnknownFields()
+	var out CreateTableRequest
+	if err := dec.Decode(&out); err != nil {


Reject trailing JSON after create-table payload

decodeCreateTableRequest decodes exactly one JSON value and returns success immediately, so bodies like {"table_name":...}{"extra":1} are accepted even though they are not a single valid request document. This weakens the stated strict-decoding contract and can let malformed or concatenated payloads create/delete tables unexpectedly instead of returning 400 invalid_body; add an EOF check with a second decode/token read after the first decode succeeds.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-25T14:12:02Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

bootjp · 2026-04-25T14:15:15Z

@claude review (Gemini medium 2件 + Codex P2 1件対応)

chatgpt-codex-connector · 2026-04-25T14:20:30Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

bootjp · 2026-04-25T14:23:16Z

@claude review (Codex P2 slash-in-name 対応)

chatgpt-codex-connector · 2026-04-25T14:35:16Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

bootjp · 2026-04-25T14:35:53Z

@claude review (PR #635 で指摘された Codex P2 を bottom-up に取り込み: GSI projection_type を validateGSI で canonicalise)

bootjp · 2026-04-25T14:47:01Z

@claude review (#633 の変更を rebase 取り込み)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 070fa3f0ca

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-25T14:58:44Z

+	if dec.More() {
+		return CreateTableRequest{}, errors.New("request body has trailing data after the JSON object")


Reject NUL-delimited trailing request content

decodeCreateTableRequest relies on dec.More() to enforce a single JSON document, but with goccy/go-json a body like {"table_name":...}\x00{"extra":1} is accepted: the decoder treats the NUL byte as end-of-input, More() is false, and the trailing bytes are silently ignored. That means malformed/concatenated payloads can still pass invalid_body checks on POST /admin/api/v1/dynamo/tables, violating the strict-body contract for this new write path.

Useful? React with 👍 / 👎.

decodeCreateTableRequest used dec.More() to enforce a single JSON document per request, but goccy/go-json treats a raw NUL as end-of-input — a body like `{"table_name":...}\x00{"extra":1}` parsed cleanly, dec.More() returned false, and the trailing content was silently dropped. Codex P2 on PR #634 flagged this as a payload-smuggling vector. Fix: read the body once, scan for NUL before decoding. JSON has no need for raw NUL (control characters must be \u-escaped per RFC 8259), so any NUL is a strong signal of either tooling misconfiguration or deliberate smuggling. Reject with 400. The existing dec.More() check stays — it catches the well-formed trailing-token cases (a second `{...}` or trailing `42`) that would otherwise pass NUL-validation but still violate the strict-body contract. Tests: extend TestDynamoHandler_CreateTable_RejectsBadJSON with two NUL vectors — the trailing `{"extra":1}` from the Codex report, and a bare trailing NUL with no extra payload.

bootjp · 2026-04-25T15:05:11Z

@claude review (Codex P2: NUL-byte payload smuggling 対応 — body 読み取り後に NUL byte 検出で 400 reject)

decodeCreateTableRequest used dec.More() to enforce a single JSON document per request, but goccy/go-json treats a raw NUL as end-of-input — a body like `{"table_name":...}\x00{"extra":1}` parsed cleanly, dec.More() returned false, and the trailing content was silently dropped. Codex P2 on PR #634 flagged this as a payload-smuggling vector. Fix: read the body once, scan for NUL before decoding. JSON has no need for raw NUL (control characters must be \u-escaped per RFC 8259), so any NUL is a strong signal of either tooling misconfiguration or deliberate smuggling. Reject with 400. The existing dec.More() check stays — it catches the well-formed trailing-token cases (a second `{...}` or trailing `42`) that would otherwise pass NUL-validation but still violate the strict-body contract. Tests: extend TestDynamoHandler_CreateTable_RejectsBadJSON with two NUL vectors — the trailing `{"extra":1}` from the Codex report, and a bare trailing NUL with no extra payload.

chatgpt-codex-connector · 2026-04-25T15:12:31Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

Two related findings on PR #635 / forward_server.go's handleDelete: - NUL-byte smuggling: handleDelete decoded with goccy/go-json but skipped the explicit NUL scan that decodeCreateTableRequest applies. Same vector as the #634 fix — `{"name":"users"}\x00{"extra":1}` passes dec.More() because goccy treats NUL as end-of-input. Add the same pre-decode NUL rejection. - Slash-in-name divergence: the HTTP handleDelete and handleDescribe both reject `/` in the table name with 404, but the forwarded delete just passed body.Name straight through to AdminDeleteTable. A forwarded call could therefore act on slash-bearing tables that a leader-direct call would 404. Reject symmetrically before invoking the source. Tests: two new ForwardServer cases — NUL payload + slash name. Both confirm the source is never invoked when the precondition fails (defence in depth — an asymmetric stub source would still make the test green if we only checked the response code).

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 84bf2580e0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-25T15:18:35Z

+	if err != nil {
+		writeJSONError(w, http.StatusBadRequest, "invalid_body", err.Error())
+		return


Return 413 for oversized create-table bodies

POST /admin/api/v1/dynamo/tables currently maps every decode failure to 400 invalid_body, so when BodyLimit trips (http.MaxBytesError) the client still gets a 400 instead of the canonical 413 payload_too_large used elsewhere in admin handlers. This breaks the middleware contract documented in internal/admin/middleware.go and makes oversized requests indistinguishable from malformed JSON for callers and retry logic.

Useful? React with 👍 / 👎.

decodeCreateTableRequest used to surface every read/parse failure as the same generic "invalid_body" string, so handleCreate mapped all of them to 400 — including the BodyLimit/MaxBytesReader overflow that should produce 413 payload_too_large. The middleware contract in internal/admin/middleware.go promises 413 on oversized bodies (WriteMaxBytesError lives in that file exactly for this purpose). Codex P2 on PR #634 flagged the write path as the only handler that broke that contract: callers and retry logic could not distinguish "body too big" from "body malformed", and oversize requests would be retried as if a caller-side fix was possible. Fix: introduce errCreateBodyTooLarge as a sentinel returned only when io.ReadAll trips MaxBytesReader. handleCreate matches the sentinel via errors.Is and routes to WriteMaxBytesError, which emits the canonical 413 + payload_too_large body. All other decode paths still produce 400 invalid_body unchanged. Test: TestDynamoHandler_CreateTable_OversizedBodyReturns413 wraps the request body in MaxBytesReader (mirroring what the real BodyLimit middleware does) and confirms the response is 413 with a payload_too_large code. Also asserts the stub source is not touched on rejection.

Two related findings on PR #635 / forward_server.go's handleDelete: - NUL-byte smuggling: handleDelete decoded with goccy/go-json but skipped the explicit NUL scan that decodeCreateTableRequest applies. Same vector as the #634 fix — `{"name":"users"}\x00{"extra":1}` passes dec.More() because goccy treats NUL as end-of-input. Add the same pre-decode NUL rejection. - Slash-in-name divergence: the HTTP handleDelete and handleDescribe both reject `/` in the table name with 404, but the forwarded delete just passed body.Name straight through to AdminDeleteTable. A forwarded call could therefore act on slash-bearing tables that a leader-direct call would 404. Reject symmetrically before invoking the source. Tests: two new ForwardServer cases — NUL payload + slash name. Both confirm the source is never invoked when the precondition fails (defence in depth — an asymmetric stub source would still make the test green if we only checked the response code).

bootjp · 2026-04-25T15:23:50Z

@claude review (Codex P2 対応: oversized body は 400 ではなく 413 payload_too_large を返すよう修正)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 41a00638ee

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-25T15:39:51Z

+		}
+		return &admin.ValidationError{Message: msg}
+	default:
+		return err //nolint:wrapcheck // forwarded so the handler logs but does not surface it.


Map leader-churn dispatch errors to not-leader response

The bridge currently only maps adapter.ErrAdminNotLeader, but AdminCreateTable/AdminDeleteTable can still return raw coordinator leadership-loss errors after the initial leader check (e.g., leadership changes between isVerifiedDynamoLeader and Dispatch in adapter/dynamodb_admin.go). Those errors fall through this default branch and become a generic 500, so transient leader churn is reported as server failure instead of 503 leader_unavailable with retry semantics.

Useful? React with 👍 / 👎.

decodeCreateTableRequest used dec.More() to enforce a single JSON document per request, but goccy/go-json treats a raw NUL as end-of-input — a body like `{"table_name":...}\x00{"extra":1}` parsed cleanly, dec.More() returned false, and the trailing content was silently dropped. Codex P2 on PR #634 flagged this as a payload-smuggling vector. Fix: read the body once, scan for NUL before decoding. JSON has no need for raw NUL (control characters must be \u-escaped per RFC 8259), so any NUL is a strong signal of either tooling misconfiguration or deliberate smuggling. Reject with 400. The existing dec.More() check stays — it catches the well-formed trailing-token cases (a second `{...}` or trailing `42`) that would otherwise pass NUL-validation but still violate the strict-body contract. Tests: extend TestDynamoHandler_CreateTable_RejectsBadJSON with two NUL vectors — the trailing `{"extra":1}` from the Codex report, and a bare trailing NUL with no extra payload.

decodeCreateTableRequest used to surface every read/parse failure as the same generic "invalid_body" string, so handleCreate mapped all of them to 400 — including the BodyLimit/MaxBytesReader overflow that should produce 413 payload_too_large. The middleware contract in internal/admin/middleware.go promises 413 on oversized bodies (WriteMaxBytesError lives in that file exactly for this purpose). Codex P2 on PR #634 flagged the write path as the only handler that broke that contract: callers and retry logic could not distinguish "body too big" from "body malformed", and oversize requests would be retried as if a caller-side fix was possible. Fix: introduce errCreateBodyTooLarge as a sentinel returned only when io.ReadAll trips MaxBytesReader. handleCreate matches the sentinel via errors.Is and routes to WriteMaxBytesError, which emits the canonical 413 + payload_too_large body. All other decode paths still produce 400 invalid_body unchanged. Test: TestDynamoHandler_CreateTable_OversizedBodyReturns413 wraps the request body in MaxBytesReader (mirroring what the real BodyLimit middleware does) and confirms the response is 413 with a payload_too_large code. Also asserts the stub source is not touched on rejection.

translateAdminTablesError previously only matched adapter.ErrAdminNotLeader, which AdminCreateTable raises from its up-front isVerifiedDynamoLeader guard. After that guard, the kv coordinator can still drop leadership mid-Dispatch — the resulting error (kv.ErrLeaderNotFound, adapter.ErrNotLeader, adapter.ErrLeaderNotFound, or any of the wire-level "not leader" phrases) fell through to the default branch and surfaced as a generic 500 with no Retry-After. Codex P2 on PR #634 flagged this as a regression in the write path's retry contract. Fix: introduce isLeaderChurnError that mirrors kv's closed list of leader-churn sentinels (the typed sentinels plus the wire-level phrases for errors that have crossed a gRPC boundary and lost their type). Hook it into translateAdminTablesError so any matching error maps to admin.ErrTablesNotLeader, which the handler already turns into 503 leader_unavailable + Retry-After: 1. Tests pin all four typed sentinels, all four wire phrases, and prove an unrelated "team leader" error still falls through to the default branch — the focused substring match avoids swallowing random sentences that happen to mention leadership.

bootjp · 2026-04-25T17:10:27Z

@claude review

claude · 2026-04-25T17:10:44Z

Claude finished @bootjp's task in 2m 46s —— View job

Code Review

Overall Assessment

Solid, well-structured PR. All findings from the previous review rounds (Gemini mediums + Codex P2 items) have been addressed. The code is clean, the test coverage is thorough, and the layered security model is correctly implemented. Three minor comment-staleness issues below — none block merge.

Previously Raised Issues — Status

Issue	Reviewer	Status
JSON encode-before-write race	Gemini	✅ Fixed — `writeAdminJSONStatus` marshals to buffer first (`dynamo_handler.go:622-654`)
Redundant empty-name check in `AdminCreateTable`	Gemini	✅ Removed — comment at `dynamodb_admin.go:212-213` confirms dead code was cleaned up
413 for oversized body	Codex P2	✅ Fixed — `errCreateBodyTooLarge` + `IsMaxBytesError` check (`dynamo_handler.go:296-298`)
NUL-byte payload smuggling	Codex P2	✅ Fixed — `bytes.IndexByte(raw, 0)` guard before decoding (`dynamo_handler.go:395-397`)
Trailing JSON after valid object	Codex P2	✅ Fixed — `dec.More()` check after decode (`dynamo_handler.go:409-411`)
Leader-churn dispatch errors map to 503	Codex P2	✅ Fixed — `isLeaderChurnError` in `translateAdminTablesError` (`main_admin.go:247-254`)
`strings.Contains` → `HasSuffix` in churn matcher	Codex P2	✅ Fixed — `main_admin.go:286-289`
Structured errors ordered before churn check	Codex P2	✅ Fixed — `IsAdminTableAlreadyExists`/`IsAdminValidation` before `isLeaderChurnError` (`main_admin.go:237-246`)
GSI `projection_type` canonicalisation	Codex P2	✅ Fixed — `validateGSI` uppercases in-place (`dynamo_handler.go:500-507`)
Outdated `DynamoHandler` struct comment	Previous Claude	✅ Fixed — now describes GET+POST on `/tables`, GET+DELETE on `/tables/{name}` (`dynamo_handler.go:157-163`)
Whitespace table name propagating to adapter	Previous Claude	✅ Fixed — `in.TableName = strings.TrimSpace(in.TableName)` at validation boundary (`dynamo_handler.go:431`)

New Findings

1. Stale route table in buildAPIMux godoc (internal/admin/server.go:136-137)

The layout comment still lists only GET routes for dynamo:

// Layout:
//   POST   /admin/api/v1/auth/login
//   POST   /admin/api/v1/auth/logout
//   GET    /admin/api/v1/cluster
//   GET    /admin/api/v1/dynamo/tables              (auth required, read-only)
//   GET    /admin/api/v1/dynamo/tables/{name}       (auth required, read-only)

Now that this PR ships POST /tables and DELETE /tables/{name}, the layout comment is missing two lines and the (read-only) annotation on both existing lines is wrong. Fix this →

2. Stale Tables field comment (internal/admin/server.go:35)

// Tables is the read-only DynamoDB admin source. Optional: a nil
// value disables /admin/api/v1/dynamo/tables{,/{name}} ...

TablesSource now declares AdminCreateTable and AdminDeleteTable, so calling it "read-only" is misleading. Fix this →

3. Slightly misleading "Read-only endpoints" comment (internal/admin/server.go:197-201)

// Read-only endpoints share the protect chain so a missing
// session or CSRF token still 401s/403s the same way as a write.
// The Audit middleware is a no-op for GET (it only logs state-
// changing methods) so we get the consistent guards without the
// noise of an audit line per dashboard poll.
var dynamoChain http.Handler

This dynamoChain now serves both reads and writes. The "Audit is a no-op for GET" rationale is still correct (writes produce audit entries, reads do not), but the opening "Read-only endpoints" label is misleading. A one-word fix ("Dynamo endpoints…" or "Read and write endpoints…") suffices. Minor — no correctness impact. Fix this →

What Looks Good

Adapter isolation: internal/admin has zero adapter imports; the dynamoTablesBridge in main_admin.go owns all cross-package translation. Struct divergence breaks the build at the bridge.
Defence-in-depth authz: Role is checked at the handler level (lines 289, 325) and at the adapter level (AdminCreateTable/AdminDeleteTable), correctly anticipating the future AdminForward RPC path where the adapter cannot trust that the caller already enforced role.
Error vocabulary isolation: translateAdminTablesError maps adapter internals to admin sentinels without leaking raw error text. Typed-sentinel checks before substring churn-matching prevents the false-503 class of bugs. The new TestTranslateAdminTablesError_LeaderPhraseInMiddleOfMessage test pins the HasSuffix behaviour precisely.
JSON safety: Marshal to buffer first, then write headers + body — the Gemini medium issue is correctly resolved and is covered by both unit tests and integration tests.
Request safety: NUL-byte detection, trailing-JSON rejection, and 413 for oversized bodies are all tested with dedicated cases in TestDynamoHandler_CreateTable_RejectsBadJSON and TestDynamoHandler_CreateTable_OversizedBodyReturns413.
Empty body contract: 204 No Content on DELETE verified with require.Empty(t, rec.Body.Bytes()) — SPA can skip body parsing on success.
Whitespace-name trimming correctly added at the handler validation boundary, with two new tests (TrimsWhitespaceFromTableName, WhitespaceOnlyNameRejected) that confirm trimming doesn't weaken the empty-name guard.
Server-level integration: Four new integration tests (TestServer_DynamoCreateTable_FullRoleHappyPath, _ReadOnlyRoleRejected, _MissingCSRFRejected, TestServer_DynamoDeleteTable_FullRoleHappyPath) close the end-to-end middleware stack under a real HTTP server.

Summary: Three stale comments, all trivial one-line fixes. No correctness, security, or behaviour issues found. LGTM with those addressed at your discretion.

handleCreate / handleDelete previously trusted whatever role was embedded in the session JWT. Tokens are valid for one hour, and the role is captured at login time, so an access key downgraded or revoked mid-session would keep mutating tables for up to 60 minutes despite the operator's intent. Codex P1 on PR #635 flagged the gap; the leader-side ForwardServer already does this re-evaluation, the HTTP path now does it too so leader- direct and forwarded write paths share the same authorisation contract. Pieces: - Extract RoleStore + MapRoleStore into role_store.go (shared between the HTTP handler and the existing forward server). - DynamoHandler gains a roles RoleStore field plus WithRoleStore option. Nil keeps the previous trust-the-JWT behaviour for single-tenant deployments where the role map never changes. - principalForWrite centralises the check: pull principal from context (fail closed if missing), enforce RoleFull on the JWT role, then — when a RoleStore is configured — re-look-up the access key against the live map and reject if absent or downgraded. The downstream call uses the live role. - ServerDeps.Roles is now plumbed into the dynamo handler via MapRoleStore so production builds pick up the live check automatically. Tests: three new cases under the live-role-revalidation banner — downgrade (full → read_only), removal (access key absent), and the same revocation flow on the delete path. All three confirm the source layer is never touched on rejection.

Two related findings on PR #635 / forward_server.go's handleDelete: - NUL-byte smuggling: handleDelete decoded with goccy/go-json but skipped the explicit NUL scan that decodeCreateTableRequest applies. Same vector as the #634 fix — `{"name":"users"}\x00{"extra":1}` passes dec.More() because goccy treats NUL as end-of-input. Add the same pre-decode NUL rejection. - Slash-in-name divergence: the HTTP handleDelete and handleDescribe both reject `/` in the table name with 404, but the forwarded delete just passed body.Name straight through to AdminDeleteTable. A forwarded call could therefore act on slash-bearing tables that a leader-direct call would 404. Reject symmetrically before invoking the source. Tests: two new ForwardServer cases — NUL payload + slash name. Both confirm the source is never invoked when the precondition fails (defence in depth — an asymmetric stub source would still make the test green if we only checked the response code).

Two changes on the AdminForward leader-side dispatcher: - Codex P2 on PR #635: forwarded 503 leader_unavailable lost the Retry-After: 1 header that the leader-direct HTTP path emits. Add retry_after_seconds to AdminForwardResponse so the follower's bridge can rebuild the same HTTP header (the bridge side wires it in the next PR; the proto + leader populate it here). Direct HTTP path is unaffected — the proto field is only consumed during forwarding. - Drop the duplicate RoleStore/MapRoleStore declarations from forward_server.go. They now live in role_store.go (added on PR #634 for HTTP-side role revalidation) so both surfaces share one definition. Tests: extend TestForwardServer_CreateTable_LeaderSteppedDownReturns503 to pin the retry_after_seconds=1 hint.

bootjp · 2026-04-25T17:20:20Z

@claude review

claude · 2026-04-25T17:20:34Z

Claude finished @bootjp's task in 4m 19s —— View job

Code Review

Overall Assessment

Very clean PR. The most recent commit (c2bfcd0) wires live role revalidation (Codex P1 from PR #635) correctly. All substantive findings from prior review rounds are now resolved. Three comment-staleness items from the previous Claude round (@claude review at 17:10 UTC) remain in internal/admin/server.go — otherwise LGTM.

Previously Raised Issues — Status

Issue	Reviewer	Status
JSON encode-before-write race	Gemini	✅ Fixed — `writeAdminJSONStatus` marshals to buffer first (`dynamo_handler.go:678-710`)
Redundant empty-name check in `AdminCreateTable`	Gemini	✅ Removed — comment at `dynamodb_admin.go:212-213` confirms dead code cleaned up
413 for oversized body	Codex P2	✅ Fixed — `errCreateBodyTooLarge` + `IsMaxBytesError` check (`dynamo_handler.go:306-310`)
NUL-byte payload smuggling	Codex P2	✅ Fixed — `bytes.IndexByte(raw, 0)` guard before decoding (`dynamo_handler.go:451-453`)
Trailing JSON after valid object	Codex P2	✅ Fixed — `dec.More()` check after decode (`dynamo_handler.go:465-467`)
Leader-churn dispatch errors map to 503	Codex P2	✅ Fixed — `isLeaderChurnError` in `translateAdminTablesError` (`main_admin.go:247-254`)
`strings.Contains` → `HasSuffix` in churn matcher	Codex P2	✅ Fixed — `main_admin.go:286-289`
Structured errors ordered before churn check	Codex P2	✅ Fixed — `IsAdminTableAlreadyExists`/`IsAdminValidation` before `isLeaderChurnError` (`main_admin.go:237-246`)
GSI `projection_type` canonicalisation	Codex P2	✅ Fixed — `validateGSI` uppercases in-place (`dynamo_handler.go:556-563`)
Outdated `DynamoHandler` struct comment	Claude (prev)	✅ Fixed — now describes GET+POST on `/tables`, GET+DELETE on `/tables/{name}` (`dynamo_handler.go:157-163`)
Whitespace table name propagating to adapter	Claude (prev)	✅ Fixed — `in.TableName = strings.TrimSpace(in.TableName)` (`dynamo_handler.go:487`)
Live role revalidation on writes	Codex P1 (#635)	✅ Fixed — `principalForWrite` re-checks against `RoleStore` (`dynamo_handler.go:356-390`); covered by `TestDynamoHandler_CreateTable_LiveRoleRevocation` and `_LiveRoleAccessKeyRemoved`
Stale route table in `buildAPIMux` godoc	Claude (prev)	❌ Still present — see below
Stale `Tables` field comment	Claude (prev)	❌ Still present — see below
Stale "Read-only endpoints" comment	Claude (prev)	❌ Still present — see below

Remaining Findings (all carry-overs from the 17:10 review)

1. Stale route table in buildAPIMux godoc (internal/admin/server.go:142-148)

// Layout:
//
//	POST   /admin/api/v1/auth/login                 (no auth, rate-limited)
//	POST   /admin/api/v1/auth/logout                (auth required)
//	GET    /admin/api/v1/cluster                    (auth required)
//	GET    /admin/api/v1/dynamo/tables              (auth required, read-only)
//	GET    /admin/api/v1/dynamo/tables/{name}       (auth required, read-only)

Missing POST /admin/api/v1/dynamo/tables (auth required, full role) and DELETE /admin/api/v1/dynamo/tables/{name} (auth required, full role); the (auth required, read-only) annotation on both GET lines is wrong. Fix this →

2. Stale Tables field comment (internal/admin/server.go:35-39)

// Tables is the read-only DynamoDB admin source. Optional: a nil
// value disables /admin/api/v1/dynamo/tables{,/{name}} ...

TablesSource now includes AdminCreateTable and AdminDeleteTable, so "read-only" is wrong. Fix this →

3. Stale "Read-only endpoints" comment (internal/admin/server.go:207-208)

// Read-only endpoints share the protect chain so a missing
// session or CSRF token still 401s/403s the same way as a write.

dynamoChain now serves both reads and writes; "Read-only endpoints" is misleading. The rationale (Audit is a no-op for GET) is still valid. One-word fix suffices: // Dynamo endpoints (reads and writes) share the protect chain…. Fix this →

What Looks Good

Adapter isolation intact: internal/admin has zero adapter imports; all cross-package translation lives in dynamoTablesBridge in main_admin.go. Struct drift breaks the build at the bridge — exactly as intended.
Defence-in-depth authz: Role enforced at the handler (principalForWrite:367) and at the adapter (AdminCreateTable:202, AdminDeleteTable:249). The adapter re-check is the one the future AdminForward RPC path will rely on, so the invariant is already correct for both the current direct path and the forwarded path.
Live role revalidation (new in c2bfcd0): principalForWrite re-evaluates the access key against MapRoleStore(deps.Roles) before touching the source; three unit tests and one integration test pin the revocation, downgrade, and key-removal cases. The principal that reaches the source carries the live role, not the JWT-frozen one.
Error vocabulary isolation: translateAdminTablesError maps adapter internals to admin sentinels without leaking raw text. Typed-sentinel checks before isLeaderChurnError prevent the false-503 class. TestTranslateAdminTablesError_LeaderPhraseInMiddleOfMessage directly pins the HasSuffix behavior.
JSON safety: Marshal-to-buffer pattern in writeAdminJSONStatus correctly prevents partial JSON on encode failure. Covered by both unit and integration tests.
Request safety: NUL-byte detection, trailing-JSON rejection, and 413 for oversized bodies all tested with dedicated cases. Correct ordering: NUL check → JSON decode → dec.More() → field validation.
204 No Content contract: TestDynamoHandler_DeleteTable_HappyPath asserts require.Empty(t, rec.Body.Bytes()) — SPA can skip body parsing on success.
Middleware ordering: CSRF checked before handler body is read; Audit runs before CSRF so CSRF rejections still appear in the audit log. TestServer_DynamoCreateTable_MissingCSRFRejected verifies the gate and checks that the source was not touched.
isLeaderChurnError comment is correctly updated to explain why HasSuffix is required and where the canonical phrase list lives.

Summary: Three stale comment-only fixes remain from the previous round (server.go:35, server.go:142-148, server.go:207). No correctness, security, or behaviour issues found. LGTM once those are addressed.

Three doc-only fixes from Claude's review on PR #634 — the behavioural changes from this PR's commit series moved past their original comments: - ServerDeps.Tables description still said "read-only DynamoDB admin source" but TablesSource now exposes AdminCreateTable and AdminDeleteTable too. - buildAPIMux godoc layout omitted POST /dynamo/tables and DELETE /dynamo/tables/{name}, and the "(auth required, read-only)" annotation on the GET lines was wrong. - The dynamoChain wiring comment said "Read-only endpoints share the protect chain..." but the chain now serves both reads and writes; the rationale (Audit no-ops for GET) is unchanged. No behavioural change; doc-only.

Two related findings on PR #635 / forward_server.go's handleDelete: - NUL-byte smuggling: handleDelete decoded with goccy/go-json but skipped the explicit NUL scan that decodeCreateTableRequest applies. Same vector as the #634 fix — `{"name":"users"}\x00{"extra":1}` passes dec.More() because goccy treats NUL as end-of-input. Add the same pre-decode NUL rejection. - Slash-in-name divergence: the HTTP handleDelete and handleDescribe both reject `/` in the table name with 404, but the forwarded delete just passed body.Name straight through to AdminDeleteTable. A forwarded call could therefore act on slash-bearing tables that a leader-direct call would 404. Reject symmetrically before invoking the source. Tests: two new ForwardServer cases — NUL payload + slash name. Both confirm the source is never invoked when the precondition fails (defence in depth — an asymmetric stub source would still make the test green if we only checked the response code).

Two changes on the AdminForward leader-side dispatcher: - Codex P2 on PR #635: forwarded 503 leader_unavailable lost the Retry-After: 1 header that the leader-direct HTTP path emits. Add retry_after_seconds to AdminForwardResponse so the follower's bridge can rebuild the same HTTP header (the bridge side wires it in the next PR; the proto + leader populate it here). Direct HTTP path is unaffected — the proto field is only consumed during forwarding. - Drop the duplicate RoleStore/MapRoleStore declarations from forward_server.go. They now live in role_store.go (added on PR #634 for HTTP-side role revalidation) so both surfaces share one definition. Tests: extend TestForwardServer_CreateTable_LeaderSteppedDownReturns503 to pin the retry_after_seconds=1 hint.

bootjp · 2026-04-25T17:32:19Z

@claude review

claude · 2026-04-25T17:32:30Z

Claude finished @bootjp's task in 2m 27s —— View job

Code Review

Overall Assessment

All findings from every prior review round are resolved. No new correctness, security, or behavioural issues found. LGTM.

Previously Raised Issues — Status

Issue	Reviewer	Status
JSON encode-before-write race	Gemini	✅ Fixed — marshal-to-buffer in `writeAdminJSONStatus` (`dynamo_handler.go:679-710`)
Redundant empty-name check in `AdminCreateTable`	Gemini	✅ Removed — comment at `dynamodb_admin.go:212-213` confirms dead code cleaned up
413 for oversized body	Codex P2	✅ Fixed — `errCreateBodyTooLarge` + `IsMaxBytesError` check (`dynamo_handler.go:307-310`)
NUL-byte payload smuggling	Codex P2	✅ Fixed — `bytes.IndexByte(raw, 0)` guard (`dynamo_handler.go:451-453`)
Trailing JSON after valid object	Codex P2	✅ Fixed — `dec.More()` check (`dynamo_handler.go:465-467`)
Leader-churn dispatch errors map to 503	Codex P2	✅ Fixed — `isLeaderChurnError` in `translateAdminTablesError` (`main_admin.go:247-254`)
`strings.Contains` → `HasSuffix` in churn matcher	Codex P2	✅ Fixed — `main_admin.go:286-289`
Structured errors ordered before churn check	Codex P2	✅ Fixed — `IsAdminTableAlreadyExists`/`IsAdminValidation` before `isLeaderChurnError` (`main_admin.go:237-246`)
GSI `projection_type` canonicalisation	Codex P2	✅ Fixed — `validateGSI` uppercases in-place (`dynamo_handler.go:556-563`)
Outdated `DynamoHandler` struct comment	Claude (prev)	✅ Fixed — now describes GET+POST on `/tables`, GET+DELETE on `/tables/{name}` (`dynamo_handler.go:157-163`)
Whitespace table name propagating to adapter	Claude (prev)	✅ Fixed — `strings.TrimSpace` at validation boundary (`dynamo_handler.go:487`)
Live role revalidation on writes	Codex P1 (#635)	✅ Fixed — `principalForWrite` re-checks against `RoleStore` (`dynamo_handler.go:356-390`)
Stale route table in `buildAPIMux` godoc	Claude (prev)	✅ Fixed — now lists POST `/tables` and DELETE `/tables/{name}` (`server.go:144-150`)
Stale `Tables` field comment	Claude (prev)	✅ Fixed — "read-only" removed; now covers list, describe, create, delete (`server.go:35-40`)
Stale "Read-only endpoints" comment	Claude (prev)	✅ Fixed — "Dynamo endpoints (reads and writes) share the protect chain" (`server.go:210-214`)

New Findings

None. After reading all 11 changed files, the code is clean and no new issues were identified.

A few implementation choices that are worth acknowledging as correct:

Auth-before-parse ordering in handleCreate (dynamo_handler.go:301-312): principalForWrite runs before decodeCreateTableRequest. This means a read-only caller gets 403 rather than 413 even on an oversized body, but this is correct — the BodyLimit middleware has already capped the body at the connection level before the handler is entered, so the auth check doesn't bypass the size cap.
strings.EqualFold in buildLegacyGSI (dynamodb_admin.go:379): The adapter uses case-insensitive "INCLUDE" matching when populating NonKeyAttributes. Since validateGSI in the handler already canonicalises to uppercase before the request reaches the adapter, this is always uppercase in practice — the EqualFold is a defence-in-depth guard for direct callers of the adapter method.
NonKeyAttributes copy (main_admin.go:206): append([]string(nil), g.Projection.NonKeyAttributes...) correctly avoids slice aliasing between the admin DTO and the adapter input. Clean.

What Looks Good

Bridge isolation: internal/admin has zero adapter imports; all cross-package translation lives in dynamoTablesBridge in main_admin.go. Struct divergence breaks the build at the bridge.
Defence-in-depth authz: Role checked at the handler (principalForWrite:356) and at the adapter (AdminCreateTable:202, AdminDeleteTable:249). Both re-checks are correct for the future AdminForward RPC path.
Live role revalidation: principalForWrite re-evaluates against MapRoleStore(deps.Roles). Three unit tests + one integration test pin the revocation, downgrade, and key-removal cases.
Error vocabulary isolation: translateAdminTablesError maps adapter internals without leaking raw text. TestTranslateAdminTablesError_LeaderPhraseInMiddleOfMessage directly pins the HasSuffix false-positive protection.
Request safety: NUL-byte detection, trailing-JSON rejection, and 413 for oversized bodies all tested with dedicated cases and in the correct order: NUL → decode → dec.More() → field validation.
Test coverage: 49 handler unit tests + 14 adapter integration tests, covering all major failure modes (413, NUL smuggling, trailing JSON, 503 + Retry-After, 409, 404, 403, role enforcement at adapter). Server-level integration tests close the end-to-end middleware stack.

Summary: All 15 tracked findings from prior rounds are resolved. No new issues. LGTM — ready to merge.

Two related findings on PR #635 / forward_server.go's handleDelete: - NUL-byte smuggling: handleDelete decoded with goccy/go-json but skipped the explicit NUL scan that decodeCreateTableRequest applies. Same vector as the #634 fix — `{"name":"users"}\x00{"extra":1}` passes dec.More() because goccy treats NUL as end-of-input. Add the same pre-decode NUL rejection. - Slash-in-name divergence: the HTTP handleDelete and handleDescribe both reject `/` in the table name with 404, but the forwarded delete just passed body.Name straight through to AdminDeleteTable. A forwarded call could therefore act on slash-bearing tables that a leader-direct call would 404. Reject symmetrically before invoking the source. Tests: two new ForwardServer cases — NUL payload + slash name. Both confirm the source is never invoked when the precondition fails (defence in depth — an asymmetric stub source would still make the test green if we only checked the response code).

Two changes on the AdminForward leader-side dispatcher: - Codex P2 on PR #635: forwarded 503 leader_unavailable lost the Retry-After: 1 header that the leader-direct HTTP path emits. Add retry_after_seconds to AdminForwardResponse so the follower's bridge can rebuild the same HTTP header (the bridge side wires it in the next PR; the proto + leader populate it here). Direct HTTP path is unaffected — the proto field is only consumed during forwarding. - Drop the duplicate RoleStore/MapRoleStore declarations from forward_server.go. They now live in role_store.go (added on PR #634 for HTTP-side role revalidation) so both surfaces share one definition. Tests: extend TestForwardServer_CreateTable_LeaderSteppedDownReturns503 to pin the retry_after_seconds=1 hint.

Stacked on #634. Lays down the proto contract and the leader-side dispatcher for the follower-to-leader forwarding path mandated by design Section 3.3. The follower-side client + bridge wiring lands in a separate PR to keep this one reviewable. ## Summary - New `proto/admin_forward.proto` with `service AdminForward` carrying an `AdminPrincipal` + opaque JSON payload + the operation enum. Operation-agnostic so adding a new admin endpoint does not require a wire-format bump. - `internal/admin/forward_server.go` is the leader-side gRPC handler. `validatePrincipal` re-evaluates the inbound access-key against the leader's own `RoleStore` and demotes a follower-claimed full role when the leader sees read-only — this is the design's *adapter side is the source of truth for authz* invariant for forwarded calls (Section 3.3.1). - `forwardErrorResponse` mirrors the HTTP handler's `writeTablesError` mapping (403 / 503 / 404 / 409 / 400 / 500) so a forwarded call is indistinguishable from a leader-direct call to the SPA. Raw adapter error text never leaves the leader; everything is encoded into the structured response payload. - `forwarded_from` is echoed into the leader's audit log line on every successful forwarded operation (acceptance criterion 6). ## What is NOT in this PR - Follower-side AdminForward client + bridge wiring (acceptance criterion 2). The follower's HTTP handler still returns `503 leader_unavailable` + `Retry-After: 1` from #634 — this PR is a no-op on the request path until the client wiring lands. - Election-period 503 handling (criterion 3) — comes with the client wiring. - Rolling-upgrade compatibility flag `admin.leader_forward_v2` (criterion 5) — depends on a Raft-level cluster version bump that is independently substantial. - gRPC server registration in `main.go` — wired in the next PR alongside the client. ## Acceptance criteria coverage | # | Criterion | This PR | |---|---|---| | 1 | Leader direct write | ✓ existing (PR #634) | | 2 | Follower forwards transparently | ⏳ next PR (client) | | 3 | Election period 503 + retry | ⏳ next PR (client) | | 4 | Leader demotes stale full role | ✓ test added | | 5 | Rolling-upgrade compat flag | ⏳ deferred | | 6 | `forwarded_from` in audit log | ✓ test added | ## Test plan - [x] `go build ./...` - [x] `go vet ./...` - [x] `golangci-lint run` (admin package: 0 issues) - [x] `go test ./internal/admin/ -run TestForwardServer -count=1` (11 leader-side tests pass — criterion 1, 4, 6, plus error-mapping coverage) - [ ] Wire follower side and exercise an end-to-end follower → leader call against a live cluster

…artial) (#644) Phase 1 + 2 of Task #26: the follower-side `LeaderForwarder` client and its integration into the dynamo HTTP handler. Builds on the AdminForward leader-side dispatcher landed via #635. ## Summary - New `LeaderForwarder` interface (`internal/admin/forward_client.go`) decouples the dynamo HTTP handler from `pb.AdminForwardClient`. The handler stays proto-free; the bridge in `main_admin.go` (next phase) plugs in the gRPC-backed implementation. - `gRPCForwardClient` translates a `CreateTableRequest` / table-name into an `AdminForwardRequest`, dials via a `GRPCConnFactory` (production wraps `kv.GRPCConnCache`), and re-shapes the response into `ForwardResult` (status, payload, content-type). - `forwarded_from = nodeID` is populated so the leader's audit log carries the trace (criterion 6, leader-side already shipped in #635). - Defensive: `status_code == 0` upgrades to `502 Bad Gateway`; missing `ContentType` fills the JSON default. Both surface transport bugs rather than producing silently-malformed SPA responses. - `ErrLeaderUnavailable` sentinel signals the "no leader known" case so the handler can map to 503 + `Retry-After: 1` (criterion 3). - `DynamoHandler` gains a `forwarder` field and `WithLeaderForwarder` option. When set, `handleCreate` / `handleDelete` catch `ErrTablesNotLeader` from the source and forward to the leader transparently — the SPA cannot tell forwarded from leader-direct. - `writeForwardResult` re-emits the leader's structured response verbatim (status + payload + content-type), so a forwarded `409 Conflict` from the leader stays `409` on the wire — no re-classification. - `writeForwardFailure` maps `ErrLeaderUnavailable` (election in flight) and gRPC transport errors to 503 + `Retry-After: 1`. `ErrLeaderUnavailable` is intentionally NOT logged at error level (elections are routine); transport errors are logged at LevelError so operators can investigate. ## What is NOT in this PR - gRPC server registration in `main.go` (production wiring of the `ForwardServer` from #635) — comes in the next phase. - The bridge that wraps `kv.GRPCConnCache` and supplies `LeaderAddressResolver` — same phase. - Election-period retry-loop on the client side (criterion 3 partial: this PR returns 503 + Retry-After; the SPA / client retries the request; criterion 3 fully needs the production bridge to dial actual leader-discovery). ## Test plan - [x] `go build ./...` - [x] `go vet ./...` - [x] `golangci-lint run` (admin package: 0 issues) - [x] `go test ./internal/admin/ -count=1 -race` - 8 forward-client unit tests: constructor input validation, both Forward operations including principal/payload/op-enum/`forwarded_from` round-trip, `ErrLeaderUnavailable`, dial/RPC errors propagated with `cockroachdb/errors` wrapping, zero status code upgrade, missing content type fallback - 9 handler integration tests: transparent forward for create + delete, no-forwarder fallback to 503, `ErrLeaderUnavailable` → 503 + Retry-After, transport error → 503 + log, leader 409 pass-through, leader 503 + Retry-After preserved, role check short-circuits before forward, body validation short-circuits before forward - [ ] Wire production bridge + register `pb.RegisterAdminForwardServer` in `main.go` and exercise an end-to-end follower → leader call against a real cluster (next PR). ## Acceptance criteria coverage | # | Criterion | This PR | |---|---|---| | 1 | Leader direct write | ✓ (in main since #634) | | 2 | Follower forwards transparently | ✓ wiring done; needs main.go gRPC registration to take effect | | 3 | Election-period 503 + retry | ✓ partial — handler returns 503 + Retry-After; full transparency needs the bridge | | 4 | Leader demotes stale full role | ✓ (in main since #635) | | 5 | Rolling-upgrade compat flag | ⏳ deferred (cluster-version bump) | | 6 | `forwarded_from` in audit log | ✓ (in main since #635) |

…de bridge (#648) Phase 3 of Task #26 / AdminForward design 3.3. Closes the wiring loop opened by #635 (leader-side dispatcher) and #644 (follower-side client + handler integration): the admin HTTP DynamoHandler now hands `ErrTablesNotLeader` writes off to the in-process `LeaderForwarder`, and the leader-side gRPC `AdminForward` service is registered on every node so the forwarded RPC actually lands somewhere. ## Summary - **`admin.ServerDeps.Forwarder`** (3315bf4) — new optional field on the high-level admin server composition surface. `admin.NewServer` calls `WithLeaderForwarder` when set; nil leaves the existing 503 + Retry-After:1 fallback untouched. Single-node / leader-only deployments are unaffected. - **`main_admin_forward.go`** (f3e529c) — production bridge between `kv.GRPCConnCache` and the admin layer's typed `PBAdminForwardClient` (via `pb.NewAdminForwardClient`). Includes `buildLeaderForwarder`, `roleStoreFromFlags`, `adminForwardServerDeps`, and a `readyForRegistration` gate. - **Runtime wiring** (19f6b70) — `runtimeServerRunner.start()` now creates `dynamoServer` BEFORE `startRaftServers`, so the leader-side `ForwardServer` can use it as its `TablesSource`. `startRaftServers` accepts the new `adminForwardServerDeps` bundle and delegates to `registerAdminForwardServer`. `startServers` constructs a single `*kv.GRPCConnCache` and forwards it (with the `Coordinator`) to `startAdminFromFlags`, which in turn builds the production `LeaderForwarder` via `buildAdminLeaderForwarder` and passes it through `ServerDeps.Forwarder`. ## What is NOT in this PR - Rolling-upgrade compatibility flag (criterion 5) — still deferred behind a cluster-version bump. - Election-period retry loop on the follower's bridge — the handler returns 503 + Retry-After:1, the SPA / curl re-issues, and the bridge dials again on the next attempt. Inline retry inside the handler would just hide the latency from operators. ## Test plan - [x] `go build ./...` - [x] `go vet ./...` - [x] `golangci-lint run` (main + admin packages: 0 issues) - [x] `go test ./internal/admin/ -count=1 -race` — full admin suite passes (existing 21 forward tests + 1 new `TestServer_ServerDepsForwarderIsWired`) - [x] `go test . -count=1 -race` — main package passes (4 new bridge / role-store / readyForRegistration tests) - [ ] `adapter` package times out at 240s on this branch AND on `main` — pre-existing flake unrelated to this PR (verified by checking out main and running the same suite) - [ ] End-to-end smoke against a 3-node cluster — needs the local Jepsen runner; blocked on the next PR or a manual run ## Acceptance criteria coverage | # | Criterion | This PR | |---|---|---| | 1 | Leader direct write | ✓ (in main since #634) | | 2 | Follower forwards transparently | ✓ — wiring complete; both leader-side gRPC service AND follower-side bridge are now plumbed | | 3 | Election-period 503 + retry | ✓ — handler still returns 503 + Retry-After:1 when no leader is known; the SPA / client retries and the next attempt re-dials | | 4 | Leader demotes stale full role | ✓ (in main since #635) | | 5 | Rolling-upgrade compat flag | ⏳ deferred (cluster-version bump) | | 6 | `forwarded_from` in audit log | ✓ (in main since #635 — bridge now stamps `--raftId` onto every forwarded request) | ## Self-review (5 lenses) 1. **Data loss**: No FSM / Raft / Pebble path changes; admin writes still go through the same `AdminCreateTable` / `AdminDeleteTable` Raft proposal once they reach the leader. 2. **Concurrency**: `runner.start()` reorder confirmed safe — `startDynamoDBServer` only depends on `coordinate` + `shardStore` (already constructed); raft TCP listeners are independent of the dynamo HTTP listener. Worst case during the 100 ms-or-so reorder window is the same "no leader yet" 503 the old order would have produced if a request landed before raft converged. 3. **Performance**: One additional `&kv.GRPCConnCache{}` allocation per process; one closure for the resolver. No hot-path changes. 4. **Data consistency**: Forwarded requests re-validate the principal at the leader (criterion 4, already shipped); commit-ts ordering is unchanged because the leader's `AdminCreateTable` path does its own `HLC.Next`. 5. **Test coverage**: 4 new main-package tests + 1 new admin-package server-level test. `TestServer_ServerDepsForwarderIsWired` is the regression test for "future refactor drops the forwarder before it reaches the dynamo handler" — exactly the failure mode the wiring change in this PR introduces the risk of.  ## Summary by CodeRabbit * **New Features** * Added leader-forwarding capability for admin requests sent to follower nodes * Implemented role-based access control for admin API operations * Enhanced DynamoDB request routing to automatically direct operations to leader nodes * **Tests** * Added comprehensive integration and unit tests validating leader-forwarding paths

Three findings from Claude review on PR #670: 1. (P1) translateAdminQueuesError did not catch leader-churn errors. AdminDeleteQueue passes the upfront isVerifiedSQSLeader check, then dispatches deleteQueueWithRetry which can hit a kv coordinator that just lost leadership. The resulting kv.ErrLeaderNotFound / adapter.ErrNotLeader / wrapped "not leader" suffixes were not in the translator's switch — they fell to default and the admin handler rendered a generic 500 instead of the spec'd 503 + Retry-After: 1. Added the `case isLeaderChurnError(err)` arm mirroring translateAdminTablesError's identical fix from PR #634. 2. (P2) No tests for translateAdminQueuesError. Mirrored the three Dynamo equivalents in main_admin_test.go: - TestTranslateAdminQueuesError_LeaderChurn covers every kv sentinel + canonical wrapped-suffix variant. - TestTranslateAdminQueuesError_LeaderPhraseInMiddleOfMessage pins the HasSuffix matcher behaviour against false positives on user-supplied error messages mid-string. - TestTranslateAdminQueuesError_UnrelatedErrorPassesThrough confirms the detector does not swallow innocent "leader" mentions outside the canonical phrase set. 3. (Low / polish) SqsList.tsx subtitle leaked the Go file path "adapter/sqs_admin.go" and the internal milestone name "Phase 3.A" to end users — DynamoList / S3List don't do this. Replaced with operator-facing prose describing what the page does. Verified: - go test -run TestTranslateAdminQueuesError . — passes - go build ./... clean - go test -race ./internal/admin/... + go test -race -run TestSQS ./adapter/ — pass - golangci-lint run ./adapter/... ./internal/admin/... ./... — 0 issues - cd web/admin && npm run lint (tsc --strict) clean

Per `docs/design/README.md`'s lifecycle convention, the admin dashboard design doc is now "partial": - **P1** (DynamoDB CRUD + AdminForward) — shipped via #634, #635, #644, #648 - **P2** (S3 buckets list/create/delete/ACL + DescribeTable) — shipped via #658, with #669 + #673 in flight - **P3** (React SPA + embed) — shipped via #649, #650 - **P4** (TLS / role / CSRF / operator docs) — TLS, role, CSRF are already live in P1; operator docs in #674 Independent of the in-flight slice 2 PRs (#669/#673) and the docs PR (#674) — this rename only reflects what is already on main today, plus an "Implementation status" table mapping each phase to the PR it landed in. ## What this PR changes - `git mv` the design doc from `2026_04_24_proposed_admin_dashboard.md` to `2026_04_24_partial_admin_dashboard.md` so its history follows - Add an "Implementation status" header table indexing each phase to the PRs that landed it - List the outstanding open items so future readers know what is still owed against the original proposal: - AdminForward acceptance criterion 5 (rolling-upgrade compat flag) — deferred - S3 object browser — explicitly out of scope per Section 2 Non-goals - TLS cert hot-reload — restart-to-rotate is the documented model When the rolling-upgrade flag lands, the doc gets renamed once more to `2026_04_24_implemented_admin_dashboard.md` per the README's lifecycle convention.

Catches the three locations the prior round (7c63a7b) missed: adapter/sqs_admin_test.go (TestAdminQueueSummary_CreatedAtUsesMillisNotHLC, TestMetaAttributesForAdmin_IncludesQueueArnAndLastModified) and main_admin_test.go (TestTranslateAdminQueuesError_LeaderChurn). Same pattern: the why-rationale stays, only the "fifth-round Claude review on PR #670" / "seventh-round Claude review on PR #670" / "Codex P2 + Claude P1 on PR #670" attributions are dropped. Pre-existing PR #634 references at main_admin_test.go:373 / :397 were added by a prior merged PR and are out of scope here.

gemini-code-assist Bot reviewed Apr 25, 2026

View reviewed changes

bootjp mentioned this pull request Apr 25, 2026

admin: AdminForward RPC + leader-side dispatcher (P1, partial) #635

Merged

5 tasks

chatgpt-codex-connector Bot reviewed Apr 25, 2026

View reviewed changes

bootjp force-pushed the feat/admin-dynamo-tables-write branch from 14ae1e9 to 70e16e1 Compare April 25, 2026 14:11

bootjp force-pushed the feat/admin-dynamo-tables-write branch from 70e16e1 to 3b0d05d Compare April 25, 2026 14:20

bootjp force-pushed the feat/admin-dynamo-tables-write branch from 1c14fc1 to 070fa3f Compare April 25, 2026 14:46

chatgpt-codex-connector Bot reviewed Apr 25, 2026

View reviewed changes

bootjp force-pushed the feat/admin-dynamo-tables-write branch from 84bf258 to dcac6e4 Compare April 25, 2026 15:12

chatgpt-codex-connector Bot reviewed Apr 25, 2026

View reviewed changes

bootjp added 2 commits April 26, 2026 02:27

Merge branch 'main' into feat/admin-dynamo-tables-write

17c0fcd

Merge branch 'main' into feat/admin-dynamo-tables-write

5783c2a

bootjp merged commit 7d9a607 into main Apr 25, 2026
8 checks passed

bootjp deleted the feat/admin-dynamo-tables-write branch April 25, 2026 19:06

bootjp mentioned this pull request Apr 25, 2026

admin: read-only DynamoDB tables endpoints (P1) #633

Closed

7 tasks

bootjp mentioned this pull request Apr 25, 2026

admin: AdminForward follower-side client + handler integration (P1, partial) #644

Merged

5 tasks

bootjp mentioned this pull request Apr 26, 2026

admin: AdminForward Phase 3 — register gRPC server + wire follower-side bridge #648

Merged

7 tasks

claude Bot mentioned this pull request Apr 26, 2026

feat(sqs/admin): SigV4-bypass admin entrypoints + SPA queues pages #670

Merged

7 tasks

bootjp mentioned this pull request Apr 26, 2026

docs(design): promote admin dashboard from proposed to partial #675

Merged

		if dec.More() {
		return CreateTableRequest{}, errors.New("request body has trailing data after the JSON object")

Conversation

bootjp commented Apr 25, 2026

Summary

Test plan

Stacked roadmap

Uh oh!

coderabbitai Bot commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Uh oh!

bootjp commented Apr 25, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot commented Apr 25, 2026

Uh oh!

bootjp commented Apr 25, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 25, 2026

Uh oh!

bootjp commented Apr 25, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 25, 2026

Uh oh!

bootjp commented Apr 25, 2026

Uh oh!

bootjp commented Apr 25, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

bootjp commented Apr 25, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 25, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

bootjp commented Apr 25, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

bootjp commented Apr 25, 2026

Uh oh!

claude Bot commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review

Overall Assessment

Previously Raised Issues — Status

New Findings

What Looks Good

Uh oh!

coderabbitai Bot commented Apr 25, 2026 •

edited

Loading

claude Bot commented Apr 25, 2026 •

edited

Loading

claude Bot commented Apr 25, 2026 •

edited

Loading

claude Bot commented Apr 25, 2026 •

edited

Loading