Skip to content

admin: AdminForward integration for S3 admin ops (P2 slice 2b)#673

Merged
bootjp merged 3 commits intofeat/admin-s3-writesfrom
feat/admin-s3-forward
Apr 26, 2026
Merged

admin: AdminForward integration for S3 admin ops (P2 slice 2b)#673
bootjp merged 3 commits intofeat/admin-s3-writesfrom
feat/admin-s3-forward

Conversation

@bootjp
Copy link
Copy Markdown
Owner

@bootjp bootjp commented Apr 26, 2026

P2 slice 2b: a follower-side S3 admin write (POST /buckets, PUT /buckets/{name}/acl, DELETE /buckets/{name}) now hands off to the leader transparently, completing the same end-to-end forwarding contract Dynamo writes received in #644 + #648.

Stacked on #669 (P2 slice 2a). Once #669 merges, this rebases cleanly onto main.

Summary

  • Proto — three new ADMIN_OP enum values (CREATE_BUCKET / DELETE_BUCKET / PUT_BUCKET_ACL) appended after the Dynamo block so existing wire integers stay stable. Regenerated with the pinned protoc 29.3 / protoc-gen-go 1.36.11 / protoc-gen-go-grpc 1.6.1.
  • Leader-sideForwardServer.WithBucketsSource lets deployments wire the S3 dispatcher optionally; Dynamo-only builds keep it nil and the new operations return 501. Three new dispatch arms mirror the leader-direct HTTP path's payload contract (NUL-byte rejection, 64 KiB limit, DisallowUnknownFields, trailing-token rejection, slash-in-name rejection). forwardBucketsErrorResponse mirrors forwardErrorResponse: ErrBucketsForbidden / NotLeader / NotFound / AlreadyExists / NotEmpty + *ValidationError each map to the same HTTP status writeBucketsError produces, so forwarded and leader-direct responses are byte-identical from the SPA's view.
  • Follower-sideLeaderForwarder interface gains ForwardCreateBucket / ForwardDeleteBucket / ForwardPutBucketAcl. ForwardPutBucketAcl carries both the bucket name (URL path) and the new ACL (request body) in one JSON payload — same approach ForwardDeleteTable takes for the table name.
  • Handler integrationS3Handler.WithLeaderForwarder + tryForwardCreateBucket / tryForwardPutBucketAcl / tryForwardDeleteBucket gated on ErrBucketsNotLeader && forwarder != nil. writeForwardResult / writeForwardFailure mirror the Dynamo handler's pattern (nosniff + Cache-Control:no-store + Retry-After:1 on 503).
  • WiringruntimeServerRunner.start() now creates *adapter.S3Server before startRaftServers so the leader-side ForwardServer registration sees both adapters. adminForwardServerDeps.buckets is wired through registerAdminForwardServer. ServerDeps.Forwarder now plumbs into S3Handler too.

What is NOT in this PR

  • Rolling-upgrade compatibility flag (criterion 5) — still deferred behind a cluster-version bump.

Test plan

Acceptance criteria coverage (Section 3.3.2 + 4.1)

# Criterion Status
1 Leader direct write ✓ for S3 (#669 / slice 2a)
2 Follower forwards transparently ✅ for S3 (this PR)
3 Election-period 503 + retry ✅ for S3 (this PR — same fallback as Dynamo)
4 Leader demotes stale full role ✅ for S3 (existing validatePrincipal runs before bucket dispatch)
5 Rolling-upgrade compat flag ⏳ deferred (cluster-version bump)
6 forwarded_from in audit log ✅ for S3 (this PR — leader-side audit log carries it)

Self-review (5 lenses)

  1. Data loss: No FSM / Raft / Pebble path changes. The leader-side dispatcher reuses the same BucketsSource interface that the leader-direct HTTP path uses; the only new code path is the JSON envelope + principal re-validation.
  2. Concurrency: New methods on gRPCForwardClient reuse the existing forward() helper, so connection-cache reuse and ErrLeaderUnavailable signalling behave identically across resource types. The runtimeServerRunner.start() reorder is safe — each adapter listens on its own address and the raft TCP listeners are independent.
  3. Performance: One additional gRPC round-trip on follower-side admin writes; identical to Dynamo's path.
  4. Data consistency: Leader-side dispatcher applies the same strict body decode the leader-direct HTTP path does (NUL-byte rejection, trailing-token rejection, slash-in-name rejection). forwardBucketsErrorResponse produces byte-identical envelopes to writeBucketsError so forwarded and leader-direct responses are indistinguishable.
  5. Test coverage: 19 new tests — 9 forward-server (happy paths, 501 no-source, bad-JSON, already-exists, not-empty, slash-in-name, missing-acl, payload-too-large sweep), 4 forward-client (3 happy-path payload shapes + ErrLeaderUnavailable for all three), 6 handler-integration (forwarded create/delete/put-acl happy + leader-unavailable + transport-error + 3-axis non-NotLeader gate sweep).

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 26, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 55c6909f-52ca-421c-a18b-656a628f3029

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/admin-s3-forward

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 26, 2026

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented Apr 26, 2026

Claude finished @bootjp's task in 4m 3s —— View job


PR Review: admin: AdminForward integration for S3 admin ops (P2 slice 2b)

Todo

  • Read changed files and gather context
  • Review proto changes
  • Review forward_server.go changes
  • Review forward_client.go changes
  • Review s3_handler.go changes
  • Review wiring (main.go, main_admin_forward.go, server.go)
  • Review test coverage
  • Post final review summary

Summary

The PR correctly implements follower→leader forwarding for the three S3 admin write operations, faithfully mirrors the Dynamo forwarding contract, and includes a solid test suite (19 new tests). The wiring change in main.go is correct and well-justified. A few issues worth addressing before landing:


Issues Found

1. logUnexpectedSourceError logs bucket name under the "table" slog key — misleading for operators

File: internal/admin/forward_server.go:511

func (s *ForwardServer) logUnexpectedSourceError(ctx context.Context, op, table, forwardedFrom string, err error) {
    // ...
    s.logger.LogAttrs(ctx, slog.LevelError, "admin_forward_"+op+"_failed",
        slog.String("table", table),  // ← "table" key, bucket name value

When called for bucket operations (create_bucket, delete_bucket, put_bucket_acl), the slog field key is "table" but the value is a bucket name. Any log query on table= would find spurious bucket-operation entries, and queries for bucket= would miss the error logs entirely. A simple fix: rename the parameter to resource and use that as the log key.

Fix this →

2. handleCreateBucket skips the whitespace-padding validation that validateCreateBucketRequest performs

File: internal/admin/forward_server.go:264

The HTTP handleCreate path calls validateCreateBucketRequest(body) which includes two checks:

  1. strings.TrimSpace(in.BucketName) == "" — catches empty/whitespace-only names
  2. in.BucketName != strings.TrimSpace(in.BucketName) — rejects "bucket " or " bucket"

The forward handler only performs check (1). A bucket name like "bucket " would be accepted by the forward server, then rejected by the adapter's validateS3BucketName (spaces aren't in isValidS3BucketRune), but the error message would be the generic "bucket name contains unsupported characters" instead of the clearer "bucket_name must not have leading or trailing whitespace". This is a validation parity gap between the forwarded and leader-direct paths.

Fix: call validateCreateBucketRequest(body) after decoding in handleCreateBucket, exactly as handleCreate delegates to decodeCreateTableRequest for the full validation.

Fix this →

3. TestForwardServer_CreateBucket_NoBucketsSourceReturns501 does not cover DELETE and PUT_ACL

File: internal/admin/forward_server_test.go:477

The nil-source 501 path exists in all three bucket handlers, but only CREATE_BUCKET is tested. DELETE_BUCKET and PUT_BUCKET_ACL have identical if s.buckets == nil guards that are untested. A table-driven test covering all three operations (like TestForwardServer_BucketOps_PayloadTooLargeReturns413 already does for payload size) would lock these down.

Fix this →


Observations (non-blocking)

Proto version change (v7.34.0v5.29.3): This is a correction. protoc v5.29.3 is the same as libprotoc 29.3 specified in CLAUDE.md. The base branch had wrong generated files; this PR fixes them to the pinned toolchain. ✓

RetryAfterSeconds dropped in ForwardResult conversion (forward_client.go:214-218): resp.GetRetryAfterSeconds() is not carried into ForwardResult; the follower infers Retry-After: 1 solely from status code 503. Since the leader only ever sets RetryAfterSeconds = 1 today, this works correctly — but if the value diverges, the follower silently ignores it. This is a pre-existing pattern from the Dynamo handler (not introduced here), but worth a follow-up issue.

writeForwardResult / writeForwardFailure duplication (s3_handler.go:544-580, dynamo_handler.go:470-514): The two methods are near-identical. Extracting to package-level helpers would reduce the drift surface as the forwarding contract evolves. Low priority; the current mirroring is explicit and documented.

Startup order change in main.go: Correct and well-explained. Moving startS3Server before startRaftServers is required for the adminForwardServerDeps.buckets field to be non-nil when registerAdminForwardServer runs. ✓


Test coverage assessment

The 19 new tests cover the happy paths and the major error arms well. The TestForwardServer_BucketOps_PayloadTooLargeReturns413 table-driven sweep across all three operations is particularly good. The gaps noted above (issue #3, plus no forwarded-path test for ErrBucketsNotLeader racing mid-leader-dispatch) are the only missing cases.


Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces S3 bucket administrative operation forwarding, enabling follower nodes to delegate bucket creation, deletion, and ACL modifications to the leader via gRPC. The implementation includes updates to the gRPC client and server, Protobuf definitions, and the S3 HTTP handler. Feedback was provided to align the validation and decoding logic in the handleCreateBucket handler with the existing HTTP path to ensure consistency.

Comment on lines +249 to +267
if bytes.IndexByte(payload, 0) >= 0 {
return rejectForward(http.StatusBadRequest, "invalid_body",
"create-bucket payload contains a NUL byte")
}
dec := json.NewDecoder(bytes.NewReader(payload))
dec.DisallowUnknownFields()
var body CreateBucketRequest
if err := dec.Decode(&body); err != nil {
return rejectForward(http.StatusBadRequest, "invalid_body",
"create-bucket payload is not valid JSON")
}
if dec.More() {
return rejectForward(http.StatusBadRequest, "invalid_body",
"create-bucket payload has trailing data")
}
if strings.TrimSpace(body.BucketName) == "" {
return rejectForward(http.StatusBadRequest, "invalid_body",
"bucket_name is required")
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The validation and decoding logic in handleCreateBucket diverges from the leader-direct HTTP path (handleCreate in s3_handler.go). Reusing the existing validation helper ensures consistency. Additionally, ensure the request body size is limited using http.MaxBytesReader to prevent Denial of Service attacks from oversized payloads, as required by repository security rules.

Suggested change
if bytes.IndexByte(payload, 0) >= 0 {
return rejectForward(http.StatusBadRequest, "invalid_body",
"create-bucket payload contains a NUL byte")
}
dec := json.NewDecoder(bytes.NewReader(payload))
dec.DisallowUnknownFields()
var body CreateBucketRequest
if err := dec.Decode(&body); err != nil {
return rejectForward(http.StatusBadRequest, "invalid_body",
"create-bucket payload is not valid JSON")
}
if dec.More() {
return rejectForward(http.StatusBadRequest, "invalid_body",
"create-bucket payload has trailing data")
}
if strings.TrimSpace(body.BucketName) == "" {
return rejectForward(http.StatusBadRequest, "invalid_body",
"bucket_name is required")
}
if bytes.IndexByte(payload, 0) >= 0 {
return rejectForward(http.StatusBadRequest, "invalid_body", "request body contains a NUL byte")
}
dec := json.NewDecoder(bytes.NewReader(payload))
dec.DisallowUnknownFields()
var body CreateBucketRequest
if err := dec.Decode(&body); err != nil {
return rejectForward(http.StatusBadRequest, "invalid_body", "request body is not valid JSON")
}
if dec.More() {
return rejectForward(http.StatusBadRequest, "invalid_body", "request body has trailing data after the first JSON value")
}
if err := validateCreateBucketRequest(body); err != nil {
return rejectForward(http.StatusBadRequest, "invalid_body", err.Error())
}
References
  1. When handling HTTP requests, always limit the size of the request body (e.g., using http.MaxBytesReader) to prevent Denial of Service attacks from oversized payloads.

bootjp added a commit that referenced this pull request Apr 26, 2026
Three findings from claude + Gemini review on 70213e0:

1) **Issue 1 — `logUnexpectedSourceError` slog key was "table"**
   When called for bucket operations the field key was `"table"` but
   the value was a bucket name. Log queries on `table=` would find
   spurious bucket-error entries; queries on `bucket=` would miss
   the audit lines entirely. Renamed the parameter and the slog
   key to `resource` so the same forensic query works for both
   resource families.

2) **Gemini security-high + Claude Issue 2 — validation divergence**
   `handleCreateBucket` only checked `strings.TrimSpace(name) == ""`
   while the HTTP path's `validateCreateBucketRequest` also rejects
   whitespace-padded names like `" bucket "`. The forward path
   would have accepted them, then hit the adapter's
   `validateS3BucketName` with a less actionable error message —
   different SPA behaviour depending on whether the request was
   leader-direct or follower-forwarded. Fix: call
   `validateCreateBucketRequest(body)` in `handleCreateBucket`
   exactly like `decodeCreateTableRequest` is shared between the
   table-side handlers.

3) **Issue 3 — only CREATE_BUCKET tested for nil-BucketsSource → 501**
   `DELETE_BUCKET` and `PUT_BUCKET_ACL` had identical
   `if s.buckets == nil` guards but no coverage. Replaced
   `TestForwardServer_CreateBucket_NoBucketsSourceReturns501` with
   a table-driven `TestForwardServer_BucketOps_NoBucketsSourceReturns501`
   sweeping all three operations. A future op added without the
   nil guard fails CI immediately.

Plus a new `TestForwardServer_CreateBucket_RejectsWhitespacePaddedName`
that pins the validation-parity fix from #2.

Rebased onto the latest `feat/admin-s3-writes` (which now carries
the slice 2a review fixes) so the stack stays clean.
@bootjp bootjp force-pushed the feat/admin-s3-forward branch from 70213e0 to e5ef0a9 Compare April 26, 2026 12:21
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 26, 2026

@claude review

e5ef0a9 (rebased onto the updated #669 branch) addresses all three findings:

  1. Issue 1: logUnexpectedSourceError parameter + slog key renamed from tableresource so log queries work uniformly across Dynamo + S3.
  2. Gemini security-high + Claude Issue 2: handleCreateBucket now calls validateCreateBucketRequest(body) for parity with the HTTP path. Whitespace-padded bucket names produce the same 400 leading or trailing whitespace message as a leader-direct call. New TestForwardServer_CreateBucket_RejectsWhitespacePaddedName pins the fix.
  3. Issue 3: replaced the single-op TestForwardServer_CreateBucket_NoBucketsSourceReturns501 with a table-driven TestForwardServer_BucketOps_NoBucketsSourceReturns501 sweeping CREATE / DELETE / PUT_ACL.

Build / vet / lint / admin tests all clean.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 26, 2026

Claude finished @bootjp's task in 1m 52s —— View job


Follow-up Review: Verifying fixes from e5ef0a9

Todo

  • Read current state of changed files
  • Verify Issue 1: logUnexpectedSourceError tableresource rename
  • Verify Issue 2: handleCreateBucket validation parity + TestForwardServer_CreateBucket_RejectsWhitespacePaddedName
  • Verify Issue 3: Table-driven TestForwardServer_BucketOps_NoBucketsSourceReturns501
  • Inspect code for any remaining concerns

All three findings from the previous review are correctly addressed in e5ef0a9.

Issue 1 — slog key tableresource

logUnexpectedSourceError (line 518) now takes a resource parameter and emits slog.String("resource", resource). The comment at lines 514–517 explains the rationale for future readers. Both Dynamo and S3 call-sites pass the right value.

Issue 2 — validateCreateBucketRequest parity + new test ✅

handleCreateBucket (line 271) calls validateCreateBucketRequest(body) immediately after the trailing-token check, so whitespace-padded names get the same "bucket_name must not have leading or trailing whitespace" 400 that the leader-direct path returns. TestForwardServer_CreateBucket_RejectsWhitespacePaddedName (line 520) pins this with a Contains("leading or trailing whitespace") assertion. The parity is tight — the test exercises exactly the second branch of validateCreateBucketRequest.

Note: handleDeleteBucket and handlePutBucketAcl don't apply the same whitespace check on the bucket name, but that's intentional and correct — for those two operations the name is extracted from the URL path parameter on the follower side, which the HTTP router already percent-decodes/normalises; whitespace padding in a path segment isn't a realistic source of ambiguity there.

Issue 3 — table-driven 501 sweep ✅

TestForwardServer_BucketOps_NoBucketsSourceReturns501 (line 477) is now a three-case table covering CREATE_BUCKET, DELETE_BUCKET, and PUT_BUCKET_ACL. Each case asserts StatusNotImplemented + payload contains "not_implemented". The comment at lines 483–484 calls out the original single-op gap.


Additional pass on the diff

Nothing new to flag. A few minor observations (all non-blocking, no action needed):

  • handleDeleteBucket and handlePutBucketAcl both call logUnexpectedSourceError with the corrected resource parameter (lines 329, 388). ✓
  • The validateCreateBucketRequest helper is package-internal (func validateCreateBucketRequest) so the forward_server can call it directly without any cross-package coupling. ✓
  • The comment on logUnexpectedSourceError (lines 504–517) now describes both Dynamo and S3 usage and explicitly records the "table""resource" rationale — this is a good pattern for a shared helper that serves multiple resource families.

LGTM. The PR correctly closes all three review findings and the code is in good shape to land.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 70213e047c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread main_admin_forward.go
Comment on lines +95 to +96
if deps.buckets != nil {
srv = srv.WithBucketsSource(deps.buckets)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Register AdminForward for S3-only deployments

This S3 wiring block is unreachable whenever Dynamo is disabled because readyForRegistration() still requires tables != nil; in a cluster started with --dynamoAddr empty but S3 enabled, leaders never register AdminForward, so follower-side S3 writes now attempt forwarding and consistently fail with gRPC Unimplemented (surfacing as 503) instead of transparently reaching the leader. Allow registration when roles exist and at least one source (tables or buckets) is present, and keep unsupported ops returning 501.

Useful? React with 👍 / 👎.

Comment thread internal/admin/forward_server.go Outdated
Comment on lines +264 to +266
if strings.TrimSpace(body.BucketName) == "" {
return rejectForward(http.StatusBadRequest, "invalid_body",
"bucket_name is required")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Enforce create-bucket validation parity on forwarded calls

The forwarded create-bucket handler only checks TrimSpace(bucket_name) == "" and skips validateCreateBucketRequest, so requests with leading/trailing whitespace can take a different path than leader-direct HTTP (which rejects them as invalid_body before hitting the source). This breaks the stated follower/leader response parity and can become a functional divergence if a BucketsSource implementation is less strict than the current adapter.

Useful? React with 👍 / 👎.

bootjp added a commit that referenced this pull request Apr 26, 2026
Two findings from claude review #675:

1. **P2 was marked ✅ shipped but #669 / #673 are still in flight.**
   PR description correctly noted slices 2a/2b haven't merged, but
   the table cell read "✅ shipped". Future readers reading the
   table alone would conclude main has the S3 write paths today,
   which is wrong. Fixed: P2 is now 🟡 partial with the in-flight
   PRs explicitly called out, mirroring the P4 row's shape.

2. **P1 row claimed criteria 1–6 but criterion 5 is in the
   outstanding items as deferred.** The two cells contradicted each
   other. Updated P1 to "criteria 1–4 + 6 (criterion 5 deferred —
   see outstanding items)" so the table is consistent with the
   outstanding-items section below.

The cosmetic notes (Last-updated field, verbose Status header) are
left for a future cleanup; they are not factually wrong, only
non-conforming to the README convention. Calling them out as a
follow-up rather than baking the precedent into this doc.
@bootjp bootjp force-pushed the feat/admin-s3-writes branch from ddc3b9b to 5a67834 Compare April 26, 2026 12:26
bootjp added a commit that referenced this pull request Apr 26, 2026
P4 deliverable from docs/design/2026_04_24_proposed_admin_dashboard.md
Section 8: a single self-contained operator-facing reference for
the admin HTTP listener.

Covers:
- Quick-start invocation for a loopback dev cluster
- Required + optional flag reference, with explanations of why each
  guard exists (TLS hard-error, rolling-update caveats, etc.)
- TLS topologies (loopback / TLS / discouraged plaintext-non-loopback)
- Role model + how live role re-validation works on every state-
  changing request
- The full /admin/api/v1/* surface (auth + cluster + dynamo +
  s3, including the slice 2 write paths and the AdminForward
  forwarding contract)
- forwarded_from audit log shape and why it carries the follower's
  node ID
- Troubleshooting guide for the common failure modes operators
  hit during initial bring-up (missing credentials, TLS hard-error,
  401 ambiguity, 503 leader_unavailable, bucket_not_empty,
  blank-screen / placeholder bundle)
- Cross-references to the design doc and architecture overview

The Section 8 P4 plan also called out "TLS, read-only role, CSRF" as
deliverables; those are already implemented (see config.go's
validateTLS / validateAccessKeyRoles, the role gates in
DynamoHandler.principalForWrite + S3Handler.principalForWrite, and
the CSRFDoubleSubmit middleware. This doc stitches them into a
single reference operators can land on without reading code.

Stacked on #669 (P2 slice 2a) + #673 (P2 slice 2b) so the API-
surface table can describe S3 write endpoints as shipped. Once both
land in main, this rebases cleanly.
EOF
)
bootjp added 2 commits April 26, 2026 21:30
Slice 2b of P2 (docs/design/2026_04_24_proposed_admin_dashboard.md
Section 3.3.2 + 4.1): a follower-side S3 admin write
(POST /buckets, PUT /buckets/{name}/acl, DELETE /buckets/{name})
now hands off to the leader transparently, completing the same
end-to-end forwarding contract Dynamo writes received in #644 +

Stacked on #669 (P2 slice 2a). Once #669 merges, this rebases
cleanly onto main.

Proto (proto/admin_forward.proto):
- Three new ADMIN_OP enum values (CREATE_BUCKET / DELETE_BUCKET /
  PUT_BUCKET_ACL) appended after the Dynamo block so existing
  wire-format integers stay stable. Regenerated with the pinned
  protoc 29.3 / protoc-gen-go 1.36.11 / protoc-gen-go-grpc 1.6.1.

Leader-side ForwardServer (internal/admin/forward_server.go):
- WithBucketsSource lets deployments wire the S3 dispatcher
  optionally — Dynamo-only builds keep the receiver nil and the
  three new operations return 501 NotImplemented.
- Three new dispatch arms: handleCreateBucket / handleDeleteBucket /
  handlePutBucketAcl. Each one mirrors the leader-direct HTTP
  path's payload contract (NUL-byte rejection, 64 KiB limit,
  DisallowUnknownFields, trailing-token rejection, slash-in-name
  rejection) so a hostile follower cannot smuggle a payload past
  validations the leader-direct path enforces.
- forwardBucketsErrorResponse mirrors forwardErrorResponse on the
  Dynamo side: ErrBucketsForbidden / NotLeader / NotFound /
  AlreadyExists / NotEmpty + ValidationError each map to the same
  HTTP status the leader-direct writeBucketsError produces, so
  forwarded and leader-direct responses are byte-for-byte
  indistinguishable from the SPA's view.
- isStructuredSourceError extended to recognise the bucket
  sentinels so they are NOT logged at LevelError on the leader.
- Forward's switch was extracted into dispatchForward to keep the
  parent function under cyclop's 10-branch ceiling as the operation
  enum grew.

Follower-side LeaderForwarder (internal/admin/forward_client.go):
- Interface gains ForwardCreateBucket / ForwardDeleteBucket /
  ForwardPutBucketAcl. PutBucketAcl carries both the bucket name
  (from the URL path) and the new ACL (from the request body) in
  one JSON payload — same approach handleDeleteBucket takes for
  the bucket name.
- gRPCForwardClient methods reuse the existing forward() helper
  for transport, so connection-cache reuse and ErrLeaderUnavailable
  signalling behave identically across resource types.

Handler integration (internal/admin/s3_handler.go):
- New `forwarder LeaderForwarder` field + WithLeaderForwarder method.
- handleCreate / handlePutACL / handleDelete now consult tryForward*
  helpers when the source returned ErrBucketsNotLeader; the helpers
  are gated on `errors.Is(err, ErrBucketsNotLeader) && forwarder != nil`
  so a leader-direct rejection (already-exists, not-found, etc.) is
  never re-applied at the leader.
- writeForwardResult / writeForwardFailure mirror the Dynamo handler's
  pattern: nosniff + Cache-Control:no-store + Retry-After:1 on 503.
  ErrLeaderUnavailable does NOT log at LevelError (elections are
  routine); transport errors do log so operators can investigate.

Wiring (main.go + main_admin_forward.go):
- adminForwardServerDeps gains a `buckets` field; readyForRegistration
  still requires only TablesSource + RoleStore (so cluster-only or
  Dynamo-only builds keep registering Dynamo forwarding without S3).
- runtimeServerRunner.start() now creates *adapter.S3Server BEFORE
  startRaftServers (in addition to dynamoServer) so the leader-side
  ForwardServer registration sees both adapters. The reorder is safe:
  each adapter listens on its own address and the raft TCP listeners
  are independent.
- ServerDeps.Forwarder now plumbs through buildS3HandlerForDeps too,
  so the follower's S3Handler picks up the same LeaderForwarder
  instance the Dynamo handler does.

Tests:
- 9 forward-server tests covering the three new bucket operations:
  happy path / no-BucketsSource→501 / bad-JSON 400 / already-exists
  409 / not-empty 409 / slash-in-name 400 / missing-acl 400 /
  payload-too-large 413 (sweep over all three ops).
- 4 forward-client tests covering ForwardCreateBucket /
  ForwardDeleteBucket / ForwardPutBucketAcl happy-path payload
  shapes + ErrLeaderUnavailable on no-leader.
- stubLeaderForwarder gains 3 bucket-side forward methods so
  existing dynamo tests still satisfy the LeaderForwarder
  interface, and the new stub fields let bucket-handler tests
  verify the forward arguments.
- 6 handler integration tests on S3Handler.tryForward*: forwarded
  create / delete / put-acl happy paths (replay leader's status
  + payload + content-type), forwarder ErrLeaderUnavailable → 503
  + Retry-After, transport-error → 503 + no leakage, and a 3-axis
  gate sweep proving the forwarder is NOT invoked on
  AlreadyExists / Forbidden / generic source errors.

Closes design 3.3.2 acceptance criteria 2 (transparent forwarding)
+ 6 (forwarded_from in audit log) for S3 admin writes; criterion 3
(election-period 503 + retry) is also live for S3 because the
existing tryForward helpers reuse the same fallback paths.
Three findings from claude + Gemini review on 70213e0:

1) **Issue 1 — `logUnexpectedSourceError` slog key was "table"**
   When called for bucket operations the field key was `"table"` but
   the value was a bucket name. Log queries on `table=` would find
   spurious bucket-error entries; queries on `bucket=` would miss
   the audit lines entirely. Renamed the parameter and the slog
   key to `resource` so the same forensic query works for both
   resource families.

2) **Gemini security-high + Claude Issue 2 — validation divergence**
   `handleCreateBucket` only checked `strings.TrimSpace(name) == ""`
   while the HTTP path's `validateCreateBucketRequest` also rejects
   whitespace-padded names like `" bucket "`. The forward path
   would have accepted them, then hit the adapter's
   `validateS3BucketName` with a less actionable error message —
   different SPA behaviour depending on whether the request was
   leader-direct or follower-forwarded. Fix: call
   `validateCreateBucketRequest(body)` in `handleCreateBucket`
   exactly like `decodeCreateTableRequest` is shared between the
   table-side handlers.

3) **Issue 3 — only CREATE_BUCKET tested for nil-BucketsSource → 501**
   `DELETE_BUCKET` and `PUT_BUCKET_ACL` had identical
   `if s.buckets == nil` guards but no coverage. Replaced
   `TestForwardServer_CreateBucket_NoBucketsSourceReturns501` with
   a table-driven `TestForwardServer_BucketOps_NoBucketsSourceReturns501`
   sweeping all three operations. A future op added without the
   nil guard fails CI immediately.

Plus a new `TestForwardServer_CreateBucket_RejectsWhitespacePaddedName`
that pins the validation-parity fix from #2.

Rebased onto the latest `feat/admin-s3-writes` (which now carries
the slice 2a review fixes) so the stack stays clean.
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 26, 2026

Rebased onto updated feat/admin-s3-writes (which now has #669 review fixes + a main rebase to absorb #661 keyviz). No code change in this push, just absorbing the new base.

@bootjp bootjp force-pushed the feat/admin-s3-forward branch from e5ef0a9 to cafac05 Compare April 26, 2026 12:32
bootjp added a commit that referenced this pull request Apr 26, 2026
P4 deliverable from docs/design/2026_04_24_proposed_admin_dashboard.md
Section 8: a single self-contained operator-facing reference for
the admin HTTP listener.

Covers:
- Quick-start invocation for a loopback dev cluster
- Required + optional flag reference, with explanations of why each
  guard exists (TLS hard-error, rolling-update caveats, etc.)
- TLS topologies (loopback / TLS / discouraged plaintext-non-loopback)
- Role model + how live role re-validation works on every state-
  changing request
- The full /admin/api/v1/* surface (auth + cluster + dynamo +
  s3, including the slice 2 write paths and the AdminForward
  forwarding contract)
- forwarded_from audit log shape and why it carries the follower's
  node ID
- Troubleshooting guide for the common failure modes operators
  hit during initial bring-up (missing credentials, TLS hard-error,
  401 ambiguity, 503 leader_unavailable, bucket_not_empty,
  blank-screen / placeholder bundle)
- Cross-references to the design doc and architecture overview

The Section 8 P4 plan also called out "TLS, read-only role, CSRF" as
deliverables; those are already implemented (see config.go's
validateTLS / validateAccessKeyRoles, the role gates in
DynamoHandler.principalForWrite + S3Handler.principalForWrite, and
the CSRFDoubleSubmit middleware. This doc stitches them into a
single reference operators can land on without reading code.

Stacked on #669 (P2 slice 2a) + #673 (P2 slice 2b) so the API-
surface table can describe S3 write endpoints as shipped. Once both
land in main, this rebases cleanly.
EOF
)
bootjp added a commit that referenced this pull request Apr 26, 2026
Latest claude review on #675 flagged that lines 3 and 6 still
described P2 as fully shipped:

  Line 3: "P1, **P2**, and P3 of the implementation plan have shipped"
  Line 6: "after P1 + **P2** + P3 landed"

But the status table I added in 7617daf correctly marks P2 as
🟡 partial (read-only slice 1 in #658 landed; write paths #669
and #673 still in flight). The header prose contradicting the
table is exactly the kind of inconsistency that triggers the
rename to `_partial_` in the first place — readers skimming the
top of the doc would conclude P2 was done.

Rewrote the Status line and the parenthetical on Last updated to
match the table, calling out P1 + P3 as fully shipped, P2 as the
read-only slice only, and P4 as TLS/role/CSRF live with the
operator doc landing now. Pointed readers at the per-phase status
table for the breakdown.

No other changes; the table itself was already correct.
Codex on PR #673 caught that `readyForRegistration()` predicate
gated registration on `tables != nil`, which meant a cluster
started with `--dynamoAddr` empty but S3 enabled never registered
the AdminForward gRPC service at all. Followers attempting to
forward S3 admin writes (CREATE_BUCKET / DELETE_BUCKET /
PUT_BUCKET_ACL) hit gRPC `Unimplemented` and surfaced as 503 to
the SPA, instead of transparently reaching the leader. The S3
write path was advertised as "follower-forwarded" by P2 slice 2b
but actually only worked when Dynamo was also configured.

Fix is symmetric: relax the predicate to require RoleStore plus
**at least one** source (TablesSource or BucketsSource) and have
the dispatcher reject ops for the missing surface with the same
501 it already returns when the inverse source is nil.

Per CLAUDE.md "test the bug first" convention, add the
regression test before the fix:

  TestForwardServer_DynamoOps_NoTablesSourceReturns501
    — symmetric with the existing
    TestForwardServer_BucketOps_NoBucketsSourceReturns501.
    With nil source, CREATE_TABLE used to slip past the source
    nil-deref because validation rejected the payload first;
    DELETE_TABLE panicked on `s.source.AdminDeleteTable`. Both
    now return 501.

Plus three readyForRegistration assertions:
  - buckets-only deployment registers (the regression case)
  - missing-roles + buckets-only does NOT register
  - tables-only / both-sources continue to register

While in there, the per-op nil-source guards I added tipped the
linter's dupl detector. Refactored the dispatch path to keep
forward_server tidy:

1. **`checkOpAvailability`**: pulls the per-op nil-source switch
   out of `dispatchForward` so the latter stays under the
   cyclomatic ceiling. One place to update when a future op
   joins the enum, instead of five.

2. **`decodeNamedPayload`**: the {"name": "..."} JSON shape is
   shared between Dynamo handleDelete and S3 handleDeleteBucket
   (NUL guard, DisallowUnknownFields, dec.More(), empty-name
   reject, slash reject — every guard had a paired comment
   citing the original review that introduced it). Centralising
   the helper keeps that audit trail in one place; the opLabel
   parameter ("delete" / "delete-bucket") preserves the
   path-specific wording in error responses.

3. **`auditDeleteSuccess`**: handleDelete and handleDeleteBucket
   each emitted a 7-line `admin_audit` block that differed only
   in the operation label and the `table` vs `bucket` field
   name. Folded into one helper so a future delete-shaped op
   automatically picks up the same field set.

4. **`notImplementedForwardResponse`**: deduplicates the 501
   construction across the five Dynamo + S3 op cases.

No behaviour change to any of the existing successful paths;
all forward_server tests pass unchanged. Tests added: the
S3-only registration assertion + the new symmetric
DynamoOps_NoTablesSourceReturns501 sweep.

go test -race ./internal/admin/ . — passes.
golangci-lint run ./internal/admin/... . — 0 issues.
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 26, 2026

@claude review

Addressed the Codex P1 finding on main_admin_forward.go:96:

  • readyForRegistration now allows S3-only deployments. Previously gated on tables != nil, so a cluster started with --dynamoAddr empty but S3 enabled never registered the AdminForward gRPC service at all — follower S3 writes hit Unimplemented / 503. Predicate is now roles != nil && (tables != nil || buckets != nil). Symmetric three-line assertion added in TestAdminForwardServerDeps_ReadyForRegistration (S3-only registers, missing roles still rejected, both-sources still registers).
  • Dispatch handles either source being nil. Pulled the source-availability check into checkOpAvailability and added TestForwardServer_DynamoOps_NoTablesSourceReturns501 — the symmetric counterpart of the existing bucket version. With nil source, CREATE_TABLE / DELETE_TABLE now return 501 cleanly instead of panicking on the source deref.
  • Refactored to keep lint quiet. decodeNamedPayload factors the {name} JSON shape shared by handleDelete + handleDeleteBucket (NUL guard, DisallowUnknownFields, dec.More(), name validation). auditDeleteSuccess factors the audit-log emission. notImplementedForwardResponse factors the 501 construction.

The gemini security-high finding on forward_server.go:273 (handleCreateBucket validation parity) — current code already calls validateCreateBucketRequest after the NUL / DisallowUnknownFields / dec.More() guards (added in cafac05). The suggestion's code matches what's there.

Latest commit: f3e9278.

go test -race ./internal/admin/ . — passes.
golangci-lint run — 0 issues.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 26, 2026

Claude finished @bootjp's task in 2m 12s —— View job


Follow-up Review: Verifying Codex P1 fix (f3e9278)

Todo

  • Read changed files and gather context
  • Verify readyForRegistration S3-only predicate fix
  • Verify checkOpAvailability nil-source panic guard + TestForwardServer_DynamoOps_NoTablesSourceReturns501
  • Verify refactors: decodeNamedPayload, auditDeleteSuccess, notImplementedForwardResponse
  • Inspect for any new concerns introduced by the refactor
  • Post final review summary

All three changes from the trigger comment are correctly implemented. Details:

Codex P1 — readyForRegistration S3-only predicate ✅

main_admin_forward.go:755: predicate changed from d.tables != nil && d.roles != nil to d.roles != nil && (d.tables != nil || d.buckets != nil). The new dummyBucketsSource stub in main_admin_forward_test.go is minimal and correct — it panics on every method so the gate test cannot accidentally let real dispatch slip through. TestAdminForwardServerDeps_ReadyForRegistration now covers all six meaningful combinations: empty bundle, tables-only-no-roles, buckets-only-no-roles, roles-only (no sources), Dynamo-only, S3-only, and full bundle. Message strings on each require call make failures self-describing. ✓

nil-source panic guard via checkOpAvailability

forward_server.go:94–113: the availability check is centralised — Dynamo ops gate on s.source != nil, bucket ops gate on s.buckets != nil, ADMIN_OP_UNSPECIFIED falls through to ok=true so the existing 400 branch in dispatchForward stays canonical. The no-default case in checkOpAvailability is intentional and safe: a future op that is added to the enum but not yet to this switch will pass the gate, hit dispatchForward's default branch, and return a 400 "unknown admin operation" — the fail-safe direction.

NewForwardServer(nil, roles, nil).WithBucketsSource(...) is now the S3-only construction pattern (explicit in TestForwardServer_DynamoOps_NoTablesSourceReturns501). This works correctly because checkOpAvailability gates the Dynamo ops before any dereference of s.source. Minor doc gap: the NewForwardServer comment says deployments without S3 pass nil via WithBucketsSource but doesn't say the inverse (pass nil source for S3-only builds). Non-blocking — the tests are the living documentation.

TestForwardServer_DynamoOps_NoTablesSourceReturns501 sweeps both CREATE_TABLE and DELETE_TABLE with source=nil and asserts 501 + "not_implemented". Symmetric with the existing bucket 501 sweep. ✓

Refactors ✅

  • decodeNamedPayload: correctly extracted from handleDelete; used by both handleDelete (opLabel="delete") and handleDeleteBucket (opLabel="delete-bucket"). All four guards — 64 KiB cap, NUL byte, DisallowUnknownFields+trailing-token, empty+slash name — are present. The return signature (string, *pb.AdminForwardResponse, error) is unusual but the comment explains why, and callers guard with rejection != nil || err != nil which is the defensive right thing.
  • auditDeleteSuccess: the resourceField parameter ("table" vs "bucket") is passed as the slog key name, which matches the audit schema.
  • notImplementedForwardResponse: the surface argument ("DynamoDB" vs "S3") ends up in the user-visible message, which is correct.

handlePutBucketAcl inline decode (not using decodeNamedPayload) is correct — it needs both name and acl fields, so the helper that only decodes {"name": "..."} can't be reused directly. The inline code applies the same guard sequence (size, NUL, DisallowUnknownFields, trailing, empty name, slash-in-name) so there's no parity gap.


LGTM. The Codex P1 finding is fully addressed; the refactors are clean and well-covered. No new concerns.

bootjp added a commit that referenced this pull request Apr 26, 2026
P4 deliverable from docs/design/2026_04_24_proposed_admin_dashboard.md
Section 8: a single self-contained operator-facing reference for
the admin HTTP listener.

Covers:
- Quick-start invocation for a loopback dev cluster
- Required + optional flag reference, with explanations of why each
  guard exists (TLS hard-error, rolling-update caveats, etc.)
- TLS topologies (loopback / TLS / discouraged plaintext-non-loopback)
- Role model + how live role re-validation works on every state-
  changing request
- The full /admin/api/v1/* surface (auth + cluster + dynamo +
  s3, including the slice 2 write paths and the AdminForward
  forwarding contract)
- forwarded_from audit log shape and why it carries the follower's
  node ID
- Troubleshooting guide for the common failure modes operators
  hit during initial bring-up (missing credentials, TLS hard-error,
  401 ambiguity, 503 leader_unavailable, bucket_not_empty,
  blank-screen / placeholder bundle)
- Cross-references to the design doc and architecture overview

The Section 8 P4 plan also called out "TLS, read-only role, CSRF" as
deliverables; those are already implemented (see config.go's
validateTLS / validateAccessKeyRoles, the role gates in
DynamoHandler.principalForWrite + S3Handler.principalForWrite, and
the CSRFDoubleSubmit middleware. This doc stitches them into a
single reference operators can land on without reading code.

Stacked on #669 (P2 slice 2a) + #673 (P2 slice 2b) so the API-
surface table can describe S3 write endpoints as shipped. Once both
land in main, this rebases cleanly.
EOF
)
bootjp added a commit that referenced this pull request Apr 26, 2026
Per `docs/design/README.md`'s lifecycle convention, the admin dashboard
design doc is now "partial":

- **P1** (DynamoDB CRUD + AdminForward) — shipped via #634, #635, #644,
#648
- **P2** (S3 buckets list/create/delete/ACL + DescribeTable) — shipped
via #658, with #669 + #673 in flight
- **P3** (React SPA + embed) — shipped via #649, #650
- **P4** (TLS / role / CSRF / operator docs) — TLS, role, CSRF are
already live in P1; operator docs in #674

Independent of the in-flight slice 2 PRs (#669/#673) and the docs PR
(#674) — this rename only reflects what is already on main today, plus
an "Implementation status" table mapping each phase to the PR it landed
in.

## What this PR changes

- `git mv` the design doc from `2026_04_24_proposed_admin_dashboard.md`
to `2026_04_24_partial_admin_dashboard.md` so its history follows
- Add an "Implementation status" header table indexing each phase to the
PRs that landed it
- List the outstanding open items so future readers know what is still
owed against the original proposal:
- AdminForward acceptance criterion 5 (rolling-upgrade compat flag) —
deferred
  - S3 object browser — explicitly out of scope per Section 2 Non-goals
  - TLS cert hot-reload — restart-to-rotate is the documented model

When the rolling-upgrade flag lands, the doc gets renamed once more to
`2026_04_24_implemented_admin_dashboard.md` per the README's lifecycle
convention.
@bootjp bootjp merged commit c8165ae into feat/admin-s3-writes Apr 26, 2026
5 checks passed
@bootjp bootjp deleted the feat/admin-s3-forward branch April 26, 2026 13:41
bootjp added a commit that referenced this pull request Apr 26, 2026
P4 deliverable from `docs/design/2026_04_24_proposed_admin_dashboard.md`
Section 8: a single self-contained operator-facing reference for the
admin HTTP listener.

**Stacked on #669 (slice 2a) + #673 (slice 2b)** so the API-surface
table can describe S3 write endpoints as shipped. Rebases cleanly onto
main once those land.

## Sections

- **Quick start** — minimal loopback-dev invocation
- **Configuration reference** — required vs optional flags, with the
rationale for each guard
- **Hard-error startup conditions** — the explicit cases where the
process refuses to start (missing creds, partial TLS, non-loopback
without TLS, role-list overlap)
- **TLS setup** — loopback / TLS / discouraged plaintext-non-loopback
topologies
- **Roles** — read-only vs full + how live role re-validation works on
every state-changing request
- **API surface** — full `/admin/api/v1/*` table including the slice 2
S3 write endpoints
- **Follower → leader forwarding** — what the SPA sees, the
rolling-update caveat, and the election-period 503 + Retry-After
contract
- **Audit log** — `admin_audit` slog shape and the `forwarded_from`
field
- **Troubleshooting** — the common bring-up failures (missing creds, TLS
hard-error, 401 ambiguity, 503 leader_unavailable, bucket_not_empty,
placeholder bundle blank-screen)

## What is NOT in this PR

Section 8's P4 plan also called out "TLS, read-only role, CSRF" as
deliverables; those are already implemented in main (see `config.go`'s
`validateTLS` / `validateAccessKeyRoles`, the role gates in
`DynamoHandler.principalForWrite` + `S3Handler.principalForWrite`, and
the `CSRFDoubleSubmit` middleware). This doc stitches them into a single
reference operators can land on without reading code.

## Test plan

- [x] Markdown renders cleanly (manual check)
- [x] Cross-references match real file paths (design doc, architecture
overview, proto file)
- [x] Flag names match `main.go`'s flag definitions verbatim
- [x] The `bucket_not_empty` 409 response shape matches
`S3Handler.writeBucketsError`
bootjp added a commit that referenced this pull request Apr 26, 2026
Three findings from claude + Gemini review on 70213e0:

1) **Issue 1 — `logUnexpectedSourceError` slog key was "table"**
   When called for bucket operations the field key was `"table"` but
   the value was a bucket name. Log queries on `table=` would find
   spurious bucket-error entries; queries on `bucket=` would miss
   the audit lines entirely. Renamed the parameter and the slog
   key to `resource` so the same forensic query works for both
   resource families.

2) **Gemini security-high + Claude Issue 2 — validation divergence**
   `handleCreateBucket` only checked `strings.TrimSpace(name) == ""`
   while the HTTP path's `validateCreateBucketRequest` also rejects
   whitespace-padded names like `" bucket "`. The forward path
   would have accepted them, then hit the adapter's
   `validateS3BucketName` with a less actionable error message —
   different SPA behaviour depending on whether the request was
   leader-direct or follower-forwarded. Fix: call
   `validateCreateBucketRequest(body)` in `handleCreateBucket`
   exactly like `decodeCreateTableRequest` is shared between the
   table-side handlers.

3) **Issue 3 — only CREATE_BUCKET tested for nil-BucketsSource → 501**
   `DELETE_BUCKET` and `PUT_BUCKET_ACL` had identical
   `if s.buckets == nil` guards but no coverage. Replaced
   `TestForwardServer_CreateBucket_NoBucketsSourceReturns501` with
   a table-driven `TestForwardServer_BucketOps_NoBucketsSourceReturns501`
   sweeping all three operations. A future op added without the
   nil guard fails CI immediately.

Plus a new `TestForwardServer_CreateBucket_RejectsWhitespacePaddedName`
that pins the validation-parity fix from #2.

Rebased onto the latest `feat/admin-s3-writes` (which now carries
the slice 2a review fixes) so the stack stays clean.
bootjp added a commit that referenced this pull request Apr 26, 2026
Codex on PR #673 caught that `readyForRegistration()` predicate
gated registration on `tables != nil`, which meant a cluster
started with `--dynamoAddr` empty but S3 enabled never registered
the AdminForward gRPC service at all. Followers attempting to
forward S3 admin writes (CREATE_BUCKET / DELETE_BUCKET /
PUT_BUCKET_ACL) hit gRPC `Unimplemented` and surfaced as 503 to
the SPA, instead of transparently reaching the leader. The S3
write path was advertised as "follower-forwarded" by P2 slice 2b
but actually only worked when Dynamo was also configured.

Fix is symmetric: relax the predicate to require RoleStore plus
**at least one** source (TablesSource or BucketsSource) and have
the dispatcher reject ops for the missing surface with the same
501 it already returns when the inverse source is nil.

Per CLAUDE.md "test the bug first" convention, add the
regression test before the fix:

  TestForwardServer_DynamoOps_NoTablesSourceReturns501
    — symmetric with the existing
    TestForwardServer_BucketOps_NoBucketsSourceReturns501.
    With nil source, CREATE_TABLE used to slip past the source
    nil-deref because validation rejected the payload first;
    DELETE_TABLE panicked on `s.source.AdminDeleteTable`. Both
    now return 501.

Plus three readyForRegistration assertions:
  - buckets-only deployment registers (the regression case)
  - missing-roles + buckets-only does NOT register
  - tables-only / both-sources continue to register

While in there, the per-op nil-source guards I added tipped the
linter's dupl detector. Refactored the dispatch path to keep
forward_server tidy:

1. **`checkOpAvailability`**: pulls the per-op nil-source switch
   out of `dispatchForward` so the latter stays under the
   cyclomatic ceiling. One place to update when a future op
   joins the enum, instead of five.

2. **`decodeNamedPayload`**: the {"name": "..."} JSON shape is
   shared between Dynamo handleDelete and S3 handleDeleteBucket
   (NUL guard, DisallowUnknownFields, dec.More(), empty-name
   reject, slash reject — every guard had a paired comment
   citing the original review that introduced it). Centralising
   the helper keeps that audit trail in one place; the opLabel
   parameter ("delete" / "delete-bucket") preserves the
   path-specific wording in error responses.

3. **`auditDeleteSuccess`**: handleDelete and handleDeleteBucket
   each emitted a 7-line `admin_audit` block that differed only
   in the operation label and the `table` vs `bucket` field
   name. Folded into one helper so a future delete-shaped op
   automatically picks up the same field set.

4. **`notImplementedForwardResponse`**: deduplicates the 501
   construction across the five Dynamo + S3 op cases.

No behaviour change to any of the existing successful paths;
all forward_server tests pass unchanged. Tests added: the
S3-only registration assertion + the new symmetric
DynamoOps_NoTablesSourceReturns501 sweep.

go test -race ./internal/admin/ . — passes.
golangci-lint run ./internal/admin/... . — 0 issues.
bootjp added a commit that referenced this pull request Apr 26, 2026
P4 deliverable from docs/design/2026_04_24_proposed_admin_dashboard.md
Section 8: a single self-contained operator-facing reference for
the admin HTTP listener.

Covers:
- Quick-start invocation for a loopback dev cluster
- Required + optional flag reference, with explanations of why each
  guard exists (TLS hard-error, rolling-update caveats, etc.)
- TLS topologies (loopback / TLS / discouraged plaintext-non-loopback)
- Role model + how live role re-validation works on every state-
  changing request
- The full /admin/api/v1/* surface (auth + cluster + dynamo +
  s3, including the slice 2 write paths and the AdminForward
  forwarding contract)
- forwarded_from audit log shape and why it carries the follower's
  node ID
- Troubleshooting guide for the common failure modes operators
  hit during initial bring-up (missing credentials, TLS hard-error,
  401 ambiguity, 503 leader_unavailable, bucket_not_empty,
  blank-screen / placeholder bundle)
- Cross-references to the design doc and architecture overview

The Section 8 P4 plan also called out "TLS, read-only role, CSRF" as
deliverables; those are already implemented (see config.go's
validateTLS / validateAccessKeyRoles, the role gates in
DynamoHandler.principalForWrite + S3Handler.principalForWrite, and
the CSRFDoubleSubmit middleware. This doc stitches them into a
single reference operators can land on without reading code.

Stacked on #669 (P2 slice 2a) + #673 (P2 slice 2b) so the API-
surface table can describe S3 write endpoints as shipped. Once both
land in main, this rebases cleanly.
EOF
)
bootjp added a commit that referenced this pull request Apr 27, 2026
## Summary

State-refresh of `docs/design/2026_04_24_partial_admin_dashboard.md` now
that all four phases of the original P1–P4 plan have shipped.

| Phase | Was | Now |
|---|---|---|
| P1 | shipped | shipped (no change) |
| P2 | partial — read-only #658 only | **shipped** (+ #669 writes + #673
forwarding) |
| P3 | shipped | shipped (no change) |
| P4 | mostly | **shipped** (+ operator doc #674, deployment runbook
#669, script wiring #669+#678) |

Also added the **AdminDeleteBucket TOCTOU** to Outstanding open items —
coderabbitai flagged it during PR #669 review (pre-existing race,
`s3.go:deleteBucket` inherits the same shape; recorded as a code comment
on `AdminDeleteBucket` for the future fix).

Doc stays at `_partial_` rather than promoting to `_implemented_`
because two outstanding items remain: AdminForward acceptance criterion
5 (rolling-upgrade flag, explicitly deferred at design time) and the new
TOCTOU. Updated the closing rename trigger to cover both.

## Test plan

- [x] No design changes — this is a state refresh
- [x] Status table entries cross-checked against the merge commits on
main
- [x] AdminDeleteBucket TOCTOU description matches the code comment in
adapter/s3_admin.go and the operator-side workaround in
docs/admin_deployment.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant