admin: read-only S3 bucket endpoints (P2 slice 1)#658
Conversation
Slice 1 of P2 (docs/design/2026_04_24_proposed_admin_dashboard.md
Section 4.1): GET /admin/api/v1/s3/buckets and GET /s3/buckets/{name}
become reachable so the SPA's S3List and S3Detail pages stop hitting
404. Write paths (POST / PUT / DELETE / ACL change) ship in slice 2
together with AdminForward integration; for now they reply 405.
Adapter:
- AdminListBuckets / AdminDescribeBucket on *adapter.S3Server bypass
SigV4 the same way AdminListTables / AdminDescribeTable do on the
Dynamo side. Both share loadBucketMetaAt + the metadata-prefix scan,
so a SigV4 listBuckets and the admin dashboard cannot drift.
- AdminBucketSummary projects only the fields the dashboard needs;
CreatedAt is left as the raw HLC and formatted in the bridge.
Admin package:
- BucketsSource interface + BucketSummary DTO with ISO-8601 CreatedAt.
- S3Handler with handleList (paginated) + handleDescribe (404 for
missing). Sub-paths under /buckets/{name}/ (the future /acl) return
404 here so a SPA bug pointed at this build cannot accidentally hit
the describe path with a "{name}/acl" string.
- ServerDeps.Buckets wires the source into NewServer; nil silently
disables the routes (mirrors the Tables nil contract).
- Shared list_pagination.go centralises base64url cursor + limit
parsing and clamping so the Dynamo and S3 handlers cannot diverge
on validation rules. Drops parseDynamoListLimit /
decodeDynamoNextToken in favour of the shared helpers.
- buildAPIMux's route dispatch refactored into apiRouteTable so adding
S3 (and future SQS / queues) does not push the function past
cyclop's 10-branch ceiling.
Production wiring:
- main_s3.go's startS3Server now returns *adapter.S3Server so the
admin listener can hold a reference. (nil, nil) is the explicit
"S3 disabled" signal — empty --s3Address is a valid configuration,
not an error.
- main.go's runtimeServerRunner stores s3Server alongside dynamoServer.
- main_admin.go's startAdminFromFlags accepts *adapter.S3Server,
builds bucketsBridge → admin.BucketsSource, and threads it through
startAdminServer / buildAdminHTTPServer / ServerDeps.Buckets.
Tests:
- 13 handler tests covering empty + populated list, pagination cursor
round-trip, invalid limit / next_token shapes, oversize-limit
silent clamp, list / describe storage failures, ErrBucketsForbidden
→ 403, missing bucket → 404, non-GET methods → 405,
/buckets/{name}/sub → 404, FormatBucketCreatedAt second-precision
RFC3339 + zero-HLC empty string.
- 4 adapter-level tests for AdminListBuckets / AdminDescribeBucket
against an in-memory MVCCStore: empty list returns non-nil
[]AdminBucketSummary{}, populated list reflects buckets created via
the SigV4 path, describe returns the populated summary with ACL
preserved, missing bucket returns (nil, false, nil).
P2 slice 2 (next PR) will add AdminCreateBucket / AdminPutBucketAcl /
AdminDeleteBucket on the adapter, the corresponding write methods on
BucketsSource, and forwarding integration so a follower can hand S3
admin writes to the leader transparently.
|
@claude review |
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 37 minutes and 44 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (8)
📝 WalkthroughWalkthroughThis PR adds S3 bucket administration endpoints to the admin API by introducing an adapter layer for bucket metadata introspection, an HTTP handler with pagination support, dedicated pagination utilities, and integration into the existing admin server infrastructure. Changes
Sequence DiagramsequenceDiagram
actor Client
participant Handler as S3Handler
participant Pagination as Pagination Logic
participant Bridge as bucketsBridge
participant Adapter as S3Server (adapter)
participant Store as MVCC Store
Client->>Handler: GET /admin/api/v1/s3/buckets?limit=10&next_token=...
Handler->>Pagination: parseListPaginationParams(req)
Pagination-->>Handler: (limit, startAfter) or error
Handler->>Bridge: AdminListBuckets(ctx)
Bridge->>Adapter: AdminListBuckets(ctx)
Adapter->>Store: metadata prefix scan @ pinned timestamp
Store-->>Adapter: [bucket metadata]
Adapter-->>Bridge: [AdminBucketSummary]
Bridge-->>Handler: [BucketSummary]
Handler->>Handler: paginateBuckets(results, startAfter, limit)
Handler-->>Client: { buckets: [...], next_token?: "..." }
Client->>Handler: GET /admin/api/v1/s3/buckets/my-bucket
Handler->>Bridge: AdminDescribeBucket(ctx, "my-bucket")
Bridge->>Adapter: AdminDescribeBucket(ctx, "my-bucket")
Adapter->>Store: load bucket metadata @ pinned timestamp
Store-->>Adapter: bucket metadata or not found
Adapter-->>Bridge: AdminBucketSummary or (nil, false, nil)
Bridge-->>Handler: BucketSummary or error
Handler-->>Client: { name: "my-bucket", ... } or 404
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Claude finished @bootjp's task in 3m 49s —— View job Code Review — admin: read-only S3 bucket endpoints (P2 slice 1)
Overall this is a clean, well-structured slice. The layering (adapter → bridge → BucketsSource → S3Handler) mirrors the Dynamo side faithfully, the shared Issue 1 —
|
There was a problem hiding this comment.
Code Review
This pull request introduces read-only S3 admin endpoints to the dashboard, allowing users to list and describe S3 buckets. It includes the implementation of the AdminListBuckets and AdminDescribeBucket methods in the S3 adapter, a new S3Handler in the admin package, and shared pagination logic used by both DynamoDB and S3 admin handlers. Feedback suggests ensuring that bucket listing handles potential ErrBucketsForbidden errors for consistency and addressing the use of a hard-coded limit in metadata scans to prevent incomplete results while maintaining memory safety.
| kvs, err := s.store.ScanAt(ctx, | ||
| []byte(s3keys.BucketMetaPrefix), | ||
| prefixScanEnd([]byte(s3keys.BucketMetaPrefix)), | ||
| s3MaxKeys, readTS) |
There was a problem hiding this comment.
The AdminListBuckets method uses s3MaxKeys as a hard limit for the metadata scan, which may cause the admin dashboard to show incomplete results. While this should be updated to retrieve more entries to satisfy the BucketsSource interface, ensure that a fixed upper bound is still applied to the total number of buckets retrieved to prevent unbounded memory growth and potential OOM issues from collections that grow based on external requests.
References
- To prevent unbounded memory growth and potential OOM issues, apply a fixed bound to collections that can grow from external requests, such as pending configuration changes. Reject new requests when the bound is reached.
| if err != nil { | ||
| h.logger.LogAttrs(r.Context(), slog.LevelError, "admin s3 list buckets failed", | ||
| slog.String("error", err.Error()), | ||
| ) | ||
| writeJSONError(w, http.StatusInternalServerError, "s3_list_failed", | ||
| "failed to list buckets; see server logs") | ||
| return | ||
| } |
There was a problem hiding this comment.
The handleList method does not handle ErrBucketsForbidden, which is inconsistent with the implementation in handleDescribe. If the underlying source returns this error (e.g., due to future role-based access control integration), the user will receive a 500 Internal Server Error instead of a 403 Forbidden. It is better to handle this sentinel error consistently across all admin endpoints.
| if err != nil { | |
| h.logger.LogAttrs(r.Context(), slog.LevelError, "admin s3 list buckets failed", | |
| slog.String("error", err.Error()), | |
| ) | |
| writeJSONError(w, http.StatusInternalServerError, "s3_list_failed", | |
| "failed to list buckets; see server logs") | |
| return | |
| } | |
| buckets, err := h.source.AdminListBuckets(r.Context()) | |
| if err != nil { | |
| if errors.Is(err, ErrBucketsForbidden) { | |
| writeJSONError(w, http.StatusForbidden, "forbidden", | |
| "this endpoint requires a full-access role") | |
| return | |
| } | |
| h.logger.LogAttrs(r.Context(), slog.LevelError, "admin s3 list buckets failed", | |
| slog.String("error", err.Error()), | |
| ) | |
| writeJSONError(w, http.StatusInternalServerError, "s3_list_failed", | |
| "failed to list buckets; see server logs") | |
| return | |
| } |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ab0b074a3d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| kvs, err := s.store.ScanAt(ctx, | ||
| []byte(s3keys.BucketMetaPrefix), | ||
| prefixScanEnd([]byte(s3keys.BucketMetaPrefix)), | ||
| s3MaxKeys, readTS) |
There was a problem hiding this comment.
Stop truncating AdminListBuckets at 1000 entries
AdminListBuckets is documented and used as the full source for admin pagination, but this scan is hard-limited to s3MaxKeys (1000). In clusters with more than 1000 buckets, /admin/api/v1/s3/buckets will silently omit everything after the first 1000, and next_token will stop advancing once the in-memory page reaches that cap, making the remaining buckets unreachable.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
internal/admin/dynamo_handler.go (1)
712-717: Stale doc-comment fragment abovepaginateDynamoTableNames.The comment block now has two function headers stacked together: a paragraph describing the removed
parseDynamoListLimitfollowed (after a blank//) by the doc forpaginateDynamoTableNames. This makespaginateDynamoTableNames's godoc start with text about a function that no longer exists in this file. If the historical rationale is worth preserving, consider moving it to a package-level comment nearparseListLimitinlist_pagination.go(where the policy actually lives) and letpaginateDynamoTableNameskeep a clean one-paragraph doc.♻️ Proposed cleanup
-// parseDynamoListLimit translates the ?limit= query parameter into a -// concrete page size. The shared parseListLimit lives in -// list_pagination.go; this comment is preserved here only because -// it documents the historical rationale for the default / clamp -// policy that the shared helper inherited. -// // paginateDynamoTableNames slices `names` (already lex-sorted by the // adapter) into a single page starting strictly after `startAfter`. // The second return is the opaque cursor the client should pass back // for the next call, or "" if this is the last page.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/admin/dynamo_handler.go` around lines 712 - 717, The doc comment above paginateDynamoTableNames includes a stale fragment describing parseDynamoListLimit; remove that orphan paragraph so paginateDynamoTableNames has a single clean godoc paragraph, and if the historical rationale for the default/clamp policy should be preserved move it into a package-level comment near parseListLimit in list_pagination.go (or update parseListLimit's comment) instead of leaving it above paginateDynamoTableNames.internal/admin/s3_handler.go (1)
119-127:handleListdoesn't mapErrBucketsForbidden—handleDescribedoes. Mirror the mapping for consistency.
handleDescribe(lines 147-151) mapsErrBucketsForbiddento 403, buthandleListreturns 500 for the same sentinel. The PR notes describeErrBucketsForbiddenas "wired" and the testTestS3Handler_DescribeBucket_ForbiddenReturns403pins describe — asymmetry here means slice 2's role gate, when it grows a list-side check, will silently surface 500s with a misleadings3_list_failederror code until someone notices. Better to wire both paths now while the code is small.♻️ Proposed fix
buckets, err := h.source.AdminListBuckets(r.Context()) if err != nil { + if errors.Is(err, ErrBucketsForbidden) { + writeJSONError(w, http.StatusForbidden, "forbidden", + "this endpoint requires a full-access role") + return + } h.logger.LogAttrs(r.Context(), slog.LevelError, "admin s3 list buckets failed", slog.String("error", err.Error()), ) writeJSONError(w, http.StatusInternalServerError, "s3_list_failed", "failed to list buckets; see server logs") return }A test paralleling
TestS3Handler_DescribeBucket_ForbiddenReturns403for the list path would also lock this in.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/admin/s3_handler.go` around lines 119 - 127, handleList currently treats ErrBucketsForbidden from h.source.AdminListBuckets as a 500 but handleDescribe maps the same sentinel to 403; update handleList to mirror handleDescribe by detecting the sentinel ErrBucketsForbidden returned from h.source.AdminListBuckets and return writeJSONError(w, http.StatusForbidden, "forbidden", "access to buckets is forbidden") (and log at error level with the same message) instead of the 500 path; ensure the error check uses the same sentinel comparison (ErrBucketsForbidden) as in handleDescribe and add a unit test analogous to TestS3Handler_DescribeBucket_ForbiddenReturns403 that asserts a 403 is returned from handleList when the source returns ErrBucketsForbidden.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@adapter/s3_admin.go`:
- Around line 38-48: AdminListBuckets currently calls ScanAt once with s3MaxKeys
(1000) and returns that page, dropping any further keys; fix by looping the
ScanAt calls until exhaustion: initialize the scan start to
[]byte(s3keys.BucketMetaPrefix), call s.store.ScanAt(ctx, start,
prefixScanEnd([]byte(s3keys.BucketMetaPrefix)), s3MaxKeys, readTS) in a loop,
append each call's kvs into the accumulated result slice, and when len(kvs) <
s3MaxKeys break; otherwise advance start to the lexicographic successor of the
last returned key (use the last kvs element's Key and increment it to avoid
rereading) and continue; return the accumulated AdminBucketSummary list at the
end and keep the existing readTS/readPin handling.
In `@internal/admin/s3_handler.go`:
- Around line 87-94: Update the misleading comment to state that subpaths
containing "/" are rejected with 404 (not 405) so it matches the implemented
behavior in the strings.Contains(name, "/") branch and the
TestS3Handler_DescribeBucket_SubpathReturns404 test; also change the
writeJSONError call to include a non-empty message (e.g. "bucket does not exist"
or "not found") instead of the empty string so it follows the convention used
elsewhere (see writeJSONError and existing message on line with "bucket does not
exist").
---
Nitpick comments:
In `@internal/admin/dynamo_handler.go`:
- Around line 712-717: The doc comment above paginateDynamoTableNames includes a
stale fragment describing parseDynamoListLimit; remove that orphan paragraph so
paginateDynamoTableNames has a single clean godoc paragraph, and if the
historical rationale for the default/clamp policy should be preserved move it
into a package-level comment near parseListLimit in list_pagination.go (or
update parseListLimit's comment) instead of leaving it above
paginateDynamoTableNames.
In `@internal/admin/s3_handler.go`:
- Around line 119-127: handleList currently treats ErrBucketsForbidden from
h.source.AdminListBuckets as a 500 but handleDescribe maps the same sentinel to
403; update handleList to mirror handleDescribe by detecting the sentinel
ErrBucketsForbidden returned from h.source.AdminListBuckets and return
writeJSONError(w, http.StatusForbidden, "forbidden", "access to buckets is
forbidden") (and log at error level with the same message) instead of the 500
path; ensure the error check uses the same sentinel comparison
(ErrBucketsForbidden) as in handleDescribe and add a unit test analogous to
TestS3Handler_DescribeBucket_ForbiddenReturns403 that asserts a 403 is returned
from handleList when the source returns ErrBucketsForbidden.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 04c9b677-e6a1-4426-baa6-291eb3b36db0
📒 Files selected for processing (13)
adapter/s3_admin.goadapter/s3_admin_test.gointernal/admin/buckets_source.gointernal/admin/dynamo_handler.gointernal/admin/list_pagination.gointernal/admin/s3_handler.gointernal/admin/s3_handler_test.gointernal/admin/server.gomain.gomain_admin.gomain_admin_test.gomain_s3.gomain_s3_test.go
Six findings from Codex / Gemini / Claude on PR #658: 1) Codex P1 + Gemini medium + Claude Issue 1: AdminListBuckets silently truncated at s3MaxKeys (1000) buckets — pages 2+ of the admin paginated list were unreachable, and the BucketsSource doc said "every bucket". Replaced the single ScanAt with a paginated loop mirroring scanAllByPrefixAt on the Dynamo side. The per-call ScanAt page stays at 1000 (named `adminBucketScanPage` so the intent is explicit), but the accumulator continues until the metadata prefix is exhausted. The defensive prefix-mismatch halt keeps the loop safe under a hypothetical mid-scan layout change. New test `TestS3Server_AdminListBuckets_PaginatesPastSinglePage` exercises 1100 buckets to lock the regression. 2) Gemini medium: handleList didn't translate ErrBucketsForbidden to 403, while handleDescribe did. Added the same mapping so both read paths surface a future role gate identically. 3) Claude Issue 2: dead `page == nil` branch in paginateBuckets — a slice expression on a non-nil slice is itself non-nil even when length is zero, so the guard could not fire. Dropped the dead branch (paginateDynamoTableNames already does the same). 4) Claude Issue 3: orphaned comment block above paginateDynamoTableNames referenced parseDynamoListLimit which this PR removed. Cleaned up. 5) Claude Issue 4: hlcPhysicalShift drift risk vs kv.hlcLogicalBits. Exported `kv.HLCLogicalBits` and added `TestAdminHLCPhysicalShiftMatchesKVLogicalBits` in the main package (which already imports both kv and admin) so a future layout change in one without the other fails CI immediately rather than silently rendering every CreatedAt at the wrong hour. 6) Claude Issue 5: bridge nolint:wrapcheck lost the call-site frame in the error chain. Replaced both bucketsBridge sites with errors.Wrap / errors.Wrapf so an operator debugging a 500 sees the bridge in the stack trace. The internal helper `appendAdminBucketSummaries` was extracted to keep AdminListBuckets under the cyclomatic ceiling — same shape the inner-loop body had before, just moved to its own function so the scan-loop reads as a control-flow skeleton rather than a mixed-concerns block.
|
@claude review 2636173 addresses all 6 findings from the previous review pass:
Build / vet / lint / main + admin + adapter targeted tests all pass. |
|
Claude finished @bootjp's task in 3m 2s —— View job Code Review — follow-up pass on commit 2636173
All 6 tracked fixes verified and correct. Details below, followed by two residual nits that weren't in the tracked list. Issue 1 — AdminListBuckets pagination loop ✅
The loop structure is correct:
One observation (not a defect): Issue 2 — ErrBucketsForbidden in handleList ✅
Residual gap: Issue 3 — Dead
|
Two residual nits from claude review #2 on 2636173: 1) Missing regression test for the new handleList ErrBucketsForbidden → 403 mapping. Added TestS3Handler_ListBuckets_ForbiddenReturns403 mirroring TestS3Handler_DescribeBucket_ForbiddenReturns403 — both read endpoints are now symmetric in coverage, locking down the slice-2 role-gate contract. 2) Stale comment said "Reject it with 405" but the code (and the pinned TestS3Handler_DescribeBucket_SubpathReturns404 test) returns 404 — the comment was inherited from an earlier draft when 405 was the intended status. Updated the comment to match the code, and replaced the empty error-message string with "no admin S3 handler is registered for this path" so the response shape matches the convention used elsewhere in the file (CodeRabbit minor + claude residual nit).
|
@claude review 1bbd1da addresses both residual nits from your previous review:
|
|
Claude finished @bootjp's task in 1m 23s —— View job Code Review — follow-up pass on commit 1bbd1da
Both residual nits verified and clean. Nothing new. Fix 1 — Stale 405 comment ( The comment now correctly says "404", references Fix 2 — Missing list-path forbidden test (
The docstring accurately describes the future role-gate context and the regression it's pinning. No gap between test intent and implementation. Overall verdict: All 6 originally tracked issues and both residual nits are resolved. The PR is clean — logic, tests, comments, and error contracts are all consistent. Ready to merge. |
Replaces PR #659, which conflicted heavily after main moved (PR #649 squashed; PR #658 added S3 admin endpoints; the Approximate counters implementation now lives directly in adapter/sqs_catalog.go). This PR: Backend (adapter/sqs_admin.go + internal/admin/sqs_handler.go): - SQSServer.AdminListQueues / AdminDescribeQueue / AdminDeleteQueue are SigV4-bypass entrypoints, mirroring the AdminListTables / AdminListBuckets pattern. - AdminDescribeQueue uses the existing scanApproxCounters from sqs_catalog.go (already on main) so the admin path returns the same Visible / NotVisible / Delayed numbers as GetQueueAttributes("All") would, taken at one snapshot read TS. - sqsQueuesBridge in main_admin.go re-shapes adapter.AdminQueueSummary into admin.QueueSummary, keeping internal/admin free of the heavy adapter dependency tree — same pattern as dynamoTablesBridge / s3BucketsBridge. - admin.QueuesSource is opt-in; deployments that don't run --sqsAddress leave /admin/api/v1/sqs/* off the wire and the SPA renders a soft "endpoint pending" notice on the 404. - Role re-evaluation against the live RoleStore on DELETE so a downgraded key cannot keep mutating with a still-valid JWT. - apiRouteTable.dispatch refactored: resourceHandlerFor extracted so the dispatcher stays under cyclop=10 as new resources land (Dynamo, S3, SQS, future). Frontend (web/admin/src/pages/SqsList.tsx, SqsDetail.tsx): - /sqs queue list with refresh + per-row link to detail. - /sqs/:name detail showing FIFO badge, counters card (Visible / In-flight / Delayed), raw attributes table, and a Delete confirmation Modal gated by RequireFullAccess. - api/client.ts gains listQueues / describeQueue / deleteQueue with the same AbortSignal pattern used for cluster / dynamo / s3 reads. - Layout nav adds an SQS tab between DynamoDB and S3. Out of scope (recorded in the SQS partial design doc §16.2): - PurgeQueue from the SPA. Underlying purgeQueueWithRetry is on main; the admin entrypoint is a trivial follow-up. - Send / Peek / CreateQueue from the SPA. Each needs its own adapter entrypoint and form UX; deferred to keep this PR focused. Verified with go build ./..., go test -race ./internal/admin/..., go test -race -run TestSQS ./adapter/, go test -run TestStartAdmin ., golangci-lint run ./adapter/... ./internal/admin/... ./... (0 issues, no //nolint), and cd web/admin && npm run build.
Latest claude review on #675 flagged that lines 3 and 6 still described P2 as fully shipped: Line 3: "P1, **P2**, and P3 of the implementation plan have shipped" Line 6: "after P1 + **P2** + P3 landed" But the status table I added in 7617daf correctly marks P2 as 🟡 partial (read-only slice 1 in #658 landed; write paths #669 and #673 still in flight). The header prose contradicting the table is exactly the kind of inconsistency that triggers the rename to `_partial_` in the first place — readers skimming the top of the doc would conclude P2 was done. Rewrote the Status line and the parenthetical on Last updated to match the table, calling out P1 + P3 as fully shipped, P2 as the read-only slice only, and P4 as TLS/role/CSRF live with the operator doc landing now. Pointed readers at the per-phase status table for the breakdown. No other changes; the table itself was already correct.
Per `docs/design/README.md`'s lifecycle convention, the admin dashboard design doc is now "partial": - **P1** (DynamoDB CRUD + AdminForward) — shipped via #634, #635, #644, #648 - **P2** (S3 buckets list/create/delete/ACL + DescribeTable) — shipped via #658, with #669 + #673 in flight - **P3** (React SPA + embed) — shipped via #649, #650 - **P4** (TLS / role / CSRF / operator docs) — TLS, role, CSRF are already live in P1; operator docs in #674 Independent of the in-flight slice 2 PRs (#669/#673) and the docs PR (#674) — this rename only reflects what is already on main today, plus an "Implementation status" table mapping each phase to the PR it landed in. ## What this PR changes - `git mv` the design doc from `2026_04_24_proposed_admin_dashboard.md` to `2026_04_24_partial_admin_dashboard.md` so its history follows - Add an "Implementation status" header table indexing each phase to the PRs that landed it - List the outstanding open items so future readers know what is still owed against the original proposal: - AdminForward acceptance criterion 5 (rolling-upgrade compat flag) — deferred - S3 object browser — explicitly out of scope per Section 2 Non-goals - TLS cert hot-reload — restart-to-rotate is the documented model When the rolling-upgrade flag lands, the doc gets renamed once more to `2026_04_24_implemented_admin_dashboard.md` per the README's lifecycle convention.
P2 slice 2b: a follower-side S3 admin write (POST /buckets, PUT
/buckets/{name}/acl, DELETE /buckets/{name}) now hands off to the leader
transparently, completing the same end-to-end forwarding contract Dynamo
writes received in #644 + #648.
**Stacked on #669 (P2 slice 2a).** Once #669 merges, this rebases
cleanly onto main.
## Summary
- **Proto** — three new `ADMIN_OP` enum values (`CREATE_BUCKET` /
`DELETE_BUCKET` / `PUT_BUCKET_ACL`) appended after the Dynamo block so
existing wire integers stay stable. Regenerated with the pinned protoc
29.3 / protoc-gen-go 1.36.11 / protoc-gen-go-grpc 1.6.1.
- **Leader-side** — `ForwardServer.WithBucketsSource` lets deployments
wire the S3 dispatcher optionally; Dynamo-only builds keep it nil and
the new operations return 501. Three new dispatch arms mirror the
leader-direct HTTP path's payload contract (NUL-byte rejection, 64 KiB
limit, `DisallowUnknownFields`, trailing-token rejection, slash-in-name
rejection). `forwardBucketsErrorResponse` mirrors
`forwardErrorResponse`: `ErrBucketsForbidden` / `NotLeader` / `NotFound`
/ `AlreadyExists` / `NotEmpty` + `*ValidationError` each map to the same
HTTP status `writeBucketsError` produces, so forwarded and leader-direct
responses are byte-identical from the SPA's view.
- **Follower-side** — `LeaderForwarder` interface gains
`ForwardCreateBucket` / `ForwardDeleteBucket` / `ForwardPutBucketAcl`.
`ForwardPutBucketAcl` carries both the bucket name (URL path) and the
new ACL (request body) in one JSON payload — same approach
`ForwardDeleteTable` takes for the table name.
- **Handler integration** — `S3Handler.WithLeaderForwarder` +
`tryForwardCreateBucket` / `tryForwardPutBucketAcl` /
`tryForwardDeleteBucket` gated on `ErrBucketsNotLeader && forwarder !=
nil`. `writeForwardResult` / `writeForwardFailure` mirror the Dynamo
handler's pattern (nosniff + Cache-Control:no-store + Retry-After:1 on
503).
- **Wiring** — `runtimeServerRunner.start()` now creates
`*adapter.S3Server` before `startRaftServers` so the leader-side
`ForwardServer` registration sees both adapters.
`adminForwardServerDeps.buckets` is wired through
`registerAdminForwardServer`. `ServerDeps.Forwarder` now plumbs into
`S3Handler` too.
## What is NOT in this PR
- Rolling-upgrade compatibility flag (criterion 5) — still deferred
behind a cluster-version bump.
## Test plan
- [x] `go build ./...`
- [x] `go vet ./...`
- [x] `golangci-lint run` (admin + main + adapter + proto: 0 issues)
- [x] `go test ./internal/admin/ -count=1 -race` — full admin suite
passes (19 new tests across forward_server / forward_client / s3_handler
integration)
- [x] `go test . -count=1 -race` — main package passes
- [x] Targeted adapter tests pass
- [ ] Full `./adapter/` times out due to a pre-existing flake (verified
earlier on PRs #648 / #658 / #669 — unrelated to this branch)
- [ ] End-to-end smoke against a 3-node cluster — manual exercise after
#669 + this land
## Acceptance criteria coverage (Section 3.3.2 + 4.1)
| # | Criterion | Status |
|---|---|---|
| 1 | Leader direct write | ✓ for S3 (#669 / slice 2a) |
| 2 | Follower forwards transparently | ✅ for S3 (this PR) |
| 3 | Election-period 503 + retry | ✅ for S3 (this PR — same fallback as
Dynamo) |
| 4 | Leader demotes stale full role | ✅ for S3 (existing
`validatePrincipal` runs before bucket dispatch) |
| 5 | Rolling-upgrade compat flag | ⏳ deferred (cluster-version bump) |
| 6 | `forwarded_from` in audit log | ✅ for S3 (this PR — leader-side
audit log carries it) |
## Self-review (5 lenses)
1. **Data loss**: No FSM / Raft / Pebble path changes. The leader-side
dispatcher reuses the same `BucketsSource` interface that the
leader-direct HTTP path uses; the only new code path is the JSON
envelope + principal re-validation.
2. **Concurrency**: New methods on `gRPCForwardClient` reuse the
existing `forward()` helper, so connection-cache reuse and
`ErrLeaderUnavailable` signalling behave identically across resource
types. The `runtimeServerRunner.start()` reorder is safe — each adapter
listens on its own address and the raft TCP listeners are independent.
3. **Performance**: One additional gRPC round-trip on follower-side
admin writes; identical to Dynamo's path.
4. **Data consistency**: Leader-side dispatcher applies the same strict
body decode the leader-direct HTTP path does (NUL-byte rejection,
trailing-token rejection, slash-in-name rejection).
`forwardBucketsErrorResponse` produces byte-identical envelopes to
`writeBucketsError` so forwarded and leader-direct responses are
indistinguishable.
5. **Test coverage**: 19 new tests — 9 forward-server (happy paths, 501
no-source, bad-JSON, already-exists, not-empty, slash-in-name,
missing-acl, payload-too-large sweep), 4 forward-client (3 happy-path
payload shapes + ErrLeaderUnavailable for all three), 6
handler-integration (forwarded create/delete/put-acl happy +
leader-unavailable + transport-error + 3-axis non-NotLeader gate sweep).
) ## Summary **Replaces #659**, which has unresolvable conflicts now that main has moved on (PR #649 squashed into main; PR #658 added the S3 admin endpoints; the Approximate counters implementation now lives directly in `adapter/sqs_catalog.go` via `scanApproxCounters`). Rather than a multi-day rebase, this PR re-applies the unique SQS admin code on a fresh branch off current main. What survived from #659: - `adapter/sqs_admin.go` — `SQSServer.AdminListQueues` / `AdminDescribeQueue` / `AdminDeleteQueue` (SigV4-bypass entrypoints, same shape as `AdminListTables` / `AdminListBuckets`). - `internal/admin/sqs_handler.go` — HTTP handler for `/admin/api/v1/sqs/queues{,/{name}}` with role re-evaluation on DELETE. - `web/admin/src/pages/SqsList.tsx` / `SqsDetail.tsx` — SPA pages for the queues view + delete confirmation. What changed during the re-apply: - `AdminQueueCounters` is now `int64` (matches `sqsApproxCounters` from main; bridge does no width conversion). - `AdminDescribeQueue` calls main's `scanApproxCounters` instead of the duplicate `computeApproxCounters` from the old branch — same numeric output, single implementation. - Dropped the `CountersTruncated` field; main's counter type doesn't expose truncation. SPA's "truncated" pill came out with it. - `apiRouteTable.dispatch` refactored to extract `resourceHandlerFor` so the dispatcher stays under cyclop=10 as new resources land. ## Backend - Re-evaluates the principal's role against the live `MapRoleStore` on every `DELETE` so a downgraded key cannot keep mutating with a still-valid JWT (Codex P1 pattern from earlier admin PRs). - `admin.QueuesSource` is **opt-in**: deployments without `--sqsAddress` leave `/admin/api/v1/sqs/*` off the wire; the SPA renders a soft "endpoint pending" notice on the 404, mirroring the Tables / Buckets `nil` contract. - The bridge in `main_admin.go` (`sqsQueuesBridge`, `convertAdminQueueSummary`, `translateAdminQueuesError`) keeps `internal/admin` free of the heavy adapter dependency tree, same architectural pattern as Dynamo and S3. ## Frontend - **/sqs** queue list with refresh + per-row link to detail. - **/sqs/:name** detail showing FIFO badge, counters card (Visible / In-flight / Delayed), raw attributes table, and a Delete confirmation `Modal` gated by `RequireFullAccess`. - `api/client.ts` gains `listQueues` / `describeQueue` / `deleteQueue` with the same `AbortSignal` pattern used for `cluster` / `dynamo` / `s3` reads. - Layout nav adds an SQS tab between DynamoDB and S3. ## Out of scope (recorded in `docs/design/2026_04_24_proposed_sqs_compatible_adapter.md` Section 14, deferred per the SQS partial doc §16.2) - **PurgeQueue from the SPA** — the underlying `purgeQueueWithRetry` adapter method is on main; the admin entrypoint is a trivial follow-up. - **Send / Peek / CreateQueue from the SPA** — each needs its own SigV4-bypass adapter entrypoint and form UX; deferred to keep this PR focused. ## Test plan - [x] `go build ./...` — clean - [x] `go test -race ./internal/admin/...` — passes - [x] `go test -race -run TestSQS ./adapter/` — passes - [x] `go test -run TestStartAdmin .` — passes - [x] `golangci-lint run ./adapter/... ./internal/admin/... ./...` — `0 issues.`, no `//nolint` - [x] `cd web/admin && npm run build` — 49 modules, 199 KB JS / 61 KB gzip + 14.7 KB CSS - [ ] Manual smoke (after PR lands): start a node with `--sqsAddress :4566 --adminEnabled --adminAllowInsecureDevCookie`, create a queue, send a few messages, hit `/admin/sqs/<name>` → counters match `GetQueueAttributes("All")`, Delete dialog returns to list. ## Self-review (5 lenses) 1. **Data loss** — Delete reuses the existing `deleteQueueWithRetry` OCC path; counters are read-only. No new write paths. 2. **Concurrency** — Per-request leader check on Delete; counters scan uses one snapshot read TS. 3. **Performance** — Counters bounded by main's existing `sqsApproxCounterScanLimit`; admin reads are cheap point lookups + one bounded scan. 4. **Data consistency** — `AdminDescribeQueue` and SigV4 `GetQueueAttributes` both call `scanApproxCounters` at a fresh `nextTxnReadTS`, so a single point in time produces the same counters via either surface. 5. **Test coverage** — Existing admin / SQS race suites stay green via the new `nil` Queues argument added to `startAdminServer` call sites; the new bridge is exercised by the cross-package build itself. ## Stacking This PR is **independent** — branched from current `main` (which has the merged versions of #649 / #658 / #650 / counter implementation). Closing #659 in favour of this clean rewrite.
P2 slice 2a of [docs/design/2026_04_24_proposed_admin_dashboard.md](https://github.com/bootjp/elastickv/blob/main/docs/design/2026_04_24_proposed_admin_dashboard.md). Ships the S3 admin write endpoints so the SPA's S3 modals stop receiving 405. Slice 2b will plumb AdminForward so a follower can hand these writes off to the leader transparently. ## Summary - **`*adapter.S3Server.AdminCreateBucket` / `AdminPutBucketAcl` / `AdminDeleteBucket`** — SigV4-bypass write methods with three in-method guards: principal must be `AdminRoleFull`, the local node must be the verified S3 leader, and bucket-name / ACL must pass the existing validators. `AdminCreateBucket` reuses the same atomic bucket-meta + ACL + generation-key txn the SigV4 path does — no new code path through the storage layer. `AdminDeleteBucket` rejects non-empty buckets (the dashboard cannot force recursive delete by design). - **`internal/admin` write surface** — `BucketsSource` gains the three write methods + `CreateBucketRequest` / `PutBucketACLRequest` types with the documented JSON shapes. `S3Handler.serveCollection` + `servePerBucket` route POST/PUT/DELETE through dedicated handlers with `principalForWrite` re-validating the role on every request against the live `MapRoleStore`. - **Strict body decoder** — `decodeAdminS3JSONBody` is generic over the request type, applies `DisallowUnknownFields`, rejects NUL bytes, rejects trailing tokens, and caps at 64 KiB (matches design 4.4). Used by both POST and PUT. - **`writeBucketsError`** translates the source-side sentinels into the design's HTTP statuses: 403 forbidden / 503 + Retry-After:1 leader_unavailable / 404 not_found / 409 already_exists / 409 bucket_not_empty / 400 invalid_request via `*ValidationError`. - **Bridge** — `bucketsBridge` gains write methods running through `translateAdminBucketsError`, mirroring `translateAdminTablesError` on the Dynamo side. Leader-churn sentinels from the kv coordinator route to `admin.ErrBucketsNotLeader` so the SPA's retry contract stays intact. ## What is NOT in this PR - AdminForward integration for S3 admin writes — slice 2b. - Rolling-upgrade compatibility flag (criterion 5) — still deferred behind a cluster-version bump. ## Test plan - [x] `go build ./...` - [x] `go vet ./...` - [x] `golangci-lint run` (admin + main + adapter packages: 0 issues) - [x] `go test ./internal/admin/ -count=1 -race` — 19 new handler tests pass - [x] `go test . -count=1 -race` — main package passes - [x] `go test -run "TestS3Server_Admin" ./adapter/ -count=1 -race` — 10 new adapter tests pass - [ ] Full `go test ./adapter/` times out at 120s due to a pre-existing flake (verified earlier on PRs #648 / #658 against `main` — unrelated to this branch) - [ ] End-to-end smoke against a 3-node cluster — slice 2b first, then a manual exercise ## Acceptance criteria coverage (Section 4.1) | Endpoint | This PR | |---|---| | `GET /admin/api/v1/s3/buckets` | ✓ (#658) | | `GET /admin/api/v1/s3/buckets/{name}` | ✓ (#658) | | `POST /admin/api/v1/s3/buckets` | ✅ | | `PUT /admin/api/v1/s3/buckets/{name}/acl` | ✅ | | `DELETE /admin/api/v1/s3/buckets/{name}` | ✅ | ## Self-review (5 lenses) 1. **Data loss**: `AdminCreateBucket` reuses `s.coordinator.Dispatch` with the same `OperationGroup` shape as the SigV4 path — bucket meta + generation key in one txn. No new FSM / Pebble / Raft path. `AdminDeleteBucket`'s "must be empty" guard is a SnapshotAt scan + size check identical to the SigV4 path. 2. **Concurrency**: Writes go through `retryS3Mutation` which already handles transient mid-dispatch leader churn. The leader check is `isVerifiedS3Leader` — same primitive the SigV4 path uses. Role gate is re-evaluated against the live `MapRoleStore` on every request, so a key downgrade picked up between login and write is enforced immediately. 3. **Performance**: One additional load-bucket-meta read on PutACL / Delete. No hot-path changes; admin writes are operator-rate, not data-plane-rate. 4. **Data consistency**: `AdminCreateBucket` writes (BucketMetaKey, BucketGenerationKey) atomically. `AdminPutBucketAcl` mutates only `meta.Acl` and re-encodes the entire BucketMeta — generation is preserved so existing object references stay valid. `AdminDeleteBucket` removes only BucketMetaKey (BucketGenerationKey is left behind, matching the SigV4 path's behaviour — a future re-create gets a fresh generation). 5. **Test coverage**: 29 new tests (19 admin-package + 10 adapter-level) covering happy paths, role gates, leader checks, validation rejections, all four sentinel error mappings, and the cross-method missing-principal 401. The existing `TestS3Handler_DescribeBucket_SubpathReturns404` was superseded by two more precise tests now that `/acl` is a real sub-resource. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * S3 bucket admin endpoints: create, update ACL, and delete with enforced write authorization, input validation, leader-forwarding, and clear HTTP error mappings. * **Documentation** * New admin docs covering dashboard config, TLS/role semantics, audit logging, and troubleshooting for admin operations. * **Tests** * Extensive end-to-end and unit tests for bucket lifecycle, forwarding, auth/validation, and error scenarios. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary State-refresh of `docs/design/2026_04_24_partial_admin_dashboard.md` now that all four phases of the original P1–P4 plan have shipped. | Phase | Was | Now | |---|---|---| | P1 | shipped | shipped (no change) | | P2 | partial — read-only #658 only | **shipped** (+ #669 writes + #673 forwarding) | | P3 | shipped | shipped (no change) | | P4 | mostly | **shipped** (+ operator doc #674, deployment runbook #669, script wiring #669+#678) | Also added the **AdminDeleteBucket TOCTOU** to Outstanding open items — coderabbitai flagged it during PR #669 review (pre-existing race, `s3.go:deleteBucket` inherits the same shape; recorded as a code comment on `AdminDeleteBucket` for the future fix). Doc stays at `_partial_` rather than promoting to `_implemented_` because two outstanding items remain: AdminForward acceptance criterion 5 (rolling-upgrade flag, explicitly deferred at design time) and the new TOCTOU. Updated the closing rename trigger to cover both. ## Test plan - [x] No design changes — this is a state refresh - [x] Status table entries cross-checked against the merge commits on main - [x] AdminDeleteBucket TOCTOU description matches the code comment in adapter/s3_admin.go and the operator-side workaround in docs/admin_deployment.md
P2 slice 1 of docs/design/2026_04_24_proposed_admin_dashboard.md. Read-only S3 admin endpoints land first so the SPA's S3List and S3Detail pages stop hitting 404. Write paths (POST/PUT/DELETE/ACL) ship in slice 2 together with AdminForward integration; until then they reply 405.
Summary
*adapter.S3Server.AdminListBuckets/AdminDescribeBucket— SigV4-bypass read methods. ShareloadBucketMetaAt+ the metadata-prefix scan with the SigV4 path, so a SigV4listBucketsand the admin dashboard cannot drift.internal/admin/buckets_source.go—BucketsSourceinterface +BucketSummaryDTO + sentinel errors (ErrBucketsForbidden/ErrBucketsNotLeader/ErrBucketsNotFound/ErrBucketsAlreadyExists).ErrBucketsForbiddenis wired in this slice; the others are reserved for slice 2 to keep the error vocabulary additive.internal/admin/s3_handler.go—S3HandlerwithhandleList(paginated) +handleDescribe(404 for missing). Sub-paths under/buckets/{name}/(the future/acl) return 404 here so a SPA bug pointed at this build cannot accidentally hit the describe path with a"{name}/acl"string.list_pagination.go— centralises base64url cursor + limit parsing/clamping so the Dynamo and S3 handlers cannot diverge on validation rules. Drops the now-redundantparseDynamoListLimit/decodeDynamoNextToken/encodeDynamoNextTokenfromdynamo_handler.goin favour ofparseListPaginationParams/decodeListNextToken/encodeListNextToken.apiRouteTableinserver.go— bundles the precomposed middleware chains and dispatches by URL prefix.buildAPIMux's body went from 13 cyclomatic branches to 6, leaving headroom for the next resource family (SQS queues, etc.) to land without another refactor.main_s3.go'sstartS3Servernow returns*adapter.S3Server;runtimeServerRunnerstores it;startAdminFromFlagsaccepts it and threadsbucketsBridge→admin.BucketsSource→ServerDeps.Buckets.What is NOT in this PR
AdminCreateBucket/AdminPutBucketAcl/AdminDeleteBucketon the adapter and their HTTP/AdminForward counterparts. Slice 2.RoleStoreplumb-through for S3. Slice 2 (read-only is fine for any authenticated session today; the write paths will need it).forwarded_fromaudit trail for S3 admin writes. Slice 2 — needs thepb.AdminOperationenum extended.proto/admin_forward.proto. Slice 2.Test plan
go build ./...go vet ./...golangci-lint run(admin + main + adapter packages: 0 issues)go test ./internal/admin/ -count=1 -race— 13 new handler tests passgo test . -count=1 -race— main package passesgo test -run "TestS3Server_Admin" ./adapter/ -count=1 -race— 4 new adapter tests passgo test ./adapter/times out at 120s due to a pre-existing flake (verified earlier on PR admin: AdminForward Phase 3 — register gRPC server + wire follower-side bridge #648 againstmain— unrelated to this branch)Acceptance criteria coverage (Section 4.1)
GET /admin/api/v1/s3/bucketsPOST /admin/api/v1/s3/bucketsGET /admin/api/v1/s3/buckets/{name}PUT /admin/api/v1/s3/buckets/{name}/aclDELETE /admin/api/v1/s3/buckets/{name}Self-review (5 lenses)
AdminListBuckets/AdminDescribeBucketare read-only and use the sameScanAt/loadBucketMetaAtas the SigV4 path.BucketsSourceinterface; the bridge holds only the immutable*S3Server.pinReadTSis the same pattern the SigV4 path uses for snapshot stability.GET /admin/api/v1/s3/buckets, identical to the SigV4 listBuckets. Pagination caps at 1000 buckets per page (silent clamp on oversize). No hot-path changes.Summary by CodeRabbit
Release Notes