Skip to content

feat(rbac): RPC permission declaration map + contract test (PR 2b of 4)#303

Merged
mcharles-square merged 4 commits into
mainfrom
feat/rbac-pr2b-rpc-permissions-infra
May 26, 2026
Merged

feat(rbac): RPC permission declaration map + contract test (PR 2b of 4)#303
mcharles-square merged 4 commits into
mainfrom
feat/rbac-pr2b-rpc-permissions-infra

Conversation

@mcharles-square
Copy link
Copy Markdown
Collaborator

@mcharles-square mcharles-square commented May 22, 2026

Stacked on #301 (PR 2a — resolver + middleware). Targets feat/rbac-pr2a-resolver as base; will retarget to main once 2a merges.

Second slice of PR 2 from the granular RBAC plan. Pure infrastructure — no behavior change. The maps exist, the contract test classifies every registered procedure, but no handler reads ProcedurePermissions yet.

What landed

server/internal/handlers/middleware/rpc_permissions.go

Two exported maps:

  • ProcedurePermissions{procedure → catalog key} — gated procedures. Empty in this PR. Populated as handlers swap from RequireAdmin to RequirePermission in subsequent PRs.
  • ProceduresPendingMigration{procedure → note} — every authenticated procedure on the production Connect mux that has not yet migrated. Each entry carries a brief note about the current gate:
    • middleware.RequireAdmin — buildings (5), sites (6) — 11 procedures
    • Inline info.Role check — fleetnodeadmin (8), serverlog (1), apikey (3, via local helper) — 12 procedures
    • Inline requireAdminFromContext — curtailment writes (4) — 4 procedures
    • Domain-layer checkCanManageUser — auth user-management (3) — 3 procedures
    • SUPER_ADMIN role string check — auth ListUsers (1) — 1 procedure
    • Authenticated self-read/write — auth UpdatePassword / UpdateUsername / VerifyCredentials / GetUserAuditInfo / Logout (5) — 5 procedures
    • Ungated — fleetmanagement (9), command/minercommand (15), deviceset (17), collection (16), pools (5), schedule (7), errors (4), telemetry (2), activity (3), networkinfo (2), pairing (2), foremanimport (2), onboarding GetFleetOnboardingStatus (1), curtailment reads (3) — 88 procedures

Total: 110 entries. Shrinking this map to zero is the exit criterion for removing middleware/admin.go.

server/internal/handlers/middleware/rpc_permissions_test.go

Contract test that uses reflection on the generated *ServiceHandler interfaces to enumerate every procedure registered on the production mux, then asserts each appears in exactly one of:

  • interceptors.UnauthenticatedProcedures
  • interceptors.FleetNodeAuthenticatedProcedures
  • ProcedurePermissions
  • ProceduresPendingMigration

Adding a new RPC without classifying it fails the test loudly. Double-classification is also flagged.

A companion test (TestRPCContract_ProcedurePermissionsKeysAreInCatalog) ensures every gate key references a real catalog permission — a no-op today because ProcedurePermissions is empty; becomes load-bearing as PR 2c lands gates.

What this PR does NOT do

  • Does not swap any handler from RequireAdmin to RequirePermission — that's PR 2c (existing gates) and PR 2d (new gates).
  • Does not change runtime authz behavior. The auth interceptor doesn't consult these maps yet; the middleware doesn't enforce them at request time.
  • Does not delete middleware/admin.go — that lands once ProceduresPendingMigration is empty.
  • Does not rename user_organization.role_id or add the raising trigger — PR 2d.

Test plan

  • go build ./... clean
  • go vet ./... clean
  • golangci-lint run ./internal/handlers/middleware/... clean
  • go test ./internal/handlers/middleware/... green (contract test + the existing RequirePermission unit tests from 2a)
  • Reflection-based discovery enumerates 110 authenticated procedures; the maps cover them with no gaps and no overlap
  • CI run on the stack confirms no regression

Rollback

Reverting this PR removes two files. No schema change, no data migration, no runtime behavior change to reverse.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 22, 2026

🔐 Codex Security Review

Note: This is an automated security-focused code review generated by Codex.
It should be used as a supplementary check alongside human review.
False positives are possible - use your judgment.

Scope summary

  • Reviewed pull request diff only (cf6ff018f14bbcbaa8e67972132b3f01d4247300...6f8b5f3f779d438558383ce6f34e6b9e0a7334ca, exact PR three-dot diff)
  • Model: gpt-5.5

💡 Click "edited" above to see previous reviews for this PR.


Review Summary

Overall Risk: MEDIUM

Findings

[MEDIUM] Permission map does not prove handlers enforce permissions

  • Category: Auth
  • Location: server/internal/handlers/middleware/rpc_permissions_test.go:162
  • Description: ProcedurePermissions is treated as the “migrated to RequirePermission” bucket, but the only assertion is that the named permission key exists in the catalog. A procedure can be moved from ProceduresPendingMigration to ProcedurePermissions without adding a RequirePermission call to its handler, and this contract test will still pass.
  • Impact: Sensitive RPCs can be incorrectly marked as RBAC-protected while remaining authenticated-only or ungated, including miner commands, pool updates, pairing, or fleet management actions.
  • Recommendation: Either enforce ProcedurePermissions centrally in an interceptor, or add a test/static check that verifies each mapped procedure’s handler actually calls RequirePermission with the declared key.

[MEDIUM] Pending-migration bucket allows new unauthorised RPCs to pass CI

  • Category: Auth
  • Location: server/internal/handlers/middleware/rpc_permissions_test.go:131
  • Description: The contract accepts ProceduresPendingMigration as a valid classification, and there is no guard preventing new procedures from being added there. The comments say adding entries is a regression, but CI does not enforce that.
  • Impact: Future high-risk RPCs can ship with only a note like "ungated" and still satisfy the authorization contract, weakening the migration safety net.
  • Recommendation: Freeze the pending list as an explicit baseline and fail on additions, or require all new RPCs to enter ProcedurePermissions with runtime enforcement from the start.

Notes

The reviewed diff only adds the RPC permission catalog and contract tests; I did not find command injection, SQL injection, protobuf wire-format, plugin, infrastructure, frontend, or pool-hijack changes in scope.

I could not run go test ./internal/handlers/middleware because the sandbox is read-only and Go could not create its module cache.


Generated by Codex Security Review |
Triggered by: @mcharles-square |
Review workflow run

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4c427ca79d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread server/internal/handlers/middleware/rpc_permissions_test.go Outdated
Base automatically changed from feat/rbac-pr2a-resolver to main May 22, 2026 19:35
Infrastructure for the per-procedure authz migration. No behavior
change in this PR — the maps exist, the contract test classifies
every registered procedure, but no handler reads ProcedurePermissions
yet. PR 2c (existing-gate swap) and PR 2d (new-gate + migration) move
procedures from ProceduresPendingGate into ProcedurePermissions as
their handlers migrate to RequirePermission.

**rpc_permissions.go** — two maps:

- ProcedurePermissions{procedure → catalog key}: gated procedures.
  Empty in this PR. Populated as handlers swap to RequirePermission.
- ProceduresPendingGate{procedure → note}: every authenticated
  procedure on the production Connect mux that has not yet migrated.
  Each entry carries a brief note about the current gate (legacy
  RequireAdmin, inline role check, or "ungated"). 110 entries today;
  shrinking to zero is the exit criterion for removing
  middleware/admin.go.

**rpc_permissions_test.go** — the contract test:

- TestRPCContract_EveryRegisteredProcedureIsClassified uses reflection
  on each generated *ServiceHandler interface to enumerate every
  procedure registered on the production mux, then asserts each
  appears in exactly one of UnauthenticatedProcedures,
  FleetNodeAuthenticatedProcedures, ProcedurePermissions, or
  ProceduresPendingGate. Adding a new RPC without classifying it
  fails the test loudly; double-classification is also flagged.
- TestRPCContract_ProcedurePermissionsKeysAreInCatalog ensures every
  gate key references a real catalog permission — a no-op today
  because ProcedurePermissions is empty; becomes load-bearing as
  PR 2c lands gates.

The registered-services list in the test mirrors the handler
registrations in cmd/fleetd/main.go. Adding a new service to
production requires adding it to that list; the build catches removal
(unused import).

Build, vet, golangci-lint, and the middleware test package all green.
…with main.go

Code review of PR 2b found that several entries in ProceduresPendingMigration
mislabel the actual current gate on the handler, which would mislead PR 2c's
migration into trusting a label that isn't backed by a real check.

- AuthService.ListUsers: labeled "SUPER_ADMIN role check"; the handler has
  no role check, so any authenticated org member can enumerate users.
- Curtailment Start/Stop/Preview gates are conditional on override fields,
  not blanket admin checks. UpdateCurtailmentEvent has no gate at all.
- FleetNodeAdmin Pair/Unpair/ListFleetNodeDevices/DiscoverOnFleetNode
  fall through the embedded UnimplementedFleetNodeAdminServiceHandler and
  return Unimplemented without reaching any role check.

Relabel each affected entry with the real state so the migration can
target the right gap.

Also:
- Rename ProceduresPendingGate -> ProceduresPendingMigration. The name
  is more honest: many entries are flat-out ungated, not "pending" any
  current gate.
- Add a guard so the classification test fails loudly if reflection
  finds zero procedures (e.g., connect-go interface rename).
- Add a sync test that parses cmd/fleetd/main.go for v1connect handler
  mounts and asserts registeredServices stays in lockstep. Without this,
  adding a new service to main.go but forgetting to list it here is a
  silent false-negative.
- Drop in-comment plan-unit references ("PR 2b", "follow-up PRs").
@mcharles-square mcharles-square force-pushed the feat/rbac-pr2b-rpc-permissions-infra branch from 4c427ca to 6bec15e Compare May 22, 2026 19:47
The previous `\w+v1connect` pattern locked the sync test to v1 services
only. A future v2+ service mounted in main.go would silently escape the
check, defeating the purpose of the guard.

Reported by Codex on PR review (P2).
Copilot AI review requested due to automatic review settings May 22, 2026 19:51
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds RBAC “procedure classification” infrastructure for the Connect RPC surface by introducing exported procedure→permission/pending maps and a reflection-based contract test that enforces complete, non-overlapping classification of every RPC mounted in cmd/fleetd/main.go.

Changes:

  • Introduces ProcedurePermissions and ProceduresPendingMigration as the canonical registry for authenticated procedures (gated vs pending migration).
  • Adds a contract test that reflects over generated *ServiceHandler interfaces to enumerate mounted procedures and asserts every one is classified exactly once.
  • Adds a companion test ensuring any permission keys referenced by ProcedurePermissions exist in the authz catalog.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
server/internal/handlers/middleware/rpc_permissions.go Adds exported procedure classification maps (gated vs pending migration) using generated Connect procedure constants.
server/internal/handlers/middleware/rpc_permissions_test.go Adds reflection + main.go-regex contract tests to ensure all mounted RPCs are classified and permission keys are valid.

Comment thread server/internal/handlers/middleware/rpc_permissions.go
The previous version compared package shortnames only, so a second
service interface added to an already-mounted vNconnect package
(e.g. authv1connect.NewMultiOrgAuthServiceHandler alongside
NewAuthServiceHandler) would silently pass the sync check even if
registeredServices forgot to list it.

Capture both the package shortname and the ServiceHandler short name
from each mount, then compare against
`<pkg>.<ServiceShortName>` derived from each registered handler's
reflect.Type. A missing service now fails the test loudly regardless
of whether its package was already imported.

Reported by Codex on PR review (MEDIUM).
@mcharles-square
Copy link
Copy Markdown
Collaborator Author

Re: the Codex security review:

[HIGH] RBAC Contract Blesses Ungated Pool-Hijack-Capable RPCs

This is deferred to PR 2d by design, not addressed here. PR 2b is infrastructure-only — it adds the ProcedurePermissions / ProceduresPendingMigration maps and the contract test, but does not change any handler's runtime gate. The "ungated" classification in ProceduresPendingMigration is an accurate description of the current state of command/*, pools/*, etc., not a sign-off on shipping them ungated. The PR description calls this out:

Shrinking this map to zero is the exit criterion for removing middleware/admin.go.

The follow-up stack:

  • PR 2c (feat(authz): swap existing-gate handlers to RequirePermission (PR 2c) #305) — existing-gate swap: handlers that already had RequireAdmin or an inline role check move to RequirePermission with catalog keys.
  • PR 2d — new gates on currently-ungated procedures, including MinerCommandService.UpdateMiningPools, the pool CRUD endpoints, fleetmanagement writes, and the rest of command/*. Plus the legacy user_organization.role_id rename + raising trigger, and the deletion of middleware/admin.go.

Reviewers should treat any growth in ProceduresPendingMigration as a red flag (the file comment says so). The contract test enforces that every new RPC must be classified into one of the four maps before it can ship.

[MEDIUM] Mux Coverage Test Can Miss New Services In Existing Connect Packages

Addressed in commit 6f8b5f3. mainConnectMountRe now captures (pkg, ServiceShortName) and the sync test compares at the <pkg>.<ServiceShortName> constructor granularity — adding a second service to an already-mounted v1connect package now fails the check unless registeredServices lists it.

@mcharles-square mcharles-square merged commit 71c7b34 into main May 26, 2026
68 checks passed
@mcharles-square mcharles-square deleted the feat/rbac-pr2b-rpc-permissions-infra branch May 26, 2026 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants