feat(bootstrap): gateway-auth enrollment-token API for Z2LS (Option C)#40
Conversation
The existing HMAC v1 API at POST /api/v1/enrollment_tokens is unusable
in the current Launch-provisioned topology because the per-tenant ZTLP
gateway is started with --http-inject-headers, which strips ALL inbound
X-ZTLP-* headers as a defense against admin-auth spoofing. The HMAC API
headers (X-ZTLP-Zone, X-ZTLP-Client, X-ZTLP-Timestamp, X-ZTLP-Signature)
share that prefix, so they get nuked before reaching Rails. Verified
end-to-end against four freshly-provisioned tenants
(hermes-sandbox.ztlp, hermes-lab.ztlp, hermes-probe.ztlp,
hermes-trial.ztlp, hermes-try5.ztlp) — every signed request 401s with
audit reason 'missing_header'.
Diagnosis: docs/findings/2026-05-23-v1-api-header-collision.md (left
untracked; not part of this PR).
This change side-steps the collision entirely by adding a new endpoint
that reuses the same gateway-auth path the Bootstrap UI already uses.
Z2LS becomes 'an admin-equivalent client over ZTLP' rather than a
separately-credentialed system — the trust boundary is the ZTLP device
identity, not a shared HMAC secret.
Changes:
* routes.rb: add namespace :admin under namespace :api with
POST /api/admin/enrollment_tokens
* app/controllers/api/admin/enrollment_tokens_controller.rb (new):
JSON-only controller protected by trusted_gateway_admin, skipping
forgery_protection (justified in-file: the device-identity check
is strictly stronger than CSRF). Returns the same response shape
as the HMAC v1 controller. Cookie-session admins are explicitly
NOT allowed — gateway-auth-only by design.
* test/controllers/api/admin/enrollment_tokens_controller_test.rb
(new): full coverage — auth (no headers, cookie-only, corrupted
sig), happy path, max_uses/expires_in defaults + overrides,
metadata storage, audit log row, validation (missing /
malformed / oversized computer_name).
* script/z2ls_gateway_auth_token_request.rb (new): Z2LS reference
client. Plain HTTP POST through a local tunnel forward port; no
HMAC, no CSRF, no shared secrets.
* docs/z2ls_gateway_auth_runbook.md (new): customer-facing runbook
for the gateway-auth path.
* docs/api_v1_ztlp_secured.md: 'preferred path' banner pointing
callers to the new runbook; HMAC contract kept as historical.
* docs/z2ls_enrollment_runbook.md: top-of-file note explaining the
HMAC path is broken in Launch topology and pointing at the new
runbook.
Manual validation followup:
Hermes will retest end-to-end via the hermes-try5.ztlp sandbox
tunnel after merge & redeploy of the bootstrap image. The expected
smoke test (verified-loadable in the production image at PR time):
curl -X POST http://127.0.0.1:18084/api/admin/enrollment_tokens \
-H 'Content-Type: application/json' \
-d '{"computer_name":"smoke-001"}'
with the tunnel up from a ZTLP-enrolled admin device should return
201 + a valid ztlp://enroll/?... URI.
Test results:
Ruby syntax check (host): all 4 files parse OK
Controller load test (production image, Rails 7.1):
Api::Admin::EnrollmentTokensController.action_methods => [:create]
api_admin_enrollment_tokens_path => /api/admin/enrollment_tokens
Full Rails test suite was NOT run on the host (mocha gem missing
from production image bundle; test gems aren't installed). CI is
expected to run the new test/controllers/api/admin/* file with the
full test-group bundle.
Things deliberately NOT touched:
* The existing HMAC v1 controller and routes are left intact as a
historical/secondary path.
* The Rust gateway code in proto/ is unchanged — this is a
Bootstrap-Rails-only PR. A future PR could narrow the gateway's
X-ZTLP-* strip-list from prefix to exact-name allowlist so the
HMAC path also works again, but Option C makes that optional
rather than required.
Co-authored-by: Steve Price <steve@techrockstars.com>
📝 WalkthroughWalkthroughThis PR adds a new gateway-authenticated API endpoint ( ChangesGateway-authenticated enrollment token endpoint
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
bootstrap/docs/z2ls_gateway_auth_runbook.md (1)
49-75: ⚡ Quick winAdd language identifier to the fenced code block.
The architecture diagram starting at line 49 uses a fenced code block without a language specifier. Adding a language identifier (e.g.,
textorascii-art) improves rendering consistency across Markdown processors.📝 Proposed fix
-``` +```text Z2LS host (ZTLP-enrolled admin device for zone "acme.ztlp") │🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bootstrap/docs/z2ls_gateway_auth_runbook.md` around lines 49 - 75, The fenced code block in bootstrap/docs/z2ls_gateway_auth_runbook.md that begins with the diagram line "Z2LS host (ZTLP-enrolled admin device for zone \"acme.ztlp\")" lacks a language identifier; update the opening fence from ``` to ```text (or ```ascii-art) so Markdown renderers treat it as plain text and preserve formatting for the diagram and lines like "ZTLp connect..." and "POST /api/admin/enrollment_tokens".
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@bootstrap/docs/z2ls_gateway_auth_runbook.md`:
- Around line 49-75: The fenced code block in
bootstrap/docs/z2ls_gateway_auth_runbook.md that begins with the diagram line
"Z2LS host (ZTLP-enrolled admin device for zone \"acme.ztlp\")" lacks a language
identifier; update the opening fence from ``` to ```text (or ```ascii-art) so
Markdown renderers treat it as plain text and preserve formatting for the
diagram and lines like "ZTLp connect..." and "POST
/api/admin/enrollment_tokens".
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 01cf8148-2735-4503-a55c-254743a0f8db
📒 Files selected for processing (7)
bootstrap/app/controllers/api/admin/enrollment_tokens_controller.rbbootstrap/config/routes.rbbootstrap/docs/api_v1_ztlp_secured.mdbootstrap/docs/z2ls_enrollment_runbook.mdbootstrap/docs/z2ls_gateway_auth_runbook.mdbootstrap/script/z2ls_gateway_auth_token_request.rbbootstrap/test/controllers/api/admin/enrollment_tokens_controller_test.rb
…t floors (#41) What: Coordinated version bump across all four component manifests to match the v0.30.3 git tag cut from a5993ee (PR #40 — Z2LS gateway-auth enrollment API). Floors in release_test.exs / version_pin_test.rs ratcheted from 0.29.4 → 0.30.3 so a future v0.30.4 cut without bumping these files will fail CI loudly. Why: v0.30.0 through v0.30.2 produced Docker image tags and a git tag but did not have a coordinated four-component manifest bump. That's the exact PR #13/#14 drift class — runtime services on v0.30.2 containers report Application.spec(:ztlp_ns, :vsn) == '0.30.0' (or some other stale value depending on when they were last compiled). The release-version-pinning skill prescribes ratcheting the floor + bumping the manifests in one PR after the tag so the next tag cut exercises the floor. Files: proto/Cargo.toml 0.30.0 → 0.30.3 ns/mix.exs 0.30.0 → 0.30.3 relay/mix.exs 0.30.0 → 0.30.3 gateway/mix.exs 0.30.0 → 0.30.3 proto/tests/version_pin_test.rs floor 0.29.4 → 0.30.3 ns/test/ztlp_ns/release_test.exs floor 0.29.4 → 0.30.3 relay/test/ztlp_relay/release_test.exs floor 0.29.4 → 0.30.3 gateway/test/ztlp_gateway/release_test.exs floor 0.29.4 → 0.30.3 .gitignore + .ssh/ (defense against accidental key commits) Tests: TDD: RED → bumped manifests → GREEN. RED (manifests still at 0.30.0, floors ratcheted to 0.30.3): ns: mix.exs version 0.30.0 is older than the v0.30.3 Z2LS gateway-auth tag relay: mix.exs version 0.30.0 is older than the v0.30.3 Z2LS gateway-auth tag gateway: mix.exs version 0.30.0 is older than the v0.30.3 Z2LS gateway-auth tag (proto deferred to CI — local cargo 1.75 doesn't grok Cargo.lock v4) GREEN (after bump, full release_test.exs per component): ns: 15 tests, 0 failures relay: 15 tests, 0 failures gateway: 15 tests, 0 failures The runtime-vs-declared drift test (Application.spec/2 == mix.exs) also passes in GREEN, confirming the OTP .app cache was recompiled correctly after the bump. Validation: - Full relay test suite running in background to confirm no collateral damage from the bump. - CI on the PR will exercise proto/tests/version_pin_test.rs (the floor guard there is the inverse direction — fails if Cargo.toml drops below the floor). Follow-up: - Cut v0.30.4 from THIS commit so the tag and four-component manifests agree (the v0.30.3 tag is the 'tag points at pre-bump commit' case per the release-version-pinning skill). - Rebuild + redeploy ztlp-node and ztlp-bootstrap images tagged v0.30.4 so the on-disk SaaS state catches up to the source. - Bootstrap currently has no version manifest (Rails app, not a packaged artifact). Adding a VERSION file + initializer + test is listed in docs/plans/2026-05-24-z2ls-via-gateway-admin-auth.md as a follow-up — kept out of this PR to keep the scope tight.
Why this exists
The existing HMAC v1 API at
POST /api/v1/enrollment_tokensis unusable in the current Launch-provisioned topology because the per-tenant ZTLP gateway is started with--http-inject-headers, which strips ALL inboundX-ZTLP-*headers as a defense against admin-auth spoofing. The HMAC API headers (X-ZTLP-Zone,X-ZTLP-Client,X-ZTLP-Timestamp,X-ZTLP-Signature) share that prefix, so they get nuked before reaching Rails.Verified end-to-end against five freshly-provisioned tenants today (
hermes-sandbox,hermes-lab,hermes-probe,hermes-trial,hermes-try5— all*.ztlp) — every signed request 401s with audit reasonmissing_header, regardless of how the headers are sent (curl, Python urllib, all case variations).Per Steve's direction this PR implements Option C: skip HMAC entirely and use the same gateway-auth path the Bootstrap UI already uses. Z2LS becomes "an admin-equivalent client over ZTLP" rather than a separately-credentialed system — the trust boundary is the ZTLP device identity, not a shared HMAC secret.
Why CSRF is safe to skip on this endpoint
In-file justification, summarized:
require_gateway_auth!confirmstrusted_gateway_adminsucceeded.trusted_gateway_adminverifies the request carries a valid gateway HMAC signature (Ztlp::HeaderVerifier.verify_request) overX-ZTLP-Authenticated,X-ZTLP-Admin-Email,X-ZTLP-Timestamp,X-ZTLP-Signature.Files changed
bootstrap/config/routes.rbnamespace :adminundernamespace :apiwithPOST enrollment_tokensbootstrap/app/controllers/api/admin/enrollment_tokens_controller.rbbootstrap/test/controllers/api/admin/enrollment_tokens_controller_test.rbbootstrap/script/z2ls_gateway_auth_token_request.rbbootstrap/docs/z2ls_gateway_auth_runbook.mdbootstrap/docs/api_v1_ztlp_secured.mdbootstrap/docs/z2ls_enrollment_runbook.mdTest results
mocha/minitestgem is ingroup :testof the Gemfile but not bundled into the production image we have locally. CI is expected to runbin/rails test test/controllers/api/admin/enrollment_tokens_controller_test.rbwith the full test-group bundle.Manual validation followup
I'll retest end-to-end via the
hermes-try5.ztlpsandbox tunnel after merge & redeploy:Expected:
201 Createdwith aztlp://enroll/?...URI in the response body. If the gateway-auth headers are reaching Rails (which they are — passwordless dashboard sign-in works), this should succeed without any further changes.Things deliberately NOT touched
/api/admin/; existing v1 callers continue to work (if any survive in a different topology).proto/is unchanged. This is a Bootstrap-Rails-only PR. A future PR could narrow the gateway'sX-ZTLP-*strip-list from prefix to exact-name allowlist so the HMAC path also works again, but Option C makes that optional rather than required.Threat-model notes for reviewers
require_gateway_auth!before_actionreturns 401 iftrusted_gateway_adminreturns nil.trusted_gateway_adminALREADY rejects whenZTLP_TRUST_GATEWAY_AUTHis unset/false, whenZTLP_GATEWAY_HEADER_SECRETis empty, or when the HMAC verification fails. So the new endpoint inherits all those checks for free.skip_forgery_protectionis the only CSRF bypass —null_sessionwould have been wrong here because we want the existing session, we just don't want CSRF.trusted_gateway_admincall short-circuits before the chain reachessession[:admin_user_id]-based lookup). This keeps the trust model crisp: only the gateway can authenticate Z2LS-style callers on this surface.Related diagnostic notes (untracked, NOT in this PR)
/home/trs/projects/ztlp/docs/findings/2026-05-23-v1-api-header-collision.md— full diagnosis of the HMAC blocker that motivated Option C/home/trs/projects/ztlp/hermes-sandbox-zone.md— sandbox zone facts for Hermes-side retestingBoth are session notes; intentionally not committed.
Summary by CodeRabbit
New Features
POST /api/admin/enrollment_tokens) as the preferred integration path.Documentation