feat(m7): exposure modes, hub auth, and device pairing#523
Merged
Conversation
Adds the configuration foundation for daemon exposure mode:
- ExposureMode enum (Local, TailscaleServe, TailscaleFunnel,
CloudflareTunnel) with JsonStringEnumConverter<ExposureMode>
using JsonNamingPolicy.KebabCaseLower for STJ contexts
- DaemonConfig sealed record (Host, Port, ExposureMode) with
BindFromConfiguration factory that parses kebab-case enum values
("tailscale-serve" → TailscaleServe) matching the config wire format
- DaemonConfig registered as singleton in daemon DI from "Daemon" section
- 17 unit tests covering IConfiguration binding, STJ serialization
round-trips, defaults, missing section, and parse error cases
IMPLEMENTATION_PLAN.md M7.A1 done-when items all checked.
openspec/changes/exposure-modes tasks 1.1, 1.2, 1.4 checked.
…on bind address (M7.A2) - Add Daemon section to netclaw-config.v1.schema.json with Host, Port, ExposureMode (string enum: local/tailscale-serve/tailscale-funnel/cloudflare-tunnel), all with defaults. Section is optional; additionalProperties: false. - Reorder Program.cs startup so ConfigureConfigServices runs before UseUrls, enabling DaemonConfig to be read from netclaw.json before the WebHost URL is bound. Replaces hardcoded "http://127.0.0.1:5199" with $"http://{host}:{port}". - Add three ConfigSchemaDoctorCheckTests: valid Daemon section accepted, invalid enum rejected (Error), missing Daemon section accepted (Pass). - DaemonApi.ResolveEndpoint() (Daemon:Endpoint override key) is unaffected. OpenSpec: exposure-modes tasks 1.3, 2.1, 2.2
…te checks (M7.A3) Validates tunnel prerequisites at daemon startup based on ExposureMode config. Tailscale modes require tailscaled; Cloudflare mode requires cloudflared. Local mode skips all checks. Throws from StartAsync to abort startup on failure.
…lth (M7.A4) Adds ExposureModeDoctorCheck to the doctor framework: - Warning when ExposureMode=local with a non-loopback bind address - Error when a non-local mode is declared but the required tunnel process (tailscaled/cloudflared) is not running - Pass for local+loopback or non-local+healthy tunnel Injects Func<string,bool> processDetector via internal constructor for unit testability (same pattern as ExposureModeValidationService). 12 unit tests cover pass, warning, and error cases.
Adds the network exposure mode step to the onboarding wizard, inserted after security posture and before Slack. Supports four modes (Local, TailscaleServe, TailscaleFunnel, CloudflareTunnel) with a two-sub-step flow for non-local modes: a mode selection list followed by an informational notice (tailscale-serve) or a high-risk warning with explicit confirmation (tailscale-funnel, cloudflare-tunnel). ContributeConfig omits the Daemon section for Local (schema default); writes kebab-case ExposureMode wire values for non-default modes. 35 unit tests cover config contribution, wire format, sub-step navigation, and risk flags.
Mark tasks 1.3, 2.1, 2.2, 3.1-3.5, 4.1-4.5, 5.1-5.7, 7.1-7.5 as complete in openspec/changes/exposure-modes/tasks.md. These were implemented in M7.A2-A5 (iterations 2-5) but never checked off. Also includes review-after-iter-05 artifacts: Fix-it tasks R1.2 and R1.3 in IMPLEMENTATION_PLAN.md, and parking lot entry for ExposureMode duplication finding.
…1.2) Replace silent fallback arms (`_ => "local"`, `_ => ToString()`) with `_ => throw new ArgumentOutOfRangeException(...)` in all three ToWireValue-style switches: - ExposureModeExtensions.ToWireValue() in WizardConfigBuilder.cs - ExposureModeDoctorCheck.ToWireValue() - ExposureModeValidationService.StartAsync() modeWireValue switch Also adds the missing Local => "local" arm to ExposureModeDoctorCheck.ToWireValue to make the switch exhaustive before the throw arm. Satisfies CLAUDE.md "No silent fallbacks" rule for all three sites.
DaemonConfig.BindFromConfiguration was called twice — once in RunDaemonAsync to bind the WebHost URL, and again inside ConfigureDaemonServices for the DI singleton. If configuration changed between the two calls the bind address and the DI instance would silently diverge. Pass the single DaemonConfig instance into ConfigureDaemonServices instead of re-parsing, eliminating the divergence risk.
ConfigWatcherService now detects when the Daemon section (bind address or exposure mode) changes in netclaw.json and skips the coordinated restart, logging a warning that a manual daemon restart is required instead. Network binding and exposure mode cannot be changed safely via hot-reload — they affect the process bind address before sessions are even established. Changes: - DaemonConfig.ParseExposureMode promoted to public (was internal) - ConfigWatcherService accepts DaemonConfig as constructor dependency - ApplyReloadAsync reads and compares Daemon section before triggering restart - ReadDaemonConfigFromFile helper parses Daemon section from JSON - 4 new unit tests covering changed/unchanged Daemon section paths - SPEC-006 updated with implementation status for exposure mode config - SPEC-011 updated to reference DaemonConfig instead of hardcoded URL - IMPLEMENTATION_PLAN.md and openspec tasks.md checkboxes updated
…apper (M7.B1) Adds the identity contract between ASP.NET Core authentication schemes and the Netclaw session pipeline: - NetclawClaimTypes — string constants for netclaw:principal, netclaw:transport, netclaw:device-id used by all auth schemes - ConnectionIdentity — record holding PrincipalClassification, TransportAuthenticity, and SenderId, derived from ClaimsPrincipal by the mapper - ClaimsPrincipalMapper — singleton service; parses Netclaw claim types from a ClaimsPrincipal, falls back to UntrustedExternal/Unknown per-claim when claims are absent or unrecognised (null principal → full fallback) - 5 unit tests covering null principal, loopback claims, bearer claims, missing claims, and unrecognised claim values OpenSpec: hub-auth-framework tasks 1.1–1.3, 6.1
Adds LoopbackAuthenticationHandler that trusts same-machine connections by checking HttpContext.Connection.RemoteIpAddress against 127.0.0.1 and ::1. Loopback connections receive Operator/LocalProcess/local claims; non-loopback and null IPs return NoResult to defer to other schemes. Registered as the default ASP.NET Core authentication scheme in daemon Program.cs via AddAuthentication(SchemeName).AddScheme<...>. 6 unit tests cover: loopback IPv4, loopback IPv6, three non-loopback IPs, and null remote IP.
…M7.B3) - Add `AddAuthorization()` alongside existing `AddAuthentication()` in daemon DI - Insert `UseAuthentication()` + `UseAuthorization()` in the middleware pipeline before hub mapping, so auth runs on every SignalR connection - Add `[Authorize]` to `SessionHub` so non-loopback connections are rejected before any hub method executes - Add integration tests via TestServer: non-loopback (null RemoteIpAddress) gets 401; loopback (RemoteIpAddress = 127.0.0.1) passes authorization
Wire ClaimsPrincipal → ConnectionIdentity → ChannelInput so that authenticated connection identity flows into every SessionHub message. - Register ClaimsPrincipalMapper singleton in daemon DI - Inject ClaimsPrincipalMapper into SessionRegistry; derive SenderId/Principal/TransportAuthenticity from ClaimsPrincipal instead of hardcoded Operator/LocalProcess values - Add ClaimsPrincipal? parameter to all public SessionRegistry methods; SessionHub now passes Context.User to every call - Add HubConnectionBuilderExtensions.ConfigureAccessToken extension: no-op for null factory (loopback), sets AccessTokenProvider otherwise - DaemonClient accepts optional Func<Task<string?>> accessTokenProvider; existing loopback callers unchanged (task 5.1 verified) - Two new unit tests verify ChannelInput populated from claims and that null principal falls back to UntrustedExternal/Unknown - Sync hub-auth-framework tasks.md: mark 3.1-3.3 (M7.B3), 4.1-4.4, 5.1-5.2, 6.3-6.5 done
…7.C1) Add the foundational device pairing infrastructure: - PairedDevice record (Name, TokenHash, Salt, CreatedAt, LastUsedAt) in Netclaw.Configuration - IRemoteAuthSchemeRegistration marker interface for startup validation discovery - DevicesPath property added to NetclawPaths (~/.netclaw/config/devices.json) - DeviceRegistry: file-backed registry with SHA256(token||salt) verification, SemaphoreSlim locking, TimeProvider-based timestamps - DeviceTokenAuthenticationHandler: ASP.NET Core auth scheme that reads Authorization: Bearer header, verifies against registry, grants Operator/Verified claims with device name as SenderId, and updates LastUsedAt on success - DevicePairingSchemeRegistration: IRemoteAuthSchemeRegistration marker registered in DI alongside the bearer token scheme - Registered DeviceRegistry, bearer token scheme, and scheme registration in Program.cs alongside the existing loopback scheme Tests (20 new): - DeviceRegistryTests: add, list, remove, lookup-by-hash, update-last-used, file round-trip - DeviceTokenAuthenticationHandlerTests: valid token → Operator/Verified claims, LastUsedAt updated; invalid token → Fail; missing/non-Bearer header → NoResult
Adds PairingCodeService — generates 8-char XXXX-XXXX codes from a 32-char unambiguous alphabet (no 0/O/1/I), stores one pending code at a time with 5-minute TTL, single-use consumption. Adds POST /api/pair/exchange — unauthenticated, rate-limited endpoint that validates the pending code, generates a 32-byte base64url bearer token, hashes it with a random salt, stores the new PairedDevice in DeviceRegistry, and returns the raw token to the caller. Rate limited via ASP.NET Core FixedWindowLimiter: 5 req/min per IP with 429 rejection (brute-force defense for the 8-char code space). 16 new unit tests cover code generation, expiry, single-use, case insensitivity, and replacement semantics.
… (R3.1)
Previously AddAuthentication("Loopback") made Loopback the sole default scheme,
so [Authorize] on SessionHub never invoked DeviceTokenAuthenticationHandler —
remote clients with valid bearer tokens always received 401.
Add an "AuthSelector" PolicyScheme as the default. Its ForwardDefaultSelector
routes to DeviceBearer when an Authorization: Bearer header is present, otherwise
to Loopback. This matches the production intent: local connections use the
loopback trust boundary; paired remote devices authenticate via bearer token.
Also extends SessionHubAuthorizationTests with two new integration tests:
- Remote connection with valid bearer token passes [Authorize]
- Remote connection with invalid bearer token receives 401
(Existing loopback tests updated to match the new multi-scheme setup.)
Documents that POST /api/pair/exchange is intentionally unauthenticated. Without this, a future FallbackPolicy would silently break pairing flows.
Bare catch in DeviceRegistry.VerifyToken swallowed all exceptions, including unexpected ones like CryptographicException. Only FormatException is expected (malformed base64url or hex-encoded input). Broader exceptions now propagate so failures are visible rather than silently masked as auth mismatches.
Tasks 10.1 (DeviceRegistry), 10.2 (PairingCodeService), 10.3 (DeviceTokenAuthenticationHandler) were implemented in iterations 14-15 but their OpenSpec checkboxes were left unchecked.
- Add GeneratePairingCode() hub method to SessionHub; requires LocalProcess
transport claim; logs code to stdout for Docker container log access
- Add GET /api/pair/devices and DELETE /api/pair/devices/{name} REST endpoints
- Add PairingCodeResultDto and PairedDeviceInfoDto to Netclaw.Configuration
- Add netclaw daemon pair, daemon devices list/revoke CLI subcommands
- Add netclaw pair <endpoint> command: prompts for code + device name, POSTs
to exchange endpoint, stores DeviceToken in secrets.json and Daemon:Endpoint
in netclaw.json on success
- Add ListPairedDevicesAsync() and RevokePairedDeviceAsync() to DaemonApi
- Add integration tests for device endpoint auth (401/200/204/404)
- Add unit test for config write on successful pair
- sync tasks.md 5.1-5.5, 6.1-6.4 and IMPLEMENTATION_PLAN.md M7.C3 checkboxes
Diagnostics found ~50% of iteration logs missing required sections (Status, commit hashes, testing-strategy citation, follow-up dispositions). Apply 4 additive process improvements: - ralph-loop.md step 4b: elevate testing-strategy.md to mandatory pre-code citation with explicit audit format - ralph-loop.md step 9: add pre-commit log compliance checklist (9a) and post-commit hash capture step (9c) - ralph-loop.md template: strengthen follow-up disposition enforcement - ralph-output-adversarial-review.md: add log compliance and follow-up disposition checks to must-check section Also adds CLEANUP task CL.1 (rename PairCommandConfigTests) to IMPLEMENTATION_PLAN.md from adversarial review finding.
Consolidate duplicated logic and improve hot-path efficiency across the M7 (Daemon Exposure and Hub Auth) implementation. Deduplication: - Move ToWireValue to public ExposureModeExtensions in Netclaw.Configuration; delete 3 private copies across Cli/Daemon - Delete ExposureModeDoctorCheck.ParseMode; call existing public DaemonConfig.ParseExposureMode instead - Replace ConfigWatcherService hand-rolled JSON parsing with DaemonConfig.BindFromConfiguration via ConfigurationBuilder - Extract auth scheme registration to NetclawAuthExtensions; share between Program.cs and both integration test files - Extract MakeDevice test helper to shared DeviceTestHelpers; replace 4 copies across test classes - Extract IsHelpToken to CliArgsParser; delegate from Program.cs and PairCommand Efficiency (DeviceRegistry): - Add LookupAndUpdateLastUsedAsync: single lock, single file read, one conditional write (was 2 lock acquisitions + 2 file reads) - Add in-memory device cache invalidated on writes - Remove TOCTOU File.Exists check; catch FileNotFoundException - Skip write in UpdateLastUsedAsync when no device matched - Move Directory.CreateDirectory to constructor (was per-write) Resolves PARK item "Extract shared ExposureMode parse/wire-value utility" — removed from BACKLOG_PARKING_LOT.md. Net: -113 lines across 20 files. All 1,810 tests pass.
…retsRoundTripTests (CL.1) The old class name and XMLdoc falsely claimed to test PairCommand, but the test never called PairCommand.RunAsync(). The test exercises ConfigFileHelper's secrets encryption round-trip and config write/read. Renamed class, file, and test method to accurately reflect what is being tested; removed misleading "exchange" language from the method name that implied HTTP was involved.
….C4) - Add DaemonClientFactory: reads DeviceToken from secrets.json, creates bearer token provider for non-loopback DaemonClient connections - Update both DI registrations in Program.cs to use DaemonClientFactory - Add 401 detection in DaemonClient.ConnectAsync with 'netclaw pair' suggestion - Extend ExposureModeValidationService: non-local mode fails startup when no IRemoteAuthSchemeRegistration is registered AND no devices are paired - Tests: 9 DaemonClientFactory unit tests (loopback/non-loopback detection, token provider presence), 3 new ExposureModeValidationService tests (no-auth+no-devices fails, escape hatches for scheme and devices)
Adds a pairing lifecycle section to scripts/smoke/check.sh that exercises the full pairing flow inside the smoke sandbox: generate pairing code via `netclaw daemon pair`, exchange via `curl POST /api/pair/exchange`, verify device in `netclaw daemon devices`, authenticate /api/pair/devices with the bearer token, revoke the device, and assert HTTP 401 on the revoked token. Smoke test runs after existing session/stats/reminder tests, before teardown. Token extraction uses sed (no jq in sandbox image). OpenSpec: device-pairing tasks 9.1, 9.2 ✓
…ntegration tests - Extract GetRequiredProcessName() into ExposureModeExtensions to eliminate duplicated process-name mapping between ExposureModeDoctorCheck and ExposureModeValidationService - Add PairingExchangeEndpointTests covering 200/400/401 HTTP responses, code expiry, single-use enforcement, and token-based authentication - Resolve PARK items from after-action review; sync OpenSpec task checkboxes
… system - Add PairingExchangeGuard: fail2ban-style per-IP lockout after 10 failed exchange attempts (15-minute block with Retry-After header) - Return 404 when no pairing code is pending (hides endpoint from scanners on internet-exposed deployments) - Record failures on invalid code attempts for lockout tracking - Document device pairing in README: exposure modes, pairing flow, security properties, device management commands, CLI reference - Update netclaw-operations skill (v1.3.0): add Device Pairing section, exposure-mode doctor check, pairing CLI commands
24 tasks
Lock down host-network daemon surfaces, enforce usable remote auth at startup, and attach device tokens to remote CLI HTTP calls. Also clarify that audience selection and exposure mode are separate controls.
Explain that audience and exposure mode are separate controls, update reachability wording, and document the authenticated host-network requirement for remote daemon access.
Script fake chat responses per call so the compaction regression test no longer mutates shared fake state mid-turn. This removes the race between post-compaction tool-loop resumption and test-side response changes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements Milestone 7 in full — three phases building the security and network exposure foundation for remote device access:
ExposureModeenum,DaemonConfigbinding,ExposureModeValidationService(startup gate),ExposureModeDoctorCheck, init wizard step, and config-watcher hot-reload exclusion for theDaemonsectionPolicySchemeselector routing toLoopbackAuthenticationHandlerorDeviceTokenAuthenticationHandler),ConnectionIdentityvalue object,ClaimsPrincipalMapper, identity propagation intoMessageSourceDeviceRegistry(file-backed, salted SHA-256 token hashes),PairingCodeService(single-use 5-minute codes),POST /api/pair/exchangeendpoint (rate-limited, anonymous), CLIpair/devices/devices revokecommands, CLI token attachment for remote SignalR connections, startup validation requiring at least one paired device or remote auth scheme for non-local exposureAlso includes:
ExposureModeprocess-name mapping duplication, added exchange endpoint integration tests, synced OpenSpec task checkboxesTest plan
dotnet test)PairingExchangeEndpointTestscover 200/400/401 HTTP responses, code expiry, single-use, and token authPairingCodeEndpointTestscover device list/revoke endpoints with auth enforcementExposureModeValidationServiceTestscover all tunnel process combinations and remote auth guardExposureModeDoctorCheckTestscover loopback warnings, missing process errors, unknown mode errorsscripts/smoke/check.sh) exercises full pairing lifecycle: generate code → exchange → authenticate → revoke → verify 401