fix(main): wire pair and run subcommands (fixes #21)#28
Merged
Conversation
The hermes-node binary previously loaded the config, printed one
line, and exited 0 — the WSS client never started. The supervisor,
dispatcher, pinger, and handlers were all implemented and tested
in isolation but had no entry point.
This commit wires them up. Two subcommands:
hermes-node pair --server <wss-url> --token <token> [--config <path>]
Write a fresh config.toml with the supplied values, mode 0600.
hermes-node run [--config <path>]
Long-lived background service. Connects, serves exec/read/write
calls, reconnects on drops with exponential backoff.
Config schema gains a [node].token field (required). The
config.Load() call on Unix verifies the file is mode 0600 — a
chmod slip during a manual edit won't silently expose the token.
config.Save() is the pair-side counterpart, refuses to overwrite
an existing config so a re-pair is explicit.
The run subcommand splits signal handling (production) from the
ctx-driven run loop (testable). runRunWithSignalCtx wires the OS
signal handler; runRun takes a ctx directly so unit tests can
drive the supervisor with a deadline-bounded ctx and assert on
the connection and shutdown paths without subprocess gymnastics.
End-to-end coverage: TestRun_ConnectsToServer stands up an
httptest WebSocket server, seeds a config pointing at it, runs
hermes-node run in a goroutine, waits for the supervisor's dial
to land, and asserts on a clean shutdown when the ctx is
cancelled.
Co-authored-by: Kate <kate@local>
Quinn's review of fix/21-wire-main-run-loop surfaced three warnings (real bugs / DX issues) and several nice-to-haves. This commit applies the ones worth shipping before the PR goes up. W1 — close previous exec Session on reconnect Setup now closes the previous *Session before opening a new one via a closure-captured prevSession + prevSessionMu. Without this, a flaky-network operator leaked one bash PID per reconnect (Go does not reap subprocesses when a *Session reference is dropped). The leak is bounded only by process shutdown, so on a container with a tight ulimit it would bound the node's uptime. W2 — defaultConfigPath returns an error The silent "return config.toml" fallback on os.UserHomeDir failure could mislead the operator with a "file not found" error whose path was a lie. Now matches defaultLogPath's error-return shape; run() surfaces the error for the run/pair subcommands and lets version/help work without it. W3 + S5 — pair stdout message The default [node].name (the config filename) would silently fail auth per PROTOCOL.md §3.4. The default empty [node].allowed_paths would silently reject every call. The pair message now spells both out with the protocol citations. S1 — delete dead driveFakeServer The helper was defined and never called; the file's preamble promised a synthetic-exec round-trip test that the code never delivered. Removed. S2 — OnError captures debug.Stack The panic-recovery path in dispatch.go pays the cost of recovering, then OnError only captured the panic value. The hook now also captures runtime/debug.Stack so the audit log and stderr both carry the trace, not just "handler panic: index out of range". dispatch.go's own docstring pointed callers at this idiom. S6 — TestSave_CreatesFile0600 unit test Coverage was previously only via the integration test (TestRun_PairSubcommand_WritesConfig). The unit test localises the mode guarantee to the Save() function itself. No changes to file ownership, no secret material added. All three checkboxes (tests, lint, build) green; 3 consecutive go test ./... runs stable. Co-authored-by: Kate <kate@local>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The
hermes-nodebinary was a stub — it loaded the config, printed one line, and exited 0. The supervisor, dispatcher, pinger, and handlers were all implemented and tested in isolation but had no entry point. This PR wires them up.Fixes #21.
What changed
Two new subcommands:
hermes-node pair --server <wss-url> --token <token> [--config <path>]— write a freshconfig.tomlwith the supplied values, mode0600. Refuses to overwrite an existing config so a re-pair is explicit.hermes-node run [--config <path>]— long-lived background service. Connects, servesexec/read/writecalls, reconnects on drops with exponential backoff.Config schema: gains a required
[node].tokenfield.config.Load()on Unix verifies the file is mode0600— a chmod slip during a manual edit won't silently expose the token.config.Save()is the pair-side counterpart.Test seam:
runRun(ctx, ...)takes actxdirectly so unit tests can drive the supervisor with a deadline-bounded context and assert on connection + shutdown paths without subprocess gymnastics. Production usessignal.NotifyContext.End-to-end coverage
TestRun_ConnectsToServerstands up anhttptestWebSocket server, seeds a config pointing at it, runshermes-node runin a goroutine, waits for the supervisor's dial to land, and asserts on a clean shutdown when the ctx is cancelled.Quinn's review
Quinn reviewed the first cut. This branch addresses all 3 warnings (W1-W3) and the suggestions worth shipping (S1, S2, S5 folded into W3, S6):
*exec.Sessionon reconnect (bash PID leak).Setupnow closes the previous session before opening a new one via a closure-capturedprevSession+prevSessionMu.defaultConfigPathreturns an error. The silent "returnconfig.toml" fallback onos.UserHomeDirfailure was misleading;run()now surfaces it.pairstdout message now warns the operator that the default[node].name(the config filename) will silently fail auth per PROTOCOL.md §3.4, and that the default empty[node].allowed_pathswill silently reject every call per §3.5.driveFakeServerhelper.OnErrornow capturesruntime/debug.Stack()so audit log and stderr both carry the trace.TestSave_CreatesFile0600unit test (was only integration-tested).Skipped per Quinn: S3, S4, S7 (not blockers).
Verification
go test -race ./...— 6/6 packages, stable across 3 consecutive runsgo vet ./...cleango build ./...cleanpairwrites correct TOML with0600,0644rejected,rundials + backs off + clean SIGTERM, audit log captures reconnect attemptsDiff stat