Fix headless PTY background launches on macOS by mmkzer0 · Pull Request #26 · aannoo/hcom

mmkzer0 · 2026-04-22T19:34:04Z

Fix headless PTY background launches on macOS

...oh man, this was an odyssey.

Summary

Fixes the macOS --headless --go regressions on the PTY-backed launch path.

Validated end result:

headless Codex works again
headless OpenCode works again
bare headless Claude without a prompt/task now fails fast with a clear error
prompt/task-based headless Claude still works

Motivation

Headless launches on the PTY path had regressed in several overlapping ways, which made this investigation a bit deceptive at first:

background PTY launches could drop instance identity before hcom pty even started
non-TTY stdin handling on macOS could prematurely tear down the PTY loop
invalid/non-useful stdin could stay in the poll set and cause hot wakeups
the non-blocking inject listener could keep waking spuriously on macOS after WouldBlock
bare hcom claude --headless without a prompt/task looked like “headless PTY is broken” but is actually a separate unsupported detached-Claude path

User-facing symptoms included:

Codex/OpenCode ending up in launch_failed
missing delivery.start / notify.registered
PTY spin / high CPU
Claude launching and exiting immediately in unsupported bare headless mode

What changed

Launcher / background PTY env

Propagate HCOM_INSTANCE_NAME into the background PTY runner env
Apply tool_extra_env(tool) on background PTY launches too

PTY loop hardening

Do not treat stdin EOF as terminal disconnect when stdin is not a TTY
Fully remove stdin from the poll set when it should no longer be polled
Drop stdin polling on POLLNVAL
Ignore stdin POLLHUP for non-TTY headless PTY sessions
Add one-iteration listener backoff after inject-listener WouldBlock

PTY child identity

Mark PTY child processes with HCOM_LAUNCHED=1

Claude safety guard

Reject bare hcom claude --headless without a prompt/task
Keep prompt/task-based headless Claude behavior intact

Notes / implementation details

The main headless Codex/OpenCode fix was restoring HCOM_INSTANCE_NAME on the background PTY path so delivery can initialize for the correct instance.
The PTY poll loop changes are specifically aimed at non-interactive stdin on macOS (/dev/null, closed pipes, invalid fds).
The inject listener still stays non-blocking; the fix is not to change socket mode, but to back off for one iteration after WouldBlock.
The HCOM_LAUNCHED=1 PTY-child env fix ensures downstream hooks/tool integrations still recognize PTY-managed child sessions as hcom-launched.
Bare headless Claude is intentionally not “fixed” into a PTY-backed live session model in this PR. The safe change here is explicit early validation.

Validation

Automated

cargo test
Added regression coverage for:
- background runner env propagation
- Claude headless validation
- PTY child launch env
- inject listener accept helper behavior / timing resilience

Manual

./target/release/hcom codex --headless --go
- launches successfully
- transitions to listening
- round-trip message delivery works
./target/release/hcom opencode --headless --go
- launches successfully
- transitions to listening
- round-trip message delivery works
./target/release/hcom claude --headless --go
- now fails fast with a clear prompt/task requirement
./target/release/hcom claude --headless --go 'Ping!'
- launches successfully
- later message delivery round-trip works

Active-binary PTY re-check

Rebuilt release binary
Launched fresh PTY-backed headless Codex from ./target/release/hcom
Observed normal delivery.start / notify.registered
Observed no new recent proxy.poll_spin_trace / POLLNVAL log entries
Focused process check showed normal CPU for PTY wrapper and child

Final pre-PR hardening

Updated PTY stdin handling so non-TTY POLLHUP disables stdin polling instead of breaking the PTY loop
Made the inject accept regression test resilient to non-blocking localhost timing
Reran focused PTY tests and full cargo test successfully

Deferred follow-up

Not included in this PR:

true PTY-backed headless Claude as a live background session model
deeper investigation of Claude initial-prompt swallowing on detached launches
any broader resume/fork cleanup around tool marker vars beyond the validated fixes here
(those are deferred follow-up items rather than blockers for restoring stable headless Codex/OpenCode PTY launches)

Commit stack

fix(pty): don't treat stdin EOF as terminal disconnect when stdin is not a TTY
fix(pty): mark PTY child processes as hcom-launched
fix(pty): fully remove stdin from poll set when not a TTY
fix(headless): restore stable PTY background launches
fix(pty): tighten headless stdin and inject test handling

…not a TTY When running hcom in headless/background mode, the launcher spawns the PTY process with stdin=/dev/null. On /dev/null, poll() immediately returns POLLIN (always readable), and the subsequent read returns Ok(0) (EOF). The existing code treated any stdin EOF as 'terminal gone' and broke the poll loop, which then killed the child agent with SIGTERM and marked it as launch_failed. However, the poll-timeout path already had an isatty() guard for this exact scenario — but when stdin=/dev/null, poll() never times out because /dev/null is always readable, so the isatty check was bypassed. Fix: guard the Ok(0) => break path with an isatty check. Only break on stdin EOF if stdin is actually a TTY. If stdin is /dev/null or a pipe, ignore EOF and continue polling. Fixes: background/headless agent launches being killed immediately.

Set HCOM_LAUNCHED=1 in hcom pty child environments so downstream hooks and tool integrations correctly recognize PTY-managed sessions as hcom launches.

The previous fix used PollFlags::empty() for stdin when poll_stdin=false, but on macOS poll() may still return immediately for a readable fd even with events=0 (e.g. /dev/null in headless mode). This caused a tight busy-wait loop consuming 99.9% CPU. Fix: dynamically build poll_fds and completely omit stdin when we're no longer polling it. Adjust inject_listener and client indices accordingly. Also wraps stdin handling in 'if poll_stdin' to avoid accidentally checking inject listener revents when stdin is absent.

- propagate HCOM_INSTANCE_NAME into background PTY runner env - apply tool-specific PTY env on background launches - back off inject listener after WouldBlock on macOS - drop stdin from PTY polling on POLLNVAL - reject bare Claude headless launches without a prompt/task - add regression tests for launcher env and PTY helpers

- ignore stdin POLLHUP for non-TTY headless PTY sessions - make inject accept regression test resilient to non-blocking timing

Copilot

Pull request overview

Restores stable macOS headless/background launches on the PTY-backed path by fixing env propagation, hardening the PTY poll loop against non-interactive stdin edge cases, and adding an explicit guard for unsupported “bare” headless Claude launches.

Changes:

Propagate instance/tool-specific environment into background PTY runner launches and mark PTY-spawned children as hcom-launched.
Harden the PTY event loop for macOS headless stdin/poll quirks and add inject-listener backoff behavior.
Fail fast for hcom claude --headless when no prompt/task is provided, with regression tests.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`src/pty/mod.rs`	Adds child env injection support and updates poll loop behavior for stdin + inject listener handling.
`src/pty/inject.rs`	Changes `accept()` to return whether a connection was accepted; adds focused tests.
`src/main.rs`	Supplies PTY child env (`HCOM_LAUNCHED=1`) and adds a regression test.
`src/launcher.rs`	Ensures background PTY runner env includes `HCOM_INSTANCE_NAME` and `tool_extra_env`; adds tests.
`src/commands/launch.rs`	Adds validation to reject bare headless Claude without prompt/task; adds tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-22T19:39:33Z

+            // Only include stdin in poll set while we're actively polling it.
+            // When stdin is a non-TTY (e.g. /dev/null in headless mode), we stop
+            // polling it to avoid busy-waiting — but we must fully remove it from
+            // the poll set, not just pass empty events, because some platforms
+            // (macOS) may still return immediately for a readable fd even with
+            // events=0.
+            if poll_stdin {
+                poll_fds.push(PollFd::new(stdin_borrowed, PollFlags::POLLIN));
+            }


poll_stdin can become false specifically when poll reports POLLNVAL (invalid/closed stdin fd). In that case, the next loop iteration still constructs stdin_borrowed via unsafe { BorrowedFd::borrow_raw(stdin_raw) } (a few lines above this hunk) even though stdin may be closed, which is undefined behavior. Suggestion: only create/borrow stdin fd inside the if poll_stdin { ... } block (or switch the isatty checks to libc::isatty(stdin_raw) so no BorrowedFd is needed for potentially-invalid fds).

Copilot · 2026-04-22T19:39:34Z

+            // Include the inject listener unless we're in backoff (macOS spurious POLLIN).
+            // Reset backoff here so it applies for exactly one iteration.
+            let include_listener = !listener_backoff;
+            listener_backoff = false;
+            let inject_listener_idx: Option<usize> = if include_listener {
+                let idx = poll_fds.len();
+                poll_fds.push(PollFd::new(inject_listener_fd, PollFlags::POLLIN));
+                Some(idx)
+            } else {
+                None
+            };


The one-iteration listener_backoff strategy can introduce large accept latency: when the listener is excluded, poll() can block for up to poll_timeout (10s) if the PTY is idle, so new inject connections (including delivery thread injections) may not be accepted until the timeout expires. Consider bounding this by using a much shorter poll timeout during backoff, or keeping the listener in the poll set and applying a small sleep/yield when accept() returns WouldBlock.

aannoo · 2026-04-23T01:55:23Z

nice

@mafo

…istener backoff (#27) * fix(status): unify hook detection with verify_* and silence expected dev_root warnings hcom status used its own naive detection (substring match on settings.json, file-existence for the opencode plugin) while hcom hooks used the canonical verify_* functions. After hcom hooks remove, status kept reporting Claude/Gemini as installed because permission strings like "Bash(hcom ...)" still contained "hcom". Route all four tool checks through the same verify_* functions so status and hooks never disagree. router.read_dev_root_from_kv logged WARN on every expected "not configured" state (fresh HCOM_DIR where the db doesn't exist yet, unset dev_root key returning QueryReturnedNoRows). These fired on every hcom invocation and every hook dispatch. Treat them silently; keep WARN for real errors (permission denied, corruption, etc.). * fix(launch): normalize claude + --headless + prompt to print-headless Before: `hcom claude --headless --hcom-prompt 'task'` (or a positional task) passed validation but launched as a detached plain `claude` — the spec never saw `-p`/`--print`, so `add_background_defaults` didn't fire and the child never got `--output-format stream-json --verbose`. Thread initial_prompt into prepare_launch_execution. When tool == "claude" && background && a prompt is present, inject `-p` into the merged args before calling add_background_defaults so the child always runs through claude's print mode. Claude-specific — --headless semantics for codex/gemini/opencode unchanged. Addresses finding 1 of the PR #26 post-merge review. * fix(pty): cap poll timeout during inject-listener backoff The macOS kqueue spurious-POLLIN workaround drops the inject listener from the poll set for one iteration after WouldBlock. The poll timeout stayed at 10s (or 5s with debug) for that iteration, so an inject connection arriving while the listener was backed off could wait the full timeout before the listener re-entered the poll set. Cap the poll timeout at 100ms whenever the listener is excluded so the next iteration (with the listener re-included) runs quickly. Addresses finding 2 of the PR #26 post-merge review. * fix(scripts): pass -p explicitly to bundled claude --headless launches The launcher now normalizes claude + --headless + prompt to inject -p, but adding it explicitly in the bundled scripts makes intent clear and is belt-and-suspenders. Guard the non-claude cases (fatcow/debate can launch any tool) so -p only appears when the tool is claude. * fix(relay): enforce claude headless validation on remote launch path The local CLI short-circuits into dispatch_remote before local validate/prepare fires when --device is set, so the remote handler has to enforce the Claude invariant itself. Before this commit, bare remote `hcom claude --headless --device X` would prepare as background=true pty=false with no -p and fall straight through launcher::launch, preserving the unsupported detached plain-claude path on the remote. Call validate_claude_headless_launch after prepare_remote_launch so remote and local paths reject the same inputs. Bare claude --headless returns the same prompt/task error; claude --headless --hcom-prompt is normalized through the -p injection in prepare_launch_execution. Addresses @mafo's blocker in the PR #27 review thread. --------- Co-authored-by: aannoo <aannoo@users.noreply.github.com>

PR #26 landed `--headless` for claude via the detached print-mode path (-p --output-format stream-json --verbose, one-shot, exits after the prompt), and deferred a true PTY-backed live headless session. This commit adds that path. User-facing - New `--pty` launch flag. For claude, combines with `--headless` to launch the interactive TUI in a PTY wrapper inside a detached runner — same shape other tools already use, giving claude a live background session that accepts hcom inject. `--pty` alone (no --headless) keeps today's foreground claude-pty behavior. - `hcom claude --headless --pty` with no prompt is now a valid launch; the session sits idle waiting for hcom messages. - `--pty` added to `SHARED_LAUNCH_FLAGS` so it shows up in help for both fresh launches and resume/fork. Plumbing - `HcomLaunchFlags.pty` parsed in `extract_launch_flags`; forwarded into `prepare_launch_execution` and the dispatch-remote payload. - `prepare_launch_execution` now skips the -p injection + background defaults when `--pty` is set — the PTY wrapper hosts the TUI, and forcing print mode there would end the session on first reply. - `validate_claude_headless_launch` gets a `use_pty` parameter. Bare `claude --headless --pty` is allowed; the prompt-required invariant still applies to the NativePrint path. - `RemoteLaunchRequest.pty` carries the flag across the relay. Launcher - Drop the `bail!("Claude PTY does not support headless/background mode")` guard in `launch()`. - Route `LaunchTool::ClaudePty` through `launch_pty_or_background` the way gemini/codex/opencode do, so background mode spawns the PTY wrapper in a detached runner. - Introduce `LaunchBackend` enum (`InteractiveVisible` / `HeadlessPty` / `NativePrint`). Resolved once in `launch()` from the already- prepared (tool, background, pty) triple. Dispatch branches on the enum where it clarifies intent (claude's NativePrint-vs-visible split); the other tools' paths already encode the split via LaunchTool and use `launch_pty_or_background` directly. Tests - Unit: `--pty --headless` claude does not inject -p and keeps use_pty true; interactive claude-pty unchanged. - Unit: validator allows no-prompt `--pty --headless` and still rejects no-prompt non-PTY headless. - Unit: `LaunchBackend::resolve` matrix for each (tool, background, pty) combination. Live - `env -u HCOM_INSTANCE_NAME hcom claude --headless --pty --hcom-prompt 'respond via hcom send then stop'` → round-trip reply. - `env -u HCOM_INSTANCE_NAME hcom claude --headless --pty` (no prompt) → reaches listening, replies to `hcom send`.

@mafo

…#28) * feat: PTY-headless claude via --pty, plus LaunchBackend enum PR #26 landed `--headless` for claude via the detached print-mode path (-p --output-format stream-json --verbose, one-shot, exits after the prompt), and deferred a true PTY-backed live headless session. This commit adds that path. User-facing - New `--pty` launch flag. For claude, combines with `--headless` to launch the interactive TUI in a PTY wrapper inside a detached runner — same shape other tools already use, giving claude a live background session that accepts hcom inject. `--pty` alone (no --headless) keeps today's foreground claude-pty behavior. - `hcom claude --headless --pty` with no prompt is now a valid launch; the session sits idle waiting for hcom messages. - `--pty` added to `SHARED_LAUNCH_FLAGS` so it shows up in help for both fresh launches and resume/fork. Plumbing - `HcomLaunchFlags.pty` parsed in `extract_launch_flags`; forwarded into `prepare_launch_execution` and the dispatch-remote payload. - `prepare_launch_execution` now skips the -p injection + background defaults when `--pty` is set — the PTY wrapper hosts the TUI, and forcing print mode there would end the session on first reply. - `validate_claude_headless_launch` gets a `use_pty` parameter. Bare `claude --headless --pty` is allowed; the prompt-required invariant still applies to the NativePrint path. - `RemoteLaunchRequest.pty` carries the flag across the relay. Launcher - Drop the `bail!("Claude PTY does not support headless/background mode")` guard in `launch()`. - Route `LaunchTool::ClaudePty` through `launch_pty_or_background` the way gemini/codex/opencode do, so background mode spawns the PTY wrapper in a detached runner. - Introduce `LaunchBackend` enum (`InteractiveVisible` / `HeadlessPty` / `NativePrint`). Resolved once in `launch()` from the already- prepared (tool, background, pty) triple. Dispatch branches on the enum where it clarifies intent (claude's NativePrint-vs-visible split); the other tools' paths already encode the split via LaunchTool and use `launch_pty_or_background` directly. Tests - Unit: `--pty --headless` claude does not inject -p and keeps use_pty true; interactive claude-pty unchanged. - Unit: validator allows no-prompt `--pty --headless` and still rejects no-prompt non-PTY headless. - Unit: `LaunchBackend::resolve` matrix for each (tool, background, pty) combination. Live - `env -u HCOM_INSTANCE_NAME hcom claude --headless --pty --hcom-prompt 'respond via hcom send then stop'` → round-trip reply. - `env -u HCOM_INSTANCE_NAME hcom claude --headless --pty` (no prompt) → reaches listening, replies to `hcom send`. * fix(launch): reject --pty + -p/--print conflict `hcom claude --headless --pty -p 'task'` (or bare `--pty -p`) routed through ClaudePty and spawned the PTY wrapper, but -p is claude's one-shot print mode — it answers and exits, tearing down the live session the moment the first reply lands. Silent misbehavior. Reject the combination explicitly in validate_claude_headless_launch rather than stripping -p, so the user notices: --pty means live TUI session, -p means print mode. Shared validator runs on both local and remote paths, so the remote path inherits the check. Addresses @mafo's edge-case finding on PR #28. --------- Co-authored-by: aannoo <aannoo@users.noreply.github.com>

mmkzer0 added 5 commits April 22, 2026 19:24

fix(pty): mark PTY child processes as hcom-launched

f2edfdf

Set HCOM_LAUNCHED=1 in hcom pty child environments so downstream hooks and tool integrations correctly recognize PTY-managed sessions as hcom launches.

fix(pty): tighten headless stdin and inject test handling

3120943

- ignore stdin POLLHUP for non-TTY headless PTY sessions - make inject accept regression test resilient to non-blocking timing

Copilot AI review requested due to automatic review settings April 22, 2026 19:34

Copilot started reviewing on behalf of mmkzer0 April 22, 2026 19:34 View session

Copilot AI reviewed Apr 22, 2026

View reviewed changes

aannoo merged commit 3716c1c into aannoo:main Apr 23, 2026
13 of 14 checks passed

This was referenced Apr 23, 2026

Normalize claude + --headless + prompt to print-headless; bound PTY listener backoff #27

Merged

Full headless UX: PTY-headless claude, --pty flag, LaunchBackend enum #28

Merged

mmkzer0 deleted the fix/headless-pty-eof branch April 23, 2026 08:57

chenrui333 mentioned this pull request Apr 24, 2026

hcom 0.7.13 chenrui333/homebrew-tap#6334

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix headless PTY background launches on macOS#26

Fix headless PTY background launches on macOS#26
aannoo merged 5 commits intoaannoo:mainfrom
mmkzer0:fix/headless-pty-eof

mmkzer0 commented Apr 22, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Uh oh!

aannoo commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mmkzer0 commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fix headless PTY background launches on macOS

Summary

Motivation

What changed

Launcher / background PTY env

PTY loop hardening

PTY child identity

Claude safety guard

Notes / implementation details

Validation

Automated

Manual

Active-binary PTY re-check

Final pre-PR hardening

Deferred follow-up

Commit stack

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aannoo commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mmkzer0 commented Apr 22, 2026 •

edited

Loading