Skip to content

fix: detect early PTY exit in daemon sessions (#320)#341

Merged
Wirasm merged 2 commits into
mainfrom
kild/320-daemon-pty-exit
Feb 11, 2026
Merged

fix: detect early PTY exit in daemon sessions (#320)#341
Wirasm merged 2 commits into
mainfrom
kild/320-daemon-pty-exit

Conversation

@Wirasm

@Wirasm Wirasm commented Feb 11, 2026

Copy link
Copy Markdown
Owner

Summary

When using kild open --daemon or kild open --resume --daemon, the daemon PTY session sometimes exits immediately after spawning. The kild open command reported success (session set to Active), but kild attach failed with the confusing error "session not running" — with no diagnostic info about why.

  • Added a 200ms post-creation health check that polls the daemon after PTY spawn. If the PTY already exited, the error is caught immediately with exit code and scrollback output
  • Added exit_code field to DaemonSession and SessionInfo so the exit code is stored and surfaced via IPC
  • Added get_session_info() and read_scrollback() daemon client functions for the health check
  • Applied the same fix to both create_session and open_session daemon paths

Root Cause

No post-creation health check existed in the daemon code path. create_pty_session() returned as soon as the daemon spawned the PTY, and the exit code was logged but discarded. Fast-failing processes (bad resume session, missing binary, env issues) would exit within milliseconds, but kild-core saved the session as Active and returned success.

Changes

File Change
crates/kild-daemon/src/session/state.rs Added exit_code field, getter, setter to DaemonSession
crates/kild-daemon/src/types.rs Added exit_code to SessionInfo wire type
crates/kild-daemon/src/session/manager.rs Store exit_code in handle_pty_exit before set_stopped
crates/kild-core/src/daemon/client.rs Added get_session_info() and read_scrollback() functions
crates/kild-core/src/sessions/errors.rs Added DaemonPtyExitedEarly error variant with diagnostics
crates/kild-core/src/sessions/handler.rs Added 200ms health check in both daemon code paths

Testing

  • cargo fmt --check — clean
  • cargo clippy --all -- -D warnings — 0 warnings
  • cargo test --all — all tests pass
  • cargo build --all — clean build

Validation

cargo fmt --check && cargo clippy --all -- -D warnings && cargo test --all && cargo build --all

Issue

Fixes #320

When a daemon PTY process exits immediately after spawn (bad resume
session, missing binary, env issue), kild now detects it within 200ms
instead of letting the user discover it later via `kild attach`.

Changes:
- Add exit_code field to DaemonSession and SessionInfo wire type
- Store exit_code in handle_pty_exit before transitioning to Stopped
- Add get_session_info() and read_scrollback() daemon client functions
- Add DaemonPtyExitedEarly error variant with exit code and scrollback
- Add post-creation health check in both create_session and open_session
  daemon paths (200ms grace period + daemon status poll)
- Clean up stopped daemon session on early exit detection

Fixes #320
@Wirasm

Wirasm commented Feb 11, 2026

Copy link
Copy Markdown
Owner Author

Self Code Review

Summary

Clean implementation that directly addresses the root cause. No critical issues found.

Findings

Strengths

  • Root cause correctly addressed: 200ms health check catches fast-failing PTY processes before user sees confusing "session not running" error
  • Exit code plumbing is clean: field, getter/setter, wire type, storage in handle_pty_exit — all follow existing patterns
  • DaemonPtyExitedEarly error provides excellent diagnostics (exit code + last 20 lines of scrollback)
  • Backward-compatible wire protocol change (skip_serializing_if)
  • Both create_session and open_session daemon paths covered

Edge Cases Handled

  • Daemon unreachable during health check → proceeds without check (no regression)
  • Empty scrollback → empty string in error message (exit code alone still useful)
  • Base64 decode failure → empty vec via unwrap_or_default()
  • Stopped daemon session → cleanup via destroy_daemon_session(force=true)

Security

No security concerns identified.

Checklist

  • Fix addresses root cause from investigation
  • Code follows codebase patterns
  • Tests cover the change
  • No obvious bugs introduced

@Wirasm Wirasm merged commit dee5ff2 into main Feb 11, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Daemon PTY sessions exit immediately after open/resume

1 participant