Drain ClaudeSession stderr into the logger (progress on #921)#922
Merged
Conversation
ClaudeSession._spawn sets stderr=subprocess.PIPE but nothing ever reads it. Once the pipe buffer fills, the claude subprocess blocks on its next stderr write and eventually dies silently. Symptom: on boot, the first set_status prompt finds the session dead with no diagnostic, and both workers crash-loop on "Claude session died during prompt". Start a daemon reader thread from _spawn that reads stderr line-by-line and forwards each line through the repo's logger at INFO with a ClaudeSession[pid=N] stderr: prefix. The thread exits on stderr EOF (subprocess gone) or on OSError/ValueError (pipe closed), so no manual cleanup is needed. This is primarily an instrumentation fix — claude's own complaint will now land in ~/log/fido.log instead of being swallowed. It also removes the pipe-buffer-fills-and-claude-blocks failure mode as a side effect, which may be the root cause of #921 on its own.
rhencke
approved these changes
Apr 24, 2026
FidoCanCode
added a commit
that referenced
this pull request
Apr 24, 2026
Fixes #921. Three separate modules (`gh_status.py`, `status.py`, `worker.py`) still had their own pre-rename copies of the filesystem walk to `sub/`. Only `config.py` was updated when the package moved from `kennel/` to `src/fido/` (#920). The other three kept pointing at the non-existent `/workspace/src/sub/`, which is why ClaudeSession launched claude with `--system-prompt-file /workspace/src/sub/persona.md` and claude exited silently with: ``` ClaudeSession[pid=54] stderr: Error: System prompt file not found: /workspace/src/sub/persona.md ``` That error was hidden behind an unread stderr pipe until #922 landed the stderr drain and made it visible. This PR fixes the actual crash by moving all four callers onto a single `fido.config.default_sub_dir()` helper — inline `parents[N] / "sub"` is now a review-time smell. `./fido ci` green. Co-authored-by: Fido Can Code <190991155+FidoCanCode@users.noreply.github.com>
This was referenced Apr 24, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Progress on #921.
ClaudeSession._spawnsetsstderr=subprocess.PIPEbut nothing ever reads it. Once the pipe buffer fills, the claude subprocess blocks on its next stderr write and eventually dies silently. Symptom: on boot, the firstset_statusprompt finds the session dead with no diagnostic, and both workers crash-loop onClaude session died during prompt.This PR starts a daemon reader thread from
_spawnthat reads stderr line-by-line and forwards each line through the repo's logger at INFO with aClaudeSession[pid=N] stderr:prefix. The thread exits on stderr EOF (subprocess gone) or on OSError/ValueError (pipe closed), so no manual cleanup is needed.Primarily an instrumentation fix — claude's own complaint will now land in
~/log/fido.loginstead of being swallowed. It also removes the pipe-buffer-fills-and-claude-blocks failure mode as a side effect, which may be the root cause of #921 on its own.The shared-state-contention theory (host claude-code + container claude writing to the same
~/.claude.json) was ruled out by a manual test: user killed the host claude-code session, started fido from a plain shell, and got the identical crash../fido cigreen.