Fix control_response request_id check — read from nested response (closes #975) by FidoCanCode · Pull Request #977 · FidoCanCode/home

FidoCanCode · 2026-04-25T16:11:24Z

Fixes #975.

Real bug

Probed the actual claude-code subprocess (claude --version → 2.1.120). Its control_response shape is:

{"type": "control_response",
 "response": {"subtype": "success", "request_id": "..."}}

request_id is nested under response. Fido's _send_control_set_model checked obj.get("request_id") at the top level. Always None. Predicate never matched. Every switch_model call hung until the subprocess was killed.

Proof

$ grep -c 'switch_model: now on model=' ~/log/fido.log
0
$ grep -c 'switch_model:.*(control_request)' ~/log/fido.log
9

Zero successful returns. Nine attempts. Every one hung. Concrete trace:

15:40:23 [home] switch_model: opus → haiku
(no [home] log lines for 5 minutes)
15:45:23 [home] worker started (fresh thread post-fido-restart)

The session was wedged the entire window. Same shape across every switch_model log line in the file.

Why fido seemed to work anyway

Many switch_model calls are no-ops (target == current model) and return at line 1102 before sending the control_request — so sessions that didn't need an actual model change kept working. The 15:40:30 webhook test that succeeded landed on a freshly-booted confusio session that was already on opus from spawn — no switch needed. Bug only fires on real model changes.

Fix

claude.py:1023-1028:

if obj.get("type") == "control_response":
    response = obj.get("response") or {}
    if response.get("request_id") == request_id:
        return

Test

Existing test fixtures had the wrong shape too — they put request_id at the top level, matching fido's broken expectation. Both helpers (TestClaudeSessionSwitchModel._make_response_line and TestClaudeSessionSendControlSetModel._make_response_line) updated to emit the real nested shape.

New regression test test_ignores_top_level_request_id_in_control_response explicitly emits a malformed top-level placement first, then the correct nested one — asserts the malformed one is skipped and the wait completes only on the nested response. This catches any future regression that re-introduces top-level lookup.

All 264 claude tests pass.

Bug: switch_model hangs when called with _in_turn=True after preempt-cancelled drain #975 — this issue
Use control_request set_model for mid-session model switch instead of session restart (closes #852) #971 — landed control_request set_model with the broken predicate
Bug: webhook handler's first prompt aborted by leftover cancel flag from worker preempt #973 / Bug: webhook handler's first prompt aborted by leftover cancel flag from worker preempt (closes #973) #974 — cancel-leak fix; uncovered the hang underneath
Lint: ban silent .get(literal) and 'k in t and t[k]' patterns at non-dynamic call sites #976 — silent .get linter (would have caught this at write time)

…oses #975) Real claude-code (verified against 2.1.120) emits control_response with request_id nested under response, not at the top level: {"type": "control_response", "response": {"subtype": "success", "request_id": "..."}} Fido's _send_control_set_model checked obj.get("request_id") at the top level. That always returned None, the predicate never matched, and every switch_model call hung until the subprocess was killed. Verified by direct probing of the claude subprocess and by grepping fido.log: switch_model: now on model= (the post-success log line) appears zero times across the entire log; switch_model: ... (control_request) appears 9 times. Every one of those 9 calls hung. The 5-minute home gap at 15:40:23-15:45:23 in fido.log shows the hang concretely. Plus updates the existing test fixtures, which had the wrong shape too — tests passed against the broken code because both used top-level request_id. Adds a regression test that explicitly emits a malformed top-level shape and asserts it is NOT matched, then a correct nested shape and asserts it is.

FidoCanCode requested a review from rhencke April 25, 2026 16:11

rhencke approved these changes Apr 25, 2026

View reviewed changes

FidoCanCode merged commit 95a5552 into main Apr 25, 2026
1 check passed

FidoCanCode deleted the fix-control-response-shape branch April 25, 2026 16:12

This was referenced Apr 25, 2026

Bug: switch_model hangs when called with _in_turn=True after preempt-cancelled drain #975

Closed

Feature: show (# sent, # received) message counts on the claude-code line in fido status #1018

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix control_response request_id check — read from nested response (closes #975)#977

Fix control_response request_id check — read from nested response (closes #975)#977
FidoCanCode merged 1 commit into
mainfrom
fix-control-response-shape

FidoCanCode commented Apr 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

FidoCanCode commented Apr 25, 2026

Real bug

Proof

Why fido seemed to work anyway

Fix

Test

Related

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants