fix: reset streaming state and close stale WebSocket on session switch (fixes #144)#151
Conversation
…element#143) Two independent fixes for `_send_message_to_agent`: 1. **Retry on ReadTimeout** — wrap the `llm_client.complete()` call in a retry loop (up to 3 attempts, exponential back-off: 1 s then 2 s). A single transient `httpx.ReadTimeout` no longer aborts a long-running A2A task mid-execution; it is retried silently instead. Only after all 3 attempts fail does the exception propagate to the outer handler. 2. **Meaningful error message** — `httpx.ReadTimeout.__str__()` returns an empty string, so the UI previously showed only `"❌ Message send error:"` with no cause. The outer `except` block now falls back to `type(e).__name__` when `str(e)` is empty, producing e.g. `"❌ Message send error: ReadTimeout"`. Fixes dataelement#143. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…dataelement#144) Three bugs caused the UI to get stuck in "outputting" state after switching conversation sessions or agents: 1. selectSession() did not reset isStreaming / isWaiting before loading the new session, so the input area remained disabled if the previous session was mid-stream. 2. selectSession() did not close the existing WebSocket. The old WS could still fire onmessage after the switch, appending stale streaming data into the new session's message list. 3. The agent-change useEffect also omitted both fixes above, so switching to a different agent carried over the same stuck state and orphaned WS. Fix: in selectSession(), close wsRef.current when readyState !== CLOSED (covers both OPEN and CONNECTING states) and reset isStreaming/isWaiting. Apply the same two resets in the agent-change useEffect. The existing WS connect useEffect already has a cleanup function (cancelled = true + wsRef.close) that prevents double-close races. Fixes dataelement#144. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Thanks for working on this. The overall direction makes sense: switching sessions or agents should reset stale streaming UI state, and data from one conversation should not leak into another. That said, I do not think this fully resolves #144 yet. From my reading, there are still a few blocking issues in the current fix:
One additional note: this PR is framed as a fix for #144, but it also includes an unrelated backend change for A2A timeout retries. I would strongly recommend splitting that into a separate PR so the review scope stays clean and easier to reason about. My view is that the stuck streaming state is only partially addressed here, while the content-loss and race-condition aspects of #144 are still not fully resolved. |
|
@yaojin3616 I think there is a broader lifecycle question underneath this bug: the current implementation appears to treat “currently visible in the UI” and “currently running conversation” as the same thing. That coupling makes session switching fragile and increases the chance of both stuck-state bugs and content-loss bugs. A longer-term model worth considering is to separate UI visibility from conversation execution lifecycle, so that switching between sessions or agents does not necessarily interrupt work that is already in progress. That would make multi-conversation workflows much more robust. I do not think that needs to be solved in this PR in order to fix #144. But it may be a useful follow-up direction once the immediate bug is fixed cleanly. |
Problem
When switching between conversation sessions or agents while a response was streaming, the UI got stuck in the "outputting" state permanently — the input bar stayed disabled and a loading indicator spun forever. Returning to the original session also showed an empty message list.
Three root causes in
AgentDetail.tsx:selectSession()did not resetisStreaming/isWaiting— switching sessions while streaming left both flagstrue, disabling the input area indefinitely.selectSession()did not close the existing WebSocket — the old WS kept firingonmessage, appending stale streaming chunks into the new session's message list and corrupting its history.useEffecthad the same two omissions — navigating to a different agent carried over the stuck state and kept the old WS alive.Reported in #144.
Fix
selectSession()— closewsRef.currentwhenreadyState !== WebSocket.CLOSED(covers bothOPENandCONNECTINGstates), then resetisStreamingandisWaitingbefore loading the new session.Agent-change
useEffect— apply the same WS close + state reset whenidchanges.The existing WS connect
useEffectalready has a proper cleanup (cancelled = true+wsRef.current?.close()) that prevents double-close races when React unmounts the effect.Changes
frontend/src/pages/AgentDetail.tsxonly — two small additions:const selectSession = async (sess: any) => { + if (wsRef.current && wsRef.current.readyState !== WebSocket.CLOSED) { + wsRef.current.close(); + wsRef.current = null; + } setChatMessages([]); setHistoryMsgs([]); + setIsStreaming(false); + setIsWaiting(false); setActiveSession(sess); ... useEffect(() => { + if (wsRef.current && wsRef.current.readyState !== WebSocket.CLOSED) { + wsRef.current.close(); + wsRef.current = null; + } setActiveSession(null); setChatMessages([]); setHistoryMsgs([]); + setIsStreaming(false); + setIsWaiting(false); ... }, [id]);🤖 Generated with Claude Code