fix(hub): mark queued sessions thinking before Codex task_start#514
fix(hub): mark queued sessions thinking before Codex task_start#514tiann merged 3 commits intotiann:mainfrom
Conversation
There was a problem hiding this comment.
Findings
- [Major] Queued sends now refresh session liveness without any delivery confirmation —
markMessageQueued()setsactive=trueand bumpsactiveAtimmediately aftermessageService.sendMessage(), but that send path only writes to SQLite and emits to the Socket.IO room without an ack. If the CLI drops just before the 30s timeout,/sessions/:id/messagesstill passesrequireActive, this code resets the liveness window, and the hub keeps showing a dead session as active/thinking even though no agent accepted the message. Evidencehub/src/sync/sessionCache.ts:270,hub/src/sync/sessionCache.ts:271; contexthub/src/sync/messageService.ts:107,hub/src/web/routes/guards.ts:29.
Suggested fix:
// A queued web message is not proof that the CLI is still alive.
if (!session.active) return
session.thinking = true
session.thinkingAt = nextTime
session.updatedAt = Math.max(session.updatedAt, nextTime)
// keep activeAt unchanged; only session-alive should extend livenessSummary
- Review mode: initial
- 1 major issue found: queued message acceptance now extends session liveness from the hub side instead of waiting for a CLI heartbeat, so a near-timeout disconnect can be misreported as active/thinking and timeout cleanup is delayed.
Testing
- Not run (automation environment:
bunnot installed)
HAPI Bot
| const previousUpdatedAt = session.updatedAt | ||
|
|
||
| session.active = true | ||
| session.activeAt = Math.max(session.activeAt, nextTime) |
There was a problem hiding this comment.
[MAJOR] markMessageQueued() is resetting activeAt from the hub side even though sendMessage() only emits to the Socket.IO room and never confirms that any CLI actually received the message (hub/src/sync/messageService.ts:107). If the agent disconnected just before the 30s timeout, /sessions/:id/messages still passes requireActive, this line extends the liveness window, and the UI will keep a dead session active/thinking for another timeout period.
Suggested fix:
if (!session.active) return
session.thinking = true
session.thinkingAt = nextTime
session.updatedAt = Math.max(session.updatedAt, nextTime)|
@tiann 可以看看我这个pr吗 |
|
看上面 bot 的 review
xiaobaifly7 ***@***.***> 于2026年4月23日周四 23:22写道:
… *xiaobaifly7* left a comment (tiann/hapi#514)
<#514?email_source=notifications&email_token=ABAJUEGST4HSI3LPCOSWWQ34XIYNFA5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTIMZQGU3DINJWG422M4TFMFZW63VHNVSW45DJN5XKKZLWMVXHJNLQOJPWG33NNVSW45C7N5YGK3S7MNWGSY3L#issuecomment-4305645675>
@tiann <https://github.com/tiann> 可以看看我这个pr吗
—
Reply to this email directly, view it on GitHub
<#514?email_source=notifications&email_token=ABAJUEGST4HSI3LPCOSWWQ34XIYNFA5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTIMZQGU3DINJWG422M4TFMFZW63VHNVSW45DJN5XKKZLWMVXHJNLQOJPWG33NNVSW45C7N5YGK3S7MNWGSY3L#issuecomment-4305645675>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABAJUEFCPGTQ7QQJOFU6XWT4XIYNFAVCNFSM6AAAAACYCGRAB2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DGMBVGY2DKNRXGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
There was a problem hiding this comment.
Findings
- [Major] Queued-thinking expiry depends on hub/CLI clock alignment, so the 15s grace window can stretch far past its limit when the agent machine clock lags behind the hub.
markMessageQueued()storespendingThinkingUntilfrom hubDate.now(), buthandleSessionAlive()preserves and clears that state by comparing against CLI-supplied heartbeat time. Because the CLI sendssession-alivetimestamps from its own clock, a skewed client can keepthinking=truepinned until its local clock catches up instead of expiring after 15 real seconds. Evidencehub/src/sync/sessionCache.ts:190,hub/src/sync/sessionCache.ts:196,hub/src/sync/sessionCache.ts:273; contextcli/src/api/apiSession.ts:145,cli/src/api/apiSession.ts:490.
Suggested fix:
const now = Date.now()
const preserveQueuedThinking = !requestedThinking && pendingThinkingUntil > now
session.thinking = requestedThinking || preserveQueuedThinking
session.thinkingAt = t
if (requestedThinking || pendingThinkingUntil <= now) {
this.pendingThinkingUntilBySessionId.delete(session.id)
}Summary
- Review mode: follow-up after new commits
- 1 major issue found: the new queued-thinking grace window mixes hub time and CLI heartbeat time, so clock skew can keep a session stuck in
thinkingwell past the intended 15 seconds.
Testing
- Not run (
bunnot installed in automation environment) - Missing coverage: skewed CLI heartbeat timestamps relative to hub time during queued-thinking expiry
HAPI Bot
| const previousCollaborationMode = session.collaborationMode | ||
| const pendingThinkingUntil = this.pendingThinkingUntilBySessionId.get(session.id) ?? 0 | ||
| const requestedThinking = Boolean(payload.thinking) | ||
| const preserveQueuedThinking = !requestedThinking && pendingThinkingUntil > t |
There was a problem hiding this comment.
[MAJOR] pendingThinkingUntil is written from hub time in markMessageQueued() but read back against CLI heartbeat time here. Since the CLI emits session-alive with its own Date.now() (cli/src/api/apiSession.ts:145, cli/src/api/apiSession.ts:490), any machine clock lag will keep preserveQueuedThinking true long past the intended 15s grace window and leave the UI stuck in thinking.
Suggested fix:
const now = Date.now()
const preserveQueuedThinking = !requestedThinking && pendingThinkingUntil > now
session.thinking = requestedThinking || preserveQueuedThinking
session.thinkingAt = t
if (requestedThinking || pendingThinkingUntil <= now) {
this.pendingThinkingUntilBySessionId.delete(session.id)
}There was a problem hiding this comment.
Findings
No findings.
Summary
- Review mode: follow-up after new commits
- No issues found in the latest full diff. The follow-up addresses the prior clock-skew finding by expiring queued-thinking state against hub time and adds coverage for skewed heartbeats.
Testing
- Not run (automation)
HAPI Bot
|
@tiann 搞定了,麻烦合并 |
Summary
thinking=trueimmediately after the hub accepts a user messagethinking=falsekeepalive heartbeats until a real task start arrives or a short grace window expiresProblem
When a remote Codex message was sent from the web app, the UI could stay on
onlinefor a while even though the message had already been accepted by the hub. The status only switched later when Codex emittedtask_startedor a later alive update.In practice this produced a visible delay where the message was already sent but the session state did not reflect that work had begun yet.
Fix
The hub now sets queued sessions to
thinking=trueas soon assendMessage()succeeds, and preserves that state across short false heartbeats from the CLI until either:thinking=trueupdate arrivesThis keeps the UI responsive without leaving sessions stuck in thinking forever.
Tests
bun test src/sync/aliveEvents.test.tsbun test src/sync/sessionModel.test.ts