fix(subagent): retry announce on timeout#17028
fix(subagent): retry announce on timeout#17028Limitless2023 wants to merge 1 commit intoopenclaw:mainfrom
Conversation
- Add retry logic with exponential backoff (3 attempts) - Fixes subagent results silently dropped on gateway timeout - Log retries and failures for debugging Fixes openclaw#17000
| // Retry on timeout, up to maxAttempts | ||
| if (isTimeout && attempt < maxAttempts) { | ||
| const delayMs = Math.min(5000 * Math.pow(2, attempt - 1), 30000); | ||
| log.warn(`Subagent announce timeout, retrying (attempt ${attempt + 1}/${maxAttempts})`, { |
There was a problem hiding this comment.
log is not defined or imported, will throw ReferenceError. Use defaultRuntime.error instead (already imported on line 15).
| log.warn(`Subagent announce timeout, retrying (attempt ${attempt + 1}/${maxAttempts})`, { | |
| defaultRuntime.error?.(`Subagent announce timeout, retrying (attempt ${attempt + 1}/${maxAttempts})`, { |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/subagent-announce.ts
Line: 142:142
Comment:
`log` is not defined or imported, will throw ReferenceError. Use `defaultRuntime.error` instead (already imported on line 15).
```suggestion
defaultRuntime.error?.(`Subagent announce timeout, retrying (attempt ${attempt + 1}/${maxAttempts})`, {
```
How can I resolve this? If you propose a fix, please make it concise.| } | ||
| // Max retries exceeded or non-timeout error - log and drop | ||
| if (attempt >= maxAttempts) { | ||
| log.error(`Subagent announce failed after ${maxAttempts} attempts: ${errorMsg}`, { |
There was a problem hiding this comment.
log is not defined or imported, will throw ReferenceError. Use defaultRuntime.error instead (already imported on line 15).
| log.error(`Subagent announce failed after ${maxAttempts} attempts: ${errorMsg}`, { | |
| defaultRuntime.error?.(`Subagent announce failed after ${maxAttempts} attempts: ${errorMsg}`, { |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/subagent-announce.ts
Line: 151:151
Comment:
`log` is not defined or imported, will throw ReferenceError. Use `defaultRuntime.error` instead (already imported on line 15).
```suggestion
defaultRuntime.error?.(`Subagent announce failed after ${maxAttempts} attempts: ${errorMsg}`, {
```
How can I resolve this? If you propose a fix, please make it concise.bfc1ccb to
f92900f
Compare
| to: origin?.to, | ||
| threadId, | ||
| deliver: true, | ||
| idempotencyKey: crypto.randomUUID(), |
There was a problem hiding this comment.
New idempotencyKey per retry risks duplicate delivery
Each retry generates a fresh UUID here. The gateway's agent method deduplicates requests using idempotencyKey (see src/gateway/server-methods/agent.ts:209: context.dedupe.get(\agent:${idem}`)`). If the first attempt was processed by the gateway but the response timed out on the client side, the retry with a new key will bypass dedup and could trigger a duplicate message delivery to the parent agent.
Generate the key once before the try block and reuse it across retries. For example, add an idemKey parameter (defaulting to a new UUID on the first call) and pass it through on recursive calls:
// At the top of sendAnnounce, before the try block:
const idempotencyKey = idemKey ?? crypto.randomUUID();
// Then on recursive retry:
return sendAnnounce(item, attempt + 1, maxAttempts, idempotencyKey);
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/subagent-announce.ts
Line: 131:131
Comment:
**New idempotencyKey per retry risks duplicate delivery**
Each retry generates a fresh UUID here. The gateway's `agent` method deduplicates requests using `idempotencyKey` (see `src/gateway/server-methods/agent.ts:209`: `context.dedupe.get(\`agent:${idem}\`)`). If the first attempt was processed by the gateway but the *response* timed out on the client side, the retry with a new key will bypass dedup and could trigger a duplicate message delivery to the parent agent.
Generate the key once before the try block and reuse it across retries. For example, add an `idemKey` parameter (defaulting to a new UUID on the first call) and pass it through on recursive calls:
```
// At the top of sendAnnounce, before the try block:
const idempotencyKey = idemKey ?? crypto.randomUUID();
// Then on recursive retry:
return sendAnnounce(item, attempt + 1, maxAttempts, idempotencyKey);
```
How can I resolve this? If you propose a fix, please make it concise.
Fixes #17000
Problem
Sub-agent task results are silently lost when announcement delivery exceeds the 60-second gateway timeout. No retry, no error shown to user.
Solution
Impact
Greptile Summary
Adds retry logic with exponential backoff (up to 3 attempts, 5s/10s/20s delays) to
sendAnnounceinsrc/agents/subagent-announce.tsto handle gateway timeout failures when delivering subagent task results.idempotencyKeyis generated on each retry attempt. Since the gateway deduplicatesagentmethod calls by idempotency key, retries bypass dedup and can cause duplicate message delivery if the original request succeeded but its response timed outlogvariable issue (undefined/not imported) still needs to be resolved before mergingConfidence Score: 2/5
log) and a duplicate delivery risk from regenerating idempotency keys on retries.logReferenceError (already flagged, not yet fixed) will crash the retry/error paths at runtime, and the new idempotencyKey per retry bypasses gateway deduplication, risking duplicate subagent announcements to the parent session.src/agents/subagent-announce.ts— both thelogimport and idempotencyKey reuse need to be addressedLast reviewed commit: 59e87d9