Skip to content

fix(subagent): retry announce on timeout#17028

Open
Limitless2023 wants to merge 1 commit intoopenclaw:mainfrom
Limitless2023:fix/subagent-announce-retry
Open

fix(subagent): retry announce on timeout#17028
Limitless2023 wants to merge 1 commit intoopenclaw:mainfrom
Limitless2023:fix/subagent-announce-retry

Conversation

@Limitless2023
Copy link
Contributor

@Limitless2023 Limitless2023 commented Feb 15, 2026

Fixes #17000

Problem

Sub-agent task results are silently lost when announcement delivery exceeds the 60-second gateway timeout. No retry, no error shown to user.

Solution

  • Add retry logic with exponential backoff (up to 3 attempts)
  • Delay between retries: 5s, 10s, 20s (capped at 30s)
  • Log retries and final failures for debugging

Impact

  • Subagent results are more reliably delivered to parent session
  • Timeout failures are now logged instead of silently dropped
  • Backoff prevents overwhelming busy gateway

Greptile Summary

Adds retry logic with exponential backoff (up to 3 attempts, 5s/10s/20s delays) to sendAnnounce in src/agents/subagent-announce.ts to handle gateway timeout failures when delivering subagent task results.

  • The retry mechanism correctly catches timeout errors and applies capped exponential backoff before retrying
  • Issue: A new idempotencyKey is generated on each retry attempt. Since the gateway deduplicates agent method calls by idempotency key, retries bypass dedup and can cause duplicate message delivery if the original request succeeded but its response timed out
  • The previously-flagged log variable issue (undefined/not imported) still needs to be resolved before merging

Confidence Score: 2/5

  • This PR has a runtime crash bug (undefined log) and a duplicate delivery risk from regenerating idempotency keys on retries.
  • Score of 2 reflects two issues: the log ReferenceError (already flagged, not yet fixed) will crash the retry/error paths at runtime, and the new idempotencyKey per retry bypasses gateway deduplication, risking duplicate subagent announcements to the parent session.
  • src/agents/subagent-announce.ts — both the log import and idempotencyKey reuse need to be addressed

Last reviewed commit: 59e87d9

- Add retry logic with exponential backoff (3 attempts)
- Fixes subagent results silently dropped on gateway timeout
- Log retries and failures for debugging

Fixes openclaw#17000
@openclaw-barnacle openclaw-barnacle bot added agents Agent runtime and tooling size: S labels Feb 15, 2026
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

// Retry on timeout, up to maxAttempts
if (isTimeout && attempt < maxAttempts) {
const delayMs = Math.min(5000 * Math.pow(2, attempt - 1), 30000);
log.warn(`Subagent announce timeout, retrying (attempt ${attempt + 1}/${maxAttempts})`, {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log is not defined or imported, will throw ReferenceError. Use defaultRuntime.error instead (already imported on line 15).

Suggested change
log.warn(`Subagent announce timeout, retrying (attempt ${attempt + 1}/${maxAttempts})`, {
defaultRuntime.error?.(`Subagent announce timeout, retrying (attempt ${attempt + 1}/${maxAttempts})`, {
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/subagent-announce.ts
Line: 142:142

Comment:
`log` is not defined or imported, will throw ReferenceError. Use `defaultRuntime.error` instead (already imported on line 15).

```suggestion
      defaultRuntime.error?.(`Subagent announce timeout, retrying (attempt ${attempt + 1}/${maxAttempts})`, {
```

How can I resolve this? If you propose a fix, please make it concise.

}
// Max retries exceeded or non-timeout error - log and drop
if (attempt >= maxAttempts) {
log.error(`Subagent announce failed after ${maxAttempts} attempts: ${errorMsg}`, {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log is not defined or imported, will throw ReferenceError. Use defaultRuntime.error instead (already imported on line 15).

Suggested change
log.error(`Subagent announce failed after ${maxAttempts} attempts: ${errorMsg}`, {
defaultRuntime.error?.(`Subagent announce failed after ${maxAttempts} attempts: ${errorMsg}`, {
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/subagent-announce.ts
Line: 151:151

Comment:
`log` is not defined or imported, will throw ReferenceError. Use `defaultRuntime.error` instead (already imported on line 15).

```suggestion
      defaultRuntime.error?.(`Subagent announce failed after ${maxAttempts} attempts: ${errorMsg}`, {
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

to: origin?.to,
threadId,
deliver: true,
idempotencyKey: crypto.randomUUID(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New idempotencyKey per retry risks duplicate delivery

Each retry generates a fresh UUID here. The gateway's agent method deduplicates requests using idempotencyKey (see src/gateway/server-methods/agent.ts:209: context.dedupe.get(\agent:${idem}`)`). If the first attempt was processed by the gateway but the response timed out on the client side, the retry with a new key will bypass dedup and could trigger a duplicate message delivery to the parent agent.

Generate the key once before the try block and reuse it across retries. For example, add an idemKey parameter (defaulting to a new UUID on the first call) and pass it through on recursive calls:

// At the top of sendAnnounce, before the try block:
const idempotencyKey = idemKey ?? crypto.randomUUID();

// Then on recursive retry:
return sendAnnounce(item, attempt + 1, maxAttempts, idempotencyKey);
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/subagent-announce.ts
Line: 131:131

Comment:
**New idempotencyKey per retry risks duplicate delivery**

Each retry generates a fresh UUID here. The gateway's `agent` method deduplicates requests using `idempotencyKey` (see `src/gateway/server-methods/agent.ts:209`: `context.dedupe.get(\`agent:${idem}\`)`). If the first attempt was processed by the gateway but the *response* timed out on the client side, the retry with a new key will bypass dedup and could trigger a duplicate message delivery to the parent agent.

Generate the key once before the try block and reuse it across retries. For example, add an `idemKey` parameter (defaulting to a new UUID on the first call) and pass it through on recursive calls:

```
// At the top of sendAnnounce, before the try block:
const idempotencyKey = idemKey ?? crypto.randomUUID();

// Then on recursive retry:
return sendAnnounce(item, attempt + 1, maxAttempts, idempotencyKey);
```

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sub-agent announcements silently dropped on gateway timeout (hardcoded 60s, no retry)

3 participants

Comments