Skip to content

feat(kbagent): implement request-level timeout override and retry for action execution#10158

Merged
leon-ape merged 2 commits intoapecloud:mainfrom
gongna-au:feat/kbagent-request-timeout-retry
Apr 24, 2026
Merged

feat(kbagent): implement request-level timeout override and retry for action execution#10158
leon-ape merged 2 commits intoapecloud:mainfrom
gongna-au:feat/kbagent-request-timeout-retry

Conversation

@gongna-au
Copy link
Copy Markdown
Contributor

Summary

The ActionRequest proto already defined TimeoutSeconds and RetryPolicy fields, and the controller-side lifecycle code (pkg/controller/lifecycle/kbagent.go) already populated them when calling kbagent. However, the kbagent service handler completely ignored both fields (marked with // TODO: not implemented comments), always using the action-definition-level timeout and never retrying.

What Changed

Request-level timeout override:

  • Added resolveTimeout() that prefers ActionRequest.TimeoutSeconds over Action.TimeoutSeconds
  • If the request specifies a timeout, it overrides the action's default; otherwise falls back to the action-level setting
  • Applied to both blocking and non-blocking call paths

Retry support:

  • Added callActionWithRetry() that honors ActionRequest.RetryPolicy
  • Retries up to MaxRetries times with configurable RetryInterval (defaults to 1s)
  • Respects context cancellation between retries
  • Only retries blocking (synchronous) calls

Cleanup:

  • Removed // TODO: not implemented comments from proto.go

Files Changed (2)

  • pkg/kbagent/proto/proto.go — removed TODO comments
  • pkg/kbagent/service/action.go — added timeout resolution, retry logic, updated call paths

… action execution

The ActionRequest proto already defined TimeoutSeconds and RetryPolicy
fields, and the controller-side lifecycle code already populated them,
but the kbagent service-side handler ignored both fields entirely
(marked with TODO comments).

This commit:
- Adds resolveTimeout() to let request-level TimeoutSeconds override
  the action-level default, giving callers per-request timeout control
- Adds callActionWithRetry() to honor RetryPolicy with configurable
  max retries and interval, respecting context cancellation
- Passes the resolved timeout through to non-blocking calls as well
- Removes the 'TODO: not implemented' comments from proto.go
@gongna-au gongna-au requested review from a team and leon-ape as code owners April 22, 2026 08:14
@github-actions github-actions Bot added the size/M Denotes a PR that changes 30-99 lines. label Apr 22, 2026
@apecloud-bot
Copy link
Copy Markdown
Collaborator

Auto Cherry-pick Instructions

Usage:
  - /nopick: Not auto cherry-pick when PR merged.
  - /pick: release-x.x [release-x.x]: Auto cherry-pick to the specified branch when PR merged.

Example:
  - /nopick
  - /pick release-1.1

@apecloud-bot apecloud-bot added the pre-approve Fork PR Pre Approve Test label Apr 22, 2026
@gongna-au
Copy link
Copy Markdown
Contributor Author

/nopick

@apecloud-bot apecloud-bot added the nopick Not auto cherry-pick when PR merged label Apr 22, 2026
Comment thread pkg/kbagent/service/action.go Outdated
}

interval := retryPolicy.RetryInterval
if interval <= 0 {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should stay consistent with the API definition: when RetryInterval is 0, retries should happen without waiting.

When RetryInterval is 0, retries should happen without waiting per the
API definition. Previously the code fell back to 1s for any interval<=0,
which silently overrode an explicit zero-interval setting. Now only
positive intervals trigger a sleep; zero means retry immediately.
@apecloud-bot apecloud-bot added pre-approve Fork PR Pre Approve Test and removed pre-approve Fork PR Pre Approve Test labels Apr 24, 2026
@leon-ape leon-ape merged commit a605e63 into apecloud:main Apr 24, 2026
20 checks passed
@github-actions github-actions Bot added this to the Release 1.2.0 milestone Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

nopick Not auto cherry-pick when PR merged pre-approve Fork PR Pre Approve Test size/M Denotes a PR that changes 30-99 lines.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants