goal: pause continuation loops on usage limits and blockers#23094
Conversation
|
Manual testing note: I verified the usage-limit path with a local simulated Responses server that returned HTTP 429 errors, and confirmed the goal transitions into the usage-limited state instead of continuing to synthesize failing turns. |
|
@codex review |
| if args.status != ThreadGoalStatus::Complete { | ||
| if !matches!( | ||
| args.status, | ||
| ThreadGoalStatus::Complete | ThreadGoalStatus::Blocked |
There was a problem hiding this comment.
I think this lets blocked overwrite a just-written budget_limited status. update_goal accounts the turn first, so if that accounting crosses the token budget we write budget_limited, then this path calls set_thread_goal(Blocked ...) and the state update only preserves budget-limited for paused
There was a problem hiding this comment.
Yep, good catch. Fixed.
| AppThreadGoalStatus::Paused => "Commands: /goal edit, /goal resume, /goal clear", | ||
| AppThreadGoalStatus::Paused | ||
| | AppThreadGoalStatus::Blocked | ||
| | AppThreadGoalStatus::UsageLimited => "Commands: /goal edit, /goal resume, /goal clear", |
There was a problem hiding this comment.
We now treat blocked and usage-limited as resumable in the menu/footer, but the thread-resume prompt path still only checks ThreadGoalStatus::Paused.
Am I missing something?
There was a problem hiding this comment.
I was thinking that we wouldn't want to resume in those cases, but it makes sense to present the user with a choice. If they haven't resolved the block or usage limit, then they can always choose to cancel the resume.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5da03a9e57
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1a67e33a1c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| .await?; | ||
| let Some(goal) = state_db | ||
| .thread_goals() | ||
| .usage_limit_active_thread_goal(self.conversation_id) |
There was a problem hiding this comment.
Bind usage-limit updates to the turn's goal
When a usage-limit error from an old turn races with thread/goal/set replacing or resuming the goal, this unqualified update marks whichever goal is currently active or budget-limited as usageLimited. The accounting path above already uses the turn's expected goal id, but this call discards it, so a newly created/resumed goal can be stopped even though it never hit the usage limit; please filter this status update by the turn's active goal id as the other goal mutations do with expected_goal_id.
Useful? React with 👍 / 👎.
Addresses #22833, #22245, #23067
Why
/goalcan keep synthesizing turns even when the next turn cannot make meaningful progress. Hard usage exhaustion can replay failing turns, and repeated permission or external-resource blockers can keep burning tokens while waiting for user or system intervention.What changed
blockedandusageLimitedgoal states. As withpaused, goal continuation stops with these states.usageLimitedafter usage-limit failures.update_goaltool to setblockedonly under explicit repeated-impasse guidance. Updated goal continuation prompt to specify that agent should useblockedonly when it has made at least three attempts to get past an impasse.Most of the files touched by this PR are because of the small app server protocol update.
Validation
I manually reproduced a number of situations where an agent can run into a true impasse and verified that it properly enters
blockedstate. I then resumed and verified that it once again enteredblockedstate several turns later if the impasse still exists.I also manually reproduced the usage-limit condition by creating a simulated responses API endpoint that returns 429 errors with the appropriate error message. Verified that the goal runtime properly moves the goal into
usageLimitedstate and TUI UI updates appropriately. Verified that/goal resumeresumes (and immediately goes back intoussageLImitedstate if appropriate).Follow-up PRs
Small changes will be needed to the GUI clients to properly handle the two new states.