Skip to content

Add session YOLO auto-approval budget#201

Merged
lewtun merged 3 commits into
mainfrom
codex/session-yolo-cost-cap
May 1, 2026
Merged

Add session YOLO auto-approval budget#201
lewtun merged 3 commits into
mainfrom
codex/session-yolo-cost-cap

Conversation

@lewtun
Copy link
Copy Markdown
Member

@lewtun lewtun commented May 1, 2026

No description provided.

Co-authored-by: Codex <codex@openai.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

Claude finished @lewtun's task in 3m 57s —— View job


PR Review — Add session YOLO auto-approval budget

No blocking issues — 3 P1
Verdict: ready to merge


What I checked

  • agent/core/agent_loop.py — approval decision logic, batch reservation, EXEC_APPROVAL path, spend recording
  • agent/core/cost_estimation.py — price catalog fetching, timeout parsing, sandbox/job cost estimation
  • agent/core/session.py + session_persistence.py — policy fields, property, persistence round-trip
  • backend/session_manager.pyupdate_session_auto_approval, restore path, session info
  • backend/routes/agent.py — new PATCH /session/{id}/yolo endpoint, auth
  • frontend/src/components/YoloControl.tsx + sessionStore.ts + sse-chat-transport.ts — UI state, toggle, cap dialog
  • agent/main.py — CLI headless path, _is_scheduled_hf_job_tool guard
  • All three new test files + updated existing tests

P1 — Divergent scheduled-job detection between main.py and agent_loop.py

agent/main.py:58 uses startswith("scheduled ") to guard CLI YOLO:

return str(arguments.get("operation") or "").strip().lower().startswith("scheduled ")

agent/core/agent_loop.py:131 uses exact set membership:

_SCHEDULED_HF_JOB_RUNS = {"scheduled run", "scheduled uv"}

def _is_scheduled_hf_job_run(tool_name, tool_args):
    return tool_name == "hf_jobs" and _operation(tool_args) in _SCHEDULED_HF_JOB_RUNS

An LLM-generated operation like "scheduled run" (double-space), "scheduled uv run", or any future scheduled variant would be caught by startswith in the CLI path but silently fall through _is_scheduled_hf_job_run, causing _base_needs_approval to return False (the operation isn't in the union set either). The session YOLO path would auto-execute without any approval or budget check.

Suggest extracting one predicate: _is_scheduled_op(operation: str) -> bool shared by both files. Fix this →


P1 — _auto_approval metadata pollutes the displayed tool-input arguments

frontend/src/lib/sse-chat-transport.ts:263:

const input = t.auto_approval_blocked
  ? { ...t.arguments, _auto_approval: { blocked: true, reason: ..., ... } }
  : t.arguments;
controller.enqueue({ type: 'tool-input-available', ..., input, dynamic: true });

The injected _auto_approval key becomes part of the tool-input chunk that drives the argument display in the UI. Users would see _auto_approval: { blocked: true, ... } rendered alongside actual tool arguments in the tool-call group, which is confusing. The sentinel is read back in ToolCallGroup.tsx:506 only by InlineApproval, but the surrounding argument rendering will also pick it up.

Consider passing the block metadata as a separate chunk type (e.g., tool-budget-block) rather than mutating the argument object. Fix this →


P1 — EXEC_APPROVAL path records estimated spend regardless of whether YOLO is enabled

agent/core/agent_loop.py:1581:

if _is_budgeted_auto_approval_target(tool_name, tool_args):
    estimate = await estimate_tool_cost(tool_name, tool_args, session=session)
    _record_estimated_spend(session, ApprovalDecision(...))

This is called for every manually-approved budgeted tool, even when auto_approval_enabled=False. It means a session that has NEVER had YOLO on can silently accumulate spend in auto_approval_estimated_spend_usd. When the user later enables YOLO, the remaining budget shown in YoloControl may already be partially or fully exhausted by historical manual approvals — surprising behavior that isn't surfaced to the user.

The condition should check _session_auto_approval_enabled(session) (or at minimum session.auto_approval_cost_cap_usd is not None) before recording. Fix this →


Minor notes (P2)

  • cost_estimation.py:11: OPENID_PROVIDER_URL sourced from env makes JOBS_HARDWARE_URL operator-configurable, which is fine. Worth a comment that this is not user-controlled input.
  • session_manager.py:944: when enabled=True, cap_provided=True, and cost_cap_usd=None, the code silently uses DEFAULT_YOLO_COST_CAP_USD instead of allowing an uncapped mode. This is a deliberate safety choice but could confuse API callers who send {"enabled":true,"cost_cap_usd":null} expecting uncapped behavior. Not blocking, but worth a docstring note.
  • Batch reservation logic (agent_loop.py:1231–1243) is clean and the test test_batch_reservation_blocks_second_over_budget_job covers the key scenario well.

Co-authored-by: Codex <codex@openai.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

Claude finished @lewtun's task in 1m 40s —— View job


PR Review — Add session YOLO auto-approval budget (re-review)

No blocking issues — all P1s resolved
Verdict: ready to merge


P1 fixes verified

  • Divergent scheduled-job detection (agent_loop.py:140, main.py:70): Fixed. Both now call is_scheduled_operation() from the new shared approval_policy.py module, which uses startswith("scheduled "). _SCHEDULED_HF_JOB_RUNS set removed.

  • _auto_approval metadata in tool arguments (sse-chat-transport.ts:257): Fixed. The injection block was removed; block metadata is now stored via setToolBudgetBlock in useAgentChat.ts and read from agentStore.budgetBlocks[toolCallId] in ToolCallGroup.tsx:505. Tool input arguments are no longer mutated.

  • Spend recorded regardless of YOLO enabled (agent_loop.py:1603): Fixed. The inline record block was replaced with _record_manual_approved_spend_if_needed, which guards on _session_auto_approval_enabled(session) before recording.


No new P0 findings in the follow-up commit.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

Claude finished @lewtun's task in 54s —— View job


PR Review — Add session YOLO auto-approval budget (re-review #2)

LGTM
No new findings — only change since re-review #1 is a merge from main.
Verdict: ready to merge


What I checked

  • git diff 55f1368..bee5fac: the only delta is the merge of 7599843 ("Steer agent to HF kernels instead of pip install flash-attn"):
    • agent/prompts/system_prompt_v3.yaml:45: updated tip — replaces "install flash-attn" guidance with HF kernels advice. Text-only change, no logic.
    • agent/tools/docs_tools.py:935: updated kernels doc-string description to reflect the same. Text-only change.
  • All three P1 fixes from re-review Compacting context #1 remain in place (verified in prior pass).
  • No new P0 findings.

@lewtun lewtun merged commit 77324b8 into main May 1, 2026
1 check passed
@lewtun lewtun deleted the codex/session-yolo-cost-cap branch May 1, 2026 17:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant