Add session YOLO auto-approval budget by lewtun · Pull Request #201 · huggingface/ml-intern

lewtun · 2026-05-01T14:02:36Z

No description provided.

Co-authored-by: Codex <codex@openai.com>

github-actions · 2026-05-01T14:02:54Z

Claude finished @lewtun's task in 3m 57s —— View job

PR Review — Add session YOLO auto-approval budget

No blocking issues — 3 P1
Verdict: ready to merge

What I checked

agent/core/agent_loop.py — approval decision logic, batch reservation, EXEC_APPROVAL path, spend recording
agent/core/cost_estimation.py — price catalog fetching, timeout parsing, sandbox/job cost estimation
agent/core/session.py + session_persistence.py — policy fields, property, persistence round-trip
backend/session_manager.py — update_session_auto_approval, restore path, session info
backend/routes/agent.py — new PATCH /session/{id}/yolo endpoint, auth
frontend/src/components/YoloControl.tsx + sessionStore.ts + sse-chat-transport.ts — UI state, toggle, cap dialog
agent/main.py — CLI headless path, _is_scheduled_hf_job_tool guard
All three new test files + updated existing tests

P1 — Divergent scheduled-job detection between `main.py` and `agent_loop.py`

agent/main.py:58 uses startswith("scheduled ") to guard CLI YOLO:

return str(arguments.get("operation") or "").strip().lower().startswith("scheduled ")

agent/core/agent_loop.py:131 uses exact set membership:

_SCHEDULED_HF_JOB_RUNS = {"scheduled run", "scheduled uv"}

def _is_scheduled_hf_job_run(tool_name, tool_args):
    return tool_name == "hf_jobs" and _operation(tool_args) in _SCHEDULED_HF_JOB_RUNS

An LLM-generated operation like "scheduled run" (double-space), "scheduled uv run", or any future scheduled variant would be caught by startswith in the CLI path but silently fall through _is_scheduled_hf_job_run, causing _base_needs_approval to return False (the operation isn't in the union set either). The session YOLO path would auto-execute without any approval or budget check.

Suggest extracting one predicate: _is_scheduled_op(operation: str) -> bool shared by both files. Fix this →

P1 — `_auto_approval` metadata pollutes the displayed tool-input arguments

frontend/src/lib/sse-chat-transport.ts:263:

const input = t.auto_approval_blocked
  ? { ...t.arguments, _auto_approval: { blocked: true, reason: ..., ... } }
  : t.arguments;
controller.enqueue({ type: 'tool-input-available', ..., input, dynamic: true });

The injected _auto_approval key becomes part of the tool-input chunk that drives the argument display in the UI. Users would see _auto_approval: { blocked: true, ... } rendered alongside actual tool arguments in the tool-call group, which is confusing. The sentinel is read back in ToolCallGroup.tsx:506 only by InlineApproval, but the surrounding argument rendering will also pick it up.

Consider passing the block metadata as a separate chunk type (e.g., tool-budget-block) rather than mutating the argument object. Fix this →

P1 — EXEC_APPROVAL path records estimated spend regardless of whether YOLO is enabled

agent/core/agent_loop.py:1581:

if _is_budgeted_auto_approval_target(tool_name, tool_args):
    estimate = await estimate_tool_cost(tool_name, tool_args, session=session)
    _record_estimated_spend(session, ApprovalDecision(...))

This is called for every manually-approved budgeted tool, even when auto_approval_enabled=False. It means a session that has NEVER had YOLO on can silently accumulate spend in auto_approval_estimated_spend_usd. When the user later enables YOLO, the remaining budget shown in YoloControl may already be partially or fully exhausted by historical manual approvals — surprising behavior that isn't surfaced to the user.

The condition should check _session_auto_approval_enabled(session) (or at minimum session.auto_approval_cost_cap_usd is not None) before recording. Fix this →

Minor notes (P2)

cost_estimation.py:11: OPENID_PROVIDER_URL sourced from env makes JOBS_HARDWARE_URL operator-configurable, which is fine. Worth a comment that this is not user-controlled input.
session_manager.py:944: when enabled=True, cap_provided=True, and cost_cap_usd=None, the code silently uses DEFAULT_YOLO_COST_CAP_USD instead of allowing an uncapped mode. This is a deliberate safety choice but could confuse API callers who send {"enabled":true,"cost_cap_usd":null} expecting uncapped behavior. Not blocking, but worth a docstring note.
Batch reservation logic (agent_loop.py:1231–1243) is clean and the test test_batch_reservation_blocks_second_over_budget_job covers the key scenario well.

Co-authored-by: Codex <codex@openai.com>

github-actions · 2026-05-01T14:17:41Z

Claude finished @lewtun's task in 1m 40s —— View job

PR Review — Add session YOLO auto-approval budget (re-review)

No blocking issues — all P1s resolved
Verdict: ready to merge

P1 fixes verified

Divergent scheduled-job detection (agent_loop.py:140, main.py:70): Fixed. Both now call is_scheduled_operation() from the new shared approval_policy.py module, which uses startswith("scheduled "). _SCHEDULED_HF_JOB_RUNS set removed.
_auto_approval metadata in tool arguments (sse-chat-transport.ts:257): Fixed. The injection block was removed; block metadata is now stored via setToolBudgetBlock in useAgentChat.ts and read from agentStore.budgetBlocks[toolCallId] in ToolCallGroup.tsx:505. Tool input arguments are no longer mutated.
Spend recorded regardless of YOLO enabled (agent_loop.py:1603): Fixed. The inline record block was replaced with _record_manual_approved_spend_if_needed, which guards on _session_auto_approval_enabled(session) before recording.

No new P0 findings in the follow-up commit.

github-actions · 2026-05-01T16:59:23Z

Claude finished @lewtun's task in 54s —— View job

PR Review — Add session YOLO auto-approval budget (re-review #2)

LGTM
No new findings — only change since re-review #1 is a merge from main.
Verdict: ready to merge

What I checked

git diff 55f1368..bee5fac: the only delta is the merge of 7599843 ("Steer agent to HF kernels instead of pip install flash-attn"):
- agent/prompts/system_prompt_v3.yaml:45: updated tip — replaces "install flash-attn" guidance with HF kernels advice. Text-only change, no logic.
- agent/tools/docs_tools.py:935: updated kernels doc-string description to reflect the same. Text-only change.
All three P1 fixes from re-review Compacting context #1 remain in place (verified in prior pass).
No new P0 findings.

Add session YOLO auto-approval budget

7629f97

Co-authored-by: Codex <codex@openai.com>

Address YOLO approval review feedback

55f1368

Co-authored-by: Codex <codex@openai.com>

Merge branch 'main' into codex/session-yolo-cost-cap

bee5fac

lewtun merged commit 77324b8 into main May 1, 2026
1 check passed

lewtun deleted the codex/session-yolo-cost-cap branch May 1, 2026 17:05

Praneeth16 mentioned this pull request May 7, 2026

Port HF #201: session YOLO auto-approval budget (defer until cost wired) Praneeth16/databricks-ml-intern#7

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add session YOLO auto-approval budget#201

Add session YOLO auto-approval budget#201
lewtun merged 3 commits into
mainfrom
codex/session-yolo-cost-cap

lewtun commented May 1, 2026

Uh oh!

github-actions Bot commented May 1, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 1, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lewtun commented May 1, 2026

Uh oh!

github-actions Bot commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review — Add session YOLO auto-approval budget

What I checked

P1 — Divergent scheduled-job detection between main.py and agent_loop.py

P1 — _auto_approval metadata pollutes the displayed tool-input arguments

P1 — EXEC_APPROVAL path records estimated spend regardless of whether YOLO is enabled

Minor notes (P2)

Uh oh!

github-actions Bot commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review — Add session YOLO auto-approval budget (re-review)

P1 fixes verified

Uh oh!

github-actions Bot commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review — Add session YOLO auto-approval budget (re-review #2)

What I checked

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented May 1, 2026 •

edited

Loading

P1 — Divergent scheduled-job detection between `main.py` and `agent_loop.py`

P1 — `_auto_approval` metadata pollutes the displayed tool-input arguments

github-actions Bot commented May 1, 2026 •

edited

Loading

github-actions Bot commented May 1, 2026 •

edited

Loading