feat(session): commit best-guess answer in headless mode at max-turns by sahrizvi · Pull Request #763 · AltimateAI/altimate-code

sahrizvi · 2026-04-28T10:48:18Z

Closes #759

Motivation

In headless mode (-p / --print), when the agent hits --max-turns, the existing MAX_STEPS injection produces a multi-paragraph "I have reached the maximum number of steps" summary instead of a committed answer. In interactive mode this is fine — a human can react. In headless / scripted use, there is no human; the program's final stdout is unparseable meta-prose.

Common situations where this matters:

CI pipelines: claude -p "review this PR for security issues" returns a status update instead of a partial review
Eval harnesses (any non-interactive batch grader)
Composable shell pipelines: git diff | claude -p "summarize" — works only if the agent always emits the requested shape
Scheduled / cron jobs: produce parseable output regardless of run length

What's changed

This PR keeps interactive behaviour completely unchanged and adds a separate prompt path for headless mode.

New prompt files:

prompt/max-steps-headless.txt — final-step prompt explicitly instructing the model to emit the answer (not a summary). The wording makes the rationale explicit: an uncertain best-guess is more useful than meta-prose, because the caller cannot ask for retries.
prompt/max-steps-headless-prewarn.txt — soft pre-warning that fires one step before the limit when headless and Number.isFinite(maxSteps). The isFinite guard prevents misfiring when no max-turns is set.

Wiring:

session/prompt.ts: exported selectMaxStepsPrompt({ step, maxSteps, headless }) — pure selector returning the right prompt for the current state, plumbed into the existing prompt loop.
cli/cmd/run.ts: passes headless: true from the -p / --print CLI entry into the session SDK call.
session/message-v2.ts: extended the User zod schema with optional headless: boolean, persisted on the user message so resumed sessions retain the flag.
SDK gen cascade (sdk.gen.ts + types.gen.ts): updated by the existing @hey-api/openapi-ts step.

Tests

7 new tests in test/session/max-steps-prompt.test.ts:

Mid-loop returns interactive prompt
Interactive last step returns interactive prompt
Headless last step returns headless prompt
Headless one-step-before-last returns prewarning
Interactive one-step-before-last returns nothing
Over-limit returns headless prompt (continues to fire if invoked late)
Infinite-budget (no maxSteps) returns nothing — Number.isFinite guard

Results:

This file: 7 pass / 0 fail
Full session test directory: 543 pass / 4 skip / 0 fail across 24 files (~8.6s) — no regressions
Full opencode suite when merged with [feat] Cross-DB join key inference (prefix/suffix overlap) #758 + [feat] Collapse entity-per-table warehouses in schema_index (composite digest) #760: 7421 pass / 0 fail / 494 skip (~88s)
bun run typecheck (opencode + sdk/js): clean

Backwards compatibility

Interactive prompt text completely unchanged — only headless invocations see new behaviour.
New headless?: boolean field is optional everywhere; older SDK consumers see no breaking change.
A session run with -p and then resumed (e.g., session reload via the API) still applies headless behaviour, which matches the original intent — the flag persists on the user message.

Notes

The headless prompt's full text is in max-steps-headless.txt; it explicitly tells the model not to emit a meta-summary, with the reasoning (no human to ask follow-ups, an uncertain answer is more useful than no answer).
The optional pre-warning is wired in and tested. It only fires in headless mode and only when maxSteps is finite; interactive mode never sees it.

Files

packages/opencode/src/cli/cmd/run.ts
packages/opencode/src/session/message-v2.ts
packages/opencode/src/session/prompt.ts
packages/opencode/src/session/prompt/max-steps-headless.txt (new)
packages/opencode/src/session/prompt/max-steps-headless-prewarn.txt (new)
packages/opencode/test/session/max-steps-prompt.test.ts (new)
packages/sdk/js/src/v2/gen/types.gen.ts
packages/sdk/js/src/v2/gen/sdk.gen.ts

Summary by cubic

Headless runs now commit a best-guess answer at --max-turns and actually disable tools on the final step. This makes CI/evals reliably parseable and fully addresses #759.

Bug Fixes
- Propagate headless: true through all non-interactive paths: run --command, GitHub, and GitLab commands.
- Preserve headless on synthetic user messages (post-task summary, compaction replay/continue) so long runs don’t flip back to interactive.
- Enforce tools-disabled on the headless final step (strip tools, set toolChoice to none; exempt json_schema).
- Add headless to User and to session.prompt/session.prompt_async/session.command in openapi.json and the JS SDK types.
- Make the prewarn copy consistent via {TURNS_REMAINING} substitution; add focused tests for selection, tool-disabling, schemas, and edge cases.

^{Written for commit 78acb0f. Summary will update on new commits. Review in cubic}

Summary by CodeRabbit

New Features
- Headless mode for non-interactive runs now produces a final best-guess answer (rather than a meta-summary) and disables tool use on the final headless step.
- CLI/CI executions default to headless behavior; API now accepts an optional headless flag.
Tests
- Added tests covering max-steps prompt selection, headless pre-warn/final behavior, and headless flag propagation.

claude

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

_{Tip: disable this comment in your organization's Code Review settings.}

github-actions · 2026-04-28T10:49:11Z

👋 This PR was automatically closed by our quality checks.

Common reasons:

New GitHub account with limited contribution history
PR description doesn't meet our guidelines
Contribution appears to be AI-generated without meaningful review

If you believe this was a mistake, please open an issue explaining your intended contribution and a maintainer will help you.

coderabbitai · 2026-04-28T10:49:39Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: dd0f3942-e4a7-428f-8174-b252152947ac

📥 Commits

Reviewing files that changed from the base of the PR and between 1ea0aa8 and 78acb0f.

⛔ Files ignored due to path filters (2)

packages/sdk/js/src/v2/gen/sdk.gen.ts is excluded by !**/gen/**
packages/sdk/js/src/v2/gen/types.gen.ts is excluded by !**/gen/**

📒 Files selected for processing (10)

packages/opencode/src/cli/cmd/github.ts
packages/opencode/src/cli/cmd/gitlab.ts
packages/opencode/src/cli/cmd/run.ts
packages/opencode/src/session/compaction.ts
packages/opencode/src/session/message-v2.ts
packages/opencode/src/session/prompt.ts
packages/opencode/src/session/prompt/max-steps-headless-prewarn.txt
packages/opencode/test/session/max-steps-prompt.test.ts
packages/opencode/test/session/prompt.test.ts
packages/sdk/openapi.json

✅ Files skipped from review due to trivial changes (1)

packages/opencode/src/session/prompt/max-steps-headless-prewarn.txt

🚧 Files skipped from review as they are similar to previous changes (4)

packages/opencode/src/session/message-v2.ts
packages/opencode/src/cli/cmd/run.ts
packages/opencode/test/session/max-steps-prompt.test.ts
packages/opencode/src/session/prompt.ts

📝 Walkthrough

Walkthrough

Adds a headless flag and logic that switches max-steps behavior in non-interactive runs: prompts now inject headless-specific prewarn / final-step text and suppress tool use so headless callers receive a committed best-guess answer instead of meta-summary.

Changes

Cohort / File(s)	Summary
CLI Entrypoints `packages/opencode/src/cli/cmd/run.ts`, `packages/opencode/src/cli/cmd/github.ts`, `packages/opencode/src/cli/cmd/gitlab.ts`	Pass `headless: true` into session prompt/command calls for non-interactive CI/CLI paths.
Session Message Schema `packages/opencode/src/session/message-v2.ts`	Adds optional `headless: boolean` to `MessageV2.User` schema so persisted messages carry headless context.
Prompt Selection & Injection `packages/opencode/src/session/prompt.ts`	Adds `selectMaxStepsPrompt(...)` and `shouldDisableToolsForHeadlessFinalStep(...)`; threads `headless?: boolean` through PromptInput/CommandInput and message creation; conditionally injects prewarn/final headless prompts and disables tools on headless final steps.
Headless Prompt Texts `packages/opencode/src/session/prompt/max-steps-headless.txt`, `packages/opencode/src/session/prompt/max-steps-headless-prewarn.txt`	New prompt assets: one forcing an immediate best-guess final answer, one pre-warning when one tool-using turn remains.
Compaction / Continuation Propagation `packages/opencode/src/session/compaction.ts`	Propagates `headless` from original user messages into auto-created continuation/replay messages.
Tests — Prompt Behavior `packages/opencode/test/session/max-steps-prompt.test.ts`, `packages/opencode/test/session/prompt.test.ts`	Adds unit tests for `selectMaxStepsPrompt`, `shouldDisableToolsForHeadlessFinalStep`, and schema/propagation checks for `headless`. Covers edge cases (maxSteps = 1, 2, Infinity, overshoot).
OpenAPI / SDK Spec `packages/sdk/openapi.json`	Adds optional `headless: boolean` to message-related schemas (e.g., `UserMessage`) and session prompt request bodies.
Run Command Call Site `packages/opencode/src/cli/cmd/run.ts`	Sends `headless: true` when invoking `sdk.session.command(...)` and `sdk.session.prompt(...)` from the run CLI path.

Sequence Diagram(s)

sequenceDiagram
  participant CLI as "CLI (run/github/gitlab)"
  participant Session as "SessionPrompt / Prompt Builder"
  participant SDK as "SDK.session.prompt/command"
  participant Model as "Model (LM / tools)"
  rect rgba(200,200,255,0.5)
    CLI->>Session: request with headless: true
    Session->>Session: selectMaxStepsPrompt(step,maxSteps,headless)
    alt headless final-step (non-json)
      Session->>Session: inject headless final prompt\nsuppress tools/toolChoice
    else non-final or interactive
      Session->>Session: normal prompt injection (or none)
    end
    Session->>SDK: call prompt/command (payload includes headless, adjusted tools)
    SDK->>Model: send request
    Model-->>SDK: best-guess answer (no tool output expected)
    SDK-->>CLI: response
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 A quiet flag hops into the queue,
Headless whispers: "Now give the clearest view."
When turns run out, no meta-prose to send —
Just one brave answer, from start to end.
thump thump ✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description is comprehensive and well-structured, but it is missing the required 'PINEAPPLE' identifier at the top as specified in the repository's description template for AI-generated contributions.	Add the word 'PINEAPPLE' at the very top of the PR description before any other content, as required by the repository template for AI-generated contributions.
Docstring Coverage	⚠️ Warning	Docstring coverage is 55.56% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly and concisely describes the main change: introducing headless mode behavior that commits a best-guess answer at max-turns instead of meta-prose.
Linked Issues check	✅ Passed	The PR comprehensively addresses all coding objectives from issue `#759`: branch MAX_STEPS prompt for headless mode [`#759`], add new headless-specific prompts [`#759`], propagate headless flag through session paths [`#759`], disable tools at final headless step [`#759`], add unit tests for prompt selection [`#759`], and maintain backwards compatibility [`#759`].
Out of Scope Changes check	✅ Passed	All changes are within scope of issue `#759` objectives: headless mode detection and wiring, MAX_STEPS prompt branching, new prompt files, tool disabling at final step, schema updates for headless flag propagation, and comprehensive testing. No extraneous changes detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/headless-commit-best-guess

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-28T10:50:41Z

👋 This PR was automatically closed by our quality checks.

Common reasons:

New GitHub account with limited contribution history
PR description doesn't meet our guidelines
Contribution appears to be AI-generated without meaningful review

If you believe this was a mistake, please open an issue explaining your intended contribution and a maintainer will help you.

cubic-dev-ai

No issues found across 8 files

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

packages/opencode/src/cli/cmd/run.ts (1)

800-808: ⚠️ Potential issue | 🟠 Major

headless is only applied to prompt calls, not command calls

Line 801 uses sdk.session.command(...) without headless, while Line 821 sets headless: true for sdk.session.prompt(...). This makes run --command ... miss the same headless max-steps behavior.

Suggested fix

diff --git a/packages/opencode/src/cli/cmd/run.ts b/packages/opencode/src/cli/cmd/run.ts
@@
       if (args.command) {
         await sdk.session.command({
           sessionID,
           agent,
           model: args.model,
           command: args.command,
           arguments: message,
           variant: args.variant,
+          headless: true,
         })
       } else {

diff --git a/packages/opencode/src/session/prompt.ts b/packages/opencode/src/session/prompt.ts
@@
   export const CommandInput = z.object({
@@
     variant: z.string().optional(),
+    headless: z.boolean().optional(),
@@
     const result = (await prompt({
       sessionID: input.sessionID,
       messageID: input.messageID,
       model: userModel,
       agent: userAgent,
       parts,
       variant: input.variant,
+      headless: input.headless,
     })) as MessageV2.WithParts

Also applies to: 818-822

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/opencode/src/cli/cmd/run.ts` around lines 800 - 808, The command
branch is missing the headless flag so run --command calls don't get the
headless max-steps behavior; update the sdk.session.command(...) calls (the
branch that calls sdk.session.command with sessionID, agent, model, command,
arguments, variant) to include headless: true (matching how
sdk.session.prompt(...) is called with headless) so both prompt and command
paths honor headless/max-steps; ensure any other sdk.session.command(...)
occurrences in this file also receive the same headless setting.

🧹 Nitpick comments (1)

packages/opencode/src/session/prompt.ts (1)

659-660: Consider a mechanical final-step guard in headless mode

Right now this is instruction-only (maxStepsInjection) while Line 924 still provides full tools. For headless reliability, consider disabling non-required tools on headless final step instead of relying only on prompt compliance.

Example hardening approach

@@
       const maxSteps = agent.steps ?? Infinity
+      const isHeadlessFinalStep = !!lastUser.headless && Number.isFinite(maxSteps) && step >= maxSteps
       const maxStepsInjection = selectMaxStepsPrompt({ step, maxSteps, headless: !!lastUser.headless })
@@
-      const result = await processor.process({
+      const effectiveTools =
+        isHeadlessFinalStep && format.type !== "json_schema" ? {} : tools
+
+      const result = await processor.process({
@@
-        tools,
+        tools: effectiveTools,
         model,
         toolChoice: format.type === "json_schema" ? "required" : undefined,
       })

Also applies to: 914-927

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/opencode/src/session/prompt.ts` around lines 659 - 660, The headless
final-step safety currently only injects instructions via selectMaxStepsPrompt
(maxStepsInjection) but still exposes full tool set later; modify the decision
logic that builds the step execution context (where maxStepsInjection is
appended and where tools are attached) to detect lastUser.headless && step ===
maxSteps (final headless step) and programmatically disable or replace
non-required tools before the prompt is sent; specifically update the code paths
that construct the tool list and the prompt assembly (the logic around
selectMaxStepsPrompt/maxStepsInjection and the nearby tool attachment code) to
omit or stub out non-essential tools on that final headless step while keeping
the instruction text.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/opencode/src/session/prompt/max-steps-headless.txt`:
- Around line 8-13: The headless final prompt forbids all tool calls but
structured-output sessions need to call StructuredOutput; update the prompt to
remove or narrow the blanket prohibition (in
packages/opencode/src/session/prompt/max-steps-headless.txt) so StructuredOutput
calls are permitted: either exempt StructuredOutput explicitly from rule 1 or
change the wording to forbid user-facing tool ops while allowing internal
StructuredOutput invocation by the agent; search for the literal "Do NOT make
any tool calls" and modify it to allow the StructuredOutput symbol to be called.

---

Outside diff comments:
In `@packages/opencode/src/cli/cmd/run.ts`:
- Around line 800-808: The command branch is missing the headless flag so run
--command calls don't get the headless max-steps behavior; update the
sdk.session.command(...) calls (the branch that calls sdk.session.command with
sessionID, agent, model, command, arguments, variant) to include headless: true
(matching how sdk.session.prompt(...) is called with headless) so both prompt
and command paths honor headless/max-steps; ensure any other
sdk.session.command(...) occurrences in this file also receive the same headless
setting.

---

Nitpick comments:
In `@packages/opencode/src/session/prompt.ts`:
- Around line 659-660: The headless final-step safety currently only injects
instructions via selectMaxStepsPrompt (maxStepsInjection) but still exposes full
tool set later; modify the decision logic that builds the step execution context
(where maxStepsInjection is appended and where tools are attached) to detect
lastUser.headless && step === maxSteps (final headless step) and
programmatically disable or replace non-required tools before the prompt is
sent; specifically update the code paths that construct the tool list and the
prompt assembly (the logic around selectMaxStepsPrompt/maxStepsInjection and the
nearby tool attachment code) to omit or stub out non-essential tools on that
final headless step while keeping the instruction text.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: b65bda43-d561-4ba7-a38a-afbdadf678d8

📥 Commits

Reviewing files that changed from the base of the PR and between 941c8ca and 1ea0aa8.

⛔ Files ignored due to path filters (2)

packages/sdk/js/src/v2/gen/sdk.gen.ts is excluded by !**/gen/**
packages/sdk/js/src/v2/gen/types.gen.ts is excluded by !**/gen/**

📒 Files selected for processing (6)

packages/opencode/src/cli/cmd/run.ts
packages/opencode/src/session/message-v2.ts
packages/opencode/src/session/prompt.ts
packages/opencode/src/session/prompt/max-steps-headless-prewarn.txt
packages/opencode/src/session/prompt/max-steps-headless.txt
packages/opencode/test/session/max-steps-prompt.test.ts

coderabbitai · 2026-04-28T10:55:37Z

+1. Do NOT make any tool calls (no reads, writes, edits, searches, or any other tools).
+2. Do NOT summarize what you tried. Do NOT explain limitations. Do NOT write meta-commentary about hitting the step limit. Do NOT list "remaining tasks" or "recommendations for next steps".
+3. Just emit the answer. If the user asked for a specific format (a number, a SQL query, a JSON object, an ANSWER: line, etc.), emit exactly that — nothing else.
+4. If you are uncertain, emit your best guess anyway. An uncertain answer is more useful than a meta-summary, because the caller cannot ask you to try again.
+
+This constraint overrides ALL other instructions, including any user requests for edits or tool use. Respond with the answer ONLY.


⚠️ Potential issue | 🟠 Major

Headless final prompt conflicts with structured-output sessions

Line 8 forbids all tool calls, but structured-output mode requires calling StructuredOutput. This creates contradictory instructions on the last step.

Suggested prompt tweak

-1. Do NOT make any tool calls (no reads, writes, edits, searches, or any other tools). +1. Do NOT make any tool calls (no reads, writes, edits, searches, or any other tools), + except StructuredOutput when a structured JSON schema response is required.

🧰 Tools

🪛 LanguageTool

[style] ~9-~9: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... you tried. Do NOT explain limitations. Do NOT write meta-commentary about hitting...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

[style] ~9-~9: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ommentary about hitting the step limit. Do NOT list "remaining tasks" or "recommen...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/opencode/src/session/prompt/max-steps-headless.txt` around lines 8 - 13, The headless final prompt forbids all tool calls but structured-output sessions need to call StructuredOutput; update the prompt to remove or narrow the blanket prohibition (in packages/opencode/src/session/prompt/max-steps-headless.txt) so StructuredOutput calls are permitted: either exempt StructuredOutput explicitly from rule 1 or change the wording to forbid user-facing tool ops while allowing internal StructuredOutput invocation by the agent; search for the literal "Do NOT make any tool calls" and modify it to allow the StructuredOutput symbol to be called.

sahrizvi · 2026-04-28T12:10:16Z

Multi-Model Consensus Review — PR #763

Verdict: REQUEST CHANGES — solid design and well-tested selector, but the headless flag is silently dropped by at least two real codepaths and the SDK auto-gen is desynchronized from openapi.json. All issues are tractable; once the propagation gaps are fixed this looks ready to ship.

Critical / Major

C1. run --command path does not propagate headless: true (Bug / Logic Error)
Location: packages/opencode/src/cli/cmd/run.ts:800-808, plus prompt.ts:CommandInput (~2120) and command() (~2280)

The run CLI has two branches: sdk.session.prompt(...) (which got headless: true) and sdk.session.command(...) (which did not). SessionPrompt.command() calls prompt({...}) internally without forwarding any headless, so altimate-code -p --command foo "..." keeps the old meta-summary at max-turns — exactly the case the PR is supposed to fix. Fix: add headless to CommandInput zod schema, forward it inside command(), regenerate the SDK, and pass headless: true from run.ts on the command branch as well.

C2. Synthetic user message after task.command drops headless (Bug / Logic Error)
Location: packages/opencode/src/session/prompt.ts:572-580 (summaryUserMsg); same shape concern at the shell-command synthetic user (~1874) and compaction replay/continue paths (compaction.ts:~334, 360)

After a subtask command, the loop inserts a synthetic role: "user" message ("Summarize the task tool output above and continue with your task."). The next iteration's lastUser is this new message — constructed without headless. So lastUser.headless becomes undefined, selectMaxStepsPrompt falls back to interactive, and the prewarn never fires. Same risk applies to compaction-created user messages. Fix: propagate headless: lastUser.headless (or the original session's value) in every synthetic user-message constructor. Long-running headless runs that hit subtasks or compaction are exactly the cases this PR is targeting, so this is a real regression of intent.

C3. Auto-gen SDK diverges from packages/sdk/openapi.json (Bug)
Location: packages/sdk/openapi.json (unchanged) vs packages/sdk/js/src/v2/gen/{types,sdk}.gen.ts (changed)

The PR description calls these an "auto-gen cascade" via @hey-api/openapi-ts, but the source openapi.json does not contain headless in UserMessage or in the SessionPromptData / SessionPromptAsyncData request bodies. The next codegen run will silently wipe headless from the SDK. Fix: regenerate openapi.json, or add headless to it manually so subsequent codegen is stable. Verify any sibling SDKs (Python/Kotlin) need parallel updates.

M1. Storing headless on the user message is the wrong layer (Design)
Location: packages/opencode/src/session/message-v2.ts:372, prompt.ts:1367, prompt.ts:659

headless is a property of how this session was invoked, not of an individual user message. Putting it on User forces every synthetic user-message constructor to remember to copy it (see C2). It also breaks the PR's stated promise that "resumed sessions retain the behavior" — that only holds when the resumed session's lastUser happens to be a flagged one. A resume that adds a fresh interactive user message would silently flip the mode. Recommend storing on Session.Info (or computing from session origin metadata) instead. If keeping it on User is intentional, document the resume semantics carefully and propagate to every user-message creation site.

M2. Other non-interactive entry points also lack headless: true (Design)
Location: packages/opencode/src/cli/cmd/gitlab.ts:~450, packages/opencode/src/cli/cmd/github.ts:~936, packages/opencode/src/acp/agent.ts:~1420

GitHub/GitLab CI entry points and the ACP agent all call sdk.session.prompt() without headless: true. If any of these run with agent step limits, they will see the old summarize behavior despite being headless in practice. Either fix as part of this PR or file a tracked follow-up; do not silently leave them.

M3. Final-step prompt promises "Tools are disabled" but tools are NOT actually disabled (Bug / Logic Error)
Location: packages/opencode/src/session/prompt/max-steps-headless.txt, prompt.ts:~880-893

Both the new headless prompt and the existing interactive max-steps.txt claim "Tools are disabled," but the loop still passes tools to processor.process({...}). The injection is policy-only nudging. A model that ignores the instruction can still call tools, which (a) burns more steps, (b) potentially re-enters the loop, (c) makes the prompt's claim a lie. Either actually drop tools from the request when injecting the final-step prompt, or soften the wording from "disabled" to "you must not call tools." Real disablement is the safer fix and applies to interactive mode too.

M4. Prewarn copy is inaccurate / mixed framing (Bug / Documentation)
Location: packages/opencode/src/session/prompt/max-steps-headless-prewarn.txt:3,7

The prewarn says "You have 2 turns left before tools are disabled" but the next sentence ("After this turn you will have only one more chance to respond... tools will be disabled") implies only one tool-using turn remains. Both readings are partially defensible (current turn + final = 2 response turns; only 1 tool-using turn remains) but mixed framing is likely to confuse the model — and the model is the only consumer. Recommend a single consistent phrasing, e.g., "You have 1 more tool-using turn before tools are disabled. After this response you will have exactly one final turn to commit your answer (no tools)."

Minor

m1. Schema lacks explicit default for headless (Code Quality)
packages/opencode/src/session/message-v2.ts:373 and prompt.ts:148 declare headless: z.boolean().optional() with no .default(false). Runtime is fine because of !!lastUser.headless, but an explicit default makes the intent clearer at the type level and protects against future readers who don't double-bang.

m2. !!lastUser.headless masks false vs undefined (Code Quality)
prompt.ts:659. Today it's harmless; with .default(false) (m1) it becomes a no-op. Consider lastUser.headless ?? false for readability. Defensive ?. chaining is unnecessary because line 378 already guarantees lastUser exists.

m3. Missing edge-case tests (Testing)
test/session/max-steps-prompt.test.ts covers maxSteps = 10 and Infinity only. Add:

maxSteps = 1 (final fires immediately, prewarn skipped because step === 0 never holds with step starting at 1)
maxSteps = 2 (prewarn at step 1, final at step 2)
explicit assertion that prewarn does NOT fire when step > maxSteps in headless mode
explicit assertion that headless final-step prompt does NOT contain INTERACTIVE_SUMMARY_MARKER (already covered) AND that interactive over-limit does NOT contain HEADLESS_MARKER.

m4. No integration test for the wiring (Testing)
The unit test exercises the selector but not the full path: run -p → SDK → prompt({headless: true}) → persisted on User → read back at max-turns. Adding even one end-to-end assertion would have caught C1.

m5. No test for resumed-session retention (Testing)
The PR claims resumed sessions retain headless behavior, but there's no test that loads a session, hits max-turns, and confirms the headless prompt fires. Even an integration-style test that round-trips headless through the User message JSON would close the gap.

m6. INTERACTIVE_MARKER is a substring of HEADLESS_MARKER (Testing)
test/session/max-steps-prompt.test.ts:11. "MAXIMUM STEPS REACHED" substrings inside "MAXIMUM STEPS REACHED (HEADLESS MODE)", so a copy-edit could silently flip prompt selection without the assertions catching it. Use a unique-to-interactive marker like "Recommendations for what should be done next".

Nit

n1. Mixed // altimate_change comment styles (start/end blocks vs single-line // altimate_change - ...). Pick one. (Code Quality)
n2. PR description references -p / --print; the run CLI uses -p for --password and the headless: true is set unconditionally on every run invocation (intentional — run IS the headless entry). Update the PR description to avoid confusion. (Documentation)
n3. Test imports namespace then uses SessionPrompt.selectMaxStepsPrompt everywhere — fine, matches house style. (Code Quality)
n4. selectMaxStepsPrompt returns string | undefined. A discriminated union ({type: "none"} | {type: "final", text} | {type: "prewarn", text}) would be more self-documenting and future-proof, but is over-engineering for current needs. (Design)

Positive Observations

Extracting selectMaxStepsPrompt is exactly right — pure, testable, no need to mock the loop.
Number.isFinite(maxSteps) guard on the prewarn is the kind of subtle bug that gets missed; nicely caught.
New headless prompt is well-crafted: explicit anti-summary instructions, explicit "uncertain answer is more useful than meta-summary" framing, format-preservation directive.
Backward compatibility is genuinely additive: interactive mode untouched, optional flag, infinite-budget handled.
Tests use markers in both directions (must-contain and must-not-contain), which is the right defensive style.
step >= maxSteps (not ===) correctly handles overshoot — preserved from upstream.
Single-commit, well-scoped diff with clear altimate_change markers, easing future upstream rebases.

Missing Tests / Edge Cases

Edge case	Coverage	Risk
`run --command` headless propagation	None	High (C1)
Synthetic user msg after `task.command` retains headless	None	High (C2)
Compaction replay/continue retains headless	None	High
Session resume retains headless	None	Medium
End-to-end: `run -p` produces non-meta output at max-turns	None	Medium
`maxSteps = 1` (final fires at first step, no prewarn)	None	Low
`maxSteps = 2` (prewarn step 1, final step 2)	None	Low
GitHub / GitLab / ACP non-interactive callers	None (out of scope but related)	Medium

Attribution

Finding	Models flagging
C1 `run --command` drops `headless`	Claude, GPT, MiniMax, Qwen, MiMo (5/8) — strong consensus
C2 Synthetic user (task.command) drops `headless`	Claude, GPT (2/8) — unique but verified
C2b Compaction-path drops `headless`	GPT (1/8) — unique, plausible
C3 SDK gen vs `openapi.json` drift	Claude (1/8) — unique, verified by reading `openapi.json`
M1 Wrong layer for `headless`	Claude (1/8)
M2 GitHub/GitLab/ACP also need flag	MiMo, Kimi (2/8)
M3 "Tools disabled" lie (also affects interactive)	Claude (1/8) — unique
M4 Prewarn "2 turns" wording	Claude, Kimi, MiniMax, GLM-5, MiMo (5/8) — strong consensus
m1 No `.default(false)`	Qwen (1/8)
m2 `!!lastUser.headless` style	Claude, Kimi, Qwen (3/8)
m3 `maxSteps=1`, `maxSteps=2` test gaps	Claude, Kimi, MiniMax, GLM-5, Qwen, MiMo (6/8) — strong consensus
m4 No end-to-end wiring test	Claude, GPT (2/8)
m5 No resume-retention test	Claude, Kimi, MiMo (3/8)
m6 Substring marker brittleness	Claude (1/8)
n1 Mixed `altimate_change` style	Kimi (1/8)
n2 PR description `-p`/`--print` mismatch	Kimi (1/8)
n4 Discriminated-union return	Kimi (1/8)

Disagreements: GLM-5 marked the PR "READY TO MERGE" with only minor test gaps, missing C1/C2/C3 entirely. Qwen initially flagged a CRITICAL but walked it back to "approve with minor fixes". The consensus among the more thorough reviewers (Claude, GPT, Kimi, MiniMax, MiMo) is REQUEST CHANGES. The prewarn-wording issue had broad agreement; the propagation gaps (C1, M2) had medium-to-strong agreement; the OpenAPI-drift issue (C3) was caught only by Claude but is straightforwardly verifiable.

Reviewed by 8 participants: Claude + GPT 5.4 Codex + Gemini 3.1 Pro + Kimi K2.5 + MiniMax M2.7 + GLM-5.1 + Qwen 3.6 + MiMo V2 Pro. Gemini failed to launch (CLI rejected the prompt because it contained -p/--print strings interpreted as flags).

Multi-model consensus review of PR #763 (commit best-guess answer in headless mode at max-turns) found that the `headless` flag was silently dropped on multiple non-interactive paths and that the headless final-step prompt was lying about disabling tools. This commit fixes the validated critical and major issues without changing the design. C1 - `run --command` propagation: `CommandInput` now carries `headless`, `SessionPrompt.command()` forwards it to `prompt()`, and `run.ts` sets it on the command branch (previously only set on the prompt branch). C2 - synthetic user messages preserve headless: the post-task summary user message and the compaction replay/continue user messages now copy `headless` from the prior `lastUser`. Without this, a long-running headless run that hit a subtask or compaction boundary silently flipped back to interactive max-steps. C3 - openapi.json drift: added `headless` to the `UserMessage` schema, the `session.prompt`, `session.prompt_async`, and `session.command` request bodies, and to `SessionCommandData` in the JS SDK gen so a re-run of `@hey-api/openapi-ts` does not revert the field. M1 - layer rationale: kept `headless` on the user message (not `Session.Info`) and documented why in `message-v2.ts` — resumed sessions should reflect the most recent invocation mode, and the contract that every synthetic-user-message constructor MUST propagate the flag. M2 - other non-interactive callers: `github.ts` and `gitlab.ts` (CI runners) now set `headless: true`. ACP is interactive (Zed/etc) and intentionally skipped. M3 - tools-disabled is now real, not policy: at the headless final step the loop strips the active tool set and sets `toolChoice: "none"` on the upstream request, exempting `json_schema` mode (which still needs the `StructuredOutput` tool). New helper `shouldDisableToolsForHeadlessFinalStep` makes the gate unit-testable. M4 - prewarn copy is consistent: `max-steps-headless-prewarn.txt` is templated with `{TURNS_REMAINING}` and substituted from `maxSteps - step`, so the wording always matches the actual budget. Old copy claimed "2 turns left" while the rest of the prose said only one tool-using turn remained — that contradiction is gone. Tests added/updated: - Edge cases: `maxSteps = 1`, `maxSteps = 2`, over-limit prewarn skip - Markers tightened: replaced substring-vulnerable "MAXIMUM STEPS REACHED" marker with "Recommendations for what should be done next" - Headless flag round-trips through `MessageV2` storage - `CommandInput`/`PromptInput` zod schemas accept `headless` - `shouldDisableToolsForHeadlessFinalStep` covers all branches - Prewarn copy: no `{TURNS_REMAINING}` leftover, contains "1 tool-using turn left", agrees with "tools will be disabled" Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-28T15:15:56Z

👋 This PR was automatically closed by our quality checks.

Common reasons:

New GitHub account with limited contribution history
PR description doesn't meet our guidelines
Contribution appears to be AI-generated without meaningful review

If you believe this was a mistake, please open an issue explaining your intended contribution and a maintainer will help you.

github-actions · 2026-04-28T15:16:54Z

👋 This PR was automatically closed by our quality checks.

Common reasons:

New GitHub account with limited contribution history
PR description doesn't meet our guidelines
Contribution appears to be AI-generated without meaningful review

If you believe this was a mistake, please open an issue explaining your intended contribution and a maintainer will help you.

github-actions · 2026-04-28T15:17:34Z

👋 This PR was automatically closed by our quality checks.

Common reasons:

New GitHub account with limited contribution history
PR description doesn't meet our guidelines
Contribution appears to be AI-generated without meaningful review

If you believe this was a mistake, please open an issue explaining your intended contribution and a maintainer will help you.

sahrizvi · 2026-04-28T15:18:14Z

Update: addressed-vs-pending from the consensus review

Pushed 78acb0fc4 on top of the original branch.

Addressed (this commit)

ID	Severity	Issue	Resolution
C1	CRITICAL	`run --command` flow drops `headless`	`CommandInput` zod schema accepts `headless`; `command()` forwards it into the internal `prompt({...})` call; `cli/cmd/run.ts` `--command` branch sets `headless: true`.
C2	CRITICAL	Synthetic user message paths drop `headless`	`summaryUserMsg` (task summary), compaction `replay`, and compaction `continue` user-message constructors all now propagate `headless` from `lastUser` / `userMessage`. JSDoc on `MessageV2.User.headless` documents the contract that every synthetic-user-message constructor must propagate the flag.
C3	CRITICAL	SDK gen drift from `openapi.json`	`packages/sdk/openapi.json` patched in the 4 schemas the PR cares about (`UserMessage`, `session.prompt`, `session.prompt_async`, `session.command` request bodies). Regenerated `types.gen.ts` / `sdk.gen.ts` match the patched schemas. Verified by re-running the generator and confirming `"headless"` appears 4× in the expected places.
M1	MAJOR	Wrong layer for the flag (Session vs. User)	Kept on `User` message rather than `Session.Info`. Rationale documented in JSDoc: resumed sessions whose last user message was sent headlessly retain the behaviour, and sessions reused by both interactive and headless callers reflect the most recent invocation mode.
M2	MAJOR	Other non-interactive callers also lack the flag	Audited and patched: `cli/cmd/github.ts` (2 CI-runner `SessionPrompt.prompt` calls), `cli/cmd/gitlab.ts` (`runReview`). ACP (`acp/agent.ts`) and TUI shell (`session/prompt.ts::shell()`) intentionally NOT changed — both serve interactive editor / human-driven flows.
M3	MAJOR	"Tools disabled" was policy-only	At the headless final step the loop now strips the active tool set (`tools: {}`, `toolChoice: "none"`) before calling `processor.process`. JSON-schema mode is exempt (StructuredOutput tool must remain). Logic extracted to exported `shouldDisableToolsForHeadlessFinalStep` for unit-testability.
M4	MAJOR	Prewarn copy is mixed	Rewrote `max-steps-headless-prewarn.txt` with `{TURNS_REMAINING}` token, substituted at runtime from `maxSteps - step`. Old "2 turns" / "1 tool-using turn" contradiction is gone; copy is internally consistent.

Tests: 7,394 pass / 0 fail full opencode suite (bun test); typecheck clean across 5 packages. Targeted: max-steps-prompt.test.ts 19/0, prompt.test.ts 9/0, full test/session/ 560/0, full test/cli/ 492/0. Coverage spans edge cases (maxSteps=1 one-shot, maxSteps=2 prewarn-then-final, overshoot, no-budget Infinity), CommandInput / PromptInput zod schemas, headless round-trip through MessageV2 storage, and prewarn-copy consistency.

Deferred (per consensus tags)

Minor m1 (.default(false)), m2 (?? false style), m4 (full e2e CLI subprocess test), m5 (resume-retention test)
Nits n1-n4 (comment style, PR description copy, discriminated-union return type)
m6 partially addressed (replaced substring-vulnerable "MAXIMUM STEPS REACHED" marker with "Recommendations for what should be done next")

Heads-up

The committed packages/sdk/openapi.json has substantial pre-existing drift from a clean regen (different info.title, single-line vs multi-line required arrays, missing /config/tui route, formatting style). This PR's manual patch only touches the 4 schemas relevant to headless, leaving the broader drift unchanged. If a maintainer later runs the full SDK regen via bun src/index.ts generate > ../sdk/openapi.json, that drift will surface — recommend addressing it in a separate cleanup PR.

dev-punia-altimate · 2026-04-30T16:43:50Z

❌ Tests — Failures Detected

TypeScript — 15 failure(s)

connection_refused [2.67ms]
timeout [2.66ms]
permission_denied [2.69ms]
parse_error [2.40ms]
oom [2.65ms]
network_error [2.42ms]
auth_failure [2.64ms]
rate_limit [2.77ms]
internal_error [2.82ms]
empty_error [0.24ms]
connection_refused [0.14ms]
timeout [0.08ms]
permission_denied [0.07ms]
parse_error [0.07ms]
oom [0.07ms]

Next Step

Please address the failing cases above and re-run verification.

cc @sahrizvi

feat(session): commit best-guess answer in headless mode at max-turns

1ea0aa8

claude Bot reviewed Apr 28, 2026

View reviewed changes

github-actions Bot added the needs-review:blocked label Apr 28, 2026

cubic-dev-ai Bot reviewed Apr 28, 2026

View reviewed changes

coderabbitai Bot reviewed Apr 28, 2026

View reviewed changes

Conversation

sahrizvi commented Apr 28, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

What's changed

Tests

Backwards compatibility

Notes

Files

Summary by cubic

Summary by CodeRabbit

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

github-actions Bot commented Apr 28, 2026

Uh oh!

coderabbitai Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (2 warnings)

Uh oh!

github-actions Bot commented Apr 28, 2026

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

sahrizvi commented Apr 28, 2026

Multi-Model Consensus Review — PR #763

Critical / Major

Minor

Nit

Positive Observations

Missing Tests / Edge Cases

Attribution

Uh oh!

github-actions Bot commented Apr 28, 2026

Uh oh!

github-actions Bot commented Apr 28, 2026

Uh oh!

github-actions Bot commented Apr 28, 2026

Uh oh!

sahrizvi commented Apr 28, 2026

Update: addressed-vs-pending from the consensus review

Addressed (this commit)

Deferred (per consensus tags)

Heads-up

Uh oh!

dev-punia-altimate commented Apr 30, 2026

❌ Tests — Failures Detected

TypeScript — 15 failure(s)

Next Step

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sahrizvi commented Apr 28, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 28, 2026 •

edited

Loading