Skip to content

🤖 fix: clarify best-of-n prompt guidance#2949

Merged
ammario merged 1 commit into
mainfrom
fix/best-of-n-task-guidance
Mar 14, 2026
Merged

🤖 fix: clarify best-of-n prompt guidance#2949
ammario merged 1 commit into
mainfrom
fix/best-of-n-task-guidance

Conversation

@ammar-agent
Copy link
Copy Markdown
Collaborator

@ammar-agent ammar-agent commented Mar 14, 2026

Summary

Add explicit system-prompt guidance that a user request for best-of-n work should be interpreted as a request to use the task tool's n parameter with suitable sub-agents, and tighten the surrounding test guidance so we do not keep prompt-copy assertions around.

Background

The task tool description already explains how best-of-n spawning works, but the shared prelude did not directly tell the model how to map a plain-English "best of n" request onto that mechanism. This follow-up also removes tautological tests that only mirrored static prompt prose and adds a stronger AGENTS rule against that pattern.

Implementation

  • add a <best-of-n> section to the shared system prompt prelude in src/node/services/systemMessage.ts
  • regenerate docs/agents/system-prompt.mdx
  • remove tautological prelude string assertions from src/node/services/systemMessage.test.ts
  • strengthen the testing guidance in docs/AGENTS.md

Validation

  • bun test src/node/services/systemMessage.test.ts
  • make static-check

Risks

Low: the production behavior change is still limited to prompt guidance, and the rest of the diff removes brittle tests plus adds repo guidance.


Generated with mux • Model: openai:gpt-5.4 • Thinking: xhigh • Cost: n/a

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 🎉

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Add a dedicated best-of-n section to the shared system prompt prelude so plain-English best-of-n requests map to the task tool's n parameter with suitable sub-agents.

Also remove tautological prompt-copy assertions from systemMessage.test.ts and strengthen AGENTS guidance so tests focus on behavior instead of mirroring static strings.

---

_Generated with `mux` • Model: `openai:gpt-5.4` • Thinking: `xhigh` • Cost: `3.12`_

<!-- mux-attribution: model=openai:gpt-5.4 thinking=xhigh costs=3.12 -->
@ammar-agent ammar-agent force-pushed the fix/best-of-n-task-guidance branch from f9fd100 to 213cdcb Compare March 14, 2026 16:43
@ammar-agent ammar-agent changed the title 🤖 fix: clarify best-of-n system prompt guidance 🤖 fix: clarify best-of-n prompt guidance Mar 14, 2026
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Another round soon, please!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ammario ammario merged commit 44d8d9e into main Mar 14, 2026
24 checks passed
@ammario ammario deleted the fix/best-of-n-task-guidance branch March 14, 2026 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants