🤖 feat: restore Orchestrator as hidden /orchestrate skill#3295
Conversation
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 145b3882e9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Replied + resolved the autocomplete thread — keeping |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 145b3882e9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Replied + resolved the ACP thread — same reasoning as before: the ACP filter for unadvertised skills is system-wide existing behavior, not specific to |
|
Codex Review: Didn't find any major issues. More of your lovely PRs please. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@codex review Pulled the placeholder tip carousel forward from New files:
Wired in |
Reintroduces the delegate-first orchestration workflow that was deleted in #3224 ("rip out the Orchestrator agent feature") as an unadvertised built-in skill, invocable via /orchestrate. The original agent was removed because the surrounding scaffolding (LLM auto-router, settings UI, picker entry, plan-handoff button) had a high maintenance cost. The underlying primitives it orchestrated — task / task_await / task_apply_git_patch — were left in place, so a markdown skill is sufficient to recover the workflow with zero new scaffolding. advertise: false keeps the skill out of the system-prompt skill index (uncluttered default UX), while users who miss it can still type /orchestrate to load the playbook on-demand. Mirrors the existing hidden-skill pattern documented for /deep-review.
Pull the placeholder tip carousel from side-questions-2hr3 ahead of that branch landing, and add a tip surfacing /orchestrate alongside the existing slash-command tricks. The skill is unadvertised in the system-prompt skill index (advertise: false), so the carousel is one of the few passive discovery surfaces users will see it on. Wall-clock 20-minute bucket means switching workspaces mid-bucket never reshuffles the visible tip. Mobile keeps the plain 'Type a message...' placeholder because the on-screen keyboard squeezes the input. Generated with mux • Model: anthropic:claude-opus-4-7 • Thinking: high
69a5c1f to
458921f
Compare
|
Codex Review: Didn't find any major issues. Delightful! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@codex review Pinned the placeholder tip carousel under Storybook so future tip-list edits don't cascade into Chromatic baseline diffs.
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d226c02150
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
d226c02 to
fd3c87f
Compare
|
@codex review Force-pushed a revised fix for the Storybook tip pinning. The first version used
|
|
Codex Review: Didn't find any major issues. More of your lovely PRs please. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
fd3c87f to
259fe4f
Compare
|
@codex review Force-pushed addressing the Codex pointed out that Also strengthened the test from "PLACEHOLDER_TIPS contains /orchestrate" to "PLACEHOLDER_TIPS[0] is the /orchestrate tip" so demoting it would fail two named surfaces at once (degenerate-input fallback + Storybook pin).
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 259fe4fbce
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Under Chromatic snapshots the carousel resolved a tip via `floor(NOW / 20min) mod PLACEHOLDER_TIPS.length`, so any reorder or insertion into PLACEHOLDER_TIPS shifted the displayed tip and cascaded into a fresh baseline diff on every story rendering ChatInput. Pin the default-arg path to PLACEHOLDER_TIPS[0] when running under Storybook. Detection uses a plain globalThis flag (`__MUX_STORYBOOK__`) set by `.storybook/preview.tsx` rather than `import.meta.env`, because this module is transitively imported by Jest-based UI tests (`tests/ui/**`) that run in CommonJS mode and choke on `import.meta`. Explicit `nowMs` arguments still rotate even when the flag is set, so unit tests for the rotation math stay meaningful in every environment. Also prune the carousel to only ungated, registered commands: - Drop `/btw` — not in the slash registry (ships in side-questions-2hr3). - Drop `/goal` — registered but `experimentGate: GOALS` (off by default), so default-config users hit 'Goal commands require the Goals experiment to be enabled.' - Promote `/orchestrate` to index 0 so the storybook pin + degenerate- input fallback both promote this PR's actual headline feature. Document the maintenance rule (every tip must map to a real, ungated command) on the module so future additions don't reintroduce the same dead-end class of bug.
259fe4f to
88165a6
Compare
|
@codex review Force-pushed addressing the
|
|
Codex Review: Didn't find any major issues. Hooray! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
The orchestrate playbook (restored in #3295) explicitly told the calling agent to delegate complex work to `plan` sub-agents. With plan no longer runnable as a sub-agent, those `task` calls would fail at Task.create. Update orchestrate to delegate all implementation work to `exec`, investing in richer briefs (via parallel `explore` tasks for context) when subtasks are non-trivial. The skill is purely instructional, so this is a prompt-only change.
## Summary
Stop letting the Plan agent be spawned as a sub-agent, and update the
`orchestrate` playbook to match. The runtime already restricts plan-mode
edits to the plan file via `validatePlanModeAccess`, but plan still had
`subagent.runnable: true` with a wide-open `add: ['.*']` tool list and
no read-only prose injunction — so a plan sub-agent would happily emit
`file_edit_*` calls every turn. The runtime would reject each call, but
the model kept trying, burning tokens and eroding the "plan never
touches code" guarantee. There is no legitimate use case for a plan
sub-agent: nothing downstream consumes its report.
## Background
- `src/node/builtinAgents/plan.md` defines tools with `add: ['.*']` and
only narrows the list via `remove` for
image/git-patch/config/skills/analytics tools. `file_edit_*` and `bash`
are left enabled by design — the comment says they "ARE available but
restricted to plan file only at runtime."
- That runtime restriction lives in
`src/node/services/tools/fileCommon.ts::validatePlanModeAccess` (gated
by `config.planFileOnly && config.planFilePath`) and is wired in via
`src/node/services/aiService.ts` (`planFileOnly: agentIsPlanLike`).
- `src/node/services/tools/task.ts:302` already prevents a plan
**parent** from spawning anything other than `explore`. The missing half
was preventing an exec parent from spawning a plan **child**.
- `TaskService.create` (`src/node/services/taskService.ts:1019`) already
has the matching guard:
```ts
if (frontmatter.subagent?.runnable !== true) {
return Err(
`Task.create: agentId is not runnable as a sub-agent (${agentId}).
${hint}`,
);
}
```
Flipping the flag is enough; no new enforcement code is needed.
- The `orchestrate` skill (restored in #3295) previously delegated
complex work to `plan` sub-agents and relied on the post-`propose_plan`
auto-handoff. With plan non-runnable, those `task` calls would now fail
at `Task.create`, so orchestrate is updated alongside.
## Implementation
`src/node/builtinAgents/plan.md`: `subagent.runnable: true` → `false`,
with a rationale comment explaining why future contributors should not
re-enable it.
`src/node/builtinSkills/orchestrate.md`: removed every recommendation to
delegate to `plan`. Orchestrate now sends all implementation work to
`exec`, and gets the "higher complexity" benefit from parallel `explore`
tasks shaping a richer `exec` brief (goal + constraints + acceptance)
rather than from a plan handoff. Added an explicit note that "plan is
intentionally not runnable as a sub-agent."
`src/node/services/agentDefinitions/builtInAgentContent.generated.ts`,
`src/node/services/agentSkills/builtInSkillContent.generated.ts`,
`docs/agents/index.mdx`: regenerated snapshots picked up by
`scripts/generate-builtin-agents.sh`,
`scripts/generate-builtin-skills.sh`, and `make fmt`.
No agent inherits from `plan` (only `desktop` and `explore` inherit from
`exec`), so no other built-in shape changes.
## Validation
- `bun test
src/node/services/agentDefinitions/builtInAgentDefinitions.test.ts` ✓
- `bun test src/node/services/taskService.test.ts` ✓
- `bun test src/node/services/tools/file_edit_operation.test.ts
src/node/services/tools/task.test.ts` ✓
- `bun test src/node/services/agentResolution.test.ts
src/node/services/agentDefinitions/
src/node/services/streamContextBuilder.test.ts
src/node/services/aiService.test.ts` ✓
- `make static-check` ✓ (lint + typecheck + fmt-check + docs sync)
## Risks
Low. The change is one frontmatter flag flip on a built-in agent plus
prompt-only edits to the orchestrate skill. The runtime path that
handles "user asked for an unrunnable sub-agent" is exercised by
`TaskService.create` for every non-runnable agent already (e.g.,
`compact`, `name_workspace`). Top-level plan workspaces — the only
legitimate use case for plan — are unaffected because `runnable` only
gates sub-agent spawning.
The legacy `handleSuccessfulProposePlanAutoHandoff` flow in
`taskService.ts` remains in place. It's now effectively defensive code
for any historical plan sub-agent workspaces that existed before this
change; it cannot be re-entered from `Task.create`.
The only behavior change a caller could notice is `task({ agentId:
"plan", ... })` returning `Task.create: agentId is not runnable as a
sub-agent (plan). …` instead of silently working-and-then-failing. That
is the intended outcome.
---
_Generated with `mux` • Model: `anthropic:claude-opus-4-7` • Thinking:
`max` • Cost: `$6.63`_
<!-- mux-attribution: model=anthropic:claude-opus-4-7 thinking=max
costs=6.63 -->
Summary
Restores the delegate-first orchestration workflow that was deleted in #3224 as an unadvertised built-in agent skill. Users who miss the original Orchestrator can type
/orchestrate <request>to load the playbook into the conversation; the skill stays out of the default system-prompt skill index so it doesn't clutter the UX for users who don't miss it.Also pulls the ChatInput placeholder tip carousel from
side-questions-2hr3ahead of that branch landing, with a tip surfacing/orchestrateso users have a passive discovery path for the new hidden skill.Background
#3224 ripped out the Orchestrator agent wholesale because the scaffolding around it (LLM auto-router, settings UI, agent picker entry, post-
propose_planbutton, two helper methods ontaskService) carried real maintenance cost. The original PR explicitly chose not to migrate users.But the primitives Orchestrator orchestrated all remained:
task/task_await/task_list/task_terminate(toolDefinitions.ts:2181, 2289)task_apply_git_patch(toolDefinitions.ts:1449)git amintegrationSo the workflow itself never broke — only the prompt and the chrome that pointed agents toward it. A markdown skill is enough to put the workflow back without resurrecting the scaffolding.
Implementation
Orchestrate skill
src/node/builtinSkills/orchestrate.md— adapted from the deletedorchestrator.md(recovered frome623bb8af^). Frontmatter setsadvertise: false. Body keeps the original sections: delegate-first hard rules, delegation guide (explore/exec/plan/desktop), task brief template, dependency analysis, parallel + sequential patch integration loops, conflict reconciliation playbook (including thegit am --abort-only-when-in-progress nuance), and the "never read session storage" guardrail. Adds a prerequisites note about Max Task Nesting Depth.builtInSkillContent.generated.ts— regenerated viabun scripts/gen_builtin_skills.ts.agentSkillsService.test.ts— two existing assertions enumerate the built-in skill list; both now include"orchestrate".builtInOrchestrateSkill.test.ts— new focused test confirms the skill is registered, hasadvertise: false, and its body still carries the load-bearing directives (regex spot-checks ondelegate-first,task_apply_git_patch,dry-run,Max Task Nesting Depth) — wording can drift, but a wholesale gutting of the playbook would fail.Tip carousel (pulled forward from
side-questions-2hr3)src/browser/features/ChatInput/placeholderTips.ts— rotating placeholder tip list keyed off a wall-clock 20-minute bucket, so switching workspaces mid-bucket never reshuffles the visible tip./orchestrateis the lead/index-0 tip alongside/haiku,/+high,/compact,/fork,/plan,/clear --soft,/new,/vim,/truncate— every entry maps to a registered, ungated slash command or built-in skill, so following a tip never lands the user on an unknown-command or experiment-required dead end./btwis omitted (ships inside-questions-2hr3) and/goalis omitted (gated behindEXPERIMENT_IDS.GOALS, off by default). Under Storybook (detected via a__MUX_STORYBOOK__global set by.storybook/preview.tsx) the default-arg path pins toPLACEHOLDER_TIPS[0]so tip-list edits don't cascade into Chromatic baseline diffs across every story rendering ChatInput; pinning to/orchestratedoubles as a passive discovery surface for this PR's headline feature in every snapshot. ExplicitnowMsarguments still rotate, so unit tests stay meaningful. The runtime-flag indirection (instead ofimport.meta.env) keeps the module importable from Jest-based UI tests intests/ui/**, which run in CommonJS mode and choke onimport.meta.src/browser/features/ChatInput/placeholderTips.test.ts— covers single-bucket stability, bucket-boundary rotation, modulo wrap, degenerate-input fallback, Storybook-pin behavior under the runtime flag, and/orchestratepresence.src/browser/features/ChatInput/index.tsx— workspace variant usesgetPlaceholderTip()instead of"Type a message..."on non-mobile; mobile keeps the plain placeholder because the on-screen keyboard already squeezes the input.What this PR intentionally does not restore
tools.remove: [file_edit_.*]so it physically could not edit files. A skill can't strip tools from its caller. The skill prose enforces the rule strongly ("Do not implement features/bugfixes directly in this workspace") but it's a soft constraint.propose_plancards and theplanExecutorRouterLLM auto-router. Recreating those is a separate, scaffolding-heavy feature; users who want the post-plan handoff can type/orchestrate Implement the plan.themselves.Workflowicon. Users stay in their current agent (Exec); the chrome was incidental.This is the same hidden-skill pattern already documented for
/deep-reviewindocs/agents/agent-skills.mdx.Validation
bun test src/node/services/agentSkills/— 36/36 pass.bun test src/node/services/tools/agent_skill_{read,read_file,list}.test.ts src/node/services/tools/skills_catalog_read.test.ts— 89/89 pass alongside the skill suite.bun test src/browser/features/ChatInput/placeholderTips.test.ts— 5/5 pass.make static-check— green.Risks
Low. No schema changes, no IPC changes, no migrations. Behavioral surface:
agentSkills.listalready returns unadvertised skills, so slash autocomplete surfaces/orchestrateonce the user starts typing it, matching the documented behavior of unadvertised skills."Type a message..."on desktop. The text rotates every 20 minutes against a wall-clock bucket; the value is deterministic per bucket so tests / screenshots inside a single bucket are stable.The legacy config keys (
planSubagentExecutorRouting,planSubagentDefaultsToOrchestrator) deliberately removed in #3224 are not reintroduced.Generated with
mux• Model:anthropic:claude-opus-4-7• Thinking:high• Cost:$11.74