Skip to content

🤖 feat: restore Orchestrator as hidden /orchestrate skill#3295

Merged
ammario merged 3 commits into
mainfrom
orchestrate-skill
May 17, 2026
Merged

🤖 feat: restore Orchestrator as hidden /orchestrate skill#3295
ammario merged 3 commits into
mainfrom
orchestrate-skill

Conversation

@ammar-agent
Copy link
Copy Markdown
Collaborator

@ammar-agent ammar-agent commented May 16, 2026

Summary

Restores the delegate-first orchestration workflow that was deleted in #3224 as an unadvertised built-in agent skill. Users who miss the original Orchestrator can type /orchestrate <request> to load the playbook into the conversation; the skill stays out of the default system-prompt skill index so it doesn't clutter the UX for users who don't miss it.

Also pulls the ChatInput placeholder tip carousel from side-questions-2hr3 ahead of that branch landing, with a tip surfacing /orchestrate so users have a passive discovery path for the new hidden skill.

Background

#3224 ripped out the Orchestrator agent wholesale because the scaffolding around it (LLM auto-router, settings UI, agent picker entry, post-propose_plan button, two helper methods on taskService) carried real maintenance cost. The original PR explicitly chose not to migrate users.

But the primitives Orchestrator orchestrated all remained:

  • task / task_await / task_list / task_terminate (toolDefinitions.ts:2181, 2289)
  • task_apply_git_patch (toolDefinitions.ts:1449)
  • sub-agent patch artifact flow + git am integration
  • Max Task Nesting Depth setting

So the workflow itself never broke — only the prompt and the chrome that pointed agents toward it. A markdown skill is enough to put the workflow back without resurrecting the scaffolding.

Implementation

Orchestrate skill

  • src/node/builtinSkills/orchestrate.md — adapted from the deleted orchestrator.md (recovered from e623bb8af^). Frontmatter sets advertise: false. Body keeps the original sections: delegate-first hard rules, delegation guide (explore / exec / plan / desktop), task brief template, dependency analysis, parallel + sequential patch integration loops, conflict reconciliation playbook (including the git am --abort-only-when-in-progress nuance), and the "never read session storage" guardrail. Adds a prerequisites note about Max Task Nesting Depth.
  • builtInSkillContent.generated.ts — regenerated via bun scripts/gen_builtin_skills.ts.
  • agentSkillsService.test.ts — two existing assertions enumerate the built-in skill list; both now include "orchestrate".
  • builtInOrchestrateSkill.test.ts — new focused test confirms the skill is registered, has advertise: false, and its body still carries the load-bearing directives (regex spot-checks on delegate-first, task_apply_git_patch, dry-run, Max Task Nesting Depth) — wording can drift, but a wholesale gutting of the playbook would fail.

Tip carousel (pulled forward from side-questions-2hr3)

  • src/browser/features/ChatInput/placeholderTips.ts — rotating placeholder tip list keyed off a wall-clock 20-minute bucket, so switching workspaces mid-bucket never reshuffles the visible tip. /orchestrate is the lead/index-0 tip alongside /haiku, /+high, /compact, /fork, /plan, /clear --soft, /new, /vim, /truncate — every entry maps to a registered, ungated slash command or built-in skill, so following a tip never lands the user on an unknown-command or experiment-required dead end. /btw is omitted (ships in side-questions-2hr3) and /goal is omitted (gated behind EXPERIMENT_IDS.GOALS, off by default). Under Storybook (detected via a __MUX_STORYBOOK__ global set by .storybook/preview.tsx) the default-arg path pins to PLACEHOLDER_TIPS[0] so tip-list edits don't cascade into Chromatic baseline diffs across every story rendering ChatInput; pinning to /orchestrate doubles as a passive discovery surface for this PR's headline feature in every snapshot. Explicit nowMs arguments still rotate, so unit tests stay meaningful. The runtime-flag indirection (instead of import.meta.env) keeps the module importable from Jest-based UI tests in tests/ui/**, which run in CommonJS mode and choke on import.meta.
  • src/browser/features/ChatInput/placeholderTips.test.ts — covers single-bucket stability, bucket-boundary rotation, modulo wrap, degenerate-input fallback, Storybook-pin behavior under the runtime flag, and /orchestrate presence.
  • src/browser/features/ChatInput/index.tsx — workspace variant uses getPlaceholderTip() instead of "Type a message..." on non-mobile; mobile keeps the plain placeholder because the on-screen keyboard already squeezes the input.

What this PR intentionally does not restore

  • Tool restriction. The old agent did tools.remove: [file_edit_.*] so it physically could not edit files. A skill can't strip tools from its caller. The skill prose enforces the rule strongly ("Do not implement features/bugfixes directly in this workspace") but it's a soft constraint.
  • "Start Orchestrator" button on propose_plan cards and the planExecutorRouter LLM auto-router. Recreating those is a separate, scaffolding-heavy feature; users who want the post-plan handoff can type /orchestrate Implement the plan. themselves.
  • Distinct agent picker entry / Workflow icon. Users stay in their current agent (Exec); the chrome was incidental.

This is the same hidden-skill pattern already documented for /deep-review in docs/agents/agent-skills.mdx.

Validation

  • bun test src/node/services/agentSkills/ — 36/36 pass.
  • bun test src/node/services/tools/agent_skill_{read,read_file,list}.test.ts src/node/services/tools/skills_catalog_read.test.ts — 89/89 pass alongside the skill suite.
  • bun test src/browser/features/ChatInput/placeholderTips.test.ts — 5/5 pass.
  • make static-check — green.

Risks

Low. No schema changes, no IPC changes, no migrations. Behavioral surface:

  1. One additional unadvertised built-in skill is discoverable. agentSkills.list already returns unadvertised skills, so slash autocomplete surfaces /orchestrate once the user starts typing it, matching the documented behavior of unadvertised skills.
  2. The workspace ChatInput placeholder is no longer the static string "Type a message..." on desktop. The text rotates every 20 minutes against a wall-clock bucket; the value is deterministic per bucket so tests / screenshots inside a single bucket are stable.

The legacy config keys (planSubagentExecutorRouting, planSubagentDefaultsToOrchestrator) deliberately removed in #3224 are not reintroduced.


Generated with mux • Model: anthropic:claude-opus-4-7 • Thinking: high • Cost: $11.74

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 145b3882e9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/builtinSkills/orchestrate.md
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Replied + resolved the autocomplete thread — keeping /orchestrate discoverable in autocomplete is intentional (advertise=false is about the system-prompt skill index, not user-facing surfaces). No code changes since last review.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 145b3882e9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/builtinSkills/orchestrate.md
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Replied + resolved the ACP thread — same reasoning as before: the ACP filter for unadvertised skills is system-wide existing behavior, not specific to /orchestrate. Changing it would be an orthogonal scope. No code changes since last review.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. More of your lovely PRs please.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Pulled the placeholder tip carousel forward from side-questions-2hr3 (likely to land after this PR) and added a tip surfacing /orchestrate so the unadvertised skill has a passive discovery path.

New files:

  • src/browser/features/ChatInput/placeholderTips.ts
  • src/browser/features/ChatInput/placeholderTips.test.ts

Wired in src/browser/features/ChatInput/index.tsx (non-mobile only — mobile keeps the static "Type a message..." placeholder).

Reintroduces the delegate-first orchestration workflow that was deleted
in #3224 ("rip out the Orchestrator agent feature") as an unadvertised
built-in skill, invocable via /orchestrate.

The original agent was removed because the surrounding scaffolding (LLM
auto-router, settings UI, picker entry, plan-handoff button) had a high
maintenance cost. The underlying primitives it orchestrated — task /
task_await / task_apply_git_patch — were left in place, so a markdown
skill is sufficient to recover the workflow with zero new scaffolding.

advertise: false keeps the skill out of the system-prompt skill index
(uncluttered default UX), while users who miss it can still type
/orchestrate to load the playbook on-demand. Mirrors the existing
hidden-skill pattern documented for /deep-review.
Pull the placeholder tip carousel from side-questions-2hr3 ahead of that
branch landing, and add a tip surfacing /orchestrate alongside the
existing slash-command tricks. The skill is unadvertised in the
system-prompt skill index (advertise: false), so the carousel is one of
the few passive discovery surfaces users will see it on.

Wall-clock 20-minute bucket means switching workspaces mid-bucket never
reshuffles the visible tip. Mobile keeps the plain 'Type a message...'
placeholder because the on-screen keyboard squeezes the input.

Generated with mux • Model: anthropic:claude-opus-4-7 • Thinking: high
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Delightful!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Pinned the placeholder tip carousel under Storybook so future tip-list edits don't cascade into Chromatic baseline diffs.

  • src/browser/features/ChatInput/placeholderTips.tsgetPlaceholderTip() now detects Storybook-Vite via import.meta.env.STORYBOOK === "true" and pins to PLACEHOLDER_TIPS[0] on default-arg calls. Explicit nowMs still rotates, so unit tests are unaffected.
  • Verified by building storybook-static and inspecting the bundled function: import.meta.env is statically replaced with {..., STORYBOOK: "true"}, so isStorybookBuild() returns true and the function returns the lead /btw tip.
  • Production Vite builds don't get the STORYBOOK env replacement, so rotation continues normally for end users.

make static-check is green; bun test src/browser/features/ChatInput/placeholderTips.test.ts passes 5/5.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d226c02150

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/browser/features/ChatInput/placeholderTips.ts
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Force-pushed a revised fix for the Storybook tip pinning.

The first version used import.meta.env.STORYBOOK === "true", which broke the Test / Integration job because tests/ui/** runs in Jest's CommonJS mode and can't parse import.meta. Switched to a plain globalThis.__MUX_STORYBOOK__ flag set at the top of .storybook/preview.tsx, which works in every environment (Storybook sets it before any story renders; Jest / Bun tests never touch it; production builds never see it).

  • Verified the Jest test that previously failed (tests/ui/layout/analyticsEscape.test.ts) now passes locally.
  • Verified the storybook build still bundles __MUX_STORYBOOK__ = true in the iframe entry and isStorybookRuntime() in the meta chunk.
  • Updated the unit test to actually flip the flag and assert both the pinned and rotating behaviors — no longer relying on the (untestable) Vite-only env check.

make static-check is green; bun test src/browser/features/ChatInput/placeholderTips.test.ts passes 6/6.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. More of your lovely PRs please.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Force-pushed addressing the /btw advertisement concern.

Codex pointed out that /btw isn't in the slash command registry (it ships with side-questions-2hr3), so users following the lead tip would have hit the unknown-command path. Dropped /btw from PLACEHOLDER_TIPS and promoted /orchestrate to index 0 — every remaining entry is a registered slash command or built-in skill, so the carousel never advertises an unimplemented command. Pinning to /orchestrate also turns the Storybook fallback into passive discovery for this PR's headline feature.

Also strengthened the test from "PLACEHOLDER_TIPS contains /orchestrate" to "PLACEHOLDER_TIPS[0] is the /orchestrate tip" so demoting it would fail two named surfaces at once (degenerate-input fallback + Storybook pin).

make static-check is green; bun test src/browser/features/ChatInput/placeholderTips.test.ts passes 6/6.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 259fe4fbce

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/browser/features/ChatInput/placeholderTips.ts Outdated
Under Chromatic snapshots the carousel resolved a tip via
`floor(NOW / 20min) mod PLACEHOLDER_TIPS.length`, so any reorder or
insertion into PLACEHOLDER_TIPS shifted the displayed tip and cascaded
into a fresh baseline diff on every story rendering ChatInput.

Pin the default-arg path to PLACEHOLDER_TIPS[0] when running under
Storybook. Detection uses a plain globalThis flag (`__MUX_STORYBOOK__`)
set by `.storybook/preview.tsx` rather than `import.meta.env`, because
this module is transitively imported by Jest-based UI tests
(`tests/ui/**`) that run in CommonJS mode and choke on `import.meta`.

Explicit `nowMs` arguments still rotate even when the flag is set, so
unit tests for the rotation math stay meaningful in every environment.

Also prune the carousel to only ungated, registered commands:
- Drop `/btw` — not in the slash registry (ships in side-questions-2hr3).
- Drop `/goal` — registered but `experimentGate: GOALS` (off by default),
  so default-config users hit 'Goal commands require the Goals
  experiment to be enabled.'
- Promote `/orchestrate` to index 0 so the storybook pin + degenerate-
  input fallback both promote this PR's actual headline feature.

Document the maintenance rule (every tip must map to a real, ungated
command) on the module so future additions don't reintroduce the same
dead-end class of bug.
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Force-pushed addressing the /goal advertisement concern (same class of bug as /btw).

  • Dropped /goal from PLACEHOLDER_TIPS because it's gated by EXPERIMENT_IDS.GOALS (enabledByDefault: false), so default-config users would have hit "Goal commands require the Goals experiment to be enabled."
  • Re-audited the remaining carousel against src/browser/utils/slashCommands/registry.ts. The only other experimentGated command is /heartbeat, which was never in the carousel. Everything that remains (/orchestrate, /haiku, /+high, /compact, /fork, /plan, /clear, /new, /vim, /truncate) is registered and ungated.
  • Documented the invariant on the module doc-comment: every tip must map to a real, ungated command. Future additions are pointed at the experimentGate grep so this class of bug doesn't reappear.

make static-check is green; bun test src/browser/features/ChatInput/placeholderTips.test.ts passes 6/6.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Hooray!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ammario ammario merged commit 316a835 into main May 17, 2026
24 checks passed
@ammario ammario deleted the orchestrate-skill branch May 17, 2026 20:37
ammar-agent added a commit that referenced this pull request May 17, 2026
The orchestrate playbook (restored in #3295) explicitly told the calling
agent to delegate complex work to `plan` sub-agents. With plan no longer
runnable as a sub-agent, those `task` calls would fail at Task.create.

Update orchestrate to delegate all implementation work to `exec`,
investing in richer briefs (via parallel `explore` tasks for context)
when subtasks are non-trivial. The skill is purely instructional, so
this is a prompt-only change.
ammario pushed a commit that referenced this pull request May 17, 2026
## Summary

Stop letting the Plan agent be spawned as a sub-agent, and update the
`orchestrate` playbook to match. The runtime already restricts plan-mode
edits to the plan file via `validatePlanModeAccess`, but plan still had
`subagent.runnable: true` with a wide-open `add: ['.*']` tool list and
no read-only prose injunction — so a plan sub-agent would happily emit
`file_edit_*` calls every turn. The runtime would reject each call, but
the model kept trying, burning tokens and eroding the "plan never
touches code" guarantee. There is no legitimate use case for a plan
sub-agent: nothing downstream consumes its report.

## Background

- `src/node/builtinAgents/plan.md` defines tools with `add: ['.*']` and
only narrows the list via `remove` for
image/git-patch/config/skills/analytics tools. `file_edit_*` and `bash`
are left enabled by design — the comment says they "ARE available but
restricted to plan file only at runtime."
- That runtime restriction lives in
`src/node/services/tools/fileCommon.ts::validatePlanModeAccess` (gated
by `config.planFileOnly && config.planFilePath`) and is wired in via
`src/node/services/aiService.ts` (`planFileOnly: agentIsPlanLike`).
- `src/node/services/tools/task.ts:302` already prevents a plan
**parent** from spawning anything other than `explore`. The missing half
was preventing an exec parent from spawning a plan **child**.
- `TaskService.create` (`src/node/services/taskService.ts:1019`) already
has the matching guard:
  ```ts
  if (frontmatter.subagent?.runnable !== true) {
    return Err(
`Task.create: agentId is not runnable as a sub-agent (${agentId}).
${hint}`,
    );
  }
  ```
  Flipping the flag is enough; no new enforcement code is needed.
- The `orchestrate` skill (restored in #3295) previously delegated
complex work to `plan` sub-agents and relied on the post-`propose_plan`
auto-handoff. With plan non-runnable, those `task` calls would now fail
at `Task.create`, so orchestrate is updated alongside.

## Implementation

`src/node/builtinAgents/plan.md`: `subagent.runnable: true` → `false`,
with a rationale comment explaining why future contributors should not
re-enable it.

`src/node/builtinSkills/orchestrate.md`: removed every recommendation to
delegate to `plan`. Orchestrate now sends all implementation work to
`exec`, and gets the "higher complexity" benefit from parallel `explore`
tasks shaping a richer `exec` brief (goal + constraints + acceptance)
rather than from a plan handoff. Added an explicit note that "plan is
intentionally not runnable as a sub-agent."

`src/node/services/agentDefinitions/builtInAgentContent.generated.ts`,
`src/node/services/agentSkills/builtInSkillContent.generated.ts`,
`docs/agents/index.mdx`: regenerated snapshots picked up by
`scripts/generate-builtin-agents.sh`,
`scripts/generate-builtin-skills.sh`, and `make fmt`.

No agent inherits from `plan` (only `desktop` and `explore` inherit from
`exec`), so no other built-in shape changes.

## Validation

- `bun test
src/node/services/agentDefinitions/builtInAgentDefinitions.test.ts` ✓
- `bun test src/node/services/taskService.test.ts` ✓
- `bun test src/node/services/tools/file_edit_operation.test.ts
src/node/services/tools/task.test.ts` ✓
- `bun test src/node/services/agentResolution.test.ts
src/node/services/agentDefinitions/
src/node/services/streamContextBuilder.test.ts
src/node/services/aiService.test.ts` ✓
- `make static-check` ✓ (lint + typecheck + fmt-check + docs sync)

## Risks

Low. The change is one frontmatter flag flip on a built-in agent plus
prompt-only edits to the orchestrate skill. The runtime path that
handles "user asked for an unrunnable sub-agent" is exercised by
`TaskService.create` for every non-runnable agent already (e.g.,
`compact`, `name_workspace`). Top-level plan workspaces — the only
legitimate use case for plan — are unaffected because `runnable` only
gates sub-agent spawning.

The legacy `handleSuccessfulProposePlanAutoHandoff` flow in
`taskService.ts` remains in place. It's now effectively defensive code
for any historical plan sub-agent workspaces that existed before this
change; it cannot be re-entered from `Task.create`.

The only behavior change a caller could notice is `task({ agentId:
"plan", ... })` returning `Task.create: agentId is not runnable as a
sub-agent (plan). …` instead of silently working-and-then-failing. That
is the intended outcome.

---

_Generated with `mux` • Model: `anthropic:claude-opus-4-7` • Thinking:
`max` • Cost: `$6.63`_

<!-- mux-attribution: model=anthropic:claude-opus-4-7 thinking=max
costs=6.63 -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants