feat: load output styles from disk#256
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3614b2dcbf
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if let Some(new_style) = agent_code_lib::state::ResponseStyle::from_name(raw) { | ||
| engine.state_mut().response_style = new_style; | ||
| // Picking a built-in clears any active disk override. | ||
| engine.state_mut().disk_output_style = None; | ||
| println!("Response style set to '{}'.", new_style.name()); |
There was a problem hiding this comment.
Prefer disk registry entry when style IDs collide
/output-style resolves built-ins before consulting the disk registry, so IDs like concise/default always select ResponseStyle and clear disk_output_style. That makes project/user overrides with the same ID impossible to activate, contradicting the loader behavior where disk entries override built-ins; users can see an override in the list but cannot actually use it.
Useful? React with 👍 / 👎.
| pub fn active_output_style_fragment(&self) -> &str { | ||
| if let Some(style) = &self.disk_output_style { | ||
| return style.body.as_str(); | ||
| } | ||
| self.response_style.prompt_fragment() |
There was a problem hiding this comment.
Honor
applies_to when applying disk output styles
The active disk style fragment is always returned without checking applies_to, so a style intended only for a subset of agent kinds (for example subagent) still affects main-agent system prompts. Since the frontmatter parser and docs expose applies_to, this causes incorrect prompt scoping whenever a style excludes main.
Useful? React with 👍 / 👎.
| .disk_output_style | ||
| .as_ref() | ||
| .map(|s| s.name.as_str()) | ||
| .unwrap_or("") |
There was a problem hiding this comment.
Hash disk style content, not only style name
Prompt caching only includes the disk style ID, so if the user edits a style file and re-selects the same name in-session, the cached system prompt can remain stale because the hash does not change. In that scenario the new style body is ignored until some unrelated cache-busting change happens.
Useful? React with 👍 / 👎.
…style content Codex review on PR #256 surfaced three regressions in the disk output-styles loader: - /output-style resolved built-ins before disk styles, so a disk override with id `concise` or `default` could never activate. Resolution now checks the disk registry first and falls back to ResponseStyle aliases (off, terse, explain, teach) only when no disk file matches. - active_output_style_fragment ignored frontmatter `applies_to`, letting a `[subagent]`-only style leak into the main agent's prompt. Both `active_output_style_fragment` and `active_output_style_name` now take an `AgentKind` and return None when the active style doesn't apply to that role. build_system_prompt threads the kind through; the Agent tool marks spawned children with AGENT_CODE_SUBAGENT=1 so the CLI can distinguish a real subagent from a user-invoked one-shot. - The cached system prompt's hash mixed in the active style id but not its body, so editing a style file in-session and re-selecting the same id left a stale prompt cached. OutputStyle now carries a 12-byte SHA256 prefix of its body, hashed into the cache key alongside the id and applies_to list. Tests: adds disk-override-wins, applies_to-filters-main, and content-hash-changes regressions.
|
Addressed codex review findings: disk styles now win over built-ins on id collision, |
|
Addressed second-round codex findings: subagent propagation via |
Add a loader that picks up markdown output-style presets from <project>/.agent/output-styles/ and ~/.config/agent-code/output-styles/. Built-in styles continue to ship with the binary; a disk style with the same id wins on collision and a warning is logged. Malformed files are skipped (not fatal) so a single broken preset cannot crash startup. The /output-style listing now shows every loaded style with a source label (built-in / project / user) and accepts disk-loaded ids as arguments. Active style selection clears any previous override so the two paths cannot drift apart. Includes unit tests for the frontmatter parser and the override behaviour, plus an end-to-end integration test that stages golden markdown fixtures into a fake project and verifies the loader picks them up.
…style content Codex review on PR #256 surfaced three regressions in the disk output-styles loader: - /output-style resolved built-ins before disk styles, so a disk override with id `concise` or `default` could never activate. Resolution now checks the disk registry first and falls back to ResponseStyle aliases (off, terse, explain, teach) only when no disk file matches. - active_output_style_fragment ignored frontmatter `applies_to`, letting a `[subagent]`-only style leak into the main agent's prompt. Both `active_output_style_fragment` and `active_output_style_name` now take an `AgentKind` and return None when the active style doesn't apply to that role. build_system_prompt threads the kind through; the Agent tool marks spawned children with AGENT_CODE_SUBAGENT=1 so the CLI can distinguish a real subagent from a user-invoked one-shot. - The cached system prompt's hash mixed in the active style id but not its body, so editing a style file in-session and re-selecting the same id left a stale prompt cached. OutputStyle now carries a 12-byte SHA256 prefix of its body, hashed into the cache key alongside the id and applies_to list. Tests: adds disk-override-wins, applies_to-filters-main, and content-hash-changes regressions.
…rate active marker, hermetic tests
- Subagent propagation: the Agent tool now sets
AGENT_CODE_DISK_OUTPUT_STYLE=<name> when spawning a child, and the
child resolves the named style against its own loaded registry. Without
this, applies_to: [subagent] styles were dead at the subprocess
boundary. Missing styles log a warning instead of crashing the child.
- applies_to validation: reject unknown values ("mian", "Main") at
parse time so a typo can't silently produce a no-op style. Lowercase
is canonical; mixed case is rejected to keep the matcher simple.
- Active marker: identity is (name, source), not just name. Disk styles
shadowing built-in ids no longer falsely activate the disk version
when only the built-in is in use. /status uses a label that includes
the source so a disk override is visible at a glance.
- /reload: re-read the active disk style from its source file so
in-session edits land on the next turn. If the file is gone, clear
the override (best-effort, never error).
- Hermetic tests: add OutputStyleRegistry::load_all_with_user_dir and
the AGENT_CODE_USER_OUTPUT_STYLES_DIR env override so integration
tests don't read the developer's real ~/.config/agent-code/.
The hermetic e2e test in output_styles_subagent.rs relies on --dump-system-prompt to inspect the rendered prompt without ever contacting an LLM. CI strips every API-key env var, so the binary errored out at the API-key check before reaching the --dump-system-prompt branch — green locally only because OPENAI_API_KEY happens to be set. --dump-system-prompt is a pure tool+state diagnostic that never uses the LLM provider, so accept an empty placeholder key for this path specifically. All other paths still demand a real key.
af74acc to
f30f1f3
Compare
|
Rebased onto main; conflicts resolved across:
Ubuntu/Coverage test regression: fixed. Root cause was Local CI gates: |
|
Final CI status (run 25232752430, head
|
…literals #263 (task model variants) was based on main before #256 (output styles from disk) added active_disk_output_style to ToolContext. After both landed, three ToolContext literals in the task subsystem fail to compile with E0063. Add the field with None to: - crates/lib/src/tools/tasks/executors/local_agent.rs:75 - crates/lib/src/tools/tasks/tools.rs (test helper) - crates/lib/tests/task_kind_integration.rs:34 Task executors don't propagate disk output styles today; None is the correct value until 8.7 (swarm mode) and 8.10 (fork/resume) wire that through.
Summary
<project>/.agent/output-styles/*.mdand~/.config/agent-code/output-styles/*.md. The frontmatter schema isname(required, becomes the style id),description(required), andapplies_to(optional list of subagent kinds; defaults to all)./output-styleso the listing surfaces every style with a(built-in)/(project)/(user)source label and accepts disk-loaded ids alongside the built-in names.Test Plan
cargo build --workspacecargo test -p agent-code-lib --lib output_style(13 tests)cargo test -p agent-code-lib --test output_styles_integration(5 tests)system_prompt_*tests still pass — the prompt-cache hash now includes the active disk style id so/output-styleflips bust the cache.cargo clippy -p agent-code-lib --testsclean.agent/output-styles/of a fixture project and confirm it appears in the loader output (exercised by the integration test).