feat(prompts): allow embedders to override constitutional prompt text via OnceLock hooks#2356
Conversation
…ways shows params
The ApprovalRequest params_display() relied on pending_tool_uses,
which gets drained by MessageComplete before ApprovalRequired arrives
in the TUI event loop. When the model streaming response contained a
text block, ContentBlockStop(Text) set pending_message_complete=true,
triggering the drain and leaving the approval dialog with empty {}.
Fix: add input: Value to Event::ApprovalRequired and populate it
from tool_input in the engine tool execution loop. The TUI now reads
params directly from the event instead of searching pending_tool_uses.
… via OnceLock hooks An application embedding this engine may want to ship its own constitutional preamble, locale preamble, or authority-recap block without forking the prompt constants. Today the only way to change `BASE_PROMPT`, `LOCALE_PREAMBLE_ZH_HANS`, or `AUTHORITY_RECAP` is to edit the bundled markdown/strings in this crate. Add three process-global, set-once override hooks: - `set_base_prompt_override(String)` - `set_locale_preamble_zh_hans_override(String)` - `set_authority_recap_override(String)` Each is backed by a `OnceLock` and read through a small accessor (`base_prompt()` / `locale_preamble_zh_hans()` / `authority_recap()`) that falls back to the existing default constant when no override is set. Prompt assembly now calls these accessors instead of the constants directly. Default builds are unaffected (no override set → upstream constant is used). Overrides are expected to be installed once at startup, before the engine builds any prompt; `set_*` returns `Err` with the supplied value if called twice. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request introduces startup override mechanisms for compile-time prompt constants (BASE_PROMPT, LOCALE_PREAMBLE_ZH_HANS, and AUTHORITY_RECAP) using std::sync::OnceLock, allowing embedders to customize prompts without modifying the files in-tree. Feedback suggests returning a Result from the setter functions instead of silently ignoring failures when setting the OnceLock values. Additionally, it is recommended to use isolated integration tests rather than standard unit tests to prevent global state pollution and test flakiness.
| pub fn set_base_prompt_override(s: String) { | ||
| let _ = BASE_PROMPT_OVERRIDE.set(s); | ||
| } | ||
|
|
||
| /// Replace the Simplified-Chinese locale preamble (`## 语言要求`). | ||
| pub fn set_locale_preamble_zh_hans_override(s: String) { | ||
| let _ = LOCALE_PREAMBLE_ZH_HANS_OVERRIDE.set(s); | ||
| } | ||
|
|
||
| /// Replace the trailing `## Authority Recap` block. | ||
| pub fn set_authority_recap_override(s: String) { | ||
| let _ = AUTHORITY_RECAP_OVERRIDE.set(s); | ||
| } |
There was a problem hiding this comment.
Instead of silently ignoring the result of OnceLock::set, it is highly recommended to return Result<(), String>. This allows the embedder/caller to handle or log failures if they attempt to set the override multiple times or after initialization.
| pub fn set_base_prompt_override(s: String) { | |
| let _ = BASE_PROMPT_OVERRIDE.set(s); | |
| } | |
| /// Replace the Simplified-Chinese locale preamble (`## 语言要求`). | |
| pub fn set_locale_preamble_zh_hans_override(s: String) { | |
| let _ = LOCALE_PREAMBLE_ZH_HANS_OVERRIDE.set(s); | |
| } | |
| /// Replace the trailing `## Authority Recap` block. | |
| pub fn set_authority_recap_override(s: String) { | |
| let _ = AUTHORITY_RECAP_OVERRIDE.set(s); | |
| } | |
| pub fn set_base_prompt_override(s: String) -> Result<(), String> { | |
| BASE_PROMPT_OVERRIDE.set(s) | |
| } | |
| /// Replace the Simplified-Chinese locale preamble (## 语言要求). | |
| pub fn set_locale_preamble_zh_hans_override(s: String) -> Result<(), String> { | |
| LOCALE_PREAMBLE_ZH_HANS_OVERRIDE.set(s) | |
| } | |
| /// Replace the trailing ## Authority Recap block. | |
| pub fn set_authority_recap_override(s: String) -> Result<(), String> { | |
| AUTHORITY_RECAP_OVERRIDE.set(s) | |
| } |
| static BASE_PROMPT_OVERRIDE: std::sync::OnceLock<String> = std::sync::OnceLock::new(); | ||
| static LOCALE_PREAMBLE_ZH_HANS_OVERRIDE: std::sync::OnceLock<String> = std::sync::OnceLock::new(); | ||
| static AUTHORITY_RECAP_OVERRIDE: std::sync::OnceLock<String> = std::sync::OnceLock::new(); |
There was a problem hiding this comment.
Since OnceLock is process-global, calling these setters in standard unit tests will pollute the global state and cause other tests in the same test runner process to fail or behave flakily. To safely test these overrides, we should add integration tests in a separate test target (e.g., under tests/prompt_overrides.rs), which runs in its own isolated process.
|
love this!! the plan is to make this a "harness creator" in that you'll be able to go thru steps and make your own constitution / preamble / base prompt for your codewhale + individual projects. this is a great start . thank you! |
Replace the incorrect CodeWhale backend example in /provider help text with DeepSeek. CodeWhale is the app name, while deepseek is the actual provider id accepted by /provider. Add a regression test for shipped locale descriptions.
Bumps the cargo group with 1 update in the / directory: [tar](https://github.com/composefs/tar-rs). Updates `tar` from 0.4.45 to 0.4.46 - [Release notes](https://github.com/composefs/tar-rs/releases) - [Commits](composefs/tar-rs@0.4.45...0.4.46) --- updated-dependencies: - dependency-name: tar dependency-version: 0.4.46 dependency-type: direct:production dependency-group: cargo ... Signed-off-by: dependabot[bot] <support@github.com>
Run the stdio MCP server inside a blocking section when launched from the async CLI entrypoint. This prevents Tokio from panicking when the MCP server's internal runtime is dropped after stdin closes.
Treat inputs like `/ hello` as plain user messages instead of slash commands. This lets users start a message with a literal slash while preserving normal slash command behavior for `/help`, `/model ...`, and other commands.
The right-click context menu rendered all entries in English regardless of ui_locale. Added 26 MessageId variants with translations for en, ja, zh-Hans, zh-Hant, pt-BR, es-419. ContextMenuView now accepts a localized title. build_context_menu_entries() uses app.tr() instead of hardcoded strings.
Keep the /statusline picker selection visible when navigating through all available footer items in smaller terminal windows. The picker now scrolls its visible rows as the cursor moves, wraps Up/Down at the list edges, and renders the selected row with a continuous, slightly brighter background.
…neral agent intro The general-purpose sub-agent intro tells the agent how to plan, but says nothing about when to *stop*. With less capable models this leads to a failure mode where the agent retries the same failing tool call (e.g. an unreachable or rate-limited external API) over and over until it hits the elapsed/step ceiling, wasting the whole budget and returning nothing useful. Add two short clauses to GENERAL_AGENT_INTRO: - Stop quickly on failure: after the same call fails twice, return partial results with a one-line note instead of looping. - Bounded effort: prefer one focused attempt; if the task can't be completed within a few tool calls, return current findings so the parent can compensate. Prompt-only change; no behavioral code paths touched.
…-session workspace embedders The environment block (`platform`, `shell`, `pwd`, `lang`) was inserted above the volatile-content boundary as a workspace-static cache layer. The original comment claimed 'workspace path is fixed for the run' → static- cacheable — true for the terminal use case where one process owns one workspace for its lifetime. It is **not** true for embedders that swap workspaces between sessions: - IDE/GUI integrations binding the engine to a per-tab workspace - Multi-engine pools with one engine per session - Anything sending Op::SyncSession with a new workspace For these embedders, `pwd` drifts session-to-session, dragging the entire static prefix (mode prompt, project context, skills, context management, compact template) out of cache reuse on every session switch. That's a ~30KB+ cache miss for a few-byte path change. Moves the environment block below the volatile-content boundary, alongside the configured `instructions = [...]` block. Effect: - Static prefix stays byte-stable across sessions even when workspace changes (better KV prefix cache hit rate) - Model still sees `pwd` / `platform` / `shell` (needed for exec_shell and structured search tools), just in a slightly later position - No behavior change for terminal callers (workspace fixed → block content unchanged → still in same logical place, just below the boundary) Added a comment explaining the per-session-workspace motivation.
… paths Article VII Tier 5 currently lists four hard-coded path conventions (AGENTS.md / CLAUDE.md / .codewhale/instructions.md / .deepseek/instructions.md) as the canonical Local Law sources. Embedders that inject instructions via `EngineConfig.instructions` (rather than placing files at one of those four paths) get their files classified by path — and since their paths don't match, the model defaults to treating their imperatives as Tier 7 Memory (the lowest tier per Article VII), overridable by a single user sentence. Adds an explicit clause to Tier 5: '...and any file configured via `EngineConfig.instructions` (rendered as <instructions source=...> blocks above). ...embedder-declared imperatives are Local Law, not Memory preferences.' Pairs naturally with Hmbown#2311 (InstructionSource enum) but stands alone — this is a doc/law clarification, not an API change. Adds `local_law_tier_covers_engine_config_instructions` test pinning the new clause's presence.
… instead of paste-burst buffer
Add SSE event forwarding for UserInputRequired, REST endpoint for submitting user input responses, timeout protection for await_user_input, and fix interrupt_turn to clear active_turn immediately.
The monitor_turn loop already handles full turn finalization when the engine shuts down after cancellation, including saving turn status, usage, error, emitting turn.completed, and clearing active_turn. Having interrupt_turn also save turn status and emit turn.completed causes duplicate SSE events and loses usage/error data that monitor_turn would have captured from TurnComplete. Keep only the active_turn cleanup so the 409 error is resolved while monitor_turn remains the single source of truth for turn completion.
- Change 'not loaded' to 'not found' in submit_user_input and cancel_user_input so map_thread_err correctly maps to 404 - Use map_thread_err in submit_user_input API endpoint for consistent error response (404 for missing thread, 409 for conflict, etc.) instead of always returning 500
Clearing active_turn immediately breaks is_interrupt_requested detection in monitor_turn, causing turn status to be Completed instead of Interrupted. Let monitor_turn handle the cleanup after it detects the interrupt flag and performs full finalization with correct status, usage, and error.
Closes Hmbown#2074 Adds FauxStep::Factory(Box<dyn Fn(&MessageRequest) -> CannedTurn + Send + Sync>) to MockLlmClient. When a Factory step is dequeued, its closure runs against the real outgoing MessageRequest before the response stream is built, so any assert! panic surfaces directly from the client call instead of later in stream polling. Internal storage moves from VecDeque<CannedTurn> to VecDeque<FauxStep>, but every existing public method keeps working: - MockLlmClient::new(Vec<CannedTurn>) wraps each turn in FauxStep::Canned. - push_turn(CannedTurn) appends as FauxStep::Canned. Adds push_factory(closure) for tests that want the Factory branch. Doc comment on the Factory variant captures the DeepSeek V4 thinking-mode tool-call invariant (the v0.4.9-v0.5.1 reasoning_content drop that produced HTTP 400 on follow-up turns). Adds: - crates/tui/tests/reasoning_content_replayed_after_tool_call.rs — a regression test whose factory asserts the assistant tool-call turn carries a Thinking content block after a thinking + tool-call round. - An additional unit test in mock.rs covering create_message_synthesizes_from_factory_turn. All 20 tests in the new file pass, and the existing integration_mock_llm suite (27 tests) is unchanged.
|
Thanks for the prompt override hook work. I am leaving this open because the branch is currently conflict-blocked, and the override API needs a little more test coverage before it is safe to carry into the prompt layer. The specific things I would tighten: surface/handle |
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
# Conflicts: # crates/tui/src/tui/ui.rs
task_shell_start delegates to ExecShellTool, providing the same shell execution capability as exec_shell. Previously, task_shell_start was registered unconditionally in with_runtime_task_tools while exec_shell was gated behind allow_shell, creating an inconsistent security gate. This caused the model to try exec_shell first, fail, then fall back to task_shell_start — wasting tokens and bypassing the intended security boundary. Split TaskShellStartTool and TaskShellWaitTool out of with_runtime_task_tools into a new with_runtime_task_shell_tools method, and gate both behind the allow_shell check in with_agent_tools. Closes Hmbown#2303
The second feature-flag gate in tool_setup.rs was calling with_shell_tools() again when allow_shell was already true, causing duplicate tool registration. Remove the redundant gate since with_agent_tools() already handles the allow_shell check. Also add task_shell_wait to MODES.md alongside task_shell_start.
# Conflicts: # crates/tui/src/tui/views/mod.rs
fix: approval dialog shows empty params when model response includes text block
…st-currency fix(tui): show effective cost currency context in config view
Route user cells in the transcript cache through the existing user-message renderer. The live chat view renders through lines_with_copy_metadata(), which previously bypassed the user-message highlight path. As a result, submitted user prompts did not show the intended full-row background in the main transcript. Add a regression test for the transcript cache path.
Resolve the prompt-layer conflict by keeping the core tool taxonomy block while routing base prompt composition through the override accessor. Tighten the override API so duplicate registrations return the rejected string, and make locale preamble/closer overrides symmetric across supported bookend locales.
|
Pushed a maintainer update to get this branch current with What changed:
Local verification on the updated head:
Thanks again @h3c-hexin — this is a nice foundation for the future harness/constitution customization path. |
|
One more stewardship note before merge: the only red CI leg is Windows, and I checked the failed job logs. The failures are the existing background shell process-handle tests (�[0mexpected completed vs still running�[0m) rather than anything in this prompt-only change set. Ubuntu/macOS/lint/npm smoke/Greptile/GitGuardian are green, and the focused prompt tests plus tui check pass locally, so I’m treating this as mergeable rather than sending more busywork back to you. Thanks again @h3c-hexin. |
Summary
An application embedding this engine may want to ship its own constitutional preamble, locale preamble, or authority-recap block without forking the prompt constants. Today the only way to change
BASE_PROMPT,LOCALE_PREAMBLE_ZH_HANS, orAUTHORITY_RECAPis to edit the bundled markdown/strings in this crate.This adds three process-global, set-once override hooks:
set_base_prompt_override(String)set_locale_preamble_zh_hans_override(String)set_authority_recap_override(String)Each is backed by a
OnceLockand read through a small accessor (effective_base_prompt()/effective_locale_preamble_zh_hans()/effective_authority_recap()) that falls back to the existing default constant when no override is set. Prompt assembly now calls these accessors instead of the constants directly.Default builds are unaffected (no override set → bundled constant is used). Overrides are expected to be installed once at startup, before the engine builds any prompt; later
set_*calls are no-ops (OnceLock semantics).Testing
cargo fmt --all -- --checkcargo build -p codewhale-tui(compiles)cargo clippy --workspace --all-targets --all-featurescargo test --workspace --all-featuresChecklist
Greptile Summary
This PR adds process-global
OnceLock-backed prompt override hooks (base prompt, all locale preambles/closers, authority recap) so embedders can customise prompt text without forking the crate. It also introducesInstructionSource(file-path or inline string) to replaceVec<PathBuf>, adds slash-command frontmatter with anallowed-toolsrestriction, and moves the environment block below the volatile-content boundary to improve KV-cache reuse across sessions.set_*/effective_*pairs backed byOnceLock; first call wins and returnsOk(()), subsequent calls return the rejectedErr(String), addressing the previously noted silent-no-op concern. All locales (zh-Hans, ja, pt-BR, vi — the last one also newly added) now have symmetric preamble and closer overrides.InstructionSourceenum —File(PathBuf)preserves existing semantics (missing → warn + skip, oversize → truncate);Inline { name, content }lets embedders inject rendered templates without staging disk files.From<PathBuf>provided for call-site compatibility.allowed_tools— YAML-like---block in command files;allowed-toolsfield restricts which tools the model may call during that turn;parse_allowed_toolsnormalises entries to lowercase; gating happens incommand_allows_toolinside the turn loop.Confidence Score: 4/5
Safe to merge with minor follow-up: the changes are well-scoped and backwards-compatible, but two small hardening fixes in the tool-taxonomy assert and the allowed_tools normalisation are recommended.
The prompt-override mechanism, InstructionSource, and frontmatter parsing are solid and well-tested. Two non-blocking quality issues remain: debug_assert! in render_core_tool_taxonomy_block won't surface an empty taxonomy block in release builds, and command_allows_tool normalises only the tool-name side of the comparison, leaving an implicit lowercase contract on EngineConfig.allowed_tools that could silently mismatch when embedders set the field directly.
crates/tui/src/prompts.rs (render_core_tool_taxonomy_block assert) and crates/tui/src/core/engine/turn_loop.rs (command_allows_tool normalisation)
Important Files Changed
Reviews (2): Last reviewed commit: "Merge main into prompt override hooks" | Re-trigger Greptile