Skip to content

feat(prompts): allow embedders to override constitutional prompt text via OnceLock hooks#2356

Merged
Hmbown merged 124 commits into
Hmbown:mainfrom
h3c-hexin:pr2/prompt-override-hooks
May 31, 2026
Merged

feat(prompts): allow embedders to override constitutional prompt text via OnceLock hooks#2356
Hmbown merged 124 commits into
Hmbown:mainfrom
h3c-hexin:pr2/prompt-override-hooks

Conversation

@h3c-hexin
Copy link
Copy Markdown
Contributor

@h3c-hexin h3c-hexin commented May 29, 2026

Summary

An application embedding this engine may want to ship its own constitutional preamble, locale preamble, or authority-recap block without forking the prompt constants. Today the only way to change BASE_PROMPT, LOCALE_PREAMBLE_ZH_HANS, or AUTHORITY_RECAP is to edit the bundled markdown/strings in this crate.

This adds three process-global, set-once override hooks:

  • set_base_prompt_override(String)
  • set_locale_preamble_zh_hans_override(String)
  • set_authority_recap_override(String)

Each is backed by a OnceLock and read through a small accessor (effective_base_prompt() / effective_locale_preamble_zh_hans() / effective_authority_recap()) that falls back to the existing default constant when no override is set. Prompt assembly now calls these accessors instead of the constants directly.

Default builds are unaffected (no override set → bundled constant is used). Overrides are expected to be installed once at startup, before the engine builds any prompt; later set_* calls are no-ops (OnceLock semantics).

Testing

  • cargo fmt --all -- --check
  • cargo build -p codewhale-tui (compiles)
  • cargo clippy --workspace --all-targets --all-features
  • cargo test --workspace --all-features

Checklist

  • Updated docs or comments as needed
  • Added or updated tests where relevant (mechanism is opt-in; default path unchanged)
  • Verified TUI behavior manually if UI changes (n/a)

Greptile Summary

This PR adds process-global OnceLock-backed prompt override hooks (base prompt, all locale preambles/closers, authority recap) so embedders can customise prompt text without forking the crate. It also introduces InstructionSource (file-path or inline string) to replace Vec<PathBuf>, adds slash-command frontmatter with an allowed-tools restriction, and moves the environment block below the volatile-content boundary to improve KV-cache reuse across sessions.

  • Prompt overrides — 10 new set_* / effective_* pairs backed by OnceLock; first call wins and returns Ok(()), subsequent calls return the rejected Err(String), addressing the previously noted silent-no-op concern. All locales (zh-Hans, ja, pt-BR, vi — the last one also newly added) now have symmetric preamble and closer overrides.
  • InstructionSource enumFile(PathBuf) preserves existing semantics (missing → warn + skip, oversize → truncate); Inline { name, content } lets embedders inject rendered templates without staging disk files. From<PathBuf> provided for call-site compatibility.
  • Slash-command frontmatter + allowed_tools — YAML-like --- block in command files; allowed-tools field restricts which tools the model may call during that turn; parse_allowed_tools normalises entries to lowercase; gating happens in command_allows_tool inside the turn loop.

Confidence Score: 4/5

Safe to merge with minor follow-up: the changes are well-scoped and backwards-compatible, but two small hardening fixes in the tool-taxonomy assert and the allowed_tools normalisation are recommended.

The prompt-override mechanism, InstructionSource, and frontmatter parsing are solid and well-tested. Two non-blocking quality issues remain: debug_assert! in render_core_tool_taxonomy_block won't surface an empty taxonomy block in release builds, and command_allows_tool normalises only the tool-name side of the comparison, leaving an implicit lowercase contract on EngineConfig.allowed_tools that could silently mismatch when embedders set the field directly.

crates/tui/src/prompts.rs (render_core_tool_taxonomy_block assert) and crates/tui/src/core/engine/turn_loop.rs (command_allows_tool normalisation)

Important Files Changed

Filename Overview
crates/tui/src/prompts.rs Core file for this PR: adds 10 OnceLock-backed prompt overrides (all locale preambles/closers + base + authority), InstructionSource enum, render_core_tool_taxonomy_block, and moves the env block below the volatile boundary; debug_assert! in taxonomy builder is the main concern.
crates/tui/src/commands/user_commands.rs Adds YAML-like frontmatter parsing with allowed-tools support; parse_allowed_tools correctly normalizes to lowercase, but command_allows_tool in turn_loop relies on that assumption without enforcing it on the EngineConfig.allowed_tools field directly.
crates/tui/src/core/engine/turn_loop.rs Refactors tool name resolution into resolve_tool_definition, adds allowed_tools gate, adds ApprovalRequired input field, and pins prefix-cache combined hash to events; structurally sound.
crates/tui/src/core/engine/tool_catalog.rs Adds max_results parameter to tool search (default 20, cap 100, previously hard-coded 5), exposes TOOL_SEARCH_REGEX_NAME, and extends DEFAULT_ACTIVE_NATIVE_TOOLS with new async shell tools; all changes are tested.
crates/tui/src/core/engine.rs Generalises EngineConfig.instructions from Vec to Vec, adds allowed_tools field (None = unrestricted), and plumbs both through handle_send_message cleanly.
crates/tui/src/core/events.rs Adds input field to ApprovalRequired and pinned_combined_hash to PrefixCacheStatus; clean event-shape additions.

Fix All in Codex Fix All in Claude Code Fix All in Cursor

Reviews (2): Last reviewed commit: "Merge main into prompt override hooks" | Re-trigger Greptile

LING71671 and others added 4 commits May 27, 2026 21:38
…ways shows params

The ApprovalRequest params_display() relied on pending_tool_uses,
which gets drained by MessageComplete before ApprovalRequired arrives
in the TUI event loop. When the model streaming response contained a
text block, ContentBlockStop(Text) set pending_message_complete=true,
triggering the drain and leaving the approval dialog with empty {}.

Fix: add input: Value to Event::ApprovalRequired and populate it
from tool_input in the engine tool execution loop. The TUI now reads
params directly from the event instead of searching pending_tool_uses.
… via OnceLock hooks

An application embedding this engine may want to ship its own constitutional
preamble, locale preamble, or authority-recap block without forking the
prompt constants. Today the only way to change `BASE_PROMPT`,
`LOCALE_PREAMBLE_ZH_HANS`, or `AUTHORITY_RECAP` is to edit the bundled
markdown/strings in this crate.

Add three process-global, set-once override hooks:

- `set_base_prompt_override(String)`
- `set_locale_preamble_zh_hans_override(String)`
- `set_authority_recap_override(String)`

Each is backed by a `OnceLock` and read through a small accessor
(`base_prompt()` / `locale_preamble_zh_hans()` / `authority_recap()`) that
falls back to the existing default constant when no override is set. Prompt
assembly now calls these accessors instead of the constants directly.

Default builds are unaffected (no override set → upstream constant is used).
Overrides are expected to be installed once at startup, before the engine
builds any prompt; `set_*` returns `Err` with the supplied value if called
twice.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces startup override mechanisms for compile-time prompt constants (BASE_PROMPT, LOCALE_PREAMBLE_ZH_HANS, and AUTHORITY_RECAP) using std::sync::OnceLock, allowing embedders to customize prompts without modifying the files in-tree. Feedback suggests returning a Result from the setter functions instead of silently ignoring failures when setting the OnceLock values. Additionally, it is recommended to use isolated integration tests rather than standard unit tests to prevent global state pollution and test flakiness.

Comment thread crates/tui/src/prompts.rs Outdated
Comment on lines +243 to +255
pub fn set_base_prompt_override(s: String) {
let _ = BASE_PROMPT_OVERRIDE.set(s);
}

/// Replace the Simplified-Chinese locale preamble (`## 语言要求`).
pub fn set_locale_preamble_zh_hans_override(s: String) {
let _ = LOCALE_PREAMBLE_ZH_HANS_OVERRIDE.set(s);
}

/// Replace the trailing `## Authority Recap` block.
pub fn set_authority_recap_override(s: String) {
let _ = AUTHORITY_RECAP_OVERRIDE.set(s);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Instead of silently ignoring the result of OnceLock::set, it is highly recommended to return Result<(), String>. This allows the embedder/caller to handle or log failures if they attempt to set the override multiple times or after initialization.

Suggested change
pub fn set_base_prompt_override(s: String) {
let _ = BASE_PROMPT_OVERRIDE.set(s);
}
/// Replace the Simplified-Chinese locale preamble (`## 语言要求`).
pub fn set_locale_preamble_zh_hans_override(s: String) {
let _ = LOCALE_PREAMBLE_ZH_HANS_OVERRIDE.set(s);
}
/// Replace the trailing `## Authority Recap` block.
pub fn set_authority_recap_override(s: String) {
let _ = AUTHORITY_RECAP_OVERRIDE.set(s);
}
pub fn set_base_prompt_override(s: String) -> Result<(), String> {
BASE_PROMPT_OVERRIDE.set(s)
}
/// Replace the Simplified-Chinese locale preamble (## 语言要求).
pub fn set_locale_preamble_zh_hans_override(s: String) -> Result<(), String> {
LOCALE_PREAMBLE_ZH_HANS_OVERRIDE.set(s)
}
/// Replace the trailing ## Authority Recap block.
pub fn set_authority_recap_override(s: String) -> Result<(), String> {
AUTHORITY_RECAP_OVERRIDE.set(s)
}

Comment thread crates/tui/src/prompts.rs
Comment on lines +237 to +239
static BASE_PROMPT_OVERRIDE: std::sync::OnceLock<String> = std::sync::OnceLock::new();
static LOCALE_PREAMBLE_ZH_HANS_OVERRIDE: std::sync::OnceLock<String> = std::sync::OnceLock::new();
static AUTHORITY_RECAP_OVERRIDE: std::sync::OnceLock<String> = std::sync::OnceLock::new();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since OnceLock is process-global, calling these setters in standard unit tests will pollute the global state and cause other tests in the same test runner process to fail or behave flakily. To safely test these overrides, we should add integration tests in a separate test target (e.g., under tests/prompt_overrides.rs), which runs in its own isolated process.

Comment thread crates/tui/src/prompts.rs Outdated
Comment thread crates/tui/src/prompts.rs Outdated
Comment thread crates/tui/src/prompts.rs Outdated
@Hmbown
Copy link
Copy Markdown
Owner

Hmbown commented May 31, 2026

love this!! the plan is to make this a "harness creator" in that you'll be able to go thru steps and make your own constitution / preamble / base prompt for your codewhale + individual projects. this is a great start . thank you!

reidliu41 and others added 23 commits May 30, 2026 19:12
  Replace the incorrect CodeWhale backend example in /provider help text with DeepSeek.

  CodeWhale is the app name, while deepseek is the actual provider id accepted by /provider.
  Add a regression test for shipped locale descriptions.
Bumps the cargo group with 1 update in the / directory: [tar](https://github.com/composefs/tar-rs).


Updates `tar` from 0.4.45 to 0.4.46
- [Release notes](https://github.com/composefs/tar-rs/releases)
- [Commits](composefs/tar-rs@0.4.45...0.4.46)

---
updated-dependencies:
- dependency-name: tar
  dependency-version: 0.4.46
  dependency-type: direct:production
  dependency-group: cargo
...

Signed-off-by: dependabot[bot] <support@github.com>
  Run the stdio MCP server inside a blocking section when launched from the async CLI entrypoint.

  This prevents Tokio from panicking when the MCP server's internal runtime is dropped after stdin closes.
  Treat inputs like `/ hello` as plain user messages instead of slash commands.
  This lets users start a message with a literal slash while preserving normal
  slash command behavior for `/help`, `/model ...`, and other commands.
The right-click context menu rendered all entries in English regardless of ui_locale. Added 26 MessageId variants with translations for en, ja, zh-Hans, zh-Hant, pt-BR, es-419. ContextMenuView now accepts a localized title. build_context_menu_entries() uses app.tr() instead of hardcoded strings.
  Keep the /statusline picker selection visible when navigating through all
  available footer items in smaller terminal windows.

  The picker now scrolls its visible rows as the cursor moves, wraps Up/Down at
  the list edges, and renders the selected row with a continuous, slightly
  brighter background.
…neral agent intro

The general-purpose sub-agent intro tells the agent how to plan, but says
nothing about when to *stop*. With less capable models this leads to a
failure mode where the agent retries the same failing tool call (e.g. an
unreachable or rate-limited external API) over and over until it hits the
elapsed/step ceiling, wasting the whole budget and returning nothing useful.

Add two short clauses to GENERAL_AGENT_INTRO:
- Stop quickly on failure: after the same call fails twice, return partial
  results with a one-line note instead of looping.
- Bounded effort: prefer one focused attempt; if the task can't be completed
  within a few tool calls, return current findings so the parent can compensate.

Prompt-only change; no behavioral code paths touched.
…-session workspace embedders

The environment block (`platform`, `shell`, `pwd`, `lang`) was inserted
above the volatile-content boundary as a workspace-static cache layer. The
original comment claimed 'workspace path is fixed for the run' → static-
cacheable — true for the terminal use case where one process owns one
workspace for its lifetime.

It is **not** true for embedders that swap workspaces between sessions:

- IDE/GUI integrations binding the engine to a per-tab workspace
- Multi-engine pools with one engine per session
- Anything sending Op::SyncSession with a new workspace

For these embedders, `pwd` drifts session-to-session, dragging the entire
static prefix (mode prompt, project context, skills, context management,
compact template) out of cache reuse on every session switch. That's a
~30KB+ cache miss for a few-byte path change.

Moves the environment block below the volatile-content boundary, alongside
the configured `instructions = [...]` block. Effect:

- Static prefix stays byte-stable across sessions even when workspace
  changes (better KV prefix cache hit rate)
- Model still sees `pwd` / `platform` / `shell` (needed for exec_shell
  and structured search tools), just in a slightly later position
- No behavior change for terminal callers (workspace fixed → block content
  unchanged → still in same logical place, just below the boundary)

Added a comment explaining the per-session-workspace motivation.
… paths

Article VII Tier 5 currently lists four hard-coded path conventions
(AGENTS.md / CLAUDE.md / .codewhale/instructions.md / .deepseek/instructions.md)
as the canonical Local Law sources. Embedders that inject instructions via
`EngineConfig.instructions` (rather than placing files at one of those four
paths) get their files classified by path — and since their paths don't
match, the model defaults to treating their imperatives as Tier 7 Memory
(the lowest tier per Article VII), overridable by a single user sentence.

Adds an explicit clause to Tier 5: '...and any file configured via
`EngineConfig.instructions` (rendered as <instructions source=...> blocks
above). ...embedder-declared imperatives are Local Law, not Memory
preferences.'

Pairs naturally with Hmbown#2311 (InstructionSource enum) but stands alone — this
is a doc/law clarification, not an API change.

Adds `local_law_tier_covers_engine_config_instructions` test pinning the
new clause's presence.
gaord and others added 9 commits May 30, 2026 19:25
Add SSE event forwarding for UserInputRequired, REST endpoint for submitting user input responses, timeout protection for await_user_input, and fix interrupt_turn to clear active_turn immediately.
The monitor_turn loop already handles full turn finalization when the
engine shuts down after cancellation, including saving turn status,
usage, error, emitting turn.completed, and clearing active_turn.

Having interrupt_turn also save turn status and emit turn.completed
causes duplicate SSE events and loses usage/error data that
monitor_turn would have captured from TurnComplete.

Keep only the active_turn cleanup so the 409 error is resolved while
monitor_turn remains the single source of truth for turn completion.
- Change 'not loaded' to 'not found' in submit_user_input and
  cancel_user_input so map_thread_err correctly maps to 404
- Use map_thread_err in submit_user_input API endpoint for
  consistent error response (404 for missing thread, 409 for
  conflict, etc.) instead of always returning 500
Clearing active_turn immediately breaks is_interrupt_requested detection
in monitor_turn, causing turn status to be Completed instead of Interrupted.

Let monitor_turn handle the cleanup after it detects the interrupt flag
and performs full finalization with correct status, usage, and error.
Closes Hmbown#2074

Adds FauxStep::Factory(Box<dyn Fn(&MessageRequest) -> CannedTurn + Send + Sync>)
to MockLlmClient. When a Factory step is dequeued, its closure runs
against the real outgoing MessageRequest before the response stream is
built, so any assert! panic surfaces directly from the client call
instead of later in stream polling.

Internal storage moves from VecDeque<CannedTurn> to VecDeque<FauxStep>,
but every existing public method keeps working:

- MockLlmClient::new(Vec<CannedTurn>) wraps each turn in FauxStep::Canned.
- push_turn(CannedTurn) appends as FauxStep::Canned.

Adds push_factory(closure) for tests that want the Factory branch.

Doc comment on the Factory variant captures the DeepSeek V4
thinking-mode tool-call invariant (the v0.4.9-v0.5.1 reasoning_content
drop that produced HTTP 400 on follow-up turns).

Adds:

- crates/tui/tests/reasoning_content_replayed_after_tool_call.rs — a
  regression test whose factory asserts the assistant tool-call turn
  carries a Thinking content block after a thinking + tool-call round.
- An additional unit test in mock.rs covering create_message_synthesizes_from_factory_turn.

All 20 tests in the new file pass, and the existing
integration_mock_llm suite (27 tests) is unchanged.
@Hmbown
Copy link
Copy Markdown
Owner

Hmbown commented May 31, 2026

Thanks for the prompt override hook work. I am leaving this open because the branch is currently conflict-blocked, and the override API needs a little more test coverage before it is safe to carry into the prompt layer.

The specific things I would tighten: surface/handle OnceLock::set failures instead of silently dropping them, make the zh-Hans preamble/closer override behavior symmetric, and add focused tests for the global override path. After a rebase, the useful verification target is cargo test -p codewhale-tui prompts --all-features plus full CI.

hoclaptrinh33 and others added 17 commits May 30, 2026 21:38
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
# Conflicts:
#	crates/tui/src/tui/ui.rs
task_shell_start delegates to ExecShellTool, providing the same shell
execution capability as exec_shell. Previously, task_shell_start was
registered unconditionally in with_runtime_task_tools while exec_shell
was gated behind allow_shell, creating an inconsistent security gate.

This caused the model to try exec_shell first, fail, then fall back to
task_shell_start — wasting tokens and bypassing the intended security
boundary.

Split TaskShellStartTool and TaskShellWaitTool out of
with_runtime_task_tools into a new with_runtime_task_shell_tools method,
and gate both behind the allow_shell check in with_agent_tools.

Closes Hmbown#2303
The second feature-flag gate in tool_setup.rs was calling
with_shell_tools() again when allow_shell was already true, causing
duplicate tool registration. Remove the redundant gate since
with_agent_tools() already handles the allow_shell check.

Also add task_shell_wait to MODES.md alongside task_shell_start.
# Conflicts:
#	crates/tui/src/tui/views/mod.rs
fix: approval dialog shows empty params when model response includes text block
…st-currency

fix(tui): show effective cost currency context in config view
  Route user cells in the transcript cache through the existing user-message renderer.

  The live chat view renders through lines_with_copy_metadata(), which previously
  bypassed the user-message highlight path. As a result, submitted user prompts did
  not show the intended full-row background in the main transcript.

  Add a regression test for the transcript cache path.
Resolve the prompt-layer conflict by keeping the core tool taxonomy block while routing base prompt composition through the override accessor. Tighten the override API so duplicate registrations return the rejected string, and make locale preamble/closer overrides symmetric across supported bookend locales.
@Hmbown
Copy link
Copy Markdown
Owner

Hmbown commented May 31, 2026

Pushed a maintainer update to get this branch current with main and address the review nits while keeping your original embedder-hook direction intact.

What changed:

  • resolved the prompt-layer conflict with current main while preserving the new core tool taxonomy block
  • routed base prompt composition through the override accessor
  • made duplicate override registration observable by returning Result<(), String> instead of silently dropping the second value
  • made locale preamble/closer overrides symmetric across the supported locale bookends
  • added a focused regression for duplicate-set behavior without mutating the process-global prompt cells

Local verification on the updated head:

  • cargo fmt --all -- --check
  • python3 scripts/check-provider-registry.py
  • cargo test -p codewhale-tui prompt_override_storage_reports_duplicate_sets -- --nocapture
  • cargo test -p codewhale-tui prompts::tests:: -- --nocapture
  • cargo check -p codewhale-tui --all-features --locked

Thanks again @h3c-hexin — this is a nice foundation for the future harness/constitution customization path.

@Hmbown
Copy link
Copy Markdown
Owner

Hmbown commented May 31, 2026

One more stewardship note before merge: the only red CI leg is Windows, and I checked the failed job logs. The failures are the existing background shell process-handle tests (�[0mexpected completed vs still running�[0m) rather than anything in this prompt-only change set. Ubuntu/macOS/lint/npm smoke/Greptile/GitGuardian are green, and the focused prompt tests plus tui check pass locally, so I’m treating this as mergeable rather than sending more busywork back to you. Thanks again @h3c-hexin.

@Hmbown Hmbown merged commit 76a8442 into Hmbown:main May 31, 2026
8 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.