Skip to content

fix(domain): collapse system messages into a single one (#2894)#3436

Open
ImBIOS wants to merge 3 commits into
tailcallhq:mainfrom
ImBIOS:fix/2894-collapse-system-messages
Open

fix(domain): collapse system messages into a single one (#2894)#3436
ImBIOS wants to merge 3 commits into
tailcallhq:mainfrom
ImBIOS:fix/2894-collapse-system-messages

Conversation

@ImBIOS
Copy link
Copy Markdown
Contributor

@ImBIOS ImBIOS commented Jun 2, 2026

Summary

Context::set_system_messages previously inserted each entry in content
as a separate ContextMessage::system(...), producing N system messages
in the resulting chat. Chat templates such as Qwen3.5/3.6 (used by
llama.cpp and vLLM) only allow a single system message at messages[0]
and raise:

raise_exception('System message must be at the beginning.')

for any additional system message, breaking those providers entirely with
a 400/500 error from the upstream inference server.

Fix: join all entries into a single system message (separated by \n\n)
and insert it once at position 0. Empty payloads are filtered out, and an
all-empty content vector leaves the context untouched.

Changes

  • crates/forge_domain/src/context.rs
    • Rewrite Context::set_system_messages to:
      1. Drop every existing system message.
      2. Combine the new entries into a single \n\n-joined string (filtering
        out empty entries).
      3. Insert exactly one ContextMessage::system(combined) at position 0.
    • Added a Rustdoc block describing the chat-template constraint that
      motivates the change.
  • Two regression tests in the same file:
    • test_set_system_messages_collapses_into_single_message — asserts the
      context contains exactly one system message and that its content is the
      \n\n-joined payload.
    • test_set_system_messages_replaces_existing_single_system_message
      asserts a pre-existing system message is replaced, not retained or
      duplicated.

Why this is safe

  • Single-function, single-crate change. The public API of
    Context::set_system_messages is unchanged.
  • Behaviour is a strict superset of the old implementation for callers that
    pass a Vec of length 1 (still produces a single system message at
    position 0).
  • The downstream caller in forge_app/src/system_prompt.rs:136 already
    passes exactly two entries (static_block, non_static_block), so this
    is the path actually exercised in production.

Verification

  • cargo check -p forge_domain --tests: clean
  • cargo check -p forge_app --tests: clean
  • cargo clippy -p forge_domain --tests --no-deps: clean
  • cargo test -p forge_domain --lib: 596 passed, 0 failed (incl. 2 new
    regression tests)
  • cargo test -p forge_app --lib system_prompt: 7 passed, 0 failed

Related issues

How to test locally

  1. cargo build -p forge_domain (or run the two new tests directly).
  2. After the fix is merged, a Qwen3.5 / Qwen3.6 model served via llama.cpp
    or vLLM should no longer reject requests with the
    System message must be at the beginning. Jinja exception.

)

`Context::set_system_messages` previously inserted each entry in `content`
as a separate `ContextMessage::system(...)`, producing N system messages
in the resulting chat. Chat templates such as Qwen3.5/3.6 (used by
llama.cpp and vLLM) only allow a single system message at `messages[0]`
and raise `raise_exception('System message must be at the beginning.')`
for any additional one, breaking those providers entirely.

Join all entries into a single system message (separated by `\n\n`) and
insert it once at position 0. Empty payloads are filtered out and an
all-empty `content` vector leaves the context untouched.

Fixes tailcallhq#2894, also resolves the duplicate tailcallhq#2586 (qwen3.5-27b llama.cpp).

Co-Authored-By: ForgeCode <noreply@forgecode.dev>
Copilot AI review requested due to automatic review settings June 2, 2026 04:46
@github-actions github-actions Bot added the type: fix Iterations on existing features or infrastructure. label Jun 2, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Updates Context::set_system_messages to ensure compatibility with chat templates that require exactly one system message at messages[0].

Changes:

  • Replace any existing system messages and collapse all provided entries into a single system message block.
  • Filter out empty system-message entries and avoid inserting a system message when the combined payload is empty.
  • Add regression tests covering multi-entry collapse and replacement of pre-existing system messages.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread crates/forge_domain/src/context.rs Outdated
Comment thread crates/forge_domain/src/context.rs Outdated
Comment thread crates/forge_domain/src/context.rs
Comment thread crates/forge_domain/src/context.rs Outdated
autofix-ci Bot and others added 2 commits June 2, 2026 04:49
Applied suggestions from the Copilot review on PR tailcallhq#3436:

1. Doc comments no longer wrap the Jinja exception
   `raise_exception('System message must be at the beginning.')` across a
   line break inside a single inline code span — both occurrences now use
   a fenced `text` code block so rustdoc/CommonMark render as intended.

2. The semantic of an empty `content` payload is now explicitly part of
   the public contract of `set_system_messages`: it clears every existing
   system message and inserts no replacement. This matches the pre-tailcallhq#2894
   behavior on the non-empty branch, but it was previously undocumented.
   Added two regression tests (`..._with_empty_payload_clears_existing`
   and `..._with_only_empty_strings_clears_existing`) to lock it in.

3. The combined system message is now built in a single `String`
   allocation (push_str + manual separator insertion) instead of the
   previous `collect::<Vec<_>>().join(...)` approach, which materialised
   an intermediate `Vec<String>` for no benefit.

Co-Authored-By: ForgeCode <noreply@forgecode.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type: fix Iterations on existing features or infrastructure.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Multiple system messages break models with strict chat templates (e.g. Qwen3.5) [Bug]: template error with qwen3.5-27b (llama.cpp)

2 participants