Skip to content

feat: model-tier prompt profiles — adapt per model capability#68

Merged
kienbui1995 merged 3 commits intomainfrom
feat/model-tier-prompts
Apr 15, 2026
Merged

feat: model-tier prompt profiles — adapt per model capability#68
kienbui1995 merged 3 commits intomainfrom
feat/model-tier-prompts

Conversation

@kienbui1995
Copy link
Copy Markdown
Owner

@kienbui1995 kienbui1995 commented Apr 15, 2026

Problem

Same 1200-token prompt for all models. Weak models (Qwen, Llama) can't follow complex rules, get confused by negative instructions.

Solution

3 prompt tiers auto-selected by model name:

Tier Models Tokens Tools Rules
1 (frontier) Claude Opus/Sonnet, GPT-4o/5 ~800 30 Full + negatives
2 (strong) Gemini, DeepSeek, Haiku ~400 15 Positive only
3 (local) Qwen, Llama, Ollama ~150 8 Simple English

Dynamic context (CWD, git, skills, memory) appended to all tiers.

274 tests, 0 fail.

Summary by CodeRabbit

  • Refactor
    • System prompts now use a tiered, model-dependent template so initial guidance adapts by model class; project and runtime context (working dir, OS/arch, detected stack, git status, instruction files, discovered skills/agents) continue to be appended to the selected template.
  • Chores
    • Updated static analysis configuration to exclude main entry files from duplicate-code checks.

…bility

3 tiers based on model capability:

Tier 1 (frontier): Claude Opus/Sonnet, GPT-4o/5, o3/o4
  - Full prompt: 30 tools, all rules, security, negatives, cost awareness
  - ~800 tokens

Tier 2 (strong): Gemini, DeepSeek, Mistral Large, Claude Haiku
  - Compact: 15 tools, positive rules only, no negatives
  - ~400 tokens

Tier 3 (local/small): Qwen, Llama, Ollama, Mistral 7B
  - Minimal: 8 essential tools, simple English, 4 rules
  - ~150 tokens

Key differences:
- Tier 3 has NO negative rules (weak models do opposite)
- Tier 2 drops browser/debug/worktree/mcp/notebook tools
- Tier 3 drops subagent/task/memory/edit_plan tools
- All tiers get dynamic context (CWD, git, skills, agents, memory)

274 tests, 0 fail.
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 15, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3d0a2320-90cc-44fe-87d3-17ae1db1d38e

📥 Commits

Reviewing files that changed from the base of the PR and between 5f03070 and ad7537c.

📒 Files selected for processing (1)
  • sonar-project.properties
✅ Files skipped from review due to trivial changes (1)
  • sonar-project.properties

📝 Walkthrough

Walkthrough

The system prompt construction is refactored to be model-aware: build_system_prompt now accepts a model parameter and selects one of several tiered prompt templates (via model_prompt_tier) before appending the existing project/runtime context details.

Changes

Cohort / File(s) Summary
System Prompt Tiering
mc/crates/mc-cli/src/main.rs
Reworked prompt assembly: build_system_prompt(project) -> build_system_prompt(project, model). Added model_prompt_tier(model: &str) -> u8 and multiple file-scope templates (PROMPT_TIER1, PROMPT_TIER2, PROMPT_TIER3, PROMPT_QWEN). Prompt selection is now model-dependent; project/runtime context is appended as before.
Sonar CPD Exclusion
sonar-project.properties
Added sonar.cpd.exclusions=**/main.rs to exclude main.rs files from duplication analysis.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as CLI
    participant Builder as build_system_prompt()
    participant Tierer as model_prompt_tier()
    participant Template as Prompt Template
    participant Context as Project/Runtime Context
    participant LLM as Runtime/Model Client

    CLI->>Builder: call build_system_prompt(project, model)
    Builder->>Tierer: model_prompt_tier(model)
    Tierer-->>Builder: tier (1/2/3/QWEN)
    Builder->>Template: select template by tier
    Template-->>Builder: base prompt text
    Builder->>Context: append working dir, OS/arch, stack, git, files, skills/agents
    Builder-->>LLM: final system prompt
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 I hopped through tiers both light and deep,
Chose prompts that wake models from sleep.
I carry context in a tidy sack,
Nudge the runtime, and hop right back. ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description explains the problem, solution, and implementation details, but does not follow the required template structure with clearly labeled 'What', 'Why', 'How', and 'Checklist' sections. Restructure the description to match the required template: add 'What', 'Why', 'How' sections with clear headings, and include the 'Checklist' section with all required verification steps.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and accurately summarizes the main change: introducing model-tier prompt profiles that adapt prompts based on model capability.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/model-tier-prompts

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
mc/crates/mc-cli/src/main.rs (1)

2011-2030: Consider more precise matching for short model identifiers.

The substrings "o3" and "o4" (lines 2017-2018) are quite short and could match unintended model names (e.g., a hypothetical "proto3-model" or "llama3-pro4"). Consider using word-boundary-aware matching or more specific patterns like "-o3" / "o3-" / exact match.

Additionally, "gpt-4o-mini" would match "gpt-4" and be classified as Tier 1, but mini models may benefit from Tier 2's simpler prompts.

💡 Potential refinement for more precise matching
 fn model_prompt_tier(model: &str) -> u8 {
     let m = model.to_lowercase();
+    // Check for mini/small variants first (demote to tier 2)
+    if m.contains("mini") || m.contains("small") {
+        return 2;
+    }
     if m.contains("opus")
         || m.contains("sonnet")
         || m.contains("gpt-4")
         || m.contains("gpt-5")
-        || m.contains("o3")
-        || m.contains("o4")
+        || m.starts_with("o3") || m.contains("-o3")
+        || m.starts_with("o4") || m.contains("-o4")
     {
         1 // frontier
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@mc/crates/mc-cli/src/main.rs` around lines 2011 - 2030, The model_prompt_tier
function currently uses broad substring checks (e.g., "o3", "o4", "gpt-4") that
can produce false positives; update model_prompt_tier to use more precise
matching (e.g., regex or boundary-aware checks) for short identifiers like
"o3"/"o4" so they only match whole tokens or hyphenated forms (e.g., "-o3",
"o3-", or exact equality), and adjust the "gpt-4" rule to exclude variants like
"gpt-4o-mini" (treat those as Tier 2) by requiring a word boundary or specific
separators when matching "gpt-4". Ensure you modify the logic inside
model_prompt_tier to apply these refined checks while preserving existing tier
assignments for other names.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@mc/crates/mc-cli/src/main.rs`:
- Around line 1969-1991: The Tier 2 prompt constant PROMPT_TIER2 still lists
tools that should be removed per the PR description; update the PROMPT_TIER2
string in mc/crates/mc-cli/src/main.rs (the PROMPT_TIER2 const) to remove the
lines describing `debug` and `browser` (i.e. remove the "- `debug`: Structured
debugging." and "- `browser`: Test web UIs." entries) so the prompt matches the
intended toolset, preserving the existing string formatting and other tool
entries.

---

Nitpick comments:
In `@mc/crates/mc-cli/src/main.rs`:
- Around line 2011-2030: The model_prompt_tier function currently uses broad
substring checks (e.g., "o3", "o4", "gpt-4") that can produce false positives;
update model_prompt_tier to use more precise matching (e.g., regex or
boundary-aware checks) for short identifiers like "o3"/"o4" so they only match
whole tokens or hyphenated forms (e.g., "-o3", "o3-", or exact equality), and
adjust the "gpt-4" rule to exclude variants like "gpt-4o-mini" (treat those as
Tier 2) by requiring a word boundary or specific separators when matching
"gpt-4". Ensure you modify the logic inside model_prompt_tier to apply these
refined checks while preserving existing tier assignments for other names.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 19c21ab0-0a5d-42f9-8af8-5e619c19a0a1

📥 Commits

Reviewing files that changed from the base of the PR and between fa58554 and b89a439.

📒 Files selected for processing (1)
  • mc/crates/mc-cli/src/main.rs

Comment on lines +1969 to +1991
/// Tier 2: Strong models (Gemini, DeepSeek, Mistral Large) — compact, positive rules only.
const PROMPT_TIER2: &str = "\
You are magic-code, an AI coding assistant in the terminal.\n\n\
## Tools\n\
- `bash`: Run shell commands (streaming output).\n\
- `read_file`: Read files. `write_file`: Create files. `edit_file`: Edit specific text.\n\
- `glob_search`: Find files. `grep_search`: Search content. `codebase_search`: Search symbols.\n\
- `edit_plan`: Plan multi-file changes. `subagent`: Delegate tasks.\n\
- `memory_read`/`memory_write`: Save/read project facts across sessions.\n\
- `web_fetch`/`web_search`: Read docs or search web.\n\
- `ask_user`: Ask clarifying questions.\n\
- `debug`: Structured debugging. `browser`: Test web UIs.\n\
- `task_create`/`task_get`/`task_list`/`task_stop`: Background tasks.\n\n\
## Rules\n\
- Always read a file before editing it.\n\
- Use `edit_file` for small changes, `write_file` for new files.\n\
- Use `codebase_search` to find code before reading files.\n\
- Run tests after making changes.\n\
- Ask the user when requirements are unclear.\n\
- Be concise. Show code, not explanations.\n\n\
## Error Recovery\n\
- If `edit_file` fails, read the file first to see current content.\n\
- If stuck, try a different approach or ask the user.";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Tier 2 prompt includes tools that should be excluded per PR description.

The PR objectives state that Tier 2 "drops browser/debug/worktree/mcp/notebook tools", but lines 1980-1981 still include debug and browser:

- `debug`: Structured debugging. `browser`: Test web UIs.\n\

Either update the prompt to remove these tools, or update the PR description to reflect the actual tool set.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@mc/crates/mc-cli/src/main.rs` around lines 1969 - 1991, The Tier 2 prompt
constant PROMPT_TIER2 still lists tools that should be removed per the PR
description; update the PROMPT_TIER2 string in mc/crates/mc-cli/src/main.rs (the
PROMPT_TIER2 const) to remove the lines describing `debug` and `browser` (i.e.
remove the "- `debug`: Structured debugging." and "- `browser`: Test web UIs."
entries) so the prompt matches the intended toolset, preserving the existing
string formatting and other tool entries.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a tiered system prompt architecture, tailoring the instructions and tool descriptions based on the capability of the AI model being used. Frontier models receive a comprehensive prompt, while strong and local models receive more compact or simplified versions. The review feedback correctly identifies that 'mini' model variants (e.g., gpt-4o-mini) would currently be misclassified as Tier 1 frontier models and suggests adjustments to the categorization logic to place them in Tier 2.

Comment on lines +2013 to +2018
if m.contains("opus")
|| m.contains("sonnet")
|| m.contains("gpt-4")
|| m.contains("gpt-5")
|| m.contains("o3")
|| m.contains("o4")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current logic for Tier 1 will incorrectly categorize 'mini' models (like gpt-4o-mini or o3-mini) as frontier models because they contain the strings gpt-4 or o3. According to the PR description, Tier 2 is intended for models like Haiku, which are comparable to the 'mini' variants of GPT/o-series. These models often struggle with complex negative instructions and should likely receive the Tier 2 prompt profile.

    if (m.contains("opus")
        || m.contains("sonnet")
        || m.contains("gpt-4")
        || m.contains("gpt-5")
        || m.contains("o3")
        || m.contains("o4"))
        && !m.contains("mini")

Comment on lines +2021 to +2024
} else if m.contains("gemini")
|| m.contains("deepseek")
|| m.contains("mistral-large")
|| m.contains("claude")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The check for claude in Tier 2 acts as a fallback for models like Haiku (since Opus/Sonnet are caught by Tier 1). However, if a 'mini' model is detected (e.g., gpt-4o-mini), it will currently fall through to Tier 3 (local/small) if the Tier 1 check is modified to exclude 'mini'. It would be more robust to explicitly include 'mini' in Tier 2 to ensure these capable but instruction-sensitive models get the 'strong' profile rather than the 'minimal' one.

    } else if m.contains("gemini")
        || m.contains("deepseek")
        || m.contains("mistral-large")
        || m.contains("claude")
        || m.contains("mini")

Research-backed optimizations for Qwen 3.5:
- Tool definitions REQUIRED to avoid reasoning loops (community fix)
- Explicit parameter descriptions (Qwen needs them)
- Step-by-step workflow (1. read → 2. edit → 3. test → 4. report)
- 'Always use tools' instruction (prevents hallucinating file contents)
- 10 tools (core coding set, no browser/debug/worktree)
- ~350 tokens (vs 800 for Tier 1)

Tier 4 auto-detected for any model containing 'qwen'.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
mc/crates/mc-cli/src/main.rs (1)

2038-2046: Consider more specific patterns for "o3"/"o4" model detection.

The substrings "o3" and "o4" (lines 2044-2045) are short and could match unintended model names (e.g., a hypothetical "pro3-model" or "expo4-vision"). Consider using more specific patterns like "o3-" or a regex/prefix match if OpenAI's naming convention is predictable.

💡 Optional: More specific matching
     || m.contains("gpt-4")
     || m.contains("gpt-5")
-    || m.contains("o3")
-    || m.contains("o4")
+    || m.starts_with("o3")
+    || m.starts_with("o4")

Or if models may have prefixes:

-    || m.contains("o3")
-    || m.contains("o4")
+    || m.contains("o3-") || m == "o3"
+    || m.contains("o4-") || m == "o4"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@mc/crates/mc-cli/src/main.rs` around lines 2038 - 2046, The model_prompt_tier
function currently matches short substrings "o3" and "o4" which can produce
false positives; update the checks in model_prompt_tier to use more specific
patterns (e.g., "o3-", "o3/", "o4-", "o4/" or a regex with word boundaries or
prefix matches) so only intended OpenAI model names are detected; modify the
conditional that checks m.contains("o3") || m.contains("o4") to use these
stricter patterns or a compiled regex to avoid matching unrelated names.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@mc/crates/mc-cli/src/main.rs`:
- Around line 2038-2046: The model_prompt_tier function currently matches short
substrings "o3" and "o4" which can produce false positives; update the checks in
model_prompt_tier to use more specific patterns (e.g., "o3-", "o3/", "o4-",
"o4/" or a regex with word boundaries or prefix matches) so only intended OpenAI
model names are detected; modify the conditional that checks m.contains("o3") ||
m.contains("o4") to use these stricter patterns or a compiled regex to avoid
matching unrelated names.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b7848513-6f7e-4b01-a353-d962183843d6

📥 Commits

Reviewing files that changed from the base of the PR and between b89a439 and 5f03070.

📒 Files selected for processing (1)
  • mc/crates/mc-cli/src/main.rs

@sonarqubecloud
Copy link
Copy Markdown

@kienbui1995 kienbui1995 merged commit 2f96445 into main Apr 15, 2026
16 checks passed
@kienbui1995 kienbui1995 deleted the feat/model-tier-prompts branch April 15, 2026 03:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant