feat: model-tier prompt profiles — adapt per model capability#68
feat: model-tier prompt profiles — adapt per model capability#68kienbui1995 merged 3 commits intomainfrom
Conversation
…bility 3 tiers based on model capability: Tier 1 (frontier): Claude Opus/Sonnet, GPT-4o/5, o3/o4 - Full prompt: 30 tools, all rules, security, negatives, cost awareness - ~800 tokens Tier 2 (strong): Gemini, DeepSeek, Mistral Large, Claude Haiku - Compact: 15 tools, positive rules only, no negatives - ~400 tokens Tier 3 (local/small): Qwen, Llama, Ollama, Mistral 7B - Minimal: 8 essential tools, simple English, 4 rules - ~150 tokens Key differences: - Tier 3 has NO negative rules (weak models do opposite) - Tier 2 drops browser/debug/worktree/mcp/notebook tools - Tier 3 drops subagent/task/memory/edit_plan tools - All tiers get dynamic context (CWD, git, skills, agents, memory) 274 tests, 0 fail.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughThe system prompt construction is refactored to be model-aware: Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as CLI
participant Builder as build_system_prompt()
participant Tierer as model_prompt_tier()
participant Template as Prompt Template
participant Context as Project/Runtime Context
participant LLM as Runtime/Model Client
CLI->>Builder: call build_system_prompt(project, model)
Builder->>Tierer: model_prompt_tier(model)
Tierer-->>Builder: tier (1/2/3/QWEN)
Builder->>Template: select template by tier
Template-->>Builder: base prompt text
Builder->>Context: append working dir, OS/arch, stack, git, files, skills/agents
Builder-->>LLM: final system prompt
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
mc/crates/mc-cli/src/main.rs (1)
2011-2030: Consider more precise matching for short model identifiers.The substrings
"o3"and"o4"(lines 2017-2018) are quite short and could match unintended model names (e.g., a hypothetical"proto3-model"or"llama3-pro4"). Consider using word-boundary-aware matching or more specific patterns like"-o3"/"o3-"/ exact match.Additionally,
"gpt-4o-mini"would match"gpt-4"and be classified as Tier 1, but mini models may benefit from Tier 2's simpler prompts.💡 Potential refinement for more precise matching
fn model_prompt_tier(model: &str) -> u8 { let m = model.to_lowercase(); + // Check for mini/small variants first (demote to tier 2) + if m.contains("mini") || m.contains("small") { + return 2; + } if m.contains("opus") || m.contains("sonnet") || m.contains("gpt-4") || m.contains("gpt-5") - || m.contains("o3") - || m.contains("o4") + || m.starts_with("o3") || m.contains("-o3") + || m.starts_with("o4") || m.contains("-o4") { 1 // frontier🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@mc/crates/mc-cli/src/main.rs` around lines 2011 - 2030, The model_prompt_tier function currently uses broad substring checks (e.g., "o3", "o4", "gpt-4") that can produce false positives; update model_prompt_tier to use more precise matching (e.g., regex or boundary-aware checks) for short identifiers like "o3"/"o4" so they only match whole tokens or hyphenated forms (e.g., "-o3", "o3-", or exact equality), and adjust the "gpt-4" rule to exclude variants like "gpt-4o-mini" (treat those as Tier 2) by requiring a word boundary or specific separators when matching "gpt-4". Ensure you modify the logic inside model_prompt_tier to apply these refined checks while preserving existing tier assignments for other names.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@mc/crates/mc-cli/src/main.rs`:
- Around line 1969-1991: The Tier 2 prompt constant PROMPT_TIER2 still lists
tools that should be removed per the PR description; update the PROMPT_TIER2
string in mc/crates/mc-cli/src/main.rs (the PROMPT_TIER2 const) to remove the
lines describing `debug` and `browser` (i.e. remove the "- `debug`: Structured
debugging." and "- `browser`: Test web UIs." entries) so the prompt matches the
intended toolset, preserving the existing string formatting and other tool
entries.
---
Nitpick comments:
In `@mc/crates/mc-cli/src/main.rs`:
- Around line 2011-2030: The model_prompt_tier function currently uses broad
substring checks (e.g., "o3", "o4", "gpt-4") that can produce false positives;
update model_prompt_tier to use more precise matching (e.g., regex or
boundary-aware checks) for short identifiers like "o3"/"o4" so they only match
whole tokens or hyphenated forms (e.g., "-o3", "o3-", or exact equality), and
adjust the "gpt-4" rule to exclude variants like "gpt-4o-mini" (treat those as
Tier 2) by requiring a word boundary or specific separators when matching
"gpt-4". Ensure you modify the logic inside model_prompt_tier to apply these
refined checks while preserving existing tier assignments for other names.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 19c21ab0-0a5d-42f9-8af8-5e619c19a0a1
📒 Files selected for processing (1)
mc/crates/mc-cli/src/main.rs
| /// Tier 2: Strong models (Gemini, DeepSeek, Mistral Large) — compact, positive rules only. | ||
| const PROMPT_TIER2: &str = "\ | ||
| You are magic-code, an AI coding assistant in the terminal.\n\n\ | ||
| ## Tools\n\ | ||
| - `bash`: Run shell commands (streaming output).\n\ | ||
| - `read_file`: Read files. `write_file`: Create files. `edit_file`: Edit specific text.\n\ | ||
| - `glob_search`: Find files. `grep_search`: Search content. `codebase_search`: Search symbols.\n\ | ||
| - `edit_plan`: Plan multi-file changes. `subagent`: Delegate tasks.\n\ | ||
| - `memory_read`/`memory_write`: Save/read project facts across sessions.\n\ | ||
| - `web_fetch`/`web_search`: Read docs or search web.\n\ | ||
| - `ask_user`: Ask clarifying questions.\n\ | ||
| - `debug`: Structured debugging. `browser`: Test web UIs.\n\ | ||
| - `task_create`/`task_get`/`task_list`/`task_stop`: Background tasks.\n\n\ | ||
| ## Rules\n\ | ||
| - Always read a file before editing it.\n\ | ||
| - Use `edit_file` for small changes, `write_file` for new files.\n\ | ||
| - Use `codebase_search` to find code before reading files.\n\ | ||
| - Run tests after making changes.\n\ | ||
| - Ask the user when requirements are unclear.\n\ | ||
| - Be concise. Show code, not explanations.\n\n\ | ||
| ## Error Recovery\n\ | ||
| - If `edit_file` fails, read the file first to see current content.\n\ | ||
| - If stuck, try a different approach or ask the user."; |
There was a problem hiding this comment.
Tier 2 prompt includes tools that should be excluded per PR description.
The PR objectives state that Tier 2 "drops browser/debug/worktree/mcp/notebook tools", but lines 1980-1981 still include debug and browser:
- `debug`: Structured debugging. `browser`: Test web UIs.\n\
Either update the prompt to remove these tools, or update the PR description to reflect the actual tool set.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@mc/crates/mc-cli/src/main.rs` around lines 1969 - 1991, The Tier 2 prompt
constant PROMPT_TIER2 still lists tools that should be removed per the PR
description; update the PROMPT_TIER2 string in mc/crates/mc-cli/src/main.rs (the
PROMPT_TIER2 const) to remove the lines describing `debug` and `browser` (i.e.
remove the "- `debug`: Structured debugging." and "- `browser`: Test web UIs."
entries) so the prompt matches the intended toolset, preserving the existing
string formatting and other tool entries.
There was a problem hiding this comment.
Code Review
This pull request introduces a tiered system prompt architecture, tailoring the instructions and tool descriptions based on the capability of the AI model being used. Frontier models receive a comprehensive prompt, while strong and local models receive more compact or simplified versions. The review feedback correctly identifies that 'mini' model variants (e.g., gpt-4o-mini) would currently be misclassified as Tier 1 frontier models and suggests adjustments to the categorization logic to place them in Tier 2.
| if m.contains("opus") | ||
| || m.contains("sonnet") | ||
| || m.contains("gpt-4") | ||
| || m.contains("gpt-5") | ||
| || m.contains("o3") | ||
| || m.contains("o4") |
There was a problem hiding this comment.
The current logic for Tier 1 will incorrectly categorize 'mini' models (like gpt-4o-mini or o3-mini) as frontier models because they contain the strings gpt-4 or o3. According to the PR description, Tier 2 is intended for models like Haiku, which are comparable to the 'mini' variants of GPT/o-series. These models often struggle with complex negative instructions and should likely receive the Tier 2 prompt profile.
if (m.contains("opus")
|| m.contains("sonnet")
|| m.contains("gpt-4")
|| m.contains("gpt-5")
|| m.contains("o3")
|| m.contains("o4"))
&& !m.contains("mini")| } else if m.contains("gemini") | ||
| || m.contains("deepseek") | ||
| || m.contains("mistral-large") | ||
| || m.contains("claude") |
There was a problem hiding this comment.
The check for claude in Tier 2 acts as a fallback for models like Haiku (since Opus/Sonnet are caught by Tier 1). However, if a 'mini' model is detected (e.g., gpt-4o-mini), it will currently fall through to Tier 3 (local/small) if the Tier 1 check is modified to exclude 'mini'. It would be more robust to explicitly include 'mini' in Tier 2 to ensure these capable but instruction-sensitive models get the 'strong' profile rather than the 'minimal' one.
} else if m.contains("gemini")
|| m.contains("deepseek")
|| m.contains("mistral-large")
|| m.contains("claude")
|| m.contains("mini")Research-backed optimizations for Qwen 3.5: - Tool definitions REQUIRED to avoid reasoning loops (community fix) - Explicit parameter descriptions (Qwen needs them) - Step-by-step workflow (1. read → 2. edit → 3. test → 4. report) - 'Always use tools' instruction (prevents hallucinating file contents) - 10 tools (core coding set, no browser/debug/worktree) - ~350 tokens (vs 800 for Tier 1) Tier 4 auto-detected for any model containing 'qwen'.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
mc/crates/mc-cli/src/main.rs (1)
2038-2046: Consider more specific patterns for "o3"/"o4" model detection.The substrings
"o3"and"o4"(lines 2044-2045) are short and could match unintended model names (e.g., a hypothetical "pro3-model" or "expo4-vision"). Consider using more specific patterns like"o3-"or a regex/prefix match if OpenAI's naming convention is predictable.💡 Optional: More specific matching
|| m.contains("gpt-4") || m.contains("gpt-5") - || m.contains("o3") - || m.contains("o4") + || m.starts_with("o3") + || m.starts_with("o4")Or if models may have prefixes:
- || m.contains("o3") - || m.contains("o4") + || m.contains("o3-") || m == "o3" + || m.contains("o4-") || m == "o4"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@mc/crates/mc-cli/src/main.rs` around lines 2038 - 2046, The model_prompt_tier function currently matches short substrings "o3" and "o4" which can produce false positives; update the checks in model_prompt_tier to use more specific patterns (e.g., "o3-", "o3/", "o4-", "o4/" or a regex with word boundaries or prefix matches) so only intended OpenAI model names are detected; modify the conditional that checks m.contains("o3") || m.contains("o4") to use these stricter patterns or a compiled regex to avoid matching unrelated names.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@mc/crates/mc-cli/src/main.rs`:
- Around line 2038-2046: The model_prompt_tier function currently matches short
substrings "o3" and "o4" which can produce false positives; update the checks in
model_prompt_tier to use more specific patterns (e.g., "o3-", "o3/", "o4-",
"o4/" or a regex with word boundaries or prefix matches) so only intended OpenAI
model names are detected; modify the conditional that checks m.contains("o3") ||
m.contains("o4") to use these stricter patterns or a compiled regex to avoid
matching unrelated names.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b7848513-6f7e-4b01-a353-d962183843d6
📒 Files selected for processing (1)
mc/crates/mc-cli/src/main.rs
…are intentionally similar)
|



Problem
Same 1200-token prompt for all models. Weak models (Qwen, Llama) can't follow complex rules, get confused by negative instructions.
Solution
3 prompt tiers auto-selected by model name:
Dynamic context (CWD, git, skills, memory) appended to all tiers.
274 tests, 0 fail.
Summary by CodeRabbit