Skip to content

fix: prompt hardening — security, negative rules, tone (research-backed)#59

Merged
kienbui1995 merged 1 commit intomainfrom
fix/prompt-deep-research
Apr 14, 2026
Merged

fix: prompt hardening — security, negative rules, tone (research-backed)#59
kienbui1995 merged 1 commit intomainfrom
fix/prompt-deep-research

Conversation

@kienbui1995
Copy link
Copy Markdown
Owner

@kienbui1995 kienbui1995 commented Apr 13, 2026

Research-backed prompt improvements

Sources: Augment Code 11 techniques, Claude Code leak analysis, 134K-star leaked prompts repo.

New sections

  • Security: prompt injection guardrail, no untrusted execution
  • What NOT to Do: 7 negative rules (Augment: 'telling model what NOT to do is safe and effective')
  • Output Format: confidence level, risks/side effects

Reordered

Cost Awareness + Error Recovery moved to END of static prompt (Augment: 'models pay more attention to beginning and especially end')

All 8 new rules

✅ Prompt injection detection
✅ No write_file for small edits
✅ No destructive commands without confirmation
✅ No modify tests unless asked
✅ No install deps silently
✅ No repeat failed approach
✅ State confidence level
✅ Mention risks/side effects

274 tests, 0 fail.

Summary by CodeRabbit

  • Chores
    • Enhanced system prompt guidelines with improved security safeguards, including prompt-injection handling and credential protection
    • Refined command execution requirements and output formatting standards

…tion reorder

Based on research from Augment Code (11 techniques), Claude Code leak,
and leaked prompts repo (134K stars):

1. Security section: prompt injection detection, no untrusted execution,
   no credential exposure
2. What NOT to Do: 7 negative rules (no write for small edits, no guess,
   no destructive commands, no modify tests, no silent deps, no repeat fails)
3. Output Format enhanced: confidence level, risks/side effects
4. Section reorder: Cost Awareness + Error Recovery moved to END
   (models pay most attention to beginning + end of prompt)

274 tests, 0 fail.
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 13, 2026

📝 Walkthrough

Walkthrough

The system prompt in the CLI's main function was extended with new Security, "What NOT to Do", and improved Output Format sections. Duplicate guidance was removed, consolidating overlapping instructions into a single comprehensive prompt.

Changes

Cohort / File(s) Summary
System Prompt Enhancement
mc/crates/mc-cli/src/main.rs
Extended build_system_prompt with Security section (prompt-injection handling, confirmation before executing untrusted commands, credential protection), What NOT to Do section (tool constraints, test/dependency handling, failure recovery), and refined Output Format requirements. Removed duplicate/overlapping Output Format block and redundant "Be concise" guidance.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

Possibly related PRs

Poem

🐰 A rabbit's refrain on prompts made right:

Guidelines clearer, safer in sight,
Security badges and "do NOT" advice,
Duplicate whispers silenced—so nice!
The AI assistant now knows the way,
To serve with wisdom, come what may. 🌟

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description covers the motivation (research-backed improvements), key implementation details (8 new rules, section reordering), and test results, but lacks the Checklist section required by the template. Add the Checklist section from the template with items for cargo fmt, cargo test, cargo clippy, warnings check, and test coverage confirmation.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding security hardening, negative rules, and tone improvements to the system prompt with research backing.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/prompt-deep-research

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the system prompt in mc-cli by adding new sections for Security, What NOT to Do, and Output Format, while reorganizing existing instructions. A review comment suggests consolidating the What NOT to Do section to remove redundancies with other parts of the prompt, which would help reduce token usage and improve clarity.

Comment on lines +1876 to +1883
## What NOT to Do\n\
- Do NOT use `write_file` to make small edits — use `edit_file` instead.\n\
- Do NOT read entire large files — use offset/limit in `read_file`.\n\
- Do NOT guess when requirements are unclear — use `ask_user`.\n\
- Do NOT run destructive commands (rm -rf, drop table) without user confirmation.\n\
- Do NOT modify test files unless explicitly asked.\n\
- Do NOT install new dependencies without mentioning it first.\n\
- Do NOT repeat a failed approach — try a different strategy.\n\n\
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The 'What NOT to Do' section introduces several rules that are already covered in other sections, leading to significant redundancy. For example:

  • Line 1877 is redundant with lines 1863 and 1893.
  • Line 1878 is redundant with line 1895.
  • Line 1879 is redundant with line 1871.
  • Line 1883 is redundant with line 1897.

While negative constraints are useful, repeating the same instructions multiple times across different sections increases token usage and can lead to instruction fatigue for the model. Consider consolidating these into a single, clear instruction per topic. For instance, you could move the unique negative constraints (like destructive commands or test file modifications) here and keep the tool-specific ones in 'Tool Usage Guidelines'.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
mc/crates/mc-cli/src/main.rs (1)

1881-1881: Soften the absolute “no test edits” rule to avoid blocking required fixes

Line 1881 is currently absolute; this can prevent necessary test updates when behavior changes are implemented. Consider allowing test edits when strictly required, with explicit justification.

✏️ Proposed wording tweak
-         - Do NOT modify test files unless explicitly asked.\n\
+         - Do NOT modify test files unless explicitly asked; if test changes are strictly required for correctness, do so and explain why.\n\
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@mc/crates/mc-cli/src/main.rs` at line 1881, Update the hardline prohibition
string "- Do NOT modify test files unless explicitly asked.\n" in main.rs to a
softer message that allows test edits when necessary; change the text to
indicate test modifications are permitted only with explicit justification and a
brief note explaining why the change is required (e.g., "Do not modify tests
unless strictly necessary — if a test must be updated, include an explicit
justification and link to the relevant issue/PR"). Ensure you update the exact
string literal where it's defined so help output and any related help/usage text
reflect the new, permissive-but-justified policy.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@mc/crates/mc-cli/src/main.rs`:
- Line 1881: Update the hardline prohibition string "- Do NOT modify test files
unless explicitly asked.\n" in main.rs to a softer message that allows test
edits when necessary; change the text to indicate test modifications are
permitted only with explicit justification and a brief note explaining why the
change is required (e.g., "Do not modify tests unless strictly necessary — if a
test must be updated, include an explicit justification and link to the relevant
issue/PR"). Ensure you update the exact string literal where it's defined so
help output and any related help/usage text reflect the new,
permissive-but-justified policy.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b43cc99f-7d57-4b65-b6df-cb96e40e6756

📥 Commits

Reviewing files that changed from the base of the PR and between 135d92e and 19b46af.

📒 Files selected for processing (1)
  • mc/crates/mc-cli/src/main.rs

@kienbui1995 kienbui1995 merged commit 1d3b87e into main Apr 14, 2026
9 checks passed
@kienbui1995 kienbui1995 deleted the fix/prompt-deep-research branch April 14, 2026 01:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant