# Section 3.2: Automate Code Review with Prompt Templates

| **Aspect** | **Details** |
|-------------|-------------|
| **Goal** | Build a reusable code review template that catches security, performance, and quality issues |
| **Time** | ~40 minutes |
| **Prerequisites** | Section 3.1 complete, `setup_utils.py` loaded |
| **What You'll Build** | Multi-dimensional review template with severity classification |
| **Next Steps** | Activity 3.2 → Section 3.3 (Test Generation) |

---

You mastered individual tactics in Module 2. Now you'll **combine them** into production templates that automate SDLC workflows. Start by building a comprehensive code review system that replaces ad-hoc reviews with structured, consistent feedback.

## 🔧 Quick Setup Check

Since you completed Section 1, setup is already done! We just need to import it.


In [None]:
# Quick setup check - imports setup_utils
try:
    import importlib
    import setup_utils
    importlib.reload(setup_utils)
    from setup_utils import *
    print(f"✅ Setup loaded! Using {PROVIDER.upper()} with {get_default_model()}")
    print("🚀 Ready to build code review templates!")
except ImportError:
    print("❌ Setup not found!")
    print("💡 Please run 3.1-setup-and-introduction.ipynb first to set up your environment.")

## 🔍 Code Review Automation Template

### Building a Comprehensive Code Review Prompt with a Multi-Tactic Stack

<div style="background:#fef3c7; border-left:4px solid #f59e0b; padding:16px; border-radius:6px; margin:20px 0; color:#000000;">
<strong style="color:#92400e;">🎯 What You'll Build in This Section</strong><br><br>

You'll create a **code review prompt template** that automatically checks code like an experienced engineer would. The prompt template will assist to find bugs, security issues, and quality problems, and provide clear suggestions on how to fix them.

**Time Required:** ~40 minutes (learning + examples + activity)
</div>

Layering tactics is the key to getting that level of rigor. Each block in the template leans on a different Module 2 technique so the model moves from context → reasoning → decision without dropping details. We'll call out those tactical touchpoints as you work through the section.

#### 🎯 The Problem We're Solving

Manual code reviews face three critical challenges:

1. **⏰ Time Bottlenecks** 
   - Senior engineers spend 8-12 hours/week reviewing PRs
   - Review queues delay feature delivery by 2-3 days on average
   - **Impact:** Slower velocity, frustrated developers

2. **🎯 Inconsistent Standards**
   - Different reviewers prioritize different concerns
   - New team members lack institutional knowledge
   - Review quality varies based on reviewer fatigue
   - **Impact:** Technical debt accumulates, security gaps emerge

3. **📝 Lost Knowledge**
   - Review reasoning buried in PR comments
   - No searchable audit trail for security decisions
   - Hard to train junior developers on review standards
   - **Impact:** Repeated mistakes, difficult compliance auditing


#### 🏗️ How We'll Build It: The Tactical Combination

We assemble this template by chaining together five Module 2 tactics. The table recaps what each tactic contributes and the callouts below map them to concrete sections of the prompt.

| **Tactic** | **Purpose in This Template** | **Why Modern LLMs Need This** |
|------------|------------------------------|-------------------------------|
| **Role Prompting** | Establishes "Senior Backend Engineer" perspective with specific expertise | LLMs respond better when given explicit expertise context rather than assuming generic knowledge |
| **Structured Inputs (XML)** | Separates code, context, and guidelines into clear sections | Prevents models from mixing different information types during analysis |
| **Task Decomposition** | Breaks review into 4 sequential steps (Think → Assess → Suggest → Verdict) | Advanced LLMs excel at following explicit numbered steps rather than implicit workflows |
| **Chain-of-Thought** | Makes reasoning visible in Analysis section | Improves accuracy by forcing deliberate analysis before conclusions |
| **Structured Output** | Uses readable markdown format with severity levels | Enables human readability while maintaining parseable structure for automation |


<div style="margin:16px 0; padding:16px; background:#eef2ff; border-left:5px solid #4338ca; border-radius:8px; color:#1f2937;">
<strong style="font-size:1.05em; color:#1e1b4b;">Choosing XML vs Markdown for Prompting LLMs</strong><br><br>
The effectiveness of the format can change depending on the AI model. It depends on the complexity and length of the prompt structure, but any notation the model can accurately understand is fine, and maintainability on the human side is also important.
<p style="margin:8px 0;">Pick the structure that keeps instructions crystal clear:</p>
<ul style="margin:8px 0 12px 18px; padding:0;">
  <li><strong>Match the model:</strong>
    <ul>
      <li>Claude is tuned for XML.</li>
      <li>GPT-4/5 works with XML or Markdown; experiment both with your workflow.</li>
      <li>Llama-class or other open models usually prefer XML on complex prompts.</li>
    </ul>
  </li>
  <li><strong>Match the prompt:</strong>
    <ul>
      <li>Short prompts can stay in Markdown. </li>
      <li>Multi-section prompts gain clarity from XML because each block (role, context, examples) is explicitly tagged.</li>
    </ul>  
  <li><strong>Match the stakes:</strong>
    <ul>
      <li>Markdown saves tokens for lightweight tasks. </li>
      <li>When accuracy matters more than cost, XML’s structure often pays off.</li>
    </ul>
  <li><strong>Label everything:</strong> Whatever format you pick, clearly separate context, instructions, and examples—use descriptive tags in XML or consistent headings in Markdown.</li>
</ul>
</div>
<br>
<div style="margin-top:16px; color:#15803d; padding:12px; background:#dcfce7; border-radius:6px; border-left:4px solid #22c55e;">
<style>
code {
  font-family: Consolas,"courier new";
  color:rgb(238, 13, 13);
  background-color: #f1f1f1;
  padding: 2px;
  font-size: 110%;
}
</style>
<strong style="color:#166534;">🚀 Let's Build It!</strong><br><br>

In the next cell, you'll see the complete template structure. **Pay special attention to**:
- How we use explicit language to define severity levels (not "bad code" but "allows SQL injection")
- Why the markdown output format is more readable than XML while still being parseable
- How parameters like `{{tech_stack}}` and `{{change_purpose}}` make the template reusable across projects
- How the 6 review dimensions (Security, Performance, Error Handling, etc.) ensure comprehensive analysis

After reviewing the template, you'll test it on real code and see how each tactic contributes to the result.
</div>


### 🤔 Why Templatize Prompts?

Templating turns good prompting habits into a repeatable system. Instead of rewriting long instructions, you:

- Swap in new repos, services, or code diffs with `{{variables}}`
- Guarantee every review covers the same dimensions and severity language
- Reduce drift as teammates inherit a proven prompt rather than inventing their own
- Make automation easy because the structure is predictable

Want a deeper dive? Claude's guidance on prompt templates and variables breaks down when to parameterise and how to organise reusable snippets: [Prompt templates & variables](https://docs.claude.com/en/docs/build-with-claude/prompt-engineering/prompt-templates-and-variables).




### 📋 Template Structure

Break the template into focused pieces so the model never has to parse a wall of XML. Each block is paired with the tactic that keeps it reliable.


**Where the tactics show up in the template:**

| Template Block | What It Does | Tactic Used |
| --- | --- | --- |
| **1. `<role>`** | Sets the reviewer persona (e.g., Senior Python Engineer) | Role Prompting |
| **2. `<context>`** | Provides repository, service, and change purpose | Structured Inputs |
| **3. `<code_diff>`** | Contains the code changes to review | Structured Inputs |
| **4. `<review_guidelines>`** | Lists what to check (security, performance, quality, etc.) | Task Decomposition |
| **5. `<tasks>`** | Guides the model: Think → Assess → Suggest → Verdict | Task Decomposition + Chain-of-Thought |
| **6. `<output_format>`** | Defines the structure: Summary → Findings table → Verdict | Structured Output |

You can reuse this template for different projects by swapping variables like `{{repo_name}}`, `{{change_purpose}}`, or `{{tech_stack}}`. The review tactics and quality checks remain consistent across all uses.

````xml
<role>
Act as a Senior Software Engineer specializing in {{tech_stack}} backend services.
</role>

<context>
Repository: {{repo_name}}
Service: {{service_name}}
Change Purpose: {{change_purpose}}
Language: {{lang}}
</context>

<code_diff>
{{code_diff}}
</code_diff>

<review_guidelines>
Assess the change across:
1. Security (auth, data handling, injection)
2. Reliability and correctness
3. Performance and resource usage
4. Maintainability and readability
5. Observability and logging
</review_guidelines>

<tasks>
1. Think through the change and note risks.
2. Analyse the code against the review guidelines.
3. Suggest fixes with concrete recommendations.
4. Deliver a final verdict (approve, needs work, block).
</tasks>

<output_format>
## Summary
[One paragraph that captures overall review stance.]

## Findings
### [SEVERITY] Issue Title
**Category:** [Security / Performance / Quality / Correctness / Best Practices]
**Line:** [line number]
**Issue:** [impact-focused description]
**Recommendation:**
```{{lang}}
# safer / faster / cleaner fix here
```

## Verdict
- Decision: [Approve / Needs Changes / Block]
- Rationale: [Why you chose this verdict]
</output_format>
````


### 💻 Working Example: Comprehensive Review Walkthrough

Now let's see the template in action! We'll review a realistic code change: a monthly report exporter that touches database queries, caching, and S3 uploads.

**What to look for:**
- Each section of the prompt is marked with comments like `<!-- Block 1: Role -->`
- Match each block back to the structure table above
- Notice how the 6 blocks work together to produce a thorough review

Run the cell below to see the complete prompt and the model's response.


In [None]:
# Example: Comprehensive Code Review aligned with the six-block template
code_diff = '''
+ import json
+ import time
+ from decimal import Decimal
+
+ CACHE = {}
+
+ def generate_monthly_report(org_id, db, s3_client):
+     if org_id in CACHE:
+         return CACHE[org_id]
+
+     query = f"SELECT * FROM invoices WHERE org_id = '{org_id}' ORDER BY created_at DESC"
+     rows = db.execute(query)
+
+     total = Decimal(0)

+     items = []
+     for row in rows:
+         total += Decimal(row['amount'])
+         items.append({
+             'id': row['id'],
+             'customer': row['customer_name'],
+             'amount': float(row['amount'])
+         })
+
+     payload = {
+         'org': org_id,
+         'generated_at': time.strftime('%Y-%m-%d %H:%M:%S'),
+         'total': float(total),
+         'items': items
+     }
+
+     key = f"reports/{org_id}/{int(time.time())}.json"
+     time.sleep(0.5)
+     s3_client.put_object(
+         Bucket='company-reports',
+         Key=key,
+         Body=json.dumps(payload),
+         ACL='public-read'
+     )
+
+     CACHE[org_id] = key
+     return key
'''

messages = [
    {
        'role': 'system',
        'content': 'You follow structured review templates and produce clear, actionable findings.'
    },
    {
        'role': 'user',
        'content': f'''
<!-- Block 1: Role -->
<role>
Act as a Senior Software Engineer specializing in Python backend services.
Your expertise covers security best practices, performance tuning, reliability, and maintainable design.
</role>

<!-- Block 2: Context -->
<context>
Repository: analytics-platform
Service: Reporting API
Purpose: Add a monthly invoice report exporter that finance can trigger
Change Scope: Review focuses on the generate_monthly_report implementation
Language: python
</context>

<!-- Block 3: Code Diff -->
<code_diff>
{code_diff}
</code_diff>

<!-- Block 4: Review Guidelines -->
<review_guidelines>
Assess the change across multiple dimensions:
1. Security — SQL injection, S3 object exposure, sensitive data handling.
2. Performance — query efficiency, blocking calls, caching behaviour.
3. Error Handling — resilience to empty results, network/storage failures.
4. Code Quality — readability, global state, data conversions.
5. Correctness — totals, currency precision, repeated report generation.
6. Best Practices — configuration management, separation of concerns, testing hooks.
For each finding, cite the diff line, describe impact, and share an actionable fix.
</review_guidelines>

<!-- Block 5: Tasks -->
<tasks>
Step 1 - Think: Analyse the diff using the dimensions listed above.
Step 2 - Assess: For each issue, capture Severity (CRITICAL/MAJOR/MINOR/INFO), Category, Line, Issue, Impact.
Step 3 - Suggest: Provide a concrete remediation (code change or process tweak).
Step 4 - Verdict: Summarise overall risk and recommend APPROVE / REQUEST CHANGES / NEEDS WORK.
</tasks>

<!-- Block 6: Output Format -->
<output_format>
## Code Review Summary
[One paragraph on overall health and primary risks]

## Findings
### [SEVERITY] Issue Title
**Category:** [Security / Performance / Quality / Correctness / Best Practices]
**Line:** [line number]
**Issue:** [impact-focused description]
**Recommendation:**
```
# safer / faster / cleaner fix here
```

## Overall Assessment
**Recommendation:** [APPROVE | REQUEST CHANGES | NEEDS WORK]
**Summary:** [What to address before merge]
</output_format>
'''
    }
]

print('🔍 COMPREHENSIVE CODE REVIEW IN PROGRESS...')
print('=' * 70)
review_result = get_chat_completion(messages, temperature=0.0)
print(review_result)
print('=' * 70)


## 🎯 Guided Practice: Complete the API Rate Limiting Review

<div style="background:#e0f2fe; border-left:4px solid #0284c7; padding:16px; border-radius:6px; margin:20px 0; color:#000000;">
<strong style="color:#075985;">🏗️ Build with Partial Scaffolding</strong><br><br>

You just saw a complete worked example. Now practice with a **partially complete template** where you fill in the critical thinking parts. This helps you learn faster than building everything from scratch.

**Your Task:** Fill in the **⚠️ TODO** sections below.  
**Time:** 10-15 minutes
</div>

### The Scenario

An engineer implemented API rate limiting to prevent abuse. Review this code:

```python
+ import time
+ from collections import defaultdict
+
+ RATE_LIMITS = defaultdict(list)
+
+ def check_rate_limit(api_key, endpoint):
+     now = time.time()
+     
+     requests = RATE_LIMITS[api_key]
+     requests = [t for t in requests if now - t < 60]
+     
+     if len(requests) >= 100:
+         return {"allowed": False, "retry_after": 60 - (now - requests[0])}
+     
+     requests.append(now)
+     RATE_LIMITS[api_key] = requests
+     
+     return {"allowed": True}
```

**Complete the review template by filling in the ⚠️ TODO sections:**

In [None]:
# Guided Completion: API Rate Limiting Review
# Fill in the ⚠️ TODO sections below

rate_limit_code = '''
+ import time
+ from collections import defaultdict
+
+ RATE_LIMITS = defaultdict(list)
+
+ def check_rate_limit(api_key, endpoint):
+     now = time.time()
+     
+     requests = RATE_LIMITS[api_key]
+     requests = [t for t in requests if now - t < 60]
+     
+     if len(requests) >= 100:
+         return {"allowed": False, "retry_after": 60 - (now - requests[0])}
+     
+     requests.append(now)
+     RATE_LIMITS[api_key] = requests
+     
+     return {"allowed": True}
'''

# ✅ PROVIDED: Role and Context (Study these)
completion_template = f'''
<role>
Act as a Senior Software Engineer specializing in Python backend services with expertise in API security and distributed systems.
</role>

<context>
Repository: api-gateway
Service: Rate Limiting Middleware
Purpose: Add rate limiting to prevent API abuse
Language: python
</context>

<code_diff>
{rate_limit_code}
</code_diff>

<review_guidelines>
⚠️ TODO: Fill in what to check. Consider:
- What security risks does in-memory rate limiting have?
- What happens in distributed systems (multiple servers)?
- What performance issues could arise?
- What edge cases exist (clock skew, key collisions)?

Your answer:
1. Security — [YOUR CONCERNS HERE]
2. Reliability — [YOUR CONCERNS HERE]  
3. Performance — [YOUR CONCERNS HERE]
4. Distributed Systems — [YOUR CONCERNS HERE]
</review_guidelines>

<tasks>
1. Think through the change and note risks.
2. Analyse the code against the review guidelines you defined above.
3. Suggest fixes with concrete recommendations.
4. Deliver a final verdict (approve, needs work, block).
</tasks>

<output_format>
⚠️ TODO: Define your output structure. What severity levels will you use?
What categories? What should each finding include?

Your format:
## Summary
[YOUR FORMAT HERE]

## Findings
[YOUR FORMAT HERE]

## Verdict
[YOUR FORMAT HERE]
</output_format>
'''

# Run this after you've filled in the ⚠️ TODOs above
print("🧪 GUIDED COMPLETION: API RATE LIMITING REVIEW")
print("=" * 70)
print("📋 YOUR TEMPLATE (with ⚠️ TODOs filled in):")
print("=" * 70)
print(completion_template)
print("=" * 70)
print("\n🤖 AI REVIEW:")
print("=" * 70)

messages = [
    {"role": "system", "content": "You follow structured review templates and produce clear, actionable findings."},
    {"role": "user", "content": completion_template}
]

review = get_chat_completion(messages, temperature=0.0)
print(review)
print("=" * 70)

## 🏋️ Activity 3.2: Build Your Code Review Template

<div style="background:#dcfce7; border-left:4px solid #22c55e; padding:16px; border-radius:6px; margin:20px 0; color:#000000;">
<strong style="color:#166534;">🎯 Your Task:</strong> Create a production-ready code review template from scratch.<br><br>
<strong>Time:</strong> 30-40 minutes
</div>

### 📚 Your Learning Journey

**Step 1: Worked Example** ✅ - You studied the monthly report review  
**Step 2: Guided Completion** ✅ - You completed the API rate limiting template  
**Step 3: Independent Practice** ⬅️ **You are here** - Build authentication review from scratch

### 📝 How to Complete This Activity

1. Open **[`activities/activity-3.2-code-review.md`](./activities/activity-3.2-code-review.md)**
2. Complete the template between the `<!-- TEMPLATE START -->` and `<!-- TEMPLATE END -->` markers
3. Return here and run the test cell below
4. Compare with **[solution](./solutions/activity-3.2-code-review-solution.md)** when done

> **💡 What You'll Build:** A multi-dimensional review template for authentication code that catches security, performance, and quality issues.

<div style="margin-top:16px; color:#991b1b; padding:12px; background:#fee2e2; border-radius:6px; border-left:4px solid #ef4444;">
<style>
code {
  font-family: Consolas,"courier new";
  color:rgb(238, 13, 13);
  background-color: #f1f1f1;
  padding: 2px;
  font-size: 110%;
}
</style>
<strong>⚠️ IMPORTANT:</strong> Complete your template in the activity file BEFORE running this!
<br><br>
<strong>Steps to complete first:</strong>
<ul style="margin: 8px 0 0 0;">
<li>Open <code>activities/activity-3.2-code-review.md</code></li>
<li>Replace all <code>&lt;!-- TODO: ... --&gt;</code> comments with your actual content</li>
<li>Fill in role, guidelines, tasks, and output format sections</li>
<li>Save the file, then come back and run the cell below</li>
</ul>
</div>

<div style="margin-top:16px; color:#78350f; padding:12px; background:#fef3c7; border-radius:6px; border-left:4px solid #f59e0b;">
<strong>💡 Model Quirk:</strong><br><br>
Sometimes the AI might start by quoting a line from your code (like <code>user = rows[0]</code>) before giving the actual review. This is normal model behavior and doesn't affect the quality of your results.
<br><br>
<strong>If this happens:</strong>
<ul style="margin: 8px 0 0 0;">
<li>Just ignore the quoted line - the rest of the review will be complete and properly formatted</li>
<li>Or re-run the cell - it might not happen the second time</li>
</ul>
</div>

In [None]:
# Test your Activity 3.2 template
from setup_utils import test_activity_3_2
# This is the vulnerable authentication code from the activity
test_code = """
+ import hashlib
+ import time
+
+ SESSION_CACHE = {}
+
+ def authenticate_user(db, username, password):
+     username = username or ""
+     password = password or ""
+
+     query = f"SELECT id, password_hash, failed_attempts FROM users WHERE username = '{username}'"
+     rows = db.execute(query)
+     user = rows[0]
+
+     hashed = hashlib.md5(password.encode()).hexdigest()
+
+     if hashed != user["password_hash"]:
+         db.execute(f"UPDATE users SET failed_attempts = {user['failed_attempts'] + 1} WHERE id = {user['id']}")
+         return {"status": "error"}
+
+     if username not in SESSION_CACHE:
+         SESSION_CACHE[username] = f"{user['id']}-{int(time.time())}"
+
+     permissions = []
+     for role in db.fetch_roles():
+         if db.has_role(user["id"], role["id"]):
+             permissions.append(role["name"])
+
+     time.sleep(0.5)
+     db.write_audit_entry(user["id"], username)
+
+     return {"status": "ok", "session": SESSION_CACHE[username], "permissions": permissions}
"""

# Run this to test your template from the activity file
test_activity_3_2(
    test_code=test_code,
    variables={
        'tech_stack': 'Python',
        'repo_name': 'user-auth-service',
        'service_name': 'Authentication API',
        'change_purpose': 'Add user login endpoint'
    }
)

# The function will:
# 1. Read your template from activities/activity-3.2-code-review.md
# 2. Substitute the variables
# 3. Send to the AI model
# 4. Display the results
# 5. Ask if you want to save results back to the activity file

### 📚 Learn More: Advanced Code Review Patterns

Want to dive deeper into production code review automation?

**📖 AWS Anthropic Advanced Patterns:**
- [Code Review Command Pattern](https://github.com/aws-samples/anthropic-on-aws/blob/main/advanced-claude-code-patterns/commands/code-review.md) — Full prompt + workflow for automated reviews

**🔗 Related Best Practices:**
- [Claude 4 Prompt Engineering](https://docs.claude.com/en/docs/build-with-claude/prompt-engineering/claude-4-best-practices) — Guidance on structuring complex instructions
- [Prompt Templates & Variables](https://docs.claude.com/en/docs/build-with-claude/prompt-engineering/prompt-templates-and-variables) — When and how to parameterize prompts
- [OpenAI GPT-5 Prompting Guide](https://cookbook.openai.com/examples/gpt-5/gpt-5_prompting_guide) — Latest guidance on tactic stacking and failure analysis for GPT-5 models


## ✅ Section 3.2 Complete!

<div style="margin-top:16px; padding:14px; background:#dcfce7; border-left:4px solid #22c55e; border-radius:6px; color:#065f46;">
<strong>🎉 Nice work!</strong> You've completed the Code Review Automation section.
</div>

**Key takeaways:**
- Combined Module 2 tactics into a reusable review template
- Built multi-dimensional reviews covering security, performance, and quality
- Learned to use parameterized templates with `{{variables}}`

### ⏭️ Next: Section 3.3 - Test Generation Automation

**[🚀 Open Section 3.3 now](./3.3-test-generation-automation.ipynb)** to learn how to generate comprehensive test specifications from requirements.

> You're halfway through the core material! Keep the momentum going.