# Section 3.2: Automate Code Review with Prompt Templates

| **Aspect** | **Details** |
|-------------|-------------|
| **Goal** | Build a reusable code review template that catches security, performance, and quality issues |
| **Time** | ~40 minutes |
| **Prerequisites** | Section 3.1 complete, `setup_utils.py` loaded |
| **What You'll Build** | Multi-dimensional review template with severity classification |
| **Next Steps** | Activity 3.2 ‚Üí Section 3.3 (Test Generation) |

---

You mastered individual tactics in Module 2. Now you'll **combine them** into production templates that automate SDLC workflows. Start by building a comprehensive code review system that replaces ad-hoc reviews with structured, consistent feedback.

## Quick Setup Check

Since you completed Section 1, setup is already done! We just need to import it.


In [None]:
# Quick setup check - imports setup_utils
try:
    import importlib
    import setup_utils
    importlib.reload(setup_utils)
    from setup_utils import *
    print(f"Setup loaded! Using {get_provider().upper()} with {get_default_model()}")
    print("Ready to build code review templates!")
except ImportError:
    print("Setup not found!")
    print("Please run 3.1-setup-and-introduction.ipynb first to set up your environment.")

## Code Review Automation Template

### Building a Comprehensive Code Review Prompt with a Multi-Tactic Stack

<div style="background:#fef3c7; border-left:4px solid #f59e0b; padding:16px; border-radius:6px; margin:20px 0; color:#000000;">
<strong style="color:#92400e;">üéØ What You'll Build in This Section</strong><br><br>

You'll create a **code review prompt template** that automatically checks code like an experienced engineer would. The template will help you find bugs, security issues, and quality problems, and provide clear suggestions on how to fix them.

**Time Required:** ~40 minutes (learning + examples + activity)
</div>

---

### Your Learning Path

Here's how we'll get there (4 steps):

<div style="background:#fff; border:2px solid #e5e7eb; padding:20px; border-radius:8px; margin:20px 0; color:#000000;">

**Step 1: See It In Action** (Cells below)  
‚Üí Run real code through two prompts: generic vs structured  
‚Üí Compare results and spot the difference

---

**Step 2: Understand Why Templates Matter** (After examples)  
‚Üí See how templates save time and ensure consistency  
‚Üí Learn the 4 superpowers: Reusability, Consistency, Version Control, Onboarding

---

**Step 3: Learn the Structure** (Blueprint section)  
‚Üí Break down the 6-block template you just saw  
‚Üí Map each block to Module 2 tactics

---

**Step 4: Build Your Own** (Activity)  
‚Üí Create a code review template from scratch  
‚Üí Test it on real authentication code  
‚Üí Get automated feedback and iterate

</div>

Combining multiple tactics is the key to getting that level of rigor. Each block in the template uses a different Module 2 technique working together so the model moves from context ‚Üí reasoning ‚Üí decision without dropping details. We'll call out those tactical touchpoints as you work through the section.

---

#### ü§î Quick Reflection: Your Code Review Experience

Before we dive in, take a moment to reflect on your own experience:

<div style="background:#e0f2fe; border-left:4px solid #0284c7; padding:16px; border-radius:6px; margin:16px 0; color:#000000;">
<strong style="color:#0c4a6e;">üí≠ Think about the last time you reviewed code or had your code reviewed:</strong><br><br>

**Question 1:** How long did it take? Was it hours? Days waiting in a queue?<br>
**Question 2:** Did different reviewers catch different things, or miss the same issues?<br>
**Question 3:** Could you easily find that review feedback later when you needed it?
</div>

---

#### The Problems We're Solving Together

Sound familiar? Let's connect these challenges to real scenarios you've probably experienced:

<div style="background:#fff; border:2px solid #e5e7eb; padding:16px; border-radius:8px; margin:16px 0; color:#000000;">

**1. ‚è∞ Time Bottlenecks**

<div style="margin-left:16px; padding:12px; background:#fef3c7; border-radius:6px; margin-top:8px; margin-bottom:16px;">
<strong>üîç Spot This Pattern?</strong><br><br>
‚Ä¢ It's 4pm Friday. You submit a PR for review.<br><br>
‚Ä¢ Monday morning: still waiting.<br><br>
‚Ä¢ Tuesday: "Looks good, just fix the SQL injection on line 47."<br><br>
‚Ä¢ Wednesday: You fix it, re-submit, wait again‚Ä¶<br><br>
<strong>Result:</strong> A 10-minute fix became a 3-day delay.
</div>

**The time cost adds up:**
- Senior engineers often spend significant time just reviewing code each week
- PRs commonly wait days before getting first review
- **Your time:** How much faster could you ship if reviews took minutes instead of days?

---

**2. üéØ Inconsistent Standards**

<div style="margin-left:16px; padding:12px; background:#fef3c7; border-radius:6px; margin-top:8px; margin-bottom:16px;">
<strong>üîç Sound Familiar?</strong><br><br>
‚Ä¢ Alex catches security issues but misses performance problems<br><br>
‚Ä¢ Jordan focuses on style but overlooks error handling<br><br>
‚Ä¢ New team members don't know what "good enough to merge" means<br><br>
<strong>Result:</strong> Every review is a lottery. Will they catch the critical bug or not?
</div>

**The consistency problem:**
- Different reviewers = different priorities
- Tired reviewers = missed issues
- No shared checklist = gaps in coverage
- **Ask yourself:** Has a critical bug ever made it to production because "someone else would catch it"?

---

**3. üìù Lost Knowledge**

<div style="margin-left:16px; padding:12px; background:#fef3c7; border-radius:6px; margin-top:8px; margin-bottom:16px;">
<strong>üîç Ever Been Here?</strong><br><br>
‚Ä¢ "Why did we decide not to cache this API call?"<br><br>
‚Ä¢ Searches through 47 closed PRs<br><br>
‚Ä¢ Finds a comment from 8 months ago: "Memory leak in production - see incident #892"<br><br>
‚Ä¢ That incident ticket is deleted<br><br>
<strong>Result:</strong> You're about to repeat the same mistake because the "why" disappeared.
</div>

**The knowledge drain:**
- Review decisions buried in PR comments
- No way to search: "Why didn't we use Redis here?"
- Hard to train new engineers on "lessons learned"
- **Challenge:** Try finding your team's security review decisions from 6 months ago. How long did it take?

</div>

---

#### Here's What You'll Build Today

By the end of this section, you'll have an automated code review template that:

‚úÖ **Runs instantly** - No more waiting days for feedback<br>
‚úÖ **Checks everything, every time** - Security, performance, error handling, quality (no gaps)<br>
‚úÖ **Gives consistent feedback** - Same standards, every review, every developer<br>
‚úÖ **Documents decisions** - Clear severity ratings and reasoning you can reference later<br>
‚úÖ **Works across your whole team** - One template, infinite reuse

**Ready to build it?** Let's start with Step 1: See it in action. ‚¨áÔ∏è

#### üèóÔ∏è Learning By Example: Before & After

The best way to understand how tactics combine is to see them in action. Let's review the same piece of real code twice:
1. **Without tactics** - A generic prompt (what most people try first)
2. **With tactics** - A structured template (what professionals use)

You'll see exactly which problems each tactic solves.

---

### üíª The Code We'll Review

Here's a realistic function with several issues - a monthly report generator that touches databases, caching, and S3:

```python
+ import json
+ import time
+ from decimal import Decimal
+
+ CACHE = {}
+
+ def generate_monthly_report(org_id, db, s3_client):
+     if org_id in CACHE:
+         return CACHE[org_id]
+
+     query = f"SELECT * FROM invoices WHERE org_id = '{org_id}' ORDER BY created_at DESC"
+     rows = db.execute(query)
+
+     total = Decimal(0)
+     items = []
+     for row in rows:
+         total += Decimal(row['amount'])
+         items.append({
+             'id': row['id'],
+             'customer': row['customer_name'],
+             'amount': float(row['amount'])
+         })
+
+     payload = {
+         'org': org_id,
+         'generated_at': time.strftime('%Y-%m-%d %H:%M:%S'),
+         'total': float(total),
+         'items': items
+     }
+
+     key = f"reports/{org_id}/{int(time.time())}.json"
+     time.sleep(0.5)
+     s3_client.put_object(
+         Bucket='company-reports',
+         Key=key,
+         Body=json.dumps(payload),
+         ACL='public-read'
+     )
+
+     CACHE[org_id] = key
+     return key
```

<div style="background:#fef3c7; border-left:4px solid #f59e0b; padding:16px; border-radius:6px; margin:16px 0; color:#000000;">
<strong>üïµÔ∏è Quick Challenge:</strong> Before scrolling down, scan the code above. What issues do you spot?<br><br>

Think about:
- **Security:** Any injection risks or exposed data?
- **Performance:** Blocking operations or unbounded caching?
- **Reliability:** What happens if the database is empty or S3 fails?

<em>Keep your guesses in mind - we'll see how different prompts catch (or miss) these issues!</em>
</div>

**Ready to see how prompting quality affects review quality?** Run the cells below! ‚¨áÔ∏è

In [None]:
# =============================================================================
# APPROACH 1: ‚ùå GENERIC PROMPT (No Tactics)
# =============================================================================
# This is what most developers try first - a simple, unstructured request

code_to_review = '''
+ import json
+ import time
+ from decimal import Decimal
+
+ CACHE = {}
+
+ def generate_monthly_report(org_id, db, s3_client):
+     if org_id in CACHE:
+         return CACHE[org_id]
+
+     query = f"SELECT * FROM invoices WHERE org_id = '{org_id}' ORDER BY created_at DESC"
+     rows = db.execute(query)
+
+     total = Decimal(0)
+     items = []
+     for row in rows:
+         total += Decimal(row['amount'])
+         items.append({
+             'id': row['id'],
+             'customer': row['customer_name'],
+             'amount': float(row['amount'])
+         })
+
+     payload = {
+         'org': org_id,
+         'generated_at': time.strftime('%Y-%m-%d %H:%M:%S'),
+         'total': float(total),
+         'items': items
+     }
+
+     key = f"reports/{org_id}/{int(time.time())}.json"
+     time.sleep(0.5)
+     s3_client.put_object(
+         Bucket='company-reports',
+         Key=key,
+         Body=json.dumps(payload),
+         ACL='public-read'
+     )
+
+     CACHE[org_id] = key
+     return key
'''

generic_messages = [
    {
        "role": "user",
        "content": f"Please review this Python code and tell me if there are any issues:\n\n{code_to_review}"
    }
]

print("=" * 80)
print("‚ùå APPROACH 1: GENERIC PROMPT (What most people try)")
print("=" * 80)
print("\nPROMPT:")
print(generic_messages[0]["content"])
print("\n" + "-" * 80)
print("AI RESPONSE:")
print("-" * 80 + "\n")

generic_response = get_chat_completion(generic_messages, temperature=0.0)
print(generic_response)

print("\n" + "=" * 80)
print("‚ùå PROBLEMS WITH THIS APPROACH:")
print("=" * 80)
print("‚Ä¢ Vague severity - No clear CRITICAL vs MINOR distinction")
print("‚Ä¢ Inconsistent coverage - Might miss security or performance on different runs")
print("‚Ä¢ Generic feedback - 'Could be improved' instead of specific fixes")
print("‚Ä¢ No structure - Hard to parse for automation or tracking")
print("\nüí° Let's see how adding tactics fixes these issues... scroll down! ‚¨áÔ∏è")

In [None]:
# =============================================================================
# APPROACH 2: ‚úÖ STRUCTURED PROMPT WITH TACTICS
# =============================================================================
# This demonstrates how combining Module 2 tactics creates systematic reviews:
# - Role Prompting: Sets expert perspective
# - Structured Inputs (XML): Clear boundaries between sections
# - Task Decomposition: Systematic checklist of review dimensions
# - Chain-of-Thought: Forces visible reasoning steps
# - Structured Output: Consistent severity classification

structured_prompt = f'''
<role>
Act as a Senior Software Engineer specializing in Python backend services.
Your expertise covers security best practices, performance tuning, reliability, and maintainable design.
</role>

<context>
Repository: analytics-platform
Service: Reporting API
Purpose: Add monthly invoice report exporter for finance team
Change Scope: New generate_monthly_report function
Language: python
</context>

<code_diff>
{code_to_review}
</code_diff>

<review_guidelines>
Assess the change across multiple dimensions:

1. Security ‚Äî SQL injection, S3 object exposure, sensitive data handling
2. Performance ‚Äî Query efficiency, blocking calls, caching behavior
3. Error Handling ‚Äî Resilience to empty results, network failures, storage failures
4. Code Quality ‚Äî Readability, global state usage, data type conversions
5. Correctness ‚Äî Calculation accuracy, currency precision, idempotency
6. Best Practices ‚Äî Configuration management, separation of concerns, testability

For each finding, cite the line number, describe business impact, and provide an actionable fix.
</review_guidelines>

<tasks>
Step 1 - Think: Analyze the code using each dimension from review_guidelines
Step 2 - Assess: For each issue found, determine Severity (CRITICAL/MAJOR/MINOR/INFO) and Category
Step 3 - Suggest: Provide concrete remediation with code examples
Step 4 - Verdict: Summarize overall risk and recommend APPROVE / REQUEST CHANGES / BLOCK
</tasks>

<output_format>
## Code Review Summary
Write one paragraph summarizing overall assessment and key risks

## Findings
For each issue, use this structure:

### {{SEVERITY}} Issue Title
**Category:** Security / Performance / Reliability / Quality / Correctness / Best Practices
**Line:** Cite specific line number
**Issue:** Describe the impact in business terms
**Recommendation:**
```python
# Provide concrete fix with code example
```

## Overall Assessment
**Recommendation:** APPROVE | REQUEST CHANGES | BLOCK
**Summary:** Explain what must be addressed before merge
</output_format>
'''

structured_messages = [
    {
        "role": "system",
        "content": "You follow structured review templates and produce clear, actionable findings."
    },
    {
        "role": "user",
        "content": structured_prompt
    }
]

print("=" * 80)
print("‚úÖ APPROACH 2: STRUCTURED PROMPT WITH TACTICS")
print("=" * 80)
print("\n" + "-" * 80)
print("AI RESPONSE:")
print("-" * 80 + "\n")

structured_response = get_chat_completion(structured_messages, temperature=0.0)
print(structured_response)

print("\nüí° Scroll down to see how each tactic contributed to this comprehensive review!")

---

### Side-by-Side Comparison: What Each Tactic Fixed

Now that you've seen both approaches in action, here's what each tactic contributed:

<table style="width:100%; border-collapse: collapse; margin:16px 0; background:#fff; border:2px solid #e5e7eb; color:#000000;">
<tr style="background:#f8fafc; font-weight:bold; color:#000000;">
<td style="padding:8px; border:1px solid #e5e7eb; color:#000000; width:25%;">Tactic</td>
<td style="padding:8px; border:1px solid #e5e7eb; color:#000000; width:25%;">Problem</td>
<td style="padding:8px; border:1px solid #e5e7eb; color:#000000; width:50%;">What Changed</td>
</tr>
<tr>
<td style="padding:8px; border:1px solid #e5e7eb; color:#000000;">üé≠ Role Prompting</td>
<td style="padding:8px; border:1px solid #e5e7eb; color:#000000;">Generic reviews</td>
<td style="padding:8px; border:1px solid #e5e7eb; background:#ecfdf5; color:#000000;">
Expert insights on security, performance, maintainability
</td>
</tr>
<tr>
<td style="padding:8px; border:1px solid #e5e7eb; color:#000000;">üì¶ Structured Inputs</td>
<td style="padding:8px; border:1px solid #e5e7eb; color:#000000;">AI mixes instructions</td>
<td style="padding:8px; border:1px solid #e5e7eb; background:#ecfdf5; color:#000000;">
Clear separation ‚Üí AI focuses on right parts
</td>
</tr>
<tr>
<td style="padding:8px; border:1px solid #e5e7eb; color:#000000;">üî¢ Task Decomposition</td>
<td style="padding:8px; border:1px solid #e5e7eb; color:#000000;">Incomplete analysis</td>
<td style="padding:8px; border:1px solid #e5e7eb; background:#ecfdf5; color:#000000;">
Checks all 6 dimensions every time
</td>
</tr>
<tr>
<td style="padding:8px; border:1px solid #e5e7eb; color:#000000;">üß† Chain-of-Thought</td>
<td style="padding:8px; border:1px solid #e5e7eb; color:#000000;">Vague feedback</td>
<td style="padding:8px; border:1px solid #e5e7eb; background:#ecfdf5; color:#000000;">
Reasoning: Think ‚Üí Assess ‚Üí Suggest
</td>
</tr>
<tr>
<td style="padding:8px; border:1px solid #e5e7eb; color:#000000;">üìä Structured Output</td>
<td style="padding:8px; border:1px solid #e5e7eb; color:#000000;">Inconsistent severity</td>
<td style="padding:8px; border:1px solid #e5e7eb; background:#ecfdf5; color:#000000;">
Format with CRITICAL/MAJOR/MINOR tags
</td>
</tr>
</table>

**üí° Key Insight:** Each tactic removes one type of failure. Combine them strategically, and you get production-ready reliability.

---

### ü§î Why Turn This Into a Template?

You just saw the structured prompt produce better results. But here's the real power move: **making it reusable across your entire team and all your projects**.

Think about the last time you crafted a great prompt, then needed it again weeks later and couldn't remember exactly how you phrased it. Or when a teammate asked, "How did you get such good results?" and you had to explain it all over again.

That's the problem templates solve.

---

#### From One-Off Prompts to Reusable Systems

**Scenario: Your team adopts AI code reviews**

<div>
<table style="width:100%; margin-top:12px; border-collapse:collapse; border:2px solid #e5e7eb;">
<tr>
<td style="width:50%; padding:12px; vertical-align:top; border:2px solid #e5e7eb; background:#fef2f2; color:#000000;">
<strong style="color:#991b1b;">‚ùå Without Templates</strong><br><br>

<strong style="color:#000000;">Week 1:</strong> Craft prompt (3 tries)<br>
<span style="color:#4b5563;">‚Üí Takes 20 minutes</span><br><br>

<strong style="color:#000000;">Week 2:</strong> Send via Slack<br>
<span style="color:#4b5563;">‚Üí They miss XML tags</span><br><br>

<strong style="color:#000000;">Week 3:</strong> Can't find original<br>
<span style="color:#4b5563;">‚Üí Rewrite from memory</span><br><br>

<strong style="color:#000000;">Week 4:</strong> Three versions<br>
<span style="color:#4b5563;">‚Üí Inconsistent reviews</span><br><br>

<strong style="color:#991b1b;">Result:</strong> Reinvent the wheel
</td>
<td style="width:50%; padding:12px; vertical-align:top; border:2px solid #e5e7eb; background:#ecfdf5; color:#000000;">
<strong style="color:#166534;">‚úÖ With Templates</strong><br><br>

<strong style="color:#000000;">Week 1:</strong> Create template<br>
<span style="color:#4b5563;">‚Üí 20 minutes once</span><br><br>

<strong style="color:#000000;">Week 2:</strong> Run command<br>
<span style="color:#4b5563;">‚Üí 30 seconds</span><br><br>

<strong style="color:#000000;">Week 3:</strong> Swap variables<br>
<span style="color:#4b5563;">‚Üí Instant review</span><br><br>

<strong style="color:#000000;">Week 4:</strong> Same template<br>
<span style="color:#4b5563;">‚Üí Consistent standards</span><br><br>

<strong style="color:#166534;">Result:</strong> Reuse forever
</td>
</tr>
</table>
</div>

---

#### The Four Superpowers of Templates

Templates aren't just saved text ‚Äî they're systematic knowledge capture with built-in advantages:

**1. üîÑ Reusability via Variables**

```python
# Instead of this every time:
"Review Python code in the auth service..."

# Use this once with placeholders:
"Review {{tech_stack}} in {{service}}..."
```

**Benefit:** One template ‚Üí infinite projects

---

**2. üìè Guaranteed Consistency**

Without templates: Reviews vary by reviewer  
With templates: **Same 6 dimensions every time**

Security ‚Üí Performance ‚Üí Error Handling ‚Üí Quality ‚Üí Correctness ‚Üí Best Practices

---

**3. üß¨ Version Control + Team Collaboration**

**Real scenario:** You discover severity labels need improvement

**Without templates:** Email team, hope everyone updates  
**With templates:** One Git commit, everyone gets the fix

*Bonus: Git history documents why each change was made*

---

**4. üéØ Instant Onboarding**

**Traditional:** "Watch me for a few weeks, then try..."  
**With templates:** "Run this command with your code."

New engineers productive immediately, not eventually.

---

### Breaking Down the Template: A Walkthrough

Let's dissect the structured prompt you saw in the example above. We'll walk through each block and see how it uses Module 2 tactics:

---

#### Block 1: üé≠ Set the Expert Persona

**What it looked like:**
```xml
<role>
Act as a Senior Software Engineer specializing 
in Python backend services. Your expertise covers 
security, performance, reliability, and design.
</role>
```

**What this does:** Activates the model's knowledge of senior-level software engineer. Instead of generic feedback, you get insights a seasoned engineer would catch.

**Module 2 Tactic:** Role Prompting

---

#### Block 2: üì¶ Provide Context (What to Analyze)

**What it looked like:**
```xml
<context>
Repository: analytics-platform
Service: Reporting API
Purpose: Add monthly invoice exporter
</context>

<code_diff>
+ import json
+ def generate_monthly_report(org_id, db, s3):
  ...
</code_diff>
```

**What this does:** Separates "what you're reviewing" from "how to review it." The AI knows the code belongs to a reporting service (so performance matters), it's for finance (so correctness and security are critical).

**Module 2 Tactic:** Structured Inputs (XML)

---

#### Block 3: üî¢ Define Review Dimensions (What to Check)

**What it looked like:**
```xml
<review_guidelines>
Assess across multiple dimensions:

1. Security ‚Äî SQL injection, S3 exposure
2. Performance ‚Äî Query efficiency, blocking
3. Error Handling ‚Äî Resilience to failures
4. Code Quality ‚Äî Readability, global state
5. Correctness ‚Äî Calculation accuracy
6. Best Practices ‚Äî Configuration, testing
</review_guidelines>
```

**What this does:** Creates a systematic checklist. Every review checks ALL 6 dimensions ‚Äî no more "Alex catches security but misses performance" inconsistency.

**Module 2 Tactic:** Task Decomposition

---

#### Block 4: üß† Guide the Reasoning Process (How to Analyze)

**What it looked like:**
```xml
<tasks>
Step 1 - Think: Analyze using each dimension
Step 2 - Assess: Determine Severity/Category
Step 3 - Suggest: Provide concrete remediation
Step 4 - Verdict: Recommend action
</tasks>
```

**What this does:** Forces the model to show its work. You get specific findings with evidence and reasoning.

**Module 2 Tactic:** Chain-of-Thought

---

#### Block 5: üìä Specify Output Format (How to Report)

**What it looked like:**
````xml
<output_format>
## Code Review Summary
Write one paragraph summarizing overall assessment and key risks

## Findings
For each issue, use this structure:

### {{SEVERITY}} Issue Title
**Category:** Security / Performance / Reliability / Quality / Correctness / Best Practices
**Line:** Cite specific line number
**Issue:** Describe the impact in business terms
**Recommendation:**
```python
# Provide concrete fix with code example
```

## Overall Assessment
**Recommendation:** APPROVE | REQUEST CHANGES | BLOCK
**Summary:** Explain what must be addressed before merge
</output_format>
````

**What this does:** Standardizes the output so it's machine-parseable. Your CI/CD can count CRITICAL issues, block merges, or post formatted comments.

**Module 2 Tactic:** Structured Output

---

#### Block 6: üîÑ Making It Reusable

**Add variables** for the parts that change between projects:

```xml
<context>
Repository: {{repo_name}}
Service: {{service_name}}
Purpose: {{change_purpose}}
Language: {{tech_stack}}
</context>
```

Now you can use the same template across your organization. Just swap variables.

**One template that can be reused across projects.**

---

### üé® A Quick Note on Formatting

You might have noticed we used XML tags (`<role>`, `<context>`) instead of Markdown. Here's why:

**The key principle:** Use **clear labels** so the AI never has to guess whether text is "an instruction to follow" or "content to analyze."

**Bad (ambiguous):**
```
You are an expert. Here's code: [code] Check it.
```
‚Üí Where do instructions end and code begin?

**Good (labeled):**
```xml
<role>You are an expert engineer.</role>
<code_to_review>[code block]</code_to_review>
<instructions>Check for issues.</instructions>
```
‚Üí Clear boundaries, zero confusion.

**XML vs Markdown?** Both work! XML gives stricter separation (good for complex prompts with 6+ sections). Markdown is fine for simpler prompts. The key is: **label your sections clearly**, whatever format you choose.

---

### How Will You Use This?

The example above showed a code review in a Python script. But where does this fit in your real workflow?

**üí≠ Common Integration Patterns:**

**Option 1: IDE Integration** - Use in your code editor  
Save as custom or Slash command. Highlight code ‚Üí Run ‚Üí Get instant review. Examples: Claude Code, GitHub Copilot, OpenAI Codex

**Option 2: GitLab/GitHub CI/CD** - Automated reviews  
Add to CI pipeline. Runs on every merge request. Posts findings as comments. Can block merges if CRITICAL

**Option 3: Pre-commit Hook** - Local validation  
Runs when you git commit. Fast feedback (catches issues in seconds). No CI needed

**Option 4: Custom Slack Bot / CLI** - Team tool  
Run commands in terminal or Slack. Great for ad-hoc reviews during pairing

---

**üß™ For This Tutorial:** We're focusing on **building the template** itself. Integration with your specific tools (GitLab, IDE, etc.) comes after ‚Äî but the template you build here works across all platforms!

---

#### Learn How to Use These Prompts in Claude Code

If you want to learn how these prompts can be used within Claude Code, check out these resources:

**üìö Workshop Materials:**
- [Workshop 101: Building Your AI Muscle](https://splunk.atlassian.net/wiki/spaces/PROD/pages/1079048930218/Workshop+101+Building+Your+AI+Muscle)
- [Workshop 201: Leverage Advanced Features of Claude Code](https://splunk.atlassian.net/wiki/spaces/PROD/pages/1079081894117/Workshop+201+Leverage+Advanced+Features+of+Claude+Code)

**üé¨ Video Tutorials:**
- [Workshop 101: Claude Code - Building Your AI Muscle](https://splunk.plusplus.app/a/videos/53139595-b224-4872-ba62-6c37cb0e2a10_workshop-101-claude-code-building-your-ai-muscle)
- [Workshop 201: Leverage Advanced Features of Claude Code](https://splunk.plusplus.app/a/videos/a937db9d-732e-4f8b-a4bb-fe2d44ab8636_workshop-201-leverage-advanced-features-of-claude-code)

### Ready for Practice?

<div style="margin-top:16px; color:#15803d; padding:12px; background:#dcfce7; border-radius:6px; border-left:4px solid #22c55e;">
<style>
code {
  font-family: Consolas,"courier new";
  color:rgb(238, 13, 13);
  background-color: #f1f1f1;
  padding: 2px;
  font-size: 110%;
}
</style>
<strong style="color:#166534;">üìç You Are Here:</strong><br><br>

‚úÖ Saw the before/after comparison (generic vs structured prompt)<br>
‚úÖ Identified which tactic fixes which problem<br>
‚úÖ Understood how to integrate this into your workflow (IDE, GitLab CI, pre-commit)<br>
‚úÖ Learned why templates make this reusable across projects<br>
‚úÖ Reviewed the 6-block template structure<br>
‚è≠Ô∏è <strong>Next:</strong> Try building your own template from scratch<br><br>

**Ready to apply what you learned?** Continue below for guided practice! ‚¨áÔ∏è
</div>

---

## Activity 3.2: Build Your Own Template

Now that you've seen how the template works, try building one from scratch.

**Your task:** Create a code review template for authentication code.

Open **[`activities/activity-3.2-code-review.md`](./activities/activity-3.2-code-review.md)** and complete the template between the `<!-- TEMPLATE START -->` and `<!-- TEMPLATE END -->` markers.

The template should:
- Set an appropriate role (e.g., Senior Backend Engineer specializing in API security)
- Include context about the code being reviewed
- Define review dimensions (Security, Performance, Error Handling, etc.)
- Guide the model through analysis steps
- Specify a structured output format with severity levels

When you're done, come back and run the cell below to test it. Compare your result with the **[solution](./solutions/activity-3.2-code-review-solution.md)** afterward.

In [None]:
# Test your Activity 3.2 template
from setup_utils import test_activity_3_2
# This is the vulnerable authentication code from the activity
test_code = """
+ import hashlib
+ import time
+
+ SESSION_CACHE = {}
+
+ def authenticate_user(db, username, password):
+     username = username or ""
+     password = password or ""
+
+     query = f"SELECT id, password_hash, failed_attempts FROM users WHERE username = '{username}'"
+     rows = db.execute(query)
+     user = rows[0]
+
+     hashed = hashlib.md5(password.encode()).hexdigest()
+
+     if hashed != user["password_hash"]:
+         db.execute(f"UPDATE users SET failed_attempts = {user['failed_attempts'] + 1} WHERE id = {user['id']}")
+         return {"status": "error"}
+
+     if username not in SESSION_CACHE:
+         SESSION_CACHE[username] = f"{user['id']}-{int(time.time())}"
+
+     permissions = []
+     for role in db.fetch_roles():
+         if db.has_role(user["id"], role["id"]):
+             permissions.append(role["name"])
+
+     time.sleep(0.5)
+     db.write_audit_entry(user["id"], username)
+
+     return {"status": "ok", "session": SESSION_CACHE[username], "permissions": permissions}
"""

# Run this to test your template from the activity file
test_activity_3_2(
    test_code=test_code,
    variables={
        'tech_stack': 'Python',
        'repo_name': 'user-auth-service',
        'service_name': 'Authentication API',
        'change_purpose': 'Add user login endpoint'
    }
)

# The function will:
# 1. Read your template from activities/activity-3.2-code-review.md
# 2. Substitute the variables
# 3. Send to the AI model
# 4. Display the results
# 5. Ask if you want to save results back to the activity file

---

#### Evaluate Your Prompt Quality

<div style="background:#f0f9ff; border-left:4px solid #0ea5e9; padding:16px; border-radius:6px; margin:20px 0; color:#000000;">
<strong style="color:#0c4a6e;">üéì Want Detailed Feedback on Your Template?</strong><br><br>

The <code style="color:#dc2626; background-color:#f1f1f1; padding:2px; font-family:Consolas,'courier new';">evaluate_prompt()</code> function provides comprehensive automated feedback by combining three evaluation methods:

**1. Traditional Metrics (40%)** - Objective pattern detection:
- ‚úÖ Detects XML structure, role prompting, chain-of-thought keywords
- ‚úÖ Fast and deterministic

**2. Quality Assessment (40%)** - AI-powered quality evaluation:
- üéØ Evaluates HOW WELL each tactic is implemented (Quality Score: 0-10)
- üéØ Includes confidence levels showing evaluation certainty (Confidence: 0-100%)
- üéØ Provides specific improvement suggestions

**3. Semantic Similarity (20%)** - Comparison with reference solution:
- üìö Shows gaps in role definition, guidelines, and structure

---

#### What is the Confidence Score?

Think of **Confidence** as "How sure is the AI about its quality rating?"

**Example Evaluation:**


**Quick Decision Guide:**

| Confidence | Quality | What It Means |
|------------|---------|---------------|
| High (‚â•80%) | High (‚â•8/10) | üü¢ **Great work!** Trust this feedback |
| High (‚â•80%) | Low (<7/10) | üî¥ **Fix this!** Clear issue identified |
| Low (<70%) | Any | üü° **Unclear** - Consider human review |

**Your Overall Confidence Score** = Average confidence across all tactics

---

**üí° When to Use Evaluation:**
- After completing your template (get instant feedback)
- Before checking the solution (understand your gaps)
- To track improvement over multiple attempts
</div>

**Run the cell below to evaluate your template!** ‚¨áÔ∏è

In [None]:
# Optional: Evaluate your Activity 3.2 template for detailed feedback
from setup_utils import evaluate_prompt, extract_template_from_activity

# Extract your template from the activity file
template, error = extract_template_from_activity('activities/activity-3.2-code-review.md')

if error:
    print(error)
else:
    # Define the same variables used in cell 14 for substitution
    # These match what test_activity_3_2() uses to fill the template placeholders
    variables = {
        'tech_stack': 'Python',
        'repo_name': 'user-auth-service',
        'service_name': 'Authentication API',
        'change_purpose': 'Add user login endpoint'
    }

    # Substitute variables in template (same logic as test_activity uses internally)
    print("üîÑ Substituting template variables...")
    substituted_template = template
    for key, value in variables.items():
        placeholder = "{{" + key + "}}"
        substituted_template = substituted_template.replace(placeholder, str(value))

    print("üìñ Evaluating your Activity 3.2 template...")
    print("‚è≥ This will take ~30 seconds (running 3 evaluation methods)\n")

    # Run comprehensive evaluation
    evaluate_prompt(
        messages=substituted_template,  # ‚úÖ Now fully substituted with actual content
        activity_name="Activity 3.2: Code Review Automation",
        expected_tactics=[
            "Role Prompting",
            "Structured Inputs",
            "Output Format Specification",
            "Chain-of-Thought"
        ],
        activity_file='activities/activity-3.2-code-review.md',
        compare_with_reference=True,  # Compare with solution
        track_progress=True            # Save to history
    )

    print("\n‚úÖ Evaluation complete!")
    print("Next: Run view_progress() to see your improvement over time!")

---

## Track Your Progress

After completing the evaluation above, run the cell below to see your learning journey:

- üìä All your evaluation attempts for this section
- üìà Your improvement over time
- üèÜ Achievement status (scores ‚â• 80 earn **SKILLS ACQUIRED** badge!)

---

In [None]:
# üìä VIEW YOUR PROGRESS
# Run this cell anytime to see your evaluation history and improvement

from setup_utils import view_progress

print("=" * 70)
print("üìä YOUR SECTION 3.2 PROGRESS")
print("=" * 70)
print()

view_progress("Activity 3.2: Code Review Automation")

print()
print("=" * 70)
print("üí° TIP: Scored ‚â• 80? You've mastered code review automation!")
print("=" * 70)

### Learn More: Advanced Code Review Patterns

Want to dive deeper into production code review automation?

**üìñ AWS Anthropic Advanced Patterns:**
- [Code Review Command Pattern](https://github.com/aws-samples/anthropic-on-aws/blob/main/advanced-claude-code-patterns/commands/code-review.md) ‚Äî Full prompt + workflow for automated reviews

**üîó Related Best Practices:**
- [Claude 4 Prompt Engineering](https://docs.claude.com/en/docs/build-with-claude/prompt-engineering/claude-4-best-practices) ‚Äî Guidance on structuring complex instructions
- [Prompt Templates & Variables](https://docs.claude.com/en/docs/build-with-claude/prompt-engineering/prompt-templates-and-variables) ‚Äî When and how to parameterize prompts
- [OpenAI GPT-5 Prompting Guide](https://cookbook.openai.com/examples/gpt-5/gpt-5_prompting_guide) ‚Äî Latest guidance on tactic stacking and failure analysis for GPT-5 models


---

<div style="margin:24px 0; padding:20px 24px; background:linear-gradient(135deg, #f8fafc 0%, #e2e8f0 100%); border-radius:12px; border-left:5px solid #3b82f6; box-shadow:0 2px 8px rgba(0,0,0,0.1);">
  <div style="color:#1e293b; font-size:0.85em; font-weight:600; text-transform:uppercase; letter-spacing:1px; margin-bottom:8px;">‚è≠Ô∏è Next Section</div>
  <div style="color:#0f172a; font-size:1.15em; font-weight:700; margin-bottom:6px;">Section 3.3: Test Generation Automation</div>
  <div style="color:#475569; font-size:0.95em; line-height:1.5; margin-bottom:12px;">Learn how to generate comprehensive test specifications from requirements using strategic tactic combinations.</div>
  <a href="./3.3-test-generation-automation.ipynb" style="display:inline-block; padding:8px 16px; background:#3b82f6; color:#fff; text-decoration:none; border-radius:6px; font-weight:600; font-size:0.9em; transition:all 0.2s;">Continue to Section 3.3 ‚Üí</a>
</div>

> **üí° Tip:** You're halfway through the core material! The setup utilities are already loaded‚Äîjust run `from setup_utils import *` in Section 3.3 to continue.