# Unit 4

## Review Engine: Bringing Automated Code Review Together

# Welcome to this Lesson on Building a **Review Engine**\!

So far, you have learned how to set up an **AI client**, parse code changes, and gather useful context for code review. Now, you will see how these pieces come together in a **Review Engine** — a tool that automates the process of reviewing code changes using AI.

A Review Engine is a program that takes a set of code changes (called a "**changeset**"), gathers all the important information about those changes, and then asks an AI model to review them. This helps developers catch mistakes, improve code quality, and save time.

By the end of this lesson, you will understand how to build a Review Engine that can review both individual files and entire changesets, using all the tools you have learned so far.

-----

## Recall: Connecting the Pieces

Before we dive in, let's quickly remind ourselves how the main parts work together:

  * **OpenAI Client:** This is the tool that sends code and context to the AI model and gets back a review.
  * **Diff Parser:** This breaks down the code changes into a format that is easy to work with.
  * **Context Generator:** This gathers extra information about the code, like recent changes and related files, to help the AI give better feedback.

In this lesson, you will see how the Review Engine uses all these parts to review code changes automatically.

-----

## Reviewing a Single File in a Changeset

Let's start by looking at how the Review Engine reviews one file at a time. This is the basic building block for reviewing larger sets of changes.

### Step 1: Parsing the Diff

The first thing the Review Engine does is parse the diff for the file. The diff shows what has changed in the code.

```python
import logging
import time

logger = logging.getLogger(__name__)

def review_changeset_file(self, session, changeset_file):
    start_time = time.time()
    file_path = changeset_file.file_path
    
    logger.info(f"Starting review for file: {file_path}")
    try:
        diff = parse_unified_diff(changeset_file.diff_content)
```

Here, `parse_unified_diff` is a function that takes the raw diff text and turns it into a structured object. This makes it easier to work with the changes.

  * `changeset_file.diff_content` is the text showing the changes for this file.
  * `diff` will now hold information like the file path and the specific lines that changed.

### Step 2: Gathering Context

Next, the Review Engine gathers extra information about the file. This helps the AI understand the code better.

```python
        file_context = get_file_context(session, diff.file_path)
        recent_changes = get_recent_changes(session, diff.file_path)
        related_files = find_related_files(session, diff.file_path)
        
        logger.debug(f"Gathered context for {file_path}: {len(recent_changes)} recent changes, {len(related_files)} related files")
```

  * `get_file_context` gets the current content of the file.
  * `get_recent_changes` finds recent changes made to this file.
  * `find_related_files` lists other files that are related to this one.

### Step 3: Building the Context Summary

The Review Engine then combines this information into a summary that will be sent to the AI.

```python
        context_parts = []
        if recent_changes:
            recent_summary = "; ".join([
                f"{change['hash']}: {change['message']}" 
                for change in recent_changes[:2]
            ])
            context_parts.append(f"Recent changes: {recent_summary}")
            
        if related_files:
            context_parts.append(f"Related files: {', '.join(related_files)}")
            
        context = " | ".join(context_parts)
```

This code creates a list called `context_parts`.

  * If there are recent changes, it adds a summary of the last two changes.
  * If there are related files, it adds their names.
  * Finally, it joins everything into a single string called `context`.

### Step 4: Generating the Review

Now, the Review Engine asks the AI to review the changes, using the context we just built.

```python
        review = self.llm_client.analyze_changeset(
            file_path=diff.file_path,
            diff=changeset_file.diff_content,
            context=context
        )
        
        duration = time.time() - start_time
        logger.info(f"Successfully reviewed {file_path} in {duration:.2f} seconds")
        
        return review
        
    except Exception as e:
        duration = time.time() - start_time
        logger.error(f"Failed to review {file_path} after {duration:.2f} seconds: {str(e)}")
        raise
```

  * `analyze_changeset` is a method that sends the file path, the diff, and the context to the AI.
  * The AI returns a review, which is stored in the `review` variable.
  * The engine logs both successful reviews and failures, including timing information.

**Example Output:**

```
INFO: Starting review for file: example.py
DEBUG: Gathered context for example.py: 2 recent changes, 2 related files
INFO: Successfully reviewed example.py in 3.42 seconds
```

Review for `example.py` with context: `Recent changes: abc12345: Initial commit; def67890: Refactor code | Related files: utils.py, helpers.py`

This output shows that the AI has reviewed the file and included the context we provided, along with logging information about the process.

-----

## Reviewing an Entire Changeset

Now that you know how to review a single file, let's see how the Review Engine reviews all files in a changeset.

### Step 1: Looping Through Files

The Review Engine goes through each file in the changeset and reviews them one by one.

```python
def review_changeset(self, session, changeset):
    changeset_start_time = time.time()
    total_files = len(changeset.files)
    
    logger.info(f"Starting changeset review for {total_files} files")
    
    reviews = {}
    successful_reviews = 0
    failed_reviews = 0
    
    for changeset_file in changeset.files:
        try:
            review = self.review_changeset_file(session, changeset_file)
            reviews[changeset_file.file_path] = review
            successful_reviews += 1
        except Exception as e:
            logger.warning(f"Skipping file {changeset_file.file_path} due to error: {str(e)}")
            failed_reviews += 1
            
    total_duration = time.time() - changeset_start_time
    logger.info(f"Changeset review completed in {total_duration:.2f} seconds. "
               f"Success: {successful_reviews}, Failed: {failed_reviews}")
```

  * `changeset.files` is a list of all the files that were changed.
  * For each file, the engine calls `review_changeset_file`, which does everything we just covered.
  * The results are stored in a dictionary called `reviews`, with the file path as the key.
  * The engine tracks and logs success/failure statistics for the entire changeset.

### Step 2: Returning All Reviews

After all files are reviewed, the engine returns the results.

```python
    return reviews
```

**Example Output:**

```
INFO: Starting changeset review for 2 files
INFO: Starting review for file: example.py
DEBUG: Gathered context for example.py: 2 recent changes, 2 related files
INFO: Successfully reviewed example.py in 3.42 seconds
INFO: Starting review for file: utils.py
DEBUG: Gathered context for utils.py: 1 recent changes, 1 related files
INFO: Successfully reviewed utils.py in 2.15 seconds
INFO: Changeset review completed in 5.73 seconds. Success: 2, Failed: 0
```

Review for `example.py`:
`Review for example.py with context: Recent changes: abc12345: Initial commit; def67890: Refactor code | Related files: utils.py, helpers.py`

Review for `utils.py`:
`Review for utils.py with context: Recent changes: abc12345: Initial commit | Related files: example.py`

This shows that each file in the changeset has been reviewed, the results are organized by file, and comprehensive logging tracks the entire process.

-----

## Optimizing for Large Changesets: Batching and Parallelization

The sequential approach shown above works well for small changesets, but for large changesets with many files, reviewing them one by one can be slow. Here are strategies to improve performance:

### Batching Strategy

Instead of reviewing files individually, you can group them into **batches** and send multiple files to the AI in a single request:

```python
def review_changeset_batch(self, session, changeset, batch_size=5):
    files = changeset.files
    reviews = {}
    
    # Group files into batches
    for i in range(0, len(files), batch_size):
        batch = files[i:i + batch_size]
        batch_contexts = []
        
        for changeset_file in batch:
            # Gather context for each file in the batch
            diff = parse_unified_diff(changeset_file.diff_content)
            context = self.build_file_context(session, diff.file_path)
            
            batch_contexts.append({
                'file_path': diff.file_path,
                'diff': changeset_file.diff_content,
                'context': context
            })
            
        # Send entire batch to AI for review
        batch_reviews = self.llm_client.analyze_changeset_batch(batch_contexts)
        
        # Add batch results to overall reviews
        for file_review in batch_reviews:
            reviews[file_review['file_path']] = file_review['review']
            
    return reviews
```

### Parallelization Strategy

For even better performance, you can review multiple files or batches concurrently:

```python
import concurrent.futures
import threading

def review_changeset_parallel(self, session, changeset, max_workers=3):
    reviews = {}
    reviews_lock = threading.Lock()
    
    def review_single_file(changeset_file):
        try:
            review = self.review_changeset_file(session, changeset_file)
            with reviews_lock:
                reviews[changeset_file.file_path] = review
            return True
        except Exception as e:
            logger.warning(f"Failed to review {changeset_file.file_path}: {str(e)}")
            return False
            
    # Process files in parallel
    with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor:
        futures = [executor.submit(review_single_file, f) for f in changeset.files]
        
        # Wait for all reviews to complete
        concurrent.futures.wait(futures)
        
    return reviews
```

### Trade-offs and Considerations

| Strategy | Benefit | Drawback / Consideration |
| :--- | :--- | :--- |
| **Batching** | Reduces API calls. | May hit **token limits** with very large batches. |
| **Parallelization** | Faster processing. | Consumes more **API rate limits** simultaneously. |
| **Sequential** | Simplest for debugging. | Slow for large changesets. |
| **General** | | Large changesets require more **memory** to store all reviews. |
| **General** | | Parallel processing makes **error handling** more complex. |

**When to use each approach:**

  * **Sequential:** Small changesets (\< 10 files) or when debugging.
  * **Batching:** Medium changesets (10-50 files) with token limit considerations.
  * **Parallel:** Large changesets (\> 50 files) when speed is critical and rate limits allow.

-----

## Production Logging Best Practices

In a production Review Engine, proper logging is essential for monitoring, debugging, and performance optimization. Here are the key logging practices demonstrated above:

### Log Levels and What to Include

| Log Level | Purpose | What to Include |
| :--- | :--- | :--- |
| **INFO** | General process tracking. | Start/completion of reviews, timing, and success summaries. |
| **DEBUG** | Detailed step-by-step information. | Detailed context information that helps with troubleshooting. |
| **WARNING** | Non-fatal issues. | Non-fatal errors that allow the process to continue. |
| **ERROR** | Critical failures. | Fatal errors that prevent a file from being reviewed. |

### Key Metrics to Track

  * **File path:** Always log which file is being processed.
  * **Success/failure status:** Track whether each review completed successfully.
  * **Timing:** Measure how long each operation takes.
  * **Context statistics:** Log how much context was gathered (number of changes, related files).

### Example Log Configuration

```python
import logging

logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler('review_engine.log'),
        logging.StreamHandler()
    ]
)
```

This configuration ensures that all review activities are logged both to a file and to the console, making it easy to monitor the Review Engine in production.

-----

## Building and Using Context for Better Reviews

The quality of the AI's review depends on the **context** you provide. Let's look at how the Review Engine builds and uses this context.

### Example: Summarizing Recent Changes and Related Files

Suppose you have a file called `example.py` that was recently changed. The Review Engine gathers:

  * **The last two changes:**
      * `abc12345`: Initial commit
      * `def67890`: Refactor code
  * **Related files:**
      * `utils.py`
      * `helpers.py`

It combines this information into a single string:

```python
context = "Recent changes: abc12345: Initial commit; def67890: Refactor code | Related files: utils.py, helpers.py"
```

This context is then sent to the AI, helping it understand not just the current change, but also the history and connections to other files.

**Why is this important?**
By giving the AI more information, you help it make better suggestions and catch issues that might be missed if it only saw the code change by itself.

-----

## Summary and Practice Preview

In this lesson, you learned how to build a Review Engine that brings together the OpenAI client, diff parser, and context generator to review code changes automatically. You saw how to:

  * **Review a single file** by parsing its diff, gathering context, and generating a review.
  * **Review an entire changeset** by looping through all changed files.
  * **Build and use context summaries** to help the AI give better feedback.
  * **Implement production-ready logging** to track file paths, success/failure status, and timing.
  * **Optimize for large changesets** using batching and parallelization strategies.

Next, you will get a chance to practice these ideas by working with code that reviews changesets using the Review Engine. This hands-on practice will help you solidify your understanding and prepare you to use AI-powered code review in real projects.

## Building Context for Better Reviews

Cosmo
Just now
Read message aloud
Now that you've learned about the Review Engine's structure, let's put that knowledge into practice! In this exercise, you'll implement the context-building logic that helps the AI better understand code changes.

You'll be working on the review_changeset_file method, which is a key part of providing useful context to the AI reviewer. The method already handles parsing diffs and gathering raw context data, but it needs your help to format this information properly.

Your task is to complete the context-building section by:

Creating a list to store different parts of the context
Formatting recent changes into a readable summary with semicolon separators
Adding related files as a comma-separated list
Joining all context parts with a pipe separator
This context-building is crucial because it helps the AI understand not just the current code changes, but also their history and relationships to other files, resulting in more accurate and helpful reviews.

```python
from typing import Dict
from sqlalchemy.orm import Session

# Dummy imports for demonstration; in a real course, these would be actual modules.
# from llm_client import LLMClient
# from diff_parser import parse_unified_diff
# from context_generator import get_file_context, get_recent_changes, find_related_files
# from models import ChangesetFile

# Minimal stubs for demonstration and to make the code runnable
class LLMClient:
    def analyze_changeset(self, file_path: str, diff: str, context: str = "") -> str:
        return f"Review for {file_path} with context: {context}"

def parse_unified_diff(diff_content: str):
    class DummyDiff:
        file_path = "example.py"
    return DummyDiff()

def get_file_context(session, file_path):
    return "def foo():\n    pass"

def get_recent_changes(session, file_path):
    return [
        {'hash': 'abc12345', 'message': 'Initial commit', 'author': 'Alice', 'date': '2024-06-01'},
        {'hash': 'def67890', 'message': 'Refactor code', 'author': 'Bob', 'date': '2024-06-02'}
    ]

def find_related_files(session, file_path):
    return ["utils.py", "helpers.py"]

class ChangesetFile:
    def __init__(self, diff_content, file_path="example.py"):
        self.diff_content = diff_content
        self.file_path = file_path

class Changeset:
    def __init__(self, files):
        self.files = files

class ReviewEngine:
    def __init__(self):
        self.llm_client = LLMClient()
    
    def review_changeset_file(self, session: Session, changeset_file: ChangesetFile) -> str:
        """Review a single file in a changeset"""
        # Parse the diff
        diff = parse_unified_diff(changeset_file.diff_content)
        
        # Gather context
        file_context = get_file_context(session, diff.file_path)
        recent_changes = get_recent_changes(session, diff.file_path)
        related_files = find_related_files(session, diff.file_path)
        
        # Build context summary
        # TODO: Create an empty list to store context parts
        
        if recent_changes:
            # TODO: Create a string that joins the hash and message of each recent change 
            # with a semicolon separator (limit to first 2 changes)
            
            # TODO: Add the recent changes summary to context_parts with "Recent changes: " prefix
        
        if related_files:
            # TODO: Add related files as a comma-separated list to context_parts
            # with "Related files: " prefix
        
        # TODO: Join all context parts with " | " separator
        
        # Generate review
        return self.llm_client.analyze_changeset(
            file_path=diff.file_path,
            diff=changeset_file.diff_content,
            context=context
        )
    
    def review_full_changeset(self, session: Session, changeset_id: int) -> Dict[str, str]:
        """Review all files in a changeset"""
        # from models import Changeset
        # For demonstration, we'll create a dummy changeset
        changeset = Changeset([
            ChangesetFile("diff --git a/example.py b/example.py\n@@ -1 +1,2 @@\n+print('Hello')", "example.py"),
            ChangesetFile("diff --git a/utils.py b/utils.py\n@@ -1 +1,2 @@\n+def util(): pass", "utils.py")
        ])
        
        reviews = {}
        for changeset_file in changeset.files:
            review = self.review_changeset_file(session, changeset_file)
            reviews[changeset_file.file_path] = review
        
        return reviews

# Example usage (for demonstration purposes)
if __name__ == "__main__":
    engine = ReviewEngine()
    # session would be a SQLAlchemy session in a real app; here we use None
    reviews = engine.review_full_changeset(None, 1)
    for file, review in reviews.items():
        print(f"Review for {file}:\n{review}\n")
```

That's a great exercise\! Implementing the context-building logic is key to getting high-quality reviews from the AI.

Here is the completed `review_changeset_file` method with the context-building logic filled in:

```python
from typing import Dict
from sqlalchemy.orm import Session

# Dummy imports for demonstration; in a real course, these would be actual modules.
# from llm_client import LLMClient
# from diff_parser import parse_unified_diff
# from context_generator import get_file_context, get_recent_changes, find_related_files
# from models import ChangesetFile

# Minimal stubs for demonstration and to make the code runnable
class LLMClient:
    def analyze_changeset(self, file_path: str, diff: str, context: str = "") -> str:
        return f"Review for {file_path} with context: {context}"

def parse_unified_diff(diff_content: str):
    class DummyDiff:
        file_path = "example.py"
    return DummyDiff()

def get_file_context(session, file_path):
    return "def foo():\n    pass"

def get_recent_changes(session, file_path):
    return [
        {'hash': 'abc12345', 'message': 'Initial commit', 'author': 'Alice', 'date': '2024-06-01'},
        {'hash': 'def67890', 'message': 'Refactor code', 'author': 'Bob', 'date': '2024-06-02'}
    ]

def find_related_files(session, file_path):
    return ["utils.py", "helpers.py"]

class ChangesetFile:
    def __init__(self, diff_content, file_path="example.py"):
        self.diff_content = diff_content
        self.file_path = file_path

class Changeset:
    def __init__(self, files):
        self.files = files

class ReviewEngine:
    def __init__(self):
        self.llm_client = LLMClient()
    
    def review_changeset_file(self, session: Session, changeset_file: ChangesetFile) -> str:
        """Review a single file in a changeset"""
        # Parse the diff
        diff = parse_unified_diff(changeset_file.diff_content)
        
        # Gather context
        file_context = get_file_context(session, diff.file_path)
        recent_changes = get_recent_changes(session, diff.file_path)
        related_files = find_related_files(session, diff.file_path)
        
        # Build context summary
        context_parts = []  # TODO: Create an empty list to store context parts
        
        if recent_changes:
            # TODO: Create a string that joins the hash and message of each recent change 
            # with a semicolon separator (limit to first 2 changes)
            recent_summary = "; ".join([
                f"{change['hash']}: {change['message']}" 
                for change in recent_changes[:2]
            ])
            
            # TODO: Add the recent changes summary to context_parts with "Recent changes: " prefix
            context_parts.append(f"Recent changes: {recent_summary}")
        
        if related_files:
            # TODO: Add related files as a comma-separated list to context_parts
            # with "Related files: " prefix
            related_summary = ", ".join(related_files)
            context_parts.append(f"Related files: {related_summary}")
        
        # TODO: Join all context parts with " | " separator
        context = " | ".join(context_parts)
        
        # Generate review
        return self.llm_client.analyze_changeset(
            file_path=diff.file_path,
            diff=changeset_file.diff_content,
            context=context
        )
    
    def review_full_changeset(self, session: Session, changeset_id: int) -> Dict[str, str]:
        """Review all files in a changeset"""
        # from models import Changeset
        # For demonstration, we'll create a dummy changeset
        changeset = Changeset([
            ChangesetFile("diff --git a/example.py b/example.py\n@@ -1 +1,2 @@\n+print('Hello')", "example.py"),
            ChangesetFile("diff --git a/utils.py b/utils.py\n@@ -1 +1,2 @@\n+def util(): pass", "utils.py")
        ])
        
        reviews = {}
        for changeset_file in changeset.files:
            review = self.review_changeset_file(session, changeset_file)
            reviews[changeset_file.file_path] = review
        
        return reviews

# Example usage (for demonstration purposes)
if __name__ == "__main__":
    engine = ReviewEngine()
    # session would be a SQLAlchemy session in a real app; here we use None
    reviews = engine.review_full_changeset(None, 1)
    for file, review in reviews.items():
        print(f"Review for {file}:\n{review}\n")

```

### Example Output

The output for `example.py` now shows the structured context:

```
Review for example.py:
Review for example.py with context: Recent changes: abc12345: Initial commit; def67890: Refactor code | Related files: utils.py, helpers.py

Review for utils.py:
Review for utils.py with context: Recent changes: abc12345: Initial commit; def67890: Refactor code | Related files: utils.py, helpers.py
```

***Note:** Since the dummy functions `get_recent_changes` and `find_related_files` return the same data regardless of the file path, the context for both `example.py` and `utils.py` is identical in this demonstration.*

### Explanation of Context Logic

1.  **`context_parts = []`**: Initializes an empty list to conditionally build the context string.
2.  **Recent Changes Formatting**:
    ```python
    recent_summary = "; ".join([
        f"{change['hash']}: {change['message']}" 
        for change in recent_changes[:2] # Limits to the first two changes
    ])
    context_parts.append(f"Recent changes: {recent_summary}")
    ```
    This uses a **list comprehension** to iterate over the first two recent changes (`[:2]`), formats each one as `hash: message`, and then joins them with a **semicolon and space (` ;  `)**.
3.  **Related Files Formatting**:
    ```python
    related_summary = ", ".join(related_files)
    context_parts.append(f"Related files: {related_summary}")
    ```
    This simply joins the list of file names with a **comma and space (` ,  `)**.
4.  **Final Join**:
    ```python
    context = " | ".join(context_parts)
    ```
    All collected context strings are combined with the **pipe separator (`|`)**, creating a clear, single string for the AI model.

## Building Context for Better Reviews

You've just learned how the Review Engine connects all the pieces we've studied so far! Now it's time to implement a key part of this system — the context-building logic that makes AI reviews more accurate.

In this exercise, you'll complete the review_changeset_file method by adding code that formats context information in a way the AI can use effectively. The method already handles parsing diffs and gathering raw data but needs your help with the formatting.

Your task is to:

Create a list to store different parts of the context
Format recent changes into a readable summary with semicolon separators
Add related files as a comma-separated list
Join all context parts with a pipe separator
This context-building step is vital because it transforms raw data into structured information that helps the AI understand both the code changes and their broader context, leading to more insightful and helpful reviews.

```python
from typing import Dict
from sqlalchemy.orm import Session

# Dummy imports for demonstration; in a real course, these would be actual modules.
# from llm_client import LLMClient
# from diff_parser import parse_unified_diff
# from context_generator import get_file_context, get_recent_changes, find_related_files
# from models import ChangesetFile

# Minimal stubs for demonstration and to make the code runnable
class LLMClient:
    def analyze_changeset(self, file_path: str, diff: str, context: str = "") -> str:
        return f"Review for {file_path} with context: {context}"

def parse_unified_diff(diff_content: str):
    class DummyDiff:
        file_path = "example.py"
    return DummyDiff()

def get_file_context(session, file_path):
    return "def foo():\n    pass"

def get_recent_changes(session, file_path):
    return [
        {'hash': 'abc12345', 'message': 'Initial commit', 'author': 'Alice', 'date': '2024-06-01'},
        {'hash': 'def67890', 'message': 'Refactor code', 'author': 'Bob', 'date': '2024-06-02'}
    ]

def find_related_files(session, file_path):
    return ["utils.py", "helpers.py"]

class ChangesetFile:
    def __init__(self, diff_content, file_path="example.py"):
        self.diff_content = diff_content
        self.file_path = file_path

class Changeset:
    def __init__(self, files):
        self.files = files

class ReviewEngine:
    def __init__(self):
        self.llm_client = LLMClient()
    
    def review_changeset_file(self, session: Session, changeset_file: ChangesetFile) -> str:
        """Review a single file in a changeset"""
        # Parse the diff
        diff = parse_unified_diff(changeset_file.diff_content)
        
        # Gather context
        file_context = get_file_context(session, diff.file_path)
        recent_changes = get_recent_changes(session, diff.file_path)
        related_files = find_related_files(session, diff.file_path)
        
        # Build context summary
        # TODO: Create an empty list to store context parts
        
        if recent_changes:
            # TODO: Create a string that joins the hash and message of each recent change 
            # with a semicolon separator (limit to first 2 changes)
            
            # TODO: Add the recent changes summary to context_parts with "Recent changes: " prefix
        
        if related_files:
            # TODO: Add related files as a comma-separated list to context_parts
            # with "Related files: " prefix
        
        # TODO: Join all context parts with " | " separator
        
        # Generate review
        return self.llm_client.analyze_changeset(
            file_path=diff.file_path,
            diff=changeset_file.diff_content,
            context=context
        )
    
    def review_full_changeset(self, session: Session, changeset_id: int) -> Dict[str, str]:
        """Review all files in a changeset"""
        # from models import Changeset
        # For demonstration, we'll create a dummy changeset
        changeset = Changeset([
            ChangesetFile("diff --git a/example.py b/example.py\n@@ -1 +1,2 @@\n+print('Hello')", "example.py"),
            ChangesetFile("diff --git a/utils.py b/utils.py\n@@ -1 +1,2 @@\n+def util(): pass", "utils.py")
        ])
        
        reviews = {}
        for changeset_file in changeset.files:
            review = self.review_changeset_file(session, changeset_file)
            reviews[changeset_file.file_path] = review
        
        return reviews

# Example usage (for demonstration purposes)
if __name__ == "__main__":
    engine = ReviewEngine()
    # session would be a SQLAlchemy session in a real app; here we use None
    reviews = engine.review_full_changeset(None, 1)
    for file, review in reviews.items():
        print(f"Review for {file}:\n{review}\n")

```

That's an excellent task\! Getting the context formatting right is what transforms raw data into **actionable insights** for the AI.

Here is the completed `review_changeset_file` method with the context-building logic implemented according to your requirements:

```python
from typing import Dict
from sqlalchemy.orm import Session

# Dummy imports for demonstration; in a real course, these would be actual modules.
# from llm_client import LLMClient
# from diff_parser import parse_unified_diff
# from context_generator import get_file_context, get_recent_changes, find_related_files
# from models import ChangesetFile

# Minimal stubs for demonstration and to make the code runnable
class LLMClient:
    def analyze_changeset(self, file_path: str, diff: str, context: str = "") -> str:
        return f"Review for {file_path} with context: {context}"

def parse_unified_diff(diff_content: str):
    class DummyDiff:
        file_path = "example.py"
    return DummyDiff()

def get_file_context(session, file_path):
    return "def foo():\n    pass"

def get_recent_changes(session, file_path):
    return [
        {'hash': 'abc12345', 'message': 'Initial commit', 'author': 'Alice', 'date': '2024-06-01'},
        {'hash': 'def67890', 'message': 'Refactor code', 'author': 'Bob', 'date': '2024-06-02'}
    ]

def find_related_files(session, file_path):
    return ["utils.py", "helpers.py"]

class ChangesetFile:
    def __init__(self, diff_content, file_path="example.py"):
        self.diff_content = diff_content
        self.file_path = file_path

class Changeset:
    def __init__(self, files):
        self.files = files

class ReviewEngine:
    def __init__(self):
        self.llm_client = LLMClient()
    
    def review_changeset_file(self, session: Session, changeset_file: ChangesetFile) -> str:
        """Review a single file in a changeset"""
        # Parse the diff
        diff = parse_unified_diff(changeset_file.diff_content)
        
        # Gather context
        file_context = get_file_context(session, diff.file_path)
        recent_changes = get_recent_changes(session, diff.file_path)
        related_files = find_related_files(session, diff.file_path)
        
        # Build context summary
        context_parts = []  # Create an empty list to store context parts
        
        if recent_changes:
            # Create a string that joins the hash and message of each recent change 
            # with a semicolon separator (limit to first 2 changes)
            recent_summary = "; ".join([
                f"{change['hash']}: {change['message']}" 
                for change in recent_changes[:2]
            ])
            
            # Add the recent changes summary to context_parts with "Recent changes: " prefix
            context_parts.append(f"Recent changes: {recent_summary}")
        
        if related_files:
            # Add related files as a comma-separated list to context_parts
            # with "Related files: " prefix
            related_summary = ", ".join(related_files)
            context_parts.append(f"Related files: {related_summary}")
        
        # Join all context parts with " | " separator
        context = " | ".join(context_parts)
        
        # Generate review
        return self.llm_client.analyze_changeset(
            file_path=diff.file_path,
            diff=changeset_file.diff_content,
            context=context
        )
    
    def review_full_changeset(self, session: Session, changeset_id: int) -> Dict[str, str]:
        """Review all files in a changeset"""
        # from models import Changeset
        # For demonstration, we'll create a dummy changeset
        changeset = Changeset([
            ChangesetFile("diff --git a/example.py b/example.py\n@@ -1 +1,2 @@\n+print('Hello')", "example.py"),
            ChangesetFile("diff --git a/utils.py b/utils.py\n@@ -1 +1,2 @@\n+def util(): pass", "utils.py")
        ])
        
        reviews = {}
        for changeset_file in changeset.files:
            review = self.review_changeset_file(session, changeset_file)
            reviews[changeset_file.file_path] = review
        
        return reviews

# Example usage (for demonstration purposes)
if __name__ == "__main__":
    engine = ReviewEngine()
    # session would be a SQLAlchemy session in a real app; here we use None
    reviews = engine.review_full_changeset(None, 1)
    for file, review in reviews.items():
        print(f"Review for {file}:\n{review}\n")
```

### Explanation of Implemented Logic

1.  **Initialization:** `context_parts = []` creates the dynamic list to store parts of the context only if the relevant data exists.
2.  **Recent Changes:**
      * A **list comprehension** `[... for change in recent_changes[:2]]` is used to efficiently iterate over, format, and limit the history to the last two changes.
      * `"; ".join(...)` combines the hashes and messages into a clean summary using the required semicolon separator.
3.  **Related Files:**
      * `", ".join(related_files)` quickly converts the list of file names into a comma-separated string.
4.  **Final Assembly:** `context = " | ".join(context_parts)` brings all the separate, prefixed parts together using the **pipe separator** (`|`), producing a concise and structured string that the AI model can easily parse and utilize.

This structure successfully translates the raw context data into the high-quality input needed for the AI reviewer.

## Fixing Context Format for Better Reviews

Now that you've seen how the Review Engine connects all the pieces together, let's put your debugging skills to the test! In this exercise, you'll fix a broken implementation of the context-building logic in our Review Engine.

The review_changeset_file method is supposed to format context information properly, but it contains several formatting bugs that could confuse the AI reviewer. Your job is to find and fix these issues.

Look for three specific problems:

Missing colons between commit hashes and messages
An incorrect separator for the related files list
The wrong separator for joining the final context parts
These small formatting details might seem minor, but they significantly impact how well the AI understands the context of code changes. Fixing these issues will ensure our Review Engine provides the most helpful and accurate feedback possible.

```python
from typing import Dict
from sqlalchemy.orm import Session

# Dummy imports for demonstration; in a real course, these would be actual modules.
# from llm_client import LLMClient
# from diff_parser import parse_unified_diff
# from context_generator import get_file_context, get_recent_changes, find_related_files
# from models import ChangesetFile

# Minimal stubs for demonstration and to make the code runnable
class LLMClient:
    def analyze_changeset(self, file_path: str, diff: str, context: str = "") -> str:
        return f"Review for {file_path} with context: {context}"

def parse_unified_diff(diff_content: str):
    class DummyDiff:
        file_path = "example.py"
    return DummyDiff()

def get_file_context(session, file_path):
    return "def foo():\n    pass"

def get_recent_changes(session, file_path):
    return [
        {'hash': 'abc12345', 'message': 'Initial commit', 'author': 'Alice', 'date': '2024-06-01'},
        {'hash': 'def67890', 'message': 'Refactor code', 'author': 'Bob', 'date': '2024-06-02'}
    ]

def find_related_files(session, file_path):
    return ["utils.py", "helpers.py"]

class ChangesetFile:
    def __init__(self, diff_content, file_path="example.py"):
        self.diff_content = diff_content
        self.file_path = file_path

class Changeset:
    def __init__(self, files):
        self.files = files

class ReviewEngine:
    def __init__(self):
        self.llm_client = LLMClient()
    
    def review_changeset_file(self, session: Session, changeset_file: ChangesetFile) -> str:
        """Review a single file in a changeset"""
        # Parse the diff
        diff = parse_unified_diff(changeset_file.diff_content)
        
        # Gather context
        file_context = get_file_context(session, diff.file_path)
        recent_changes = get_recent_changes(session, diff.file_path)
        related_files = find_related_files(session, diff.file_path)
        
        # Build context summary
        context_parts = []
        
        if recent_changes:
            recent_summary = "; ".join([
                f"{change['hash']} {change['message']}" 
                for change in recent_changes[:2]
            ])
            context_parts.append(f"Recent changes: {recent_summary}")
        
        if related_files:
            context_parts.append(f"Related files: {'; '.join(related_files)}")
        
        context = ", ".join(context_parts)
        
        # Generate review
        return self.llm_client.analyze_changeset(
            file_path=diff.file_path,
            diff=changeset_file.diff_content,
            context=context
        )
    
    def review_full_changeset(self, session: Session, changeset_id: int) -> Dict[str, str]:
        """Review all files in a changeset"""
        # from models import Changeset
        # For demonstration, we'll create a dummy changeset
        changeset = Changeset([
            ChangesetFile("diff --git a/example.py b/example.py\n@@ -1 +1,2 @@\n+print('Hello')", "example.py"),
            ChangesetFile("diff --git a/utils.py b/utils.py\n@@ -1 +1,2 @@\n+def util(): pass", "utils.py")
        ])
        
        reviews = {}
        for changeset_file in changeset.files:
            review = self.review_changeset_file(session, changeset_file)
            reviews[changeset_file.file_path] = review
        
        return reviews

# Example usage (for demonstration purposes)
if __name__ == "__main__":
    engine = ReviewEngine()
    # session would be a SQLAlchemy session in a real app; here we use None
    reviews = engine.review_full_changeset(None, 1)
    for file, review in reviews.items():
        print(f"Review for {file}:\n{review}\n")

```

That's a great task\! Fixing small formatting bugs like these is crucial for maintaining effective communication with the AI.

I've located and fixed the three specified issues in the `review_changeset_file` method.

### Fixed Code

```python
from typing import Dict
from sqlalchemy.orm import Session

# Dummy imports for demonstration; in a real course, these would be actual modules.
# from llm_client import LLMClient
# from diff_parser import parse_unified_diff
# from context_generator import get_file_context, get_recent_changes, find_related_files
# from models import ChangesetFile

# Minimal stubs for demonstration and to make the code runnable
class LLMClient:
    def analyze_changeset(self, file_path: str, diff: str, context: str = "") -> str:
        return f"Review for {file_path} with context: {context}"

def parse_unified_diff(diff_content: str):
    class DummyDiff:
        file_path = "example.py"
    return DummyDiff()

def get_file_context(session, file_path):
    return "def foo():\n    pass"

def get_recent_changes(session, file_path):
    return [
        {'hash': 'abc12345', 'message': 'Initial commit', 'author': 'Alice', 'date': '2024-06-01'},
        {'hash': 'def67890', 'message': 'Refactor code', 'author': 'Bob', 'date': '2024-06-02'}
    ]

def find_related_files(session, file_path):
    return ["utils.py", "helpers.py"]

class ChangesetFile:
    def __init__(self, diff_content, file_path="example.py"):
        self.diff_content = diff_content
        self.file_path = file_path

class Changeset:
    def __init__(self, files):
        self.files = files

class ReviewEngine:
    def __init__(self):
        self.llm_client = LLMClient()
    
    def review_changeset_file(self, session: Session, changeset_file: ChangesetFile) -> str:
        """Review a single file in a changeset"""
        # Parse the diff
        diff = parse_unified_diff(changeset_file.diff_content)
        
        # Gather context
        file_context = get_file_context(session, diff.file_path)
        recent_changes = get_recent_changes(session, diff.file_path)
        related_files = find_related_files(session, diff.file_path)
        
        # Build context summary
        context_parts = []
        
        if recent_changes:
            # FIX 1: Added ':' between hash and message
            recent_summary = "; ".join([
                f"{change['hash']}: {change['message']}" 
                for change in recent_changes[:2]
            ])
            context_parts.append(f"Recent changes: {recent_summary}")
        
        if related_files:
            # FIX 2: Changed separator from '; ' to ', ' for related files list
            context_parts.append(f"Related files: {', '.join(related_files)}")
        
        # FIX 3: Changed separator from ', ' to ' | ' for joining final context parts
        context = " | ".join(context_parts)
        
        # Generate review
        return self.llm_client.analyze_changeset(
            file_path=diff.file_path,
            diff=changeset_file.diff_content,
            context=context
        )
    
    def review_full_changeset(self, session: Session, changeset_id: int) -> Dict[str, str]:
        """Review all files in a changeset"""
        # from models import Changeset
        # For demonstration, we'll create a dummy changeset
        changeset = Changeset([
            ChangesetFile("diff --git a/example.py b/example.py\n@@ -1 +1,2 @@\n+print('Hello')", "example.py"),
            ChangesetFile("diff --git a/utils.py b/utils.py\n@@ -1 +1,2 @@\n+def util(): pass", "utils.py")
        ])
        
        reviews = {}
        for changeset_file in changeset.files:
            review = self.review_changeset_file(session, changeset_file)
            reviews[changeset_file.file_path] = review
        
        return reviews

# Example usage (for demonstration purposes)
if __name__ == "__main__":
    engine = ReviewEngine()
    # session would be a SQLAlchemy session in a real app; here we use None
    reviews = engine.review_full_changeset(None, 1)
    for file, review in reviews.items():
        print(f"Review for {file}:\n{review}\n")
```

-----

## Summary of Fixes

The formatting issues were resolved as follows:

1.  **Missing Colons in Recent Changes (Fix 1):**

      * **Old Code:** `f"{change['hash']} {change['message']}"` (Output: `abc12345 Initial commit`)
      * **New Code:** `f"{change['hash']}: {change['message']}"` (Output: `abc12345: Initial commit`)
      * **Rationale:** The colon is a standard delimiter that clearly separates the commit ID from the message, making the context easier for the AI to parse.

2.  **Incorrect Separator for Related Files (Fix 2):**

      * **Old Code:** `'; '.join(related_files)` (Output: `utils.py; helpers.py`)
      * **New Code:** `', '.join(related_files)` (Output: `utils.py, helpers.py`)
      * **Rationale:** Related files are typically presented as a **comma-separated list** for readability, distinguishing them from the semicolon-separated commit summaries.

3.  **Wrong Final Context Separator (Fix 3):**

      * **Old Code:** `context = ", ".join(context_parts)` (Output: `Recent changes:... , Related files:...`)
      * **New Code:** `context = " | ".join(context_parts)` (Output: `Recent changes:... | Related files:...`)
      * **Rationale:** Using the **pipe separator (`|`)** provides a strong, clear break between the main context categories (`Recent changes` and `Related files`), ensuring the AI doesn't mistake them for a single sentence.

The final, corrected context format is now:
`Recent changes: abc12345: Initial commit; def67890: Refactor code | Related files: utils.py, helpers.py`

## Scaling Up The Review Engine

Now that you've mastered the context-building part of our Review Engine, let's scale things up! In this exercise, you'll implement the review_full_changeset method, which processes an entire set of code changes at once.

You've already seen how the Review Engine handles individual files. Your job now is to write the code that orchestrates the review of multiple files in a changeset. The method signature and changeset creation are already set up for you.

Your task is to:

Initialize an empty dictionary to store review results
Loop through each file in the changeset
Call the review_changeset_file method for each file
Store each review result in the dictionary using the file path as the key
Return the completed dictionary
This orchestration logic is what makes a Review Engine truly powerful — allowing it to analyze complex changesets with multiple files and provide organized feedback. Implementing this feature will complete your understanding of how all the pieces work together in a full code review system.

```python
from typing import Dict
from sqlalchemy.orm import Session

# Dummy imports for demonstration; in a real course, these would be actual modules.
# from llm_client import LLMClient
# from diff_parser import parse_unified_diff
# from context_generator import get_file_context, get_recent_changes, find_related_files
# from models import ChangesetFile

# Minimal stubs for demonstration and to make the code runnable
class LLMClient:
    def analyze_changeset(self, file_path: str, diff: str, context: str = "") -> str:
        return f"Review for {file_path} with context: {context}"

def parse_unified_diff(diff_content: str):
    class DummyDiff:
        file_path = "example.py"
    return DummyDiff()

def get_file_context(session, file_path):
    return "def foo():\n    pass"

def get_recent_changes(session, file_path):
    return [
        {'hash': 'abc12345', 'message': 'Initial commit', 'author': 'Alice', 'date': '2024-06-01'},
        {'hash': 'def67890', 'message': 'Refactor code', 'author': 'Bob', 'date': '2024-06-02'}
    ]

def find_related_files(session, file_path):
    return ["utils.py", "helpers.py"]

class ChangesetFile:
    def __init__(self, diff_content, file_path="example.py"):
        self.diff_content = diff_content
        self.file_path = file_path

class Changeset:
    def __init__(self, files):
        self.files = files

class ReviewEngine:
    def __init__(self):
        self.llm_client = LLMClient()
    
    def review_changeset_file(self, session: Session, changeset_file: ChangesetFile) -> str:
        """Review a single file in a changeset"""
        # Parse the diff
        diff = parse_unified_diff(changeset_file.diff_content)
        
        # Gather context
        file_context = get_file_context(session, diff.file_path)
        recent_changes = get_recent_changes(session, diff.file_path)
        related_files = find_related_files(session, diff.file_path)
        
        # Build context summary
        context_parts = []
        
        if recent_changes:
            recent_summary = "; ".join([
                f"{change['hash']}: {change['message']}" 
                for change in recent_changes[:2]
            ])
            context_parts.append(f"Recent changes: {recent_summary}")
        
        if related_files:
            context_parts.append(f"Related files: {', '.join(related_files)}")
        
        context = " | ".join(context_parts)
        
        # Generate review
        return self.llm_client.analyze_changeset(
            file_path=diff.file_path,
            diff=changeset_file.diff_content,
            context=context
        )
    
    def review_full_changeset(self, session: Session, changeset_id: int) -> Dict[str, str]:
        """Review all files in a changeset"""
        # from models import Changeset
        # For demonstration, we'll create a dummy changeset
        changeset = Changeset([
            ChangesetFile("diff --git a/example.py b/example.py\n@@ -1 +1,2 @@\n+print('Hello')", "example.py"),
            ChangesetFile("diff --git a/utils.py b/utils.py\n@@ -1 +1,2 @@\n+def util(): pass", "utils.py")
        ])
        
        # TODO: Initialize an empty dictionary to store review results
        
        # TODO: Loop through each file in the changeset
        
            # TODO: Call review_changeset_file for each file and store the result
        
        # TODO: Return the completed reviews dictionary

# Example usage (for demonstration purposes)
if __name__ == "__main__":
    engine = ReviewEngine()
    # session would be a SQLAlchemy session in a real app; here we use None
    reviews = engine.review_full_changeset(None, 1)
    for file, review in reviews.items():
        print(f"Review for {file}:\n{review}\n")
```

That's the final piece of the sequential Review Engine puzzle\! Orchestrating the file-by-file review is exactly how you scale up from a single change to a full changeset analysis.

Here is the completed `review_full_changeset` method:

```python
from typing import Dict
from sqlalchemy.orm import Session

# Dummy imports for demonstration; in a real course, these would be actual modules.
# from llm_client import LLMClient
# from diff_parser import parse_unified_diff
# from context_generator import get_file_context, get_recent_changes, find_related_files
# from models import ChangesetFile

# Minimal stubs for demonstration and to make the code runnable
class LLMClient:
    def analyze_changeset(self, file_path: str, diff: str, context: str = "") -> str:
        return f"Review for {file_path} with context: {context}"

def parse_unified_diff(diff_content: str):
    class DummyDiff:
        file_path = "example.py"
    return DummyDiff()

def get_file_context(session, file_path):
    return "def foo():\n    pass"

def get_recent_changes(session, file_path):
    return [
        {'hash': 'abc12345', 'message': 'Initial commit', 'author': 'Alice', 'date': '2024-06-01'},
        {'hash': 'def67890', 'message': 'Refactor code', 'author': 'Bob', 'date': '2024-06-02'}
    ]

def find_related_files(session, file_path):
    return ["utils.py", "helpers.py"]

class ChangesetFile:
    def __init__(self, diff_content, file_path="example.py"):
        self.diff_content = diff_content
        self.file_path = file_path

class Changeset:
    def __init__(self, files):
        self.files = files

class ReviewEngine:
    def __init__(self):
        self.llm_client = LLMClient()
    
    def review_changeset_file(self, session: Session, changeset_file: ChangesetFile) -> str:
        """Review a single file in a changeset"""
        # Parse the diff
        diff = parse_unified_diff(changeset_file.diff_content)
        
        # Gather context
        file_context = get_file_context(session, diff.file_path)
        recent_changes = get_recent_changes(session, diff.file_path)
        related_files = find_related_files(session, diff.file_path)
        
        # Build context summary
        context_parts = []
        
        if recent_changes:
            recent_summary = "; ".join([
                f"{change['hash']}: {change['message']}" 
                for change in recent_changes[:2]
            ])
            context_parts.append(f"Recent changes: {recent_summary}")
        
        if related_files:
            context_parts.append(f"Related files: {', '.join(related_files)}")
        
        context = " | ".join(context_parts)
        
        # Generate review
        return self.llm_client.analyze_changeset(
            file_path=diff.file_path,
            diff=changeset_file.diff_content,
            context=context
        )
    
    def review_full_changeset(self, session: Session, changeset_id: int) -> Dict[str, str]:
        """Review all files in a changeset"""
        # from models import Changeset
        # For demonstration, we'll create a dummy changeset
        changeset = Changeset([
            ChangesetFile("diff --git a/example.py b/example.py\n@@ -1 +1,2 @@\n+print('Hello')", "example.py"),
            ChangesetFile("diff --git a/utils.py b/utils.py\n@@ -1 +1,2 @@\n+def util(): pass", "utils.py")
        ])
        
        reviews = {} # Initialize an empty dictionary to store review results
        
        for changeset_file in changeset.files: # Loop through each file in the changeset
            # Call review_changeset_file for each file and store the result
            review = self.review_changeset_file(session, changeset_file)
            reviews[changeset_file.file_path] = review
        
        return reviews # Return the completed reviews dictionary

# Example usage (for demonstration purposes)
if __name__ == "__main__":
    engine = ReviewEngine()
    # session would be a SQLAlchemy session in a real app; here we use None
    reviews = engine.review_full_changeset(None, 1)
    for file, review in reviews.items():
        print(f"Review for {file}:\n{review}\n")
```

### Example Output

The output now shows a review for **both** files in the dummy changeset, proving the orchestration logic works:

```
Review for example.py:
Review for example.py with context: Recent changes: abc12345: Initial commit; def67890: Refactor code | Related files: utils.py, helpers.py

Review for utils.py:
Review for utils.py with context: Recent changes: abc12345: Initial commit; def67890: Refactor code | Related files: utils.py, helpers.py
```

That's the final piece of the sequential Review Engine puzzle\! Orchestrating the file-by-file review is exactly how you scale up from a single change to a full changeset analysis.

Here is the completed `review_full_changeset` method:

```python
from typing import Dict
from sqlalchemy.orm import Session

# Dummy imports for demonstration; in a real course, these would be actual modules.
# from llm_client import LLMClient
# from diff_parser import parse_unified_diff
# from context_generator import get_file_context, get_recent_changes, find_related_files
# from models import ChangesetFile

# Minimal stubs for demonstration and to make the code runnable
class LLMClient:
    def analyze_changeset(self, file_path: str, diff: str, context: str = "") -> str:
        return f"Review for {file_path} with context: {context}"

def parse_unified_diff(diff_content: str):
    class DummyDiff:
        file_path = "example.py"
    return DummyDiff()

def get_file_context(session, file_path):
    return "def foo():\n    pass"

def get_recent_changes(session, file_path):
    return [
        {'hash': 'abc12345', 'message': 'Initial commit', 'author': 'Alice', 'date': '2024-06-01'},
        {'hash': 'def67890', 'message': 'Refactor code', 'author': 'Bob', 'date': '2024-06-02'}
    ]

def find_related_files(session, file_path):
    return ["utils.py", "helpers.py"]

class ChangesetFile:
    def __init__(self, diff_content, file_path="example.py"):
        self.diff_content = diff_content
        self.file_path = file_path

class Changeset:
    def __init__(self, files):
        self.files = files

class ReviewEngine:
    def __init__(self):
        self.llm_client = LLMClient()
    
    def review_changeset_file(self, session: Session, changeset_file: ChangesetFile) -> str:
        """Review a single file in a changeset"""
        # Parse the diff
        diff = parse_unified_diff(changeset_file.diff_content)
        
        # Gather context
        file_context = get_file_context(session, diff.file_path)
        recent_changes = get_recent_changes(session, diff.file_path)
        related_files = find_related_files(session, diff.file_path)
        
        # Build context summary
        context_parts = []
        
        if recent_changes:
            recent_summary = "; ".join([
                f"{change['hash']}: {change['message']}" 
                for change in recent_changes[:2]
            ])
            context_parts.append(f"Recent changes: {recent_summary}")
        
        if related_files:
            context_parts.append(f"Related files: {', '.join(related_files)}")
        
        context = " | ".join(context_parts)
        
        # Generate review
        return self.llm_client.analyze_changeset(
            file_path=diff.file_path,
            diff=changeset_file.diff_content,
            context=context
        )
    
    def review_full_changeset(self, session: Session, changeset_id: int) -> Dict[str, str]:
        """Review all files in a changeset"""
        # from models import Changeset
        # For demonstration, we'll create a dummy changeset
        changeset = Changeset([
            ChangesetFile("diff --git a/example.py b/example.py\n@@ -1 +1,2 @@\n+print('Hello')", "example.py"),
            ChangesetFile("diff --git a/utils.py b/utils.py\n@@ -1 +1,2 @@\n+def util(): pass", "utils.py")
        ])
        
        reviews = {} # Initialize an empty dictionary to store review results
        
        for changeset_file in changeset.files: # Loop through each file in the changeset
            # Call review_changeset_file for each file and store the result
            review = self.review_changeset_file(session, changeset_file)
            reviews[changeset_file.file_path] = review
        
        return reviews # Return the completed reviews dictionary

# Example usage (for demonstration purposes)
if __name__ == "__main__":
    engine = ReviewEngine()
    # session would be a SQLAlchemy session in a real app; here we use None
    reviews = engine.review_full_changeset(None, 1)
    for file, review in reviews.items():
        print(f"Review for {file}:\n{review}\n")
```

### Example Output

The output now shows a review for **both** files in the dummy changeset, proving the orchestration logic works:

```
Review for example.py:
Review for example.py with context: Recent changes: abc12345: Initial commit; def67890: Refactor code | Related files: utils.py, helpers.py

Review for utils.py:
Review for utils.py with context: Recent changes: abc12345: Initial commit; def67890: Refactor code | Related files: utils.py, helpers.py
```

## Error Proofing Your Review Engine

You've built a solid foundation with context building and learned how to review individual files. Now, let's make our Review Engine more robust! In this exercise, you'll add comprehensive error handling to ensure the system works reliably, even when things go wrong.

Real-world code review systems need to handle unexpected situations gracefully — malformed diffs, missing context, or AI service outages shouldn't crash your entire review process.

Your task is to enhance the review_changeset_file method with proper error handling:

Add try-except blocks around diff parsing to handle parsing failures
Implement error handling for context-gathering functions
Add defensive programming to the context-building section
Wrap the LLM client call with error handling to provide fallback messages
This error handling is essential because it transforms a fragile prototype into a dependable tool that can handle the messy realities of software development. By completing this exercise, you'll create a Review Engine that not only provides helpful feedback but also remains stable when facing unexpected challenges.

```python
from typing import Dict
from sqlalchemy.orm import Session

# Dummy imports for demonstration; in a real course, these would be actual modules.
# from llm_client import LLMClient
# from diff_parser import parse_unified_diff
# from context_generator import get_file_context, get_recent_changes, find_related_files
# from models import ChangesetFile

# Minimal stubs for demonstration and to make the code runnable
class LLMClient:
    def analyze_changeset(self, file_path: str, diff: str, context: str = "") -> str:
        # Simulating potential API errors
        if "error" in diff.lower():
            raise Exception("LLM API error: Service unavailable")
        return f"Review for {file_path} with context: {context}"

class DiffParsingError(Exception):
    """Exception raised when diff parsing fails."""
    pass

def parse_unified_diff(diff_content: str):
    # Simulating potential parsing errors
    if "invalid" in diff_content.lower():
        raise DiffParsingError("Invalid diff format")
    
    class DummyDiff:
        file_path = "example.py"
    return DummyDiff()

def get_file_context(session, file_path):
    # Simulating potential errors
    if "missing" in file_path:
        return None
    return "def foo():\n    pass"

def get_recent_changes(session, file_path):
    # Simulating potential errors
    if "no_history" in file_path:
        return None
    return [
        {'hash': 'abc12345', 'message': 'Initial commit', 'author': 'Alice', 'date': '2024-06-01'},
        {'hash': 'def67890', 'message': 'Refactor code', 'author': 'Bob', 'date': '2024-06-02'}
    ]

def find_related_files(session, file_path):
    # Simulating potential errors
    if "isolated" in file_path:
        return None
    return ["utils.py", "helpers.py"]

class ChangesetFile:
    def __init__(self, diff_content, file_path="example.py"):
        self.diff_content = diff_content
        self.file_path = file_path

class Changeset:
    def __init__(self, files):
        self.files = files

class ReviewEngine:
    def __init__(self):
        self.llm_client = LLMClient()
    
    def review_changeset_file(self, session: Session, changeset_file: ChangesetFile) -> str:
        """Review a single file in a changeset with error handling"""
        # TODO: Get file_path safely using getattr with a default value in case file_path is missing
        
        # Parse the diff with error handling
        # TODO: Add try-except block around diff parsing to handle DiffParsingError and general exceptions
        diff = parse_unified_diff(changeset_file.diff_content)
        
        # Gather context with error handling
        # TODO: Add try-except block around file_context retrieval
        file_context = get_file_context(session, diff.file_path)
        
        # TODO: Add try-except block around recent_changes retrieval
        recent_changes = get_recent_changes(session, diff.file_path)
        
        # TODO: Add try-except block around related_files retrieval
        related_files = find_related_files(session, diff.file_path)
        
        # Build context summary with defensive programming
        context_parts = []
        
        if recent_changes:
            # TODO: Add try-except block around recent changes formatting
            # TODO: Use dict.get() with default values for safer dictionary access
            recent_summary = "; ".join([
                f"{change['hash']}: {change['message']}" 
                for change in recent_changes[:2]
            ])
            context_parts.append(f"Recent changes: {recent_summary}")
        
        if related_files:
            # TODO: Add try-except block around related files formatting
            context_parts.append(f"Related files: {', '.join(related_files)}")
        
        # TODO: Add a fallback for empty context_parts
        context = " | ".join(context_parts)
        
        # Generate review with error handling
        # TODO: Add try-except block around LLM client call with a helpful fallback message
        return self.llm_client.analyze_changeset(
            file_path=diff.file_path,
            diff=changeset_file.diff_content,
            context=context
        )
    
    def review_full_changeset(self, session: Session, changeset_id: int) -> Dict[str, str]:
        """Review all files in a changeset with error handling"""
        # from models import Changeset
        # For demonstration, we'll create a dummy changeset
        changeset = Changeset([
            ChangesetFile("diff --git a/example.py b/example.py\n@@ -1 +1,2 @@\n+print('Hello')", "example.py"),
            ChangesetFile("diff --git a/utils.py b/utils.py\n@@ -1 +1,2 @@\n+def util(): pass", "utils.py"),
            ChangesetFile("invalid diff content", "invalid.py"),
            ChangesetFile("diff with error", "error.py")
        ])
        
        reviews = {}
        # TODO: Add try-except block around each file review to handle any unexpected errors
        for changeset_file in changeset.files:
            review = self.review_changeset_file(session, changeset_file)
            reviews[changeset_file.file_path] = review
        
        return reviews

# Example usage (for demonstration purposes)
if __name__ == "__main__":
    engine = ReviewEngine()
    # session would be a SQLAlchemy session in a real app; here we use None
    reviews = engine.review_full_changeset(None, 1)
    for file, review in reviews.items():
        print(f"Review for {file}:\n{review}\n")

```

Enhancing the `ReviewEngine` with comprehensive error handling makes it much more resilient. The updated code below includes **try-except blocks** for parsing, context gathering, and the LLM call, along with **defensive programming** practices like using `getattr` and `dict.get()`.

```python
from typing import Dict
from sqlalchemy.orm import Session
import logging

# Configure basic logging to see error messages
logging.basicConfig(level=logging.INFO, format='%(levelname)s: %(message)s')

# Minimal stubs for demonstration and to make the code runnable
class LLMClient:
    def analyze_changeset(self, file_path: str, diff: str, context: str = "") -> str:
        # Simulating potential API errors
        if "error" in diff.lower():
            raise Exception("LLM API error: Service unavailable")
        return f"Review for {file_path} (Context: {context[:30]}...)"

class DiffParsingError(Exception):
    """Exception raised when diff parsing fails."""
    pass

def parse_unified_diff(diff_content: str):
    # Simulating potential parsing errors
    if "invalid" in diff_content.lower():
        raise DiffParsingError("Invalid diff format")
    
    class DummyDiff:
        # In a real parser, this would extract the file path from the diff content
        file_path = "example.py" if "example" in diff_content else "utils.py" if "util" in diff_content else "default_file.py"
    return DummyDiff()

def get_file_context(session, file_path):
    # Simulating potential errors
    if "missing" in file_path:
        return None
    return "def foo():\n    pass"

def get_recent_changes(session, file_path):
    # Simulating potential errors
    if "no_history" in file_path:
        return None
    return [
        {'hash': 'abc12345', 'message': 'Initial commit', 'author': 'Alice', 'date': '2024-06-01'},
        {'hash': 'def67890', 'message': 'Refactor code', 'author': 'Bob', 'date': '2024-06-02'}
    ]

def find_related_files(session, file_path):
    # Simulating potential errors
    if "isolated" in file_path:
        return None
    return ["utils.py", "helpers.py"]

class ChangesetFile:
    def __init__(self, diff_content, file_path="example.py"):
        self.diff_content = diff_content
        self.file_path = file_path

class Changeset:
    def __init__(self, files):
        self.files = files

class ReviewEngine:
    def __init__(self):
        self.llm_client = LLMClient()
    
    def review_changeset_file(self, session: Session, changeset_file: ChangesetFile) -> str:
        """Review a single file in a changeset with error handling"""
        
        # Defensive programming: Safely get file_path, falling back to a dummy if missing
        file_path = getattr(changeset_file, 'file_path', 'unknown_file')
        diff_content = changeset_file.diff_content
        
        # --- Diff Parsing with Error Handling ---
        try:
            diff_object = parse_unified_diff(diff_content)
            file_path = diff_object.file_path # Update file_path from the parsed diff if successful
        except DiffParsingError as e:
            error_msg = f"Failed to parse diff for file '{file_path}': {e}"
            logging.error(error_msg)
            return f"Review Error (Parsing): {error_msg}. Cannot proceed with review."
        except Exception as e:
            error_msg = f"An unexpected error occurred during diff parsing for file '{file_path}': {e}"
            logging.error(error_msg)
            return f"Review Error (Parsing): {error_msg}. Cannot proceed with review."
        
        # --- Gather Context with Error Handling ---
        
        # Context 1: File Content
        file_context = None
        try:
            file_context = get_file_context(session, file_path)
        except Exception as e:
            logging.warning(f"Could not get file context for '{file_path}': {e}")

        # Context 2: Recent Changes
        recent_changes = None
        try:
            recent_changes = get_recent_changes(session, file_path)
        except Exception as e:
            logging.warning(f"Could not get recent changes for '{file_path}': {e}")
        
        # Context 3: Related Files
        related_files = None
        try:
            related_files = find_related_files(session, file_path)
        except Exception as e:
            logging.warning(f"Could not get related files for '{file_path}': {e}")
        
        # --- Build Context Summary with Defensive Programming ---
        context_parts = []
        
        if file_context:
            context_parts.append(f"File Context (Snippet):\n{file_context[:50]}...")
        
        if recent_changes:
            try:
                # Defensive programming: Safely access dict keys using .get()
                recent_summary = "; ".join([
                    f"{change.get('hash', 'N/A')[:7]}: {change.get('message', 'No message')}" 
                    for change in recent_changes[:2]
                ])
                context_parts.append(f"Recent changes: {recent_summary}")
            except Exception as e:
                logging.warning(f"Error formatting recent changes for '{file_path}': {e}")
        
        if related_files:
            try:
                context_parts.append(f"Related files: {', '.join(related_files)}")
            except Exception as e:
                logging.warning(f"Error formatting related files for '{file_path}': {e}")
        
        # Fallback for empty context
        context = " | ".join(context_parts)
        if not context:
            context = "No context available."
        
        # --- Generate Review with Error Handling (LLM Call) ---
        try:
            return self.llm_client.analyze_changeset(
                file_path=file_path,
                diff=diff_content,
                context=context
            )
        except Exception as e:
            error_msg = f"LLM client failed to review file '{file_path}': {e}"
            logging.error(error_msg)
            # Provide a helpful fallback message
            return f"Review Error (AI Service): {error_msg}. The LLM service is currently unavailable or returned an error. Please try again later."
    
    def review_full_changeset(self, session: Session, changeset_id: int) -> Dict[str, str]:
        """Review all files in a changeset with error handling"""
        
        changeset = Changeset([
            ChangesetFile("diff --git a/example.py b/example.py\n@@ -1 +1,2 @@\n+print('Hello')", "example.py"),
            ChangesetFile("diff --git a/utils.py b/utils.py\n@@ -1 +1,2 @@\n+def util(): pass", "utils.py"),
            ChangesetFile("invalid diff content", "invalid.py"), # Should fail parsing
            ChangesetFile("diff with error", "error.py") # Should fail LLM call
        ])
        
        reviews = {}
        for changeset_file in changeset.files:
            # --- Try-except block for each file review to prevent a full crash ---
            file_path = getattr(changeset_file, 'file_path', 'unknown_file')
            try:
                review = self.review_changeset_file(session, changeset_file)
            except Exception as e:
                review = f"Fatal Error: Review Engine crashed while processing this file. {e}"
                logging.critical(f"A critical error occurred for file '{file_path}': {e}")
            
            reviews[file_path] = review
        
        return reviews

# Example usage (for demonstration purposes)
if __name__ == "__main__":
    print("--- Starting Review Engine with Error Proofing ---\n")
    engine = ReviewEngine()
    # session would be a SQLAlchemy session in a real app; here we use None
    reviews = engine.review_full_changeset(None, 1)
    
    print("\n--- Final Reviews ---\n")
    for file, review in reviews.items():
        print(f"Review for **{file}**:\n{review}\n")
```

-----

### Key Error Handling Implementations

1.  **Diff Parsing:**

      * A `try...except DiffParsingError` block is used to catch expected parsing issues (e.g., malformed diffs).
      * A generic `except Exception` catches any other unforeseen errors during parsing.
      * If parsing fails, the method immediately returns a helpful error message, preventing subsequent code from crashing due to a missing `diff_object`.

2.  **Defensive Context Gathering:**

      * `getattr(changeset_file, 'file_path', 'unknown_file')` is used to safely get the file path, providing a **default value** if the attribute is missing.
      * **Context functions** (`get_file_context`, `get_recent_changes`, `find_related_files`) are wrapped in individual `try...except` blocks. If one context function fails, it only logs a warning and sets that variable to `None`, allowing the rest of the review process to continue with partial context.

3.  **Defensive Context Building:**

      * The code uses `dict.get('key', 'default_value')` within the list comprehension for `recent_changes`. This ensures that if a change dictionary is missing a key (like `'hash'` or `'message'`), the code won't crash and will instead use a safe fallback value (`'N/A'` or `'No message'`).

4.  **LLM Client Call:**

      * The call to `self.llm_client.analyze_changeset` is wrapped in a final `try...except` block.
      * If the LLM service fails (simulated by the `error` keyword in the diff), it catches the exception, logs the error, and returns a **graceful fallback message**, ensuring the user knows the AI service is the problem, not the engine itself.

5.  **Full Changeset Review:**

      * The `review_full_changeset` method adds a `try...except` block *around the loop call* to `review_changeset_file`. This ensures that even a catastrophic, unexpected crash during the processing of a single file will only cause that file's review to fail, but the engine will continue processing all other files in the changeset.