# Lesson 23: Evaluator-Optimizer Pattern â€” Reviewing and Editing the Brown Agent

In this lesson, we'll explore how to implement the evaluator-optimizer pattern to review and edit generated articles. Building on the foundation from Lesson 22, we'll add a quality assurance layer that ensures the generated content meets all requirements.

Learning Objectives:

- Understand the evaluator-optimizer pattern and its real-world applications
- Implement an article reviewing system that checks content against multiple profiles
- Extend the article writer to handle review feedback
- Configure the entire system from a single YAML file
- Glue everything together into a robust LangGraph workflow


> [!NOTE]
> ðŸ’¡ Remember that you can also run `brown` as a standalone Python package by going to `lessons/writing_workflow/` and following the instructions from there.

## 1. Setup

First, we define some standard Magic Python commands to autoreload Python packages whenever they change:


In [1]:
%load_ext autoreload
%autoreload 2

### Set Up Python Environment

To set up your Python virtual environment using `uv` and load it into the Notebook, follow the step-by-step instructions from the `Course Admin` lesson from the beginning of the course.

**TL/DR:** Be sure the correct kernel pointing to your `uv` virtual environment is selected.


### Configure Gemini API

To run this lesson, you'll need several API keys configured:

1. **Gemini API Key**, `GOOGLE_API_KEY` variable: Get your key from [Google AI Studio](https://aistudio.google.com/app/apikey).

In [2]:
from utils import env

env.load(required_env_vars=["GOOGLE_API_KEY"])

Environment variables loaded from `/Users/pauliusztin/Documents/01_projects/TAI/course-ai-agents/.env`
Environment variables loaded successfully.


### Import Key Packages


In [3]:
import nest_asyncio
from utils import pretty_print

nest_asyncio.apply()  # Allow nested async usage in notebooks

pretty_print.wrapped("Using Pretty Prints")

[93m----------------------------------------------------------------------------------------------------[0m
  Using Pretty Prints
[93m----------------------------------------------------------------------------------------------------[0m


### Download Required Files

We need to download the configuration files and input data that Brown uses for article generation and editing.

First, let's download the configs folder:

In [4]:
%%capture

!rm -rf configs
!curl -L -o configs.zip https://raw.githubusercontent.com/iusztinpaul/agentic-ai-engineering-course-data/main/data/configs.zip
!unzip configs.zip
!rm -rf configs.zip

Now, let's download the inputs folder containing profiles, examples, and test data:

In [5]:
%%capture

!rm -rf inputs
!curl -L -o inputs.zip https://raw.githubusercontent.com/iusztinpaul/agentic-ai-engineering-course-data/main/data/inputs.zip
!unzip inputs.zip
!rm -rf inputs.zip

Let's verify what we downloaded:

In [6]:
%ls

aritcle_guideline.md   [1m[36minputs[m[m/                notebook_guideline.md
[1m[36mconfigs[m[m/               notebook.ipynb


### Set Up Directory Constants

Now let's define constants to reference these directories throughout the notebook:

In [7]:
from pathlib import Path

CONFIGS_DIR = Path("configs")
INPUTS_DIR = Path("inputs")

# Verify they exist
print(f"Configs directory exists: {CONFIGS_DIR.exists()}")
print(f"Inputs directory exists: {INPUTS_DIR.exists()}")

Configs directory exists: True
Inputs directory exists: True


In [8]:
SAMPLE_DIR = Path("inputs/tests/01_sample")
EXAMPLES_DIR = Path("inputs/examples/course_lessons")
PROFILES_DIR = Path("inputs/profiles")

print(f"Samples directory exists: {SAMPLE_DIR.exists()}")
print(f"Examples directory exists: {EXAMPLES_DIR.exists()}")
print(f"Profiles directory exists: {PROFILES_DIR.exists()}")

Samples directory exists: True
Examples directory exists: True
Profiles directory exists: True


## 2. How the Writing Agent Works with Review-Edit Loop

Before diving into the implementation, let's understand how the writing agent now incorporates the review-editing process through the evaluator-optimizer pattern.

### The Extended Workflow

In Lesson 22, we learned about the three-step workflow:

1. **Load Context into Memory** - Gather guidelines, research, profiles, and examples
2. **Generate Media Items** - Use the orchestrator-worker pattern to create diagrams
3. **Write the Article** - Generate the first draft using the ArticleWriter

Now we're adding a fourth and fifth step that loops multiple times:

4. **Review the Article** (Evaluator) - Check the article against all profiles and guidelines
5. **Edit the Article** (Optimizer) - Fix all identified issues based on the reviews

This review-edit pattern continues for a configurable number of iterations, gradually improving the article quality.

### The Evaluator-Optimizer Pattern Explained

The evaluator-optimizer pattern is a fundamental AI workflow pattern that mirrors real-world quality assurance processes:

- **Evaluator**: Analyzes output and identifies issues or areas for improvement
- **Optimizer**: Takes the feedback and makes targeted improvements

In our case:
- **Article Reviewer Node** = Evaluator (checks if article follows all the standards)
- **Article Writer Node** = Optimizer (edits the article based on reviews)

This approach is extremely similar to how a real-world writing process works:

1. The writer writes the article (initial draft)
2. A reviewer provides feedback from outside eyes
3. The same writer edits the article based on the provided feedback
4. Repeat steps 2-3 until satisfied

### Workflow Visualization

Let's visualize the complete workflow with the review-edit loop:

<img src="https://raw.githubusercontent.com/iusztinpaul/agentic-ai-engineering-course-data/main/images/l23_writing_workflow.png" alt="Workflow" height="800"/>

## 3. Review Entities: Modeling Feedback

Now let's explore the new Pydantic entities we need for the review process. In Lesson 22, we already covered the core entities like `Article`, `ArticleGuideline`, and `ArticleProfiles`. Now we need entities to represent the reviewing logic.

### Why Two Types of Reviews?

We support two review modes:

1. **Whole Article Reviews**: Review the entire article from top to bottom
2. **Selected Text Reviews**: Review only a specific portion of the article

Most of the time, only a section of the article needs editing, not the whole thing. This targeted approach saves time and reduces API costs by only reviewing what matters.

### The Review Entities

From `brown.entities.reviews`, we have these core entities:


### 1. The Review Entity

A `Review` represents a single piece of feedback about the article:


```python
from pydantic import BaseModel, Field
from brown.entities.mixins import ContextMixin


class Review(BaseModel, ContextMixin):
    profile: str = Field(
        description="The profile type listing the constraints based on which we will write the comment."
    )
    location: str = Field(
        description="The location from within the article where the comment is made. For example, the title of a section."
    )
    comment: str = Field(
        description="The comment made by the reviewer stating the issue relative to the profile."
    )

    def to_context(self) -> str:
        return f"""
<{self.xml_tag}>
    <profile>{self.profile}</profile>
    <location>{self.location}</location>
    <comment>{self.comment}</comment>
</{self.xml_tag}>
"""
```

**Key Fields:**

- **profile**: Which requirement was violated (e.g., "tonality_profile", "article_guideline", "structured_profile")
- **location**: Where in the article the issue exists, usually the title of the article section (e.g., "Introduction - Second paragraph")
- **comment**: Detailed explanation of what's wrong and why it deviates from the requirement

**Example Review:**

```python
Review(
    profile="tonality_profile",
    location="Introduction - First paragraph",
    comment="The tone is overly formal. The tonality profile specifies a conversational, friendly tone. The current opening reads like an academic paper rather than an engaging blog post."
)
```


### 2. The ArticleReviews Entity

`ArticleReviews` bundles multiple reviews for the whole article:


```python
class ArticleReviews(BaseModel, ContextMixin):
    article: Article
    reviews: list[Review]

    def to_context(self, include_article: bool = False) -> str:
        reviews_str = "\n".join([review.to_context() for review in self.reviews])
        return f"""
<{self.xml_tag}>
    {f"<article>{self.article}</article>" if include_article else ""}
    <reviews>
    {reviews_str}
    </reviews>
</{self.xml_tag}>
"""

    def __str__(self) -> str:
        return f"Reviews(len_reviews={len(self.reviews)})"
```


### 3. The SelectedText Entity

Before understanding `SelectedTextReviews`, we need to see the `SelectedText` entity from `brown.entities.articles` to understand how we will model the selected text relative to how we did for the whole article:


```python
class SelectedText(BaseModel, ContextMixin):
    article: Article
    content: str
    first_line_number: int
    last_line_number: int

    def to_context(self) -> str:
        return f"""
<{self.xml_tag}>
    <content>{self.content}</content>
    <first_line_number>{self.first_line_number}</first_line_number>
    <last_line_number>{self.last_line_number}</last_line_number>
</{self.xml_tag}>
"""
```

**Key Features:**

- Contains the full `article` for context
- `content`: The specific text selection to review/edit
- Line numbers help locate the selection within the full article
- This enables targeted reviews of specific sections


### 4. The SelectedTextReviews Entity

`SelectedTextReviews` handles reviews for just a portion of the article:


```python
class SelectedTextReviews(BaseModel, ContextMixin):
    article: Article
    selected_text: SelectedText
    reviews: list[Review]

    def to_context(self, include_article: bool = False) -> str:
        reviews_str = "\n".join([review.to_context() for review in self.reviews])
        return f"""
<{self.xml_tag}>
    {f"<article>{self.article.to_context()}</article>" if include_article else ""}
    <selected_text>{self.selected_text.to_context()}</selected_text>
    <reviews>
    {reviews_str}
    </reviews>
</{self.xml_tag}>
"""
```

**Use Case:**

When a user identifies a specific problematic section, we can:
1. Create a `SelectedText` entity pointing to that section
2. Review only that selection (faster, cheaper)
3. Edit only that selection
4. Replace the selection in the full article

This is particularly useful for human-in-the-loop workflows where humans can highlight specific sections for improvement. More on this in Lesson 24.


### Entity Relationships

Let's visualize how these entities relate:

```
Article
  â””â”€â”€ ArticleReviews
       â””â”€â”€ reviews: list[Review]

Article + SelectedText
  â””â”€â”€ SelectedTextReviews  
       â”œâ”€â”€ selected_text: SelectedText
       â””â”€â”€ reviews: list[Review]
```


## 4. The Article Reviewer Node: The Evaluator

Now let's explore the `ArticleReviewer` node, which acts as the **evaluator** in our evaluator-optimizer pattern. This node analyzes articles against all requirements and generates detailed feedback.

Remember that the core expectations are that the article follows the article guidelines and that all the writing profiles are respected.

### Node Abstraction Recap

First, a quick reminder that we leverage the same `Node` abstraction from Lesson 22 to implement all our nodes.


```python
from abc import ABC, abstractmethod
from typing import Any

from brown.nodes.base import Node, Toolkit


class Node(ABC):
    def __init__(self, model: Runnable, toolkit: Toolkit) -> None:
        self.toolkit = toolkit
        self.model = self._extend_model(model)

    def _extend_model(self, model: Runnable) -> Runnable:
        # Can be overridden to bind tools, structured output, etc.
        return model

    @abstractmethod
    async def ainvoke(self) -> Any:
        pass
```

All workflow nodes inherit from this base class, providing a consistent interface throughout the system.


### ArticleReviewer Class Structure

Let's examine the `ArticleReviewer` class from `brown.nodes.article_reviewer`:

**1. The Class and Initialization:**

```python
class ArticleReviewer(Node):
    system_prompt_template = """..."""  # We'll see this shortly
    selected_text_system_prompt_template = """..."""

    def __init__(
        self,
        to_review: Article | SelectedText,
        article_guideline: ArticleGuideline,
        model: Runnable,
        article_profiles: ArticleProfiles,
    ) -> None:
        self.to_review = to_review
        self.article_guideline = article_guideline
        self.article_profiles = article_profiles

        super().__init__(model, toolkit=Toolkit(tools=[]))
```

**Key Design Decisions:**

- `to_review` can be either a full `Article` or just `SelectedText` (polymorphic design)
- Takes all the requirements: guideline, profiles
- No tools needed (empty toolkit), as reviewing is a pure generation task and no tools are required


**2. Model Extension with Structured Output:**

```python
    def _extend_model(self, model: Runnable) -> Runnable:
        model = cast(BaseChatModel, super()._extend_model(model))
        model = model.with_structured_output(ReviewsOutput)
        
        return model
```

The reviewer uses structured output to ensure we get properly formatted reviews. First, we need an intermediate Pydantic model:

```python
class ReviewsOutput(BaseModel):
    reviews: list[Review]
```

**Why an intermediate model?**

The LLM outputs `ReviewsOutput`, but the node returns either `ArticleReviews` or `SelectedTextReviews` (which include the article/selected_text). This separation keeps the LLM output schema simple to avoid any potential LLM inference errors, while allowing richer node outputs.


**3. The ainvoke Method:**

```python
    async def ainvoke(self) -> ArticleReviews | SelectedTextReviews:
        # Build the main system prompt with all requirements
        system_prompt = self.system_prompt_template.format(
            human_feedback=self.human_feedback.to_context() if self.human_feedback else "",
            article=self.article.to_context(),
            article_guideline=self.article_guideline.to_context(),
            character_profile=self.article_profiles.character.to_context(),
            article_profile=self.article_profiles.article.to_context(),
            structure_profile=self.article_profiles.structure.to_context(),
            mechanics_profile=self.article_profiles.mechanics.to_context(),
            terminology_profile=self.article_profiles.terminology.to_context(),
            tonality_profile=self.article_profiles.tonality.to_context(),
        )
        
        user_input_content = self.build_user_input_content(inputs=[system_prompt])
        inputs = [{"role": "user", "content": user_input_content}]
        
        # If reviewing selected text, add additional instructions
        if self.is_selected_text:
            inputs.extend([
                {
                    "role": "user",
                    "content": self.selected_text_system_prompt_template.format(
                        selected_text=self.to_review.to_context()
                    ),
                }
            ])
        
        # Generate reviews
        reviews = await self.model.ainvoke(inputs)
        if not isinstance(reviews, ReviewsOutput):
            raise InvalidOutputTypeException(ReviewsOutput, type(reviews))
        
        # Return appropriate review type
        if self.is_selected_text:
            return SelectedTextReviews(
                article=self.article,
                selected_text=cast(SelectedText, self.to_review),
                reviews=reviews.reviews,
            )
        else:
            return ArticleReviews(
                article=self.article,
                reviews=reviews.reviews,
            )
```

**Flow:**

1. Format the system prompt with all requirements
2. If reviewing selected text, add special instructions
3. Generate structured reviews from the LLM
4. Package the output entity into the appropriate review type


**4. The System Prompt (Main Review Logic):**

Here's the system prompt which is carefully designed to create thorough, actionable reviews based on the article guideline and writing profiles:

```python
class ArticleReviewer(Node):
    system_prompt_template = """
You are Brown, an expert article writer, editor and reviewer specialized in reviewing technical, educative and informational articles.

Your task is to review a given article against a set of expected requirements and provide detailed feedback 
about any deviations. You will act as a quality assurance reviewer, identifying specific issues and suggesting 
how the article fails to meet the expected requirements.

These reviews will further be used to edit the article, ensuring it follows all the requirements.

## Requirements

The requirements are a set of rules, guidelines or profiles that the article should follow. Here they are:

- **article guideline:** the user intent describing how the article should look like. Specific to this particular article.
- **article profile:** rules specific to writing articles. Generic for all articles.
- **character profile:** the character you will impersonate while writing. Generic for all content.
- **structure profile:** Structure rules guiding the final output format. Generic for all content.
- **mechanics profile:** Mechanics rules guiding the writing process. Generic for all content.
- **terminology profile:** Terminology rules guiding word choice and phrasing. Generic for all content.
- **tonality profile:** Tonality rules guiding the writing style. Generic for all content.

## Article to Review

Here is the article that needs to be reviewed:

{article}

## Article Guideline

The <article_guideline> represents the user intent, describing how the actual article should look like.

The <article_guideline> will ALWAYS contain:
- all the sections of the article expected to be written, in the correct order
- a level of detail for each section, describing what each section should contain. Depending on how much detail you have in a
particular section of the <article_guideline>, you will use more or less information from the <research> tags to write the section.

The <article_guideline> can ALSO contain:
- length constraints for each section, such as the number of characters, words or reading time. If present, you will respect them.
- important (golden) references as URLs or titles present in the <research> tags. If present, always prioritize them over anything else 
from the <research>.
- information about anchoring the article into a series such as a course or a book. Extremely important when the article is part of 
something bigger and we have to anchor the article into the learning journey of the reader. For example, when introducing concepts
in previous articles that we don't want to reintroduce into the current one.
- concrete information about writing the article. If present, you will ALWAYS priotize the instructions from the <article_guideline> 
over any other instructions.

Here is the article guideline:
{article_guideline}

## Character Profile

To make the writing more personable, we impersonated the following character profile when writing the article:
{character_profile}

## Terminology Profile

Here is the terminology profile, describing how to choose the right words and phrases:
to the target audience:
{terminology_profile}

## Tonality Profile

Here is the tonality profile, describing the tone, voice and style of the writing:
{tonality_profile}

## Mechanics Profile

Here is the mechanics profile, describing how the sentences and words should be written:
{mechanics_profile}

## Structure Profile

Here is the structure profile, describing general rules on how to structure text, such as the sections, paragraphs, lists,
code blocks, or media items:
{structure_profile}

## Article Profile

Here is the article profile, describing particularities on how the end-to-end article should look like:
{article_profile}

## Reviewing Process

You will review the article against all the requirements above, creating a one-to-many relationship between each requirement and the 
number of required reviews. In other words, for each requirement, you will create 0 to N reviews. If the article follows the 
requirement 100%, you will not create any reviews for it. If it doesn't follow the requirement, you will create as many reviews 
as required to ensure the article follows the requirement.

Remember that these reviews will further be used to edit the article, ensuring it follows all the requirements. Thus, it's
important to make a thorough review, covering all the requirements and not missing any detail.

## Reviewing Rules

- **The first most important rule:** The requirements can contain some special sections labeled as "rules" or 
"correction rules". You should look for <(.*)?rules(.*)?> XML tags like <correction_media_rules>, 
<abbreviations_or_acronyms_never_to_expand_rules>, <correction_reference_rules>. These are special highlights that 
should always be prioritized over other rules during the review process. They should be respected at all costs when 
writing the article. You will always prioritize these rules over other rules from the requirements making them your 
No.1 focus.
- **The second most important rule:** The adherence to the <article_guideline>.
- **The third most important rule:** The adherence to the <article_profile>.
- **The fourth most important rule:** The adherence to the rest of the requirements.

Other more generic rules:
- Be thorough but fair - only flag genuine issues
- Emphasize WHY something is wrong, not just WHAT is wrong
- Focus on significant deviations, not minor nitpicks 

## Output Format

For each issue you identify, create a review with:
- **profile**: The requirement where the issue was found (e.g., "human_feedback", "article_guideline", "character_profile", 
"article_profile", "structure_profile", "mechanics_profile", "terminology_profile", "tonality_profile")
- **location**: The section title where the issue was found and the paragraph number. For example, "Introduction - First paragraph" 
or "Implementing GraphRAG - Third paragraph"
- **comment**: A detailed explanation of why it's wrong, what's wrong and how it deviates from the requirement.

## Chain of Thoughts

1. Read and analyze the article.
2. Read and analyze the <human_feedback>.
3. Read and analyze all the requirements considering the <human_feedback> as a guiding force.
4. Carefully compare the article against the requirements as instructed by the rules above.
5. For each requirement, create 0 to N reviews
6. Return the reviews of the article.
"""
```

**Key Prompt Engineering Techniques:**

1. **Clear Role**: Expert reviewer with specific expertise
2. **Explicit Priority System**: Rules are ranked (special rules > guideline > article profile > other profiles)
3. **Output**: Clear instructions on what we want the LLM to fill for each attribute
5. **Chain of Thought**: Explicit reasoning steps that glue together all the other sections


**5. The Selected Text System Prompt:**

When reviewing only a selected portion, we append additional instructions:

```python
class ArticleReviewer(Node):
    system_prompt_template = """..."""
    
    selected_text_system_prompt_template = """
You already reviewed and edited the whole article. Now we want to further review only a specific portion
of the article, which we label as the <selected_text>. Despite reviewing the selected text, instead of the
article as a whole, you will follow the exact same instructions from above as if you were reviewing the article as a whole.

## Selected Text to Review

Here is the selected text that needs to be reviewed:

{selected_text}

As pointed out before, the selected text is part of the larger <article> that is already reviewed.
You will use the full <article> as context and anchoring the reviewing process within the bigger picture.

The <first_line_number> and <last_line_number> numbers from the <selected_text> indicate the first and 
last line/row numbers of the selected text from the <article>. Use them to locate the selected text within the <article>.

## Chain of Thoughts

Here is the new chain of thoughts logic you will follow when reviewing the selected text. You can ignore the
previous chain of thoughts:

1. Read and analyze the article.
2. Locate the <selected_text> within the <article> based on the <first_line_number> and <last_line_number>.
3. Read and analyze the <human_feedback>.
4. Read and analyze all the requirements considering the <human_feedback> as a guiding force.
5. Carefully compare the selected text against the requirements as instructed by the rules above.
6. For each requirement, create 0 to N reviews
7. Return the reviews of the selected text.
"""
```

This allows focused reviews on specific sections while maintaining context of the full article. As this system prompt is passed together with the `system_prompt_template` system prompt it has to act only as an extension on explaining what to do with a selected text.

The special trick here is that it adds a new `Chain of Thoughts` section that overrides the one from the original system prompts adding specialized instructions on how to reason across the new task, while still having all the context from both system prompts.


### Example: Reviewing a Whole Article

Now let's see the `ArticleReviewer` in action by reviewing a sample article.

First load the sample article guideline and the standard profiles:



In [9]:
from brown.loaders import MarkdownArticleGuidelineLoader, MarkdownArticleLoader, MarkdownArticleProfilesLoader
from brown.models import SupportedModels, get_model
from brown.nodes import ArticleReviewer
from utils import pretty_print

# Load the article guideline
guideline_loader = MarkdownArticleGuidelineLoader(uri=Path("article_guideline.md"))
article_guideline = guideline_loader.load(working_uri=SAMPLE_DIR)

# Load the article profiles
profiles_input = {
    "article": PROFILES_DIR / "article_profile.md",
    "character": PROFILES_DIR / "character_profiles" / "paul_iusztin.md",
    "mechanics": PROFILES_DIR / "mechanics_profile.md",
    "structure": PROFILES_DIR / "structure_profile.md",
    "terminology": PROFILES_DIR / "terminology_profile.md",
    "tonality": PROFILES_DIR / "tonality_profile.md",
}
profiles_loader = MarkdownArticleProfilesLoader(uri=profiles_input)
article_profiles = profiles_loader.load(working_uri=SAMPLE_DIR)

pretty_print.wrapped(article_guideline.content[:1000], title="Sample article guideline")

[32m2025-11-25 20:24:02.541[0m | [1mINFO    [0m | [36mbrown.config[0m:[36m<module>[0m:[36m10[0m - [1mLoading environment file from `.env`[0m


[93m------------------------------------- Sample article guideline -------------------------------------[0m
  ## Outline

1. Introduction: The Critical Decision Every AI Engineer Faces
2. Understanding the Spectrum: From Workflows to Agents
3. Choosing Your Path
4. Exploring Common Patterns
5. Zooming In on Our Favorite Examples
6. The Challenges of Every AI Engineer

## Section 1 - Introduction: The Critical Decision Every AI Engineer Faces

- **The Problem:** When building AI applications, engineers face a critical architectural decision early in their development process. Should they create a predictable, step-by-step workflow where they control every action, or should they build an autonomous agent that can think and decide for itself? This is one of the key decisions that will impact everything from the product such as development time and costs to reliability and user experience.
- **Why This Decision Matters:** Choose the wrong approach and you might end up with:
  - An overly

Load the sample article to review:

In [10]:
article_loader = MarkdownArticleLoader(uri=Path("article.md"))
article = article_loader.load(working_uri=SAMPLE_DIR)

pretty_print.wrapped(article.content[:1500], title="Sample article to review")

[93m------------------------------------- Sample article to review -------------------------------------[0m
  # AI Agents vs. LLM Workflows: The Critical Decision Every AI Engineer Faces
### A pragmatic guide to choosing the right architecture for your AI application.

When building AI applications, engineers face a critical architectural decision early in their development process. Should you create a predictable, step-by-step workflow where you control every action, or should you build an autonomous agent that can think and decide for itself? This is one of the key decisions that will impact everything from development time and costs to reliability and user experience.

Choose the wrong approach, and you might end up with an overly rigid system that breaks when users deviate from expected patterns. Or you could build an unpredictable agent that works brilliantly 80% of the time but fails catastrophically when it matters most. Either path can lead to months of wasted development tim

Now run the article reviewer:

In [11]:
# Create and run the reviewer
model = get_model(SupportedModels.GOOGLE_GEMINI_25_FLASH)
reviewer = ArticleReviewer(
    to_review=article,
    article_guideline=article_guideline,
    article_profiles=article_profiles,
    model=model,
    human_feedback=None,  # No human feedback for this example
)

print("Reviewing article...")
article_reviews = await reviewer.ainvoke()

pretty_print.wrapped(f"Generated {len(article_reviews.reviews)} reviews:", title="Article reviews")
for i, review in enumerate(article_reviews.reviews, 1):
    review_dict = {
        "Profile": review.profile,
        "Location": review.location,
        "Comment": review.comment[:200] + "..." if len(review.comment) > 200 else review.comment,
    }
    pretty_print.wrapped(review_dict, title=f"Review {i}")

Reviewing article...
[93m----------------------------------------- Article reviews -----------------------------------------[0m
  Generated 87 reviews:
[93m----------------------------------------------------------------------------------------------------[0m
[93m--------------------------------------------- Review 1 ---------------------------------------------[0m
  {
  "Profile": "article_guideline",
  "Location": "Introduction - First paragraph",
  "Comment": "The introduction does not fully align with the 'The Problem' detail from the article guideline. It should emphasize the architectural decision as the core problem, not just 'one of the key decisions'...."
}
[93m----------------------------------------------------------------------------------------------------[0m
[93m--------------------------------------------- Review 2 ---------------------------------------------[0m
  {
  "Profile": "article_guideline",
  "Location": "Introduction - Second paragraph",
  "Comment":

### Example: Reviewing Selected Text

Now let's review only a specific section of the article:


In [12]:
from brown.entities.articles import SelectedText

# Let's extract a specific section to review
article_lines = article.content.split("\n")
first_line_number = 11
last_line_number = 44
selected_content = "\n".join(article_lines[first_line_number:last_line_number])

selected_text = SelectedText(
    article=article,
    content=selected_content,
    first_line_number=first_line_number,
    last_line_number=last_line_number,
)

text = [
    f"Selected text: {len(selected_content)} characters",
    f"Lines: {selected_text.first_line_number}-{selected_text.last_line_number}",
]
pretty_print.wrapped("\n".join(text), title="Selected text to review")
pretty_print.wrapped(f"{selected_text.to_context()[:1500]}\n...", title="Selected text context (first 1500 characters)")

[93m------------------------------------- Selected text to review -------------------------------------[0m
  Selected text: 2577 characters
Lines: 11-44
[93m----------------------------------------------------------------------------------------------------[0m
[93m-------------------------- Selected text context (first 1500 characters) --------------------------[0m
  
<selected_text>
    
    <content>## Understanding the Spectrum: From Workflows to Agents

Before we can choose between workflows and agents, we need a clear understanding of what they are. Rather than focusing on the technical specifics, let's look at their core properties and how they function in practice.

An LLM workflow is a sequence of tasks that involves LLM calls or other operations, such as reading from a database or writing to a file system. It is largely predefined and orchestrated by developer-written code. The steps are defined in advance, resulting in deterministic or rule-based paths with predictable 

Now, let's review the selected text (note how we used the same `ArticleReviewer` class for both inputs containing the business logic in a single place):

In [13]:
model = get_model(SupportedModels.GOOGLE_GEMINI_25_FLASH)
reviewer = ArticleReviewer(
    to_review=selected_text,  # Now passing SelectedText instead of Article
    article_guideline=article_guideline,
    article_profiles=article_profiles,
    model=model,
    human_feedback=None,
)

print("Reviewing selected text...")
selected_text_reviews = await reviewer.ainvoke()

pretty_print.wrapped(
    f"Generated {len(selected_text_reviews.reviews)} reviews for selected text:", title="Selected text reviews"
)
for i, review in enumerate(selected_text_reviews.reviews, 1):
    review_dict = {
        "Profile": review.profile,
        "Location": review.location,
        "Comment": review.comment[:200] + "..." if len(review.comment) > 200 else review.comment,
    }
    pretty_print.wrapped(review_dict, title=f"Review {i}")

Reviewing selected text...
[93m-------------------------------------- Selected text reviews --------------------------------------[0m
  Generated 3 reviews for selected text:
[93m----------------------------------------------------------------------------------------------------[0m
[93m--------------------------------------------- Review 1 ---------------------------------------------[0m
  {
  "Profile": "mechanics_profile",
  "Location": "Understanding the Spectrum: From Workflows to Agents - First paragraph",
  "Comment": "The sentence 'Before we can choose between workflows and agents, we need a clear understanding of what they are.' uses 'we' to refer to the student, which is incorrect. The point of view rule states t..."
}
[93m----------------------------------------------------------------------------------------------------[0m
[93m--------------------------------------------- Review 2 ---------------------------------------------[0m
  {
  "Profile": "article_guideline"

## 5. Hooking Reviews to the Article Writer

Now let's see how the `ArticleWriter` node handles reviews to act as the **optimizer** in our evaluator-optimizer pattern.

### Design Philosophy

To keep the "writing" logic contained and avoid duplicated code, the `ArticleWriter` serves dual purposes:

1. **Writer**: Generates the initial article draft
2. **Editor**: Edits the article based on reviews

This mirrors real-world writing processes where the original author both writes and edits their own work based on feedback. It keeps all writing knowledge in one place.


### Changes to ArticleWriter __init__

The `ArticleWriter` now accepts an optional `reviews` parameter:


```python
class ArticleWriter(Node):
    def __init__(
        self,
        article_guideline: ArticleGuideline,
        research: Research,
        article_profiles: ArticleProfiles,
        media_items: MediaItems,
        article_examples: ArticleExamples,
        model: Runnable,
        reviews: ArticleReviews | SelectedTextReviews | None = None,  # NEW!
    ) -> None:
        super().__init__(model, toolkit=Toolkit(tools=[]))
        
        self.article_guideline = article_guideline
        self.research = research
        self.article_profiles = article_profiles
        self.media_items = media_items
        self.article_examples = article_examples
        self.reviews = reviews  # Store reviews for editing mode
```

**Key Insight:**

When `reviews=None`, the writer generates a new article from scratch. When reviews are provided, it edits the existing article based on the feedback.


### Changes to the ainvoke Method

The `ainvoke` method now handles both writing and editing:
```python
async def ainvoke(self) -> Article | SelectedText:
    # Step 1: Build the main system prompt (same as before)
    system_prompt = self.system_prompt_template.format(
        article_guideline=self.article_guideline.to_context(),
        research=self.research.to_context(),
        # ... all other context ...
    )
    
    user_input_content = self.build_user_input_content(
        inputs=[system_prompt], 
        image_urls=self.research.image_urls
    )
    inputs = [{"role": "user", "content": user_input_content}]
    
    # Step 2: If reviews exist, add them to the conversation
    if self.reviews:
        # First, provide the previously written article as the assistant's response.
        # This is important because the editing will be done relative to the article.
        # Thus, we have to anchor the reviews on the evaluated article.
        inputs.extend([
            {
                "role": "assistant",
                "content": self.reviews.article.to_context(),
            },
        ])
        
        # Then, provide the reviews as user feedback, along with a new system prompt that
        # instructs the agent on how to edit the article. In this way, we can "hijack"
        # the original system template to edit the article instead of writing it from scratch.
        if isinstance(self.reviews, ArticleReviews):
            reviews_prompt = self.article_reviews_prompt_template.format(
                reviews=self.reviews.to_context(include_article=False),
            )
        elif isinstance(self.reviews, SelectedTextReviews):
            reviews_prompt = self.selected_text_reviews_prompt_template.format(
                selected_text=self.reviews.selected_text.to_context(),
                reviews=self.reviews.to_context(include_article=False),
            )
        
        inputs.extend([{"role": "user", "content": reviews_prompt}])
    
    # Step 3: Generate/edit the article
    written_output = await self.model.ainvoke(inputs)
    written_output = cast(str, written_output.text)
    
    # Step 4: Return appropriate type
    if isinstance(self.reviews, SelectedTextReviews):
        return SelectedText(
            article=self.reviews.article,
            content=written_output,
            first_line_number=self.reviews.selected_text.first_line_number,
            last_line_number=self.reviews.selected_text.last_line_number,
        )
    else:
        return Article(content=written_output)
```

**Context engineering for editing:**

1. **User**: System prompt with all context (guidelines, profiles, etc.)
2. **Assistant**: The previously written article
3. **User**: The reviews with specific issues to fix
4. **Assistant**: The edited article (generated by LLM)

### The Review Prompt Templates

The writer has two additional prompt templates for handling reviews. One for editing the whole article and whole for the selected text.

**1. Article Reviews Prompt:**

```python
article_reviews_prompt_template = """
We personally reviewed the article and compiled a list of reviews based on which you have to 
edit the article you wrote one step before.

## Reviewing Logic

Here is how we created the feedback reviews:
- We compared the article against the <article_guideline> to ensure it follows user intent
- We compared against all profile constraints
- Manual human reviews create special "human_feedback" reviews (highest priority)
- For each broken rule, we created a review

## Ranking the Importance of the Reviews

1. Always prioritize the human feedback reviews above everything else
2. Next prioritize reviews based on the <article_guideline>
3. Finally prioritize reviews based on other profiles

## Reviews

Here are the reviews you have to fix:
{reviews}

## Chain of Thought

1. Analyze the reviews to understand what needs to be changed
2. Prioritize the reviews based on the importance ranking
3. Apply necessary edits while following all instructions from profiles and guidelines
4. Ensure edited text is still anchored in <research> and <article_guideline>
5. Ensure edited text flows naturally with surrounding content
6. Return the fully edited article
"""
```

**Key Features:**

- Explains the review creation process
- Provides clear priority ranking
- New chain of thought section adding the new reasoning steps and final task


**2. Selected Text Reviews Prompt:**

```python
selected_text_reviews_prompt_template = """
We personally reviewed only a portion of the article and compiled reviews for editing just that 
selected text.

## Selected Text to Edit

{selected_text}

Remember this selected text is part of the article from one step before. Anchor your editing within 
the broader context of the article.

Selected text editing guidelines:
- Keep selected text consistent with surrounding article context
- Use first and last line numbers to locate the selection
- Only edit the selected text, don't modify the entire article

## [Rest similar to article reviews prompt - reviewing logic, priority ranking, etc.]

{reviews}

## Chain of Thought

1. Place the selected text in context of the full article
2. Analyze the reviews
3. Prioritize reviews based on importance ranking
4. Apply edits while following all instructions
5. Ensure edited selected text is still anchored in research/guideline
6. Ensure edited selected text flows naturally with surrounding content
7. Return the fully edited selected text
"""
```

The prompt is similar to the one for editing the whole article, but we added special details on clearly explaining how to manipulate the selected text. Remember that LLMs have zero clue of what is going on within your application and business logic. Thus, you have to explain all your processes super clearly for this to work well.


## 6. End-to-End Example: Review and Edit Loop

Now let's run a complete example showing the full workflow: generate media, write article, review, and edit. First, let's run it without LangGraph. Next, we will glue everything together into a standalone LangGraph workflow that can further be shipped to production.

### Step 1: Load all necessary context


In [14]:
from brown.loaders import (
    MarkdownArticleExampleLoader,
    MarkdownArticleGuidelineLoader,
    MarkdownArticleProfilesLoader,
    MarkdownResearchLoader,
)

pretty_print.wrapped("STEP 1: Loading Context", width=100)

# Load guideline
guideline_loader = MarkdownArticleGuidelineLoader(uri=Path("article_guideline.md"))
article_guideline = guideline_loader.load(working_uri=SAMPLE_DIR)

# Load research
research_loader = MarkdownResearchLoader(uri=Path("research.md"))
research = research_loader.load(working_uri=SAMPLE_DIR)

# Load profiles
profiles_input = {
    "article": PROFILES_DIR / "article_profile.md",
    "character": PROFILES_DIR / "character_profiles" / "paul_iusztin.md",
    "mechanics": PROFILES_DIR / "mechanics_profile.md",
    "structure": PROFILES_DIR / "structure_profile.md",
    "terminology": PROFILES_DIR / "terminology_profile.md",
    "tonality": PROFILES_DIR / "tonality_profile.md",
}
profiles_loader = MarkdownArticleProfilesLoader(uri=profiles_input)
article_profiles = profiles_loader.load()

# Load examples
examples_loader = MarkdownArticleExampleLoader(uri=EXAMPLES_DIR)
article_examples = examples_loader.load()

print(f"âœ“ Guideline: {len(article_guideline.content):,} characters")
print(f"âœ“ Research: {len(research.content):,} characters, {len(research.image_urls)} images")
print("âœ“ Profiles: 6 profiles loaded")
print(f"âœ“ Examples: {len(article_examples.examples)} article examples")

[93m----------------------------------------------------------------------------------------------------[0m
  STEP 1: Loading Context
[93m----------------------------------------------------------------------------------------------------[0m
âœ“ Guideline: 23,127 characters
âœ“ Research: 211,792 characters, 18 images
âœ“ Profiles: 6 profiles loaded
âœ“ Examples: 2 article examples


### Step 2: Generate media items (we'll skip this for brevity - covered in Lesson 22)

In [15]:
from brown.entities.media_items import MediaItems

pretty_print.wrapped("STEP 2: Generating Media Items", width=100)

# For this example, we'll use empty media items to save time
# In a real scenario, you'd run the MediaGeneratorOrchestrator as shown in Lesson 22
media_items_entity = MediaItems.build([])

print(f"âœ“ Media items: {len(media_items_entity.media_items)} items (using empty for demo)")

[93m----------------------------------------------------------------------------------------------------[0m
  STEP 2: Generating Media Items
[93m----------------------------------------------------------------------------------------------------[0m
âœ“ Media items: 0 items (using empty for demo)


### Step 3: Write the first draft of the article

In [16]:
from brown.nodes import ArticleWriter

pretty_print.wrapped("STEP 3: Writing Article (First Draft)", width=100)
print("This may take 1-2 minutes...")

writer_model = get_model(SupportedModels.GOOGLE_GEMINI_25_FLASH)
article_writer = ArticleWriter(
    article_guideline=article_guideline,
    research=research,
    article_profiles=article_profiles,
    media_items=media_items_entity,
    article_examples=article_examples,
    model=writer_model,
    reviews=None,  # No reviews for first draft
)

article = await article_writer.ainvoke()

print(f"âœ“ Article generated: {len(article.content):,} characters")
pretty_print.wrapped(article.content[:1000], title="First Draft (First 1000 chars)", width=120)

[93m----------------------------------------------------------------------------------------------------[0m
  STEP 3: Writing Article (First Draft)
[93m----------------------------------------------------------------------------------------------------[0m
This may take 1-2 minutes...
âœ“ Article generated: 25,636 characters
[93m-------------------------------------------- First Draft (First 1000 chars) --------------------------------------------[0m
  # AI Workflows vs. Agents: The Architectural Decision Every Engineer Faces
### Navigating the Spectrum of AI Autonomy to Build Production-Ready Systems

When you build AI applications, you face a critical architectural decision early in the development process. Will you create a predictable, step-by-step workflow where you control every action, or will you build an autonomous agent that can think and decide for itself? This choice impacts everything from development time and costs to reliability and user experience.

Choosing the wr

### Step 4: Review the article

In [17]:
pretty_print.wrapped("STEP 4: Reviewing Article", width=100)

reviewer_model = get_model(SupportedModels.GOOGLE_GEMINI_25_FLASH)
article_reviewer = ArticleReviewer(
    to_review=article,
    article_guideline=article_guideline,
    article_profiles=article_profiles,
    model=reviewer_model,
    human_feedback=None,
)

article_reviews = await article_reviewer.ainvoke()

print(f"âœ“ Generated {len(article_reviews.reviews)} reviews")
print("\nReviews:")
for i, review in enumerate(article_reviews.reviews, 1):
    print(f"\n  {i}. Profile: {review.profile}")
    print(f"     Location: {review.location}")
    print(
        f"     Comment: {review.comment[:150]}..." if len(review.comment) > 150 else f"     Comment: {review.comment}"
    )

[93m----------------------------------------------------------------------------------------------------[0m
  STEP 4: Reviewing Article
[93m----------------------------------------------------------------------------------------------------[0m
âœ“ Generated 36 reviews

Reviews:

  1. Profile: structure_profile
     Location: Image 1: A simple Retrieval-Augmented Generation (RAG) workflow.
     Comment: The image caption for 'Image 1' is missing the source URL and citation as required by the `image_caption` format in the `structure_profile`. It should...

  2. Profile: structure_profile
     Location: Image 2: The components of an LLM-powered agent (Image by author from [A Developerâ€™s Guide to Building Scalable AI: Workflows vs Agents](https://towardsdatascience.com/a-developers-guide-to-building-scalable-ai-workflows-vs-agents/))
     Comment: The image caption for 'Image 2' incorrectly uses the format '(Image by author from [A Developerâ€™s Guide to Building Scalable AI: Workflo

### Edit the article based on reviews

In [18]:
pretty_print.wrapped("STEP 5: Editing Article Based on Reviews", width=100)
print("This may take 1-2 minutes...")

editor_model = get_model(SupportedModels.GOOGLE_GEMINI_25_FLASH)
article_editor = ArticleWriter(
    article_guideline=article_guideline,
    research=research,
    article_profiles=article_profiles,
    media_items=media_items_entity,
    article_examples=article_examples,
    model=editor_model,
    reviews=article_reviews,  # Pass reviews to trigger editing mode
)

edited_article = await article_editor.ainvoke()

print(f"âœ“ Article edited: {len(edited_article.content):,} characters")
pretty_print.wrapped(edited_article.content[:1000], title="Edited Article (First 1000 chars)", width=120)

[93m----------------------------------------------------------------------------------------------------[0m
  STEP 5: Editing Article Based on Reviews
[93m----------------------------------------------------------------------------------------------------[0m
This may take 1-2 minutes...
âœ“ Article edited: 27,590 characters
[93m------------------------------------------ Edited Article (First 1000 chars) ------------------------------------------[0m
  Here is the revised article, incorporating all the feedback and adhering to the specified profiles.

<article>
# AI Workflows vs. Agents: The Architectural Decision Every Engineer Faces
### Navigating the Spectrum of AI Autonomy to Build Production-Ready Systems

When you build AI applications, you face a critical architectural decision early in the development process. Will you create a predictable, step-by-step workflow where you control every action, or will you build an autonomous agent that can think and decide for itself? This 

### Step 6: Compare original vs edited

In [19]:
comparison_text = f"""Original length: {len(article.content):,} characters
Edited length: {len(edited_article.content):,} characters
Difference: {len(edited_article.content) - len(article.content):+,} characters

Number of reviews addressed: {len(article_reviews.reviews)}"""

pretty_print.wrapped(comparison_text, title="COMPARISON: Original vs Edited", width=100)

[93m---------------------------------- COMPARISON: Original vs Edited ----------------------------------[0m
  Original length: 25,636 characters
Edited length: 27,590 characters
Difference: +1,954 characters

Number of reviews addressed: 36
[93m----------------------------------------------------------------------------------------------------[0m


### Step 7: Save the articles

In [20]:
from brown.renderers import MarkdownArticleRenderer

renderer = MarkdownArticleRenderer()

# Save first draft
first_draft_path = SAMPLE_DIR / "article_draft.md"
renderer.render(article, output_uri=first_draft_path)

# Save edited version
edited_path = SAMPLE_DIR / "article_edited.md"
renderer.render(edited_article, output_uri=edited_path)

print(f"\nâœ“ Saved first draft to: {first_draft_path}")
print(f"âœ“ Saved edited version to: {edited_path}")


âœ“ Saved first draft to: inputs/tests/01_sample/article_draft.md
âœ“ Saved edited version to: inputs/tests/01_sample/article_edited.md


## 7. App Configuration: Centralized Control

Before gluing everything together into a LangGraph workflow, let's explore our centralized configuration system. This allows us to configure the entire application from a single YAML file.

### Why Centralized Configuration?

As our system grows more complex, we need:

- **Single source of truth**: One file that controls everything
- **Easy experimentation**: Change models, parameters without touching code
- **Environment-specific configs**: Different settings for dev/prod
- **Version control**: Track configuration changes over time

The Brown agent uses a Pydantic-based configuration system that validates all settings at load time.


### The AppConfig Class Structure

From `brown.config_app`, here's the configuration class hierarchy:

**1. Context Configuration:**


```python
from pathlib import Path
from pydantic import BaseModel, DirectoryPath, Field
from typing import Literal, Annotated


class Context(BaseModel):
    # Article guideline
    article_guideline_loader: Literal["markdown"]
    article_guideline_uri: Path

    # Research
    research_loader: Literal["markdown"]
    research_uri: Path

    # Article
    article_loader: Literal["markdown"]
    article_renderer: Literal["markdown"]
    article_uri: Path

    # Profiles
    profiles_loader: Literal["markdown"]
    profiles_uri: Annotated[DirectoryPath, Field(description="URI to profiles directory")]
    character_profile: str

    # Examples
    examples_loader: Literal["markdown"]
    examples_uri: Annotated[DirectoryPath, Field(description="URI to examples directory")]

    def build_article_uri(self, iteration: int) -> Path:
        return self.article_uri.with_stem(f"{self.article_uri.stem}_{iteration:03d}")
```

This defines all the paths and loaders for different content types.


**2. Node and Tool Configuration:**
```python
from brown.models.config import ModelConfig, SupportedModels


class ToolConfig(BaseModel):
    name: str
    model_id: SupportedModels
    config: ModelConfig


class NodeConfig(BaseModel):
    model_id: SupportedModels
    config: ModelConfig
    tools: dict[str, ToolConfig]
```

Each node can have its own model and configuration. Tools (like diagram generators) have their own configs too.


**3. Memory Configuration:**
```python
class Memory(BaseModel):
    checkpointer: Literal["in_memory", "sqlite"]
```

Controls which checkpointing strategy to use for workflow state persistence.


**4. The Main AppConfig Class:**
```python
from annotated_types import Ge


class AppConfig(BaseModel):
    context: Context
    memory: Memory
    
    num_reviews: Annotated[int, Ge(1), Field(
        description="The number of reviews to perform while generating the article"
    )]
    nodes: dict[str, NodeConfig]

    @classmethod
    def from_yaml(cls, file_path: Path) -> "AppConfig":
        """Load configuration from a YAML file."""
        if not file_path.exists():
            raise FileNotFoundError(f"Configuration file not found: {file_path}")
        
        with open(file_path, "r", encoding="utf-8") as f:
            data = yaml.safe_load(f)
        
        return cls(**data)
```

**Key Features:**

- `num_reviews`: How many review-edit iterations to run
- `nodes`: Configuration for each workflow node
- `from_yaml()`: Load configuration from YAML file
- Full Pydantic validation ensures type safety


### Example Configuration File

Let's look at an actual configuration file from `configs/course.yaml`:

```yaml
context:
  article_guideline_loader: "markdown"
  article_guideline_uri: "article_guideline.md"
  research_loader: "markdown"
  research_uri: "research.md"
  article_loader: "markdown"
  article_renderer: "markdown"
  article_uri: "article.md"
  profiles_loader: "markdown"
  profiles_uri: "inputs/profiles"
  character_profile: "paul_iusztin.md"
  examples_loader: "markdown"
  examples_uri: "inputs/examples/course_lessons"

memory:
  checkpointer: "in_memory"

num_reviews: 2

nodes:
  generate_media_items:
    model_id: "google_genai:gemini-2.5-flash"
    model_config:
      temperature: 0.0
      include_thoughts: false
      thinking_budget: null
    tools:
      mermaid_diagram_generator:
        model_id: "google_genai:gemini-2.5-flash"
        config:
          temperature: 0.0
          include_thoughts: false

  write_article:
    model_id: "google_genai:gemini-2.5-pro"
    model_config:
      temperature: 0.7
      include_thoughts: false

  review_article:
    model_id: "google_genai:gemini-2.5-pro"
    model_config:
      temperature: 0.0
      include_thoughts: false

  edit_article:
    model_id: "google_genai:gemini-2.5-pro"
    model_config:
      temperature: 0.1
      include_thoughts: false
```

**Configuration Highlights:**

- **Media generation**: Uses fast Flash model with 0 temperature (deterministic)
- **Article writing**: Uses Pro model with higher temperature (0.7) for creativity
- **Reviewing**: Uses Pro model with 0 temperature (strict adherence to rules)
- **Editing**: Uses Pro model with low temperature (0.1) for focused changes
- **2 review iterations**: Runs the review-edit loop twice

This fine-grained control lets you optimize for quality, speed, and cost.


### Loading and Using the Configuration

Let's load a configuration file and examine it:


In [21]:
from brown.config_app import AppConfig

# Load configuration from YAML
config_path = CONFIGS_DIR / "course.yaml"
app_config = AppConfig.from_yaml(config_path)

text = f"""
Configuration loaded successfully!
Num review iterations: {app_config.num_reviews}
Memory checkpointer: {app_config.memory.checkpointer}
Character profile: {app_config.context.character_profile}
"""
pretty_print.wrapped(text, title="Configuration", width=100)

node_info = {}
for node_name, node_config in app_config.nodes.items():
    node_dict = {
        "Model": node_config.model_id,
        "Temperature": node_config.config.temperature,
    }
    if node_config.tools:
        node_dict["Tools"] = list(node_config.tools.keys())
    node_info[node_name] = node_dict
pretty_print.wrapped(node_info, title="Configured nodes", width=100)

[93m------------------------------------------ Configuration ------------------------------------------[0m
  
Configuration loaded successfully!
Num review iterations: 2
Memory checkpointer: in_memory
Character profile: paul_iusztin.md

[93m----------------------------------------------------------------------------------------------------[0m
[93m----------------------------------------- Configured nodes -----------------------------------------[0m
  {
  "generate_media_items": {
    "Model": "google_genai:gemini-2.5-flash",
    "Temperature": 0.0,
    "Tools": [
      "mermaid_diagram_generator"
    ]
  },
  "write_article": {
    "Model": "google_genai:gemini-2.5-flash",
    "Temperature": 0.7
  },
  "review_article": {
    "Model": "google_genai:gemini-2.5-flash",
    "Temperature": 0.0
  },
  "edit_article": {
    "Model": "google_genai:gemini-2.5-flash",
    "Temperature": 0.1
  },
  "review_selected_text": {
    "Model": "google_genai:gemini-2.5-flash",
    "Temperature": 0

### Benefits of This Approach

1. **Experimentation**: Change models/temperatures without editing code
2. **Cost optimization**: Use cheaper models for simple tasks
3. **Quality tuning**: Adjust temperatures per task
4. **Environment flexibility**: Different configs for dev/staging/prod
5. **Reproducibility**: Version control your configurations
6. **Global overview**: You can see the whole setup in a glance

This configuration-driven approach makes the system highly flexible and easy to iterate on.


## 8. LangGraph Workflow Integration

Now let's explore how everything is glued together into a robust LangGraph workflow. The complete workflow is in `brown.workflows.generate_article`.

### Workflow Architecture

The workflow uses LangGraph's Function API to orchestrate the complete article generation process. Let's break down the key components.


### Building the Workflow

**1. The Build Function:**
```python
from langgraph.func import entrypoint
from langgraph.checkpoint.base import BaseCheckpointSaver


def build_generate_article_workflow(checkpointer: BaseCheckpointSaver):
    """Create a generate article workflow with optional checkpointer.
    
    Args:
        checkpointer: Checkpointer to use for workflow state persistence
        
    Returns:
        Configured workflow entrypoint
    """
    return entrypoint(checkpointer=checkpointer)(_generate_article_workflow)
```

**Why This Pattern?**

We could directly decorate `_generate_article_workflow` with `@entrypoint`, but this builder function allows us to:

1. **Inject dependencies at runtime**: Pass the checkpointer when building the workflow
2. **Follow clean architecture**: Separate infrastructure (checkpointer) from business logic
3. **Enable testing**: Easily swap checkpointers for different environments

This pattern makes sense when you see how it's called in the next section.


**2. The Workflow Input:**

```python
from typing import TypedDict


class GenerateArticleInput(TypedDict):
    dir_path: Path
```

Simple typed input containing just the directory path where all resources are located.


**3. The Main Workflow Function:**


```python
from langgraph.config import get_stream_writer, RunnableConfig


async def _generate_article_workflow(inputs: GenerateArticleInput, config: RunnableConfig) -> str:
    dir_path = inputs["dir_path"]
    dir_path.mkdir(parents=True, exist_ok=True)
    
    writer = get_stream_writer()
    
    # Step 1: Load context
    writer(WorkflowProgress(progress=0, message="Loading context").model_dump(mode="json"))
    context = {}
    loaders = build_loaders(app_config)
    for context_name in ["article_guideline", "research", "profiles", "examples"]:
        loader = cast(Loader, loaders[context_name])
        context[context_name] = loader.load(working_uri=dir_path)
    writer(WorkflowProgress(progress=2, message="Loaded context").model_dump(mode="json"))
    
    # Step 2: Generate media items
    writer(WorkflowProgress(progress=3, message="Generating media items").model_dump(mode="json"))
    media_items = await generate_media_items(context["article_guideline"], context["research"])
    writer(WorkflowProgress(progress=10, message="Generated media items").model_dump(mode="json"))
    
    # Step 3: Write article
    writer(WorkflowProgress(progress=15, message="Writing article").model_dump(mode="json"))
    article = await write_article(...) 
    writer(WorkflowProgress(progress=20, message="Written raw article").model_dump(mode="json"))
    
    # Save iteration 0
    article_path = dir_path / app_config.context.build_article_uri(0)
    article_renderer = build_article_renderer(app_config)
    article_renderer.render(article, output_uri=article_path)
    
    # Steps 4-5: Review and edit loop
    for i in range(1, app_config.num_reviews + 1):
        # Review
        reviews = await generate_reviews(article, context["article_guideline"], context["profiles"])
        
        # Edit
        article = await edit_based_on_reviews(...)
        
        # Save iteration i
        article_path = dir_path / app_config.context.build_article_uri(i)
        article_renderer.render(article, output_uri=article_path)
    
    # Save final article
    article_path = dir_path / app_config.context.article_uri
    article_renderer.render(article, output_uri=article_path)
    
    return f"Final article rendered to `{article_path}`."
```

**Key Features:**

1. **Progress reporting**: Uses `get_stream_writer()` to send progress updates
2. **Iterative refinement**: Loops through review-edit cycles
3. **Version saving**: Saves each iteration for comparison
4. **Configuration-driven**: Uses `app_config` for all settings


**4. The Task Functions with Retry Policies:**

Each major step is wrapped in a `@task` decorator with retry policies:


```python
from langgraph.func import task
from langgraph.types import RetryPolicy

retry_policy = RetryPolicy(max_attempts=3, retry_on=Exception)


@task(retry_policy=retry_policy)
async def generate_media_items(article_guideline: ArticleGuideline, research: Research) -> MediaItems:
    writer = get_stream_writer()
    
    model, toolkit = build_model(app_config, node="generate_media_items")
    media_generator_orchestrator = MediaGeneratorOrchestrator(...)
    media_items_to_generate_jobs = await media_generator_orchestrator.ainvoke()
    
    # Generate media items in parallel
    coroutines = [tool.ainvoke(job["args"]) for job in media_items_to_generate_jobs]
    media_items = await asyncio.gather(*coroutines)
    
    return MediaItems.build(media_items)


@task(retry_policy=retry_policy)
async def write_article(...) -> Article:
    model, _ = build_model(app_config, node="write_article")
    article_writer = ArticleWriter(...)
    article = await article_writer.ainvoke()
    return cast(Article, article)


@task(retry_policy=retry_policy)
async def generate_reviews(...) -> ArticleReviews:
    model, _ = build_model(app_config, node="review_article")
    article_reviewer = ArticleReviewer(...)
    reviews = await article_reviewer.ainvoke()
    return cast(ArticleReviews, reviews)


@task(retry_policy=retry_policy)
async def edit_based_on_reviews(...) -> Article:
    model, _ = build_model(app_config, node="edit_article")
    article_writer = ArticleWriter(..., reviews=reviews)
    article = await article_writer.ainvoke()
    return cast(Article, article)
```

**Why Retry Policies?**

- **Resilience**: API failures, rate limits, network issues happen
- **Automatic recovery**: Retry failed steps without manual intervention
- **Task-level granularity**: Only retry the failed step, not the entire workflow
- **Production-ready**: Makes the system robust for real-world use

Having each step as a separate task with single responsibility makes retry policies extremely effective.


## 9. Short-Term Memory: Checkpointing

Before running the complete workflow, let's understand the checkpointing system that provides workflow state persistence.

### Why Checkpointing?

Checkpointing serves several purposes:

1. **Resume from failure**: If a workflow crashes, resume from the last checkpoint. For example, resume from the last `generate_reviews` step that crashed.
2. **State inspection**: Examine workflow state at any point
3. **Debugging**: Step through workflow execution
4. **Human-in-the-loop**: Pause for human input, then resume

For our use case, checkpointing enables the review-edit iterations to maintain state between steps.


### InMemory Checkpointer

From `brown.memory`, we have a simple in-memory checkpointer factory:


```python
from contextlib import asynccontextmanager
from typing import AsyncIterator

from langgraph.checkpoint.memory import InMemorySaver


@asynccontextmanager
async def build_in_memory_checkpointer() -> AsyncIterator[InMemorySaver]:
    """Build an in-memory checkpointer.
    
    Returns an async context manager that yields an InMemorySaver.
    
    Yields:
        InMemorySaver instance
    """
    yield InMemorySaver()
```

**Characteristics:**

- **Ephemeral**: State lost when process ends
- **Development-friendly**: Perfect for testing and iteration
- **No persistence**: Not suitable for production long-running workflows

**When to use**: Development, testing, short-lived workflows


## 10. Complete LangGraph Example

Now let's run the complete workflow with LangGraph integration! This brings together everything we've learned.


In [22]:
import uuid

from brown.memory import build_in_memory_checkpointer
from brown.workflows.generate_article import GenerateArticleInput, build_generate_article_workflow

pretty_print.wrapped("COMPLETE WORKFLOW WITH LANGGRAPH", width=100)

[93m----------------------------------------------------------------------------------------------------[0m
  COMPLETE WORKFLOW WITH LANGGRAPH
[93m----------------------------------------------------------------------------------------------------[0m


In [23]:
print("\n1. Building short-term memory...")
async with build_in_memory_checkpointer() as checkpointer:
    print("   âœ“ In-memory checkpointer created")

    print("\n2. Building workflow...")
    workflow = build_generate_article_workflow(checkpointer=checkpointer)
    print("   âœ“ Workflow built with checkpointer")

    print("\n3. Configuring workflow...")
    thread_id = str(uuid.uuid4())
    config = {"configurable": {"thread_id": thread_id}}
    print(f"   âœ“ Thread ID: {thread_id}")

    print("\n4. Running workflow...")
    print("   This will take several minutes...")

    inputs = GenerateArticleInput(dir_path=SAMPLE_DIR)

    async for event in workflow.astream(inputs, config=config, stream_mode=["custom", "values"]):
        # Print progress updates
        event_type, event_data = event
        if event_type == "custom":
            pretty_print.wrapped(event_data, title="Event")
        elif event_type == "values":
            pretty_print.wrapped(event_data, title="Output")

pretty_print.wrapped("WORKFLOW COMPLETE - Check SAMPLE_DIR for generated articles", width=100)


1. Building short-term memory...
   âœ“ In-memory checkpointer created

2. Building workflow...
   âœ“ Workflow built with checkpointer

3. Configuring workflow...
   âœ“ Thread ID: 0eaae8af-a61a-405b-86b3-10bf38655a42

4. Running workflow...
   This will take several minutes...
[93m---------------------------------------------- Event ----------------------------------------------[0m
  {
  "progress": 0,
  "message": "Loading context"
}
[93m----------------------------------------------------------------------------------------------------[0m
[93m---------------------------------------------- Event ----------------------------------------------[0m
  {
  "progress": 2,
  "message": "Loaded context"
}
[93m----------------------------------------------------------------------------------------------------[0m
[93m---------------------------------------------- Event ----------------------------------------------[0m
  {
  "progress": 3,
  "message": "Genererating media items"
}
[

In [24]:
# Check the generated articles
print("\nGenerated article files:")
article_files = sorted(SAMPLE_DIR.glob("article*.md"))
for article_file in article_files:
    size = article_file.stat().st_size
    print(f"  - {article_file.name}: {size:,} bytes")


Generated article files:
  - article.md: 29,529 bytes
  - article_000.md: 26,203 bytes
  - article_001.md: 26,144 bytes
  - article_002.md: 29,529 bytes
  - article_draft.md: 25,644 bytes
  - article_edited.md: 27,598 bytes
  - article_guideline.md: 23,131 bytes


### What Just Happened?

Let's break down the workflow execution:

**Infrastructure Setup:**
1. Created an in-memory checkpointer for state persistence
2. Built the workflow by injecting the checkpointer
3. Generated a unique thread ID for this workflow run

**Workflow Execution:**
1. **Loaded Context** (Progress: 0-2%): All guidelines, research, profiles, examples
2. **Generated Media** (Progress: 3-10%): Orchestrator identified and delegated media generation
3. **Wrote Article** (Progress: 15-20%): ArticleWriter created first draft
4. **Review-Edit Loop** (Progress: 25-99%): For each iteration:
   - ArticleReviewer analyzed the article
   - ArticleWriter edited based on reviews
   - Saved intermediate version
5. **Final Save** (Progress: 100%): Saved the final refined article

**Key LangGraph Features Used:**
- **Checkpointing**: State persistence between steps
- **Streaming**: Real-time progress updates
- **Retry policies**: Automatic recovery from failures
- **Tasks**: Composable, retryable workflow steps
- **Configuration**: Thread-based workflow isolation

**Output Files:**

Saved to the sample input directory:

- `article_000.md`: Initial draft
- `article_001.md`: After first review-edit iteration
- `article_002.md`: After second review-edit iteration (if num_reviews=2)
- `article.md`: Final refined article

This demonstrates a production-ready implementation of the evaluator-optimizer pattern!


Check the saved files at:

In [25]:
SAMPLE_DIR

PosixPath('inputs/tests/01_sample')

Listing them:

In [26]:
list(SAMPLE_DIR.glob("article*.md"))

[PosixPath('inputs/tests/01_sample/article_draft.md'),
 PosixPath('inputs/tests/01_sample/article_edited.md'),
 PosixPath('inputs/tests/01_sample/article.md'),
 PosixPath('inputs/tests/01_sample/article_001.md'),
 PosixPath('inputs/tests/01_sample/article_000.md'),
 PosixPath('inputs/tests/01_sample/article_002.md'),
 PosixPath('inputs/tests/01_sample/article_guideline.md')]

## 11. Conclusion and Future Steps

Congratulations! You've learned how to implement the evaluator-optimizer pattern and integrate it into a production-ready workflow.

### What We've Learned

**1. The Evaluator-Optimizer Pattern:**
- Evaluator (ArticleReviewer) identifies issues against requirements
- Optimizer (ArticleWriter) fixes issues based on feedback
- Iterative refinement gradually improves quality
- Transparent, debuggable, and improvable process

**2. Centralized Configuration:**
- Single YAML file controls everything
- Per-node model and parameter configuration
- Easy experimentation and optimization
- Type-safe with Pydantic validation

**3. LangGraph Integration:**
- Workflow orchestration with Function API
- Task-based architecture with retry policies
- Checkpointing for state persistence
- Streaming for progress reporting
- Production-ready error handling


### Ideas for Extension

Now that you understand the system, here are ways to extend it:

**1. Different AI Frameworks:**

Replace LangGraph Function API with LangGraph Graph API. Try PydanticAI or other frameworks. The clean architecture makes swapping easy.

**2. Improve the Reviewer:**

Play around with the reviewer, get a feeling of how it extract the reviews from the profiles and tweak either the profiles or the reviewer for better results.

**3. Modify the configuration:**

Change models, temperatures, num_reviews from the `configs/course.yaml` file and see how the output of the article changes.


### What's Next in the Course

In **Lesson 24: Human-in-the-Loop**, we'll explore:

- Adding two new workflows for iteratively editing the whole article or just a selected piece of text.
- Properly add humans in the loop between the generated article and future edit iterations.
- Expose the workflows as MCP tools.

You'll see why having a fixed number of review iterations (rather than scoring until "good enough") makes perfect sense when humans are in the loop.

### Resources

- **Brown Package**: Explore `../writing_workflow/` for complete source code
- **Configuration Examples**: Check `configs/` for different configurations
- **Test Data**: Use `inputs/tests/` for additional testing scenarios