## üåü Scenario: Content Moderation System

### The Problem

You've joined a growing tech community platform that has **50,000 users** but only **3 moderators**:
- **Sarah** manually reviews posts (6 hours/day, 200+ posts in queue)
- **Mike** tries to help users improve content (rarely has time)
- **Lisa** identifies harmful content (can't keep up)

**Current Issues:**
- Takes 5-10 minutes per post to check safety, tone, and grammar manually
- Users don't understand why content is rejected
- No time to enhance approved content

### Your Solution

Build an **AI-Powered Content Moderation System** that:

1. **Classifies** content type (social media post / article / comment)
2. **Analyzes** safety, tone, and grammar **in parallel**
3. **Scores** and decides: approve or reject
4. **Enhances** approved content automatically
5. **Provides feedback** to users

**Expected Impact:** Reduce moderation time from 5-10 minutes to 30 seconds per post!

### Example Test Cases

Your system should handle:

**‚úÖ Good Content (needs enhancement):**
```
just finished reading an amzing book about AI ethics! 
its really make me think about how we build responsible systems.
```
‚Üí Approve, fix grammar, enhance

**‚ö†Ô∏è Problematic Content:**
```
I hate this stupid product! Complete waste of money.
```
‚Üí Flag for aggressive language, suggest constructive rephrasing

---

## üìã Challenge Overview

### Your Mission

Build an **AI-Powered Content Moderation & Enhancement System** that:
1. Analyzes user-submitted content (text posts)
2. Moderates for safety and quality
3. Provides improvement suggestions
4. Enhances approved content

### Why This Challenge?

This challenge combines **multiple agentic patterns** in a realistic scenario:
- **Routing**: Classify content type (social media post, article, comment)
- **Evaluator-Optimizer**: Assess content quality and iterate improvements
- **Parallelization**: Analyze multiple aspects simultaneously (tone, safety, grammar)
- **Orchestrator-Worker**: Coordinate the full moderation pipeline
- **Prompt Chaining**: Transform raw content through moderation ‚Üí enhancement ‚Üí finalization

---

## üéì Part 1: Framework Selection & Justification

### Task 1.1: Choose Your Framework

**Instructions:**
1. Review the 4 frameworks you learned
2. Select ONE framework for this challenge
3. Write a justification (150-200 words) explaining:
   - Why you chose this framework
   - What strengths make it suitable for this challenge
   - What trade-offs you considered
   - How its features align with the challenge requirements

**Available Frameworks:**
- CrewAI: Role-based agents, sequential/hierarchical processes
- LangGraph: Graph-based state management, conditional routing
- LlamaIndex: Data-centric, built-in RAG capabilities
- smolagents: Lightweight, tool-focused, minimal dependencies

---

### ‚úçÔ∏è YOUR FRAMEWORK SELECTION

**Selected Framework:** LangGraph

**Justification:**
I chose LangGraph because this content moderation system requires workflow orchestration, which is exactly what LangGraph is built for. The pipeline is not a simple linear pipeline, but a parallel execution of agents, branching with evaluation results and a state object that accumulates data as content progresses through each step. LangGraph is a graph based model and is a perfect match for this architecture, where each node is a specific content processing step, edges are connections between those steps and the state object contains all content, scores and data throughout the entire pipeline. The conditional edge after the evaluator makes perfect sense for this approval/rejection decision, routing content to the optimizer -> enhancer chain for approved content and immediately terminating for rejected content. LangGraph also provides strong state persistence and traceability, everything can be inspected at any point in the pipeline.
CrewAI is more suitable for autonomous agents collaborating, LlamaIndex is optimized for RAGs, Smolagents is too lightweight to handle parallel nodes and conditional branching. LangGraph is optimized for this task, which is why it is perfect for this task.

## üõ†Ô∏è Part 2: Setup & Configuration

### Task 2.1: Install Dependencies

Install your chosen framework and configure your API keys.

In [68]:
# TODO: Install your chosen framework and dependencies
# Your code here:
!pip install -q langgraph langchain-openai langchain langchain-core



[notice] A new release of pip is available: 25.0.1 -> 26.0.1
[notice] To update, run: C:\Users\edvard.smoliakov\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


### Task 2.2: Configure API Keys & Model

In [69]:
# TODO: Configure your API key and model
import getpass
import os
from langchain_openai import ChatOpenAI

if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OPENAI API key: ")
if "OPENAI_BASE_URL" not in os.environ:
    os.environ["OPENAI_BASE_URL"] = getpass.getpass("Enter your OPENAI base url: ")


def get_model(temperature=0.7):
    """
    Create and return a configured gpt-4.1-mini model via OpenAI.

    Args:
        temperature (float): Controls randomness (0.0 = deterministic, 1.0 = creative)

    Returns:
        ChatOpenAI: Configured model instance
    """
    return ChatOpenAI(
        model="gpt-4.1-mini",
        temperature=temperature,
        max_tokens=2048,
        base_url=os.environ.get("OPENAI_BASE_URL"),
    )

# Test the model
test_model = get_model()
response = test_model.invoke("Hello! Can you confirm you're working?")
print(f"Model Response: {response.content}")
print(f"\n‚úÖ Model configured and tested successfully!")

Model Response: Hello! Yes, I'm here and ready to help. How can I assist you today?

‚úÖ Model configured and tested successfully!


---

## üèóÔ∏è Part 3: Implementation

Build your **Content Moderation & Enhancement System** by implementing the following components:

### System Architecture

```
User Content Input
      |
      v
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  Router Agent   ‚îÇ ‚îÄ‚îÄ> Classify: Social Media / Article / Comment
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
         |
         v
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ Parallel Analysis       ‚îÇ
‚îÇ  - Safety Check Agent   ‚îÇ ‚îÄ‚îÄ> Detect harmful content
‚îÇ  - Tone Analyzer Agent  ‚îÇ ‚îÄ‚îÄ> Assess sentiment/tone
‚îÇ  - Grammar Checker      ‚îÇ ‚îÄ‚îÄ> Identify language issues
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
         |
         v
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ Evaluator Agent         ‚îÇ ‚îÄ‚îÄ> Aggregate findings, score content
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
         |
    ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚î¥‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
    v          v
  REJECT    APPROVE
            |
            v
    ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
    ‚îÇ  Optimizer    ‚îÇ ‚îÄ‚îÄ> Suggest improvements
    ‚îÇ  Agent        ‚îÇ
    ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
            |
            v
    ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
    ‚îÇ  Enhancer     ‚îÇ ‚îÄ‚îÄ> Apply improvements
    ‚îÇ  Agent        ‚îÇ
    ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
            |
            v
    Final Enhanced Content
```

---

### Task 3.1: Router Agent (Routing Pattern)

**Requirements:**
- Create a router that classifies content into: "social_media", "article", or "comment"
- Route should be based on length, structure, and style
- Return the classification decision

In [70]:
from typing import TypedDict, Annotated, Sequence, Optional
from langgraph.graph import StateGraph, END, START
from langchain_core.messages import BaseMessage, HumanMessage, AIMessage
from langchain_core.prompts import ChatPromptTemplate
import operator
import json
import time

from langdetect import detect

In [71]:
# TODO: Implement Router Agent
# This agent analyzes content and classifies it

class ModerationState(TypedDict):
    content: str
    language: str  # detected language code
    route: str
    # parallel analysis
    safety_result: Annotated[dict, lambda x, y: y]
    tone_result: Annotated[dict, lambda x, y: y]
    grammar_result: Annotated[dict, lambda x, y: y]
    # evaluator
    overall_score: Annotated[float, lambda x, y: y]
    decision: Annotated[str, lambda x, y: y]
    rejection_reason: Annotated[Optional[str], lambda x, y: y]
    suggestions: Annotated[list, lambda x, y: y]
    # optimizer
    optimized_instructions: Annotated[str, lambda x, y: y]
    # enhancer
    enhanced_content: Annotated[str, lambda x, y: y]
    feedback: Annotated[str, lambda x, y: y]

def detect_language(state: ModerationState) -> dict:
    """
    Language Detection Agent: Identifies the language of input content.
    Supports multiple languages (English, Spanish, French, German)
    """
    try:
        lang = detect(state["content"])
        lang_names = {  # available languages, can be expanded
            'en': 'English',
            'es': 'Spanish',
            'fr': 'French',
            'de': 'German',
        }
        lang_name = lang_names.get(lang, lang.upper())
        print(f"üåê Detected language: {lang_name} ({lang})\n")
        return {"language": lang}
    except Exception as e:
        print(f"‚ö†Ô∏è Language detection failed: {e}, defaulting to 'en'")
        return {"language": "en"}

def route_query(state: ModerationState) -> ModerationState:
    """
    Router Agent: Analyzes the query and determines the appropriate route.
    Routes: 'social_media', 'article', 'comment' 
    """

    print("Analyzing query for routing...")

    # simple heuristic to avoid misclassification by the model
    text = state["content"].strip()
    word_count = len(text.split())
    if word_count > 100:
        route = "article"
        print(f"üîç Heuristic detected long text ({word_count} words), routing to: {route}\n")
        state["route"] = route
        return state

    model = get_model(temperature=0.3)
    prompt = f"""Classify this content into one category based on its length, structure, and style:
- 'social_media': short, casual, informal (tweets, status updates)
- 'article': long, structured, formal writing
- 'comment': brief reply or reaction to something

Respond with ONLY the category name.

Content: {state['content']}

Category:"""

    response = model.invoke(prompt)
    route = response.content.strip().lower()

    if route not in ['social_media', 'article', 'comment']:
        route = 'social_media' # default if classification fails

    state["route"] = route
    print(f"‚úÖ Query routed to: {route}\n")
    return {"content": state["content"], "route": route}


### Task 3.2: Parallel Analysis Agents (Parallelization Pattern)

**Requirements:**
- Implement 3 agents that run in parallel:
  1. **Safety Checker**: Detect toxic, harmful, or inappropriate content
  2. **Tone Analyzer**: Assess sentiment (positive/negative/neutral) and professionalism
  3. **Grammar Checker**: Identify spelling, grammar, and clarity issues
- Each agent should return a structured assessment
- Execute them concurrently for efficiency

In [72]:
def safety_checker(state: ModerationState) -> dict:
    """Detect toxic, harmful, or inappropriate content"""
    model = get_model(temperature=0.1)
    prompt = f"""Analyze the following text for safety issues (hate speech, threats, harassment, explicit content).

TEXT TO ANALYZE:
{state['content']}

Respond with JSON only, replace example values with your actual assessment:
{{"score": 0.95, "flagged": false, "issues": []}}

Where score is 0.0 (very unsafe) to 1.0 (completely safe)."""

    response = model.invoke(prompt)
    try:
        result = json.loads(response.content.strip())
    except json.JSONDecodeError:
        result = {"score": 0.5, "flagged": False, "issues": ["Could not parse"]} # default to neutral if parsing fails

    try:
        result['score'] = float(result.get('score', 0.5))
    except Exception:
        result['score'] = 0.5
    result['score'] = max(0.0, min(1.0, result['score'])) # catches out of range and non numeric scores

    print(f"üõ°Ô∏è Safety: {result.get('score'):.2f} | Flagged: {result.get('flagged')}")
    return {"safety_result": result}


def tone_analyzer(state: ModerationState) -> dict:
    """Assess sentiment and professionalism"""
    model = get_model(temperature=0.1)
    prompt = f"""Analyze the following text for tone, sentiment, and professionalism.

TEXT TO ANALYZE:
{state['content']}

Respond with JSON only, replace example values with your actual assessment:
{{"score": 0.95, "sentiment": "positive", "issues": []}}

Where score is 0.0 (very unprofessional) to 1.0 (perfectly professional)."""

    response = model.invoke(prompt)
    try:
        result = json.loads(response.content.strip())
    except json.JSONDecodeError:
        result = {"score": 0.5, "sentiment": "neutral", "issues": ["Could not parse"]} # default to neutral

    try:
        result['score'] = float(result.get('score', 0.5))
    except Exception:
        result['score'] = 0.5
    result['score'] = max(0.0, min(1.0, result['score'])) # catches out of range and non numeric scores

    print(f"üé≠ Tone: {result.get('score'):.2f} | Sentiment: {result.get('sentiment')}")
    return {"tone_result": result} # returned as dict


def grammar_checker(state: ModerationState) -> dict:
    """Identify spelling, grammar, and clarity issues"""
    model = get_model(temperature=0.1)
    prompt = f"""Analyze the following text for spelling, grammar, and clarity issues.

TEXT TO ANALYZE:
{state['content']}

Respond with JSON only, replace example values with your actual assessment:
{{"score": 0.95, "issues": []}}

Where score is 0.0 (very poor) to 1.0 (perfect grammar and clarity)."""

    response = model.invoke(prompt)
    try:
        result = json.loads(response.content.strip())
    except json.JSONDecodeError:
        result = {"score": 0.5, "issues": ["Could not parse"]} # default to neutral

    try:
        result['score'] = float(result.get('score', 0.5))
    except Exception:
        result['score'] = 0.5
    result['score'] = max(0.0, min(1.0, result['score']))  # catches out of range and non numeric scores

    print(f"üìù Grammar: {result.get('score'):.2f}")
    return {"grammar_result": result}

### Task 3.3: Evaluator Agent (Evaluator-Optimizer Pattern - Part 1)

**Requirements:**
- Aggregate results from the 3 parallel agents
- Calculate an overall content quality score (0-100)
- Make a decision: APPROVE (score ‚â• 70) or REJECT (score < 70)
- For approved content, provide specific improvement suggestions

In [73]:
def evaluator(state: ModerationState) -> ModerationState:
    """Aggregate analysis results and make approve/reject decision"""
    
    # debug print full analysis results
    print("üîé Debug state results:")
    print(state.get("safety_result"))
    print(state.get("tone_result"))
    print(state.get("grammar_result"))
    
    safety_score = state["safety_result"].get("score", 0.5) # default to neutral
    tone_score = state["tone_result"].get("score", 0.5)
    grammar_score = state["grammar_result"].get("score", 0.5)

    overall_score = (safety_score * 0.5 + tone_score * 0.3 + grammar_score * 0.2) * 100 # weighted average and convert to %
    decision = "approve" if overall_score >= 70 else "reject" # approve if 70 and above

    print(f"üìä Overall score: {overall_score:.1f} | Decision: {decision.upper()}")

    all_issues = (
        state["safety_result"].get("issues", []) +
        state["tone_result"].get("issues", []) +
        state["grammar_result"].get("issues", []) # combine all issues into a list
    )

    if decision == "reject":
        state["rejection_reason"] = "; ".join(all_issues) if all_issues else "Content did not meet quality standards"
        state["suggestions"] = []
        print(f"‚ùå Rejection reason: {state['rejection_reason']}") # print all issues as reason if available, otherwise default message
    else:
        state["rejection_reason"] = None
        state["suggestions"] = all_issues
        print(f"‚úÖ Approved with {len(all_issues)} suggestions") # print number of suggestions if approved

    state["overall_score"] = overall_score
    state["decision"] = decision
    return {
        "overall_score": overall_score,
        "decision": decision,
        "rejection_reason": state["rejection_reason"] if decision == "reject" else None,
        "suggestions": [] if decision == "reject" else all_issues # return suggestions only if approved
    }

### Task 3.4: Optimizer & Enhancer Agents (Prompt Chaining + Evaluator-Optimizer)

**Requirements:**
- **Optimizer Agent**: Generate specific improvements based on evaluator feedback
- **Enhancer Agent**: Apply improvements to create an enhanced version
- Implement as a chain: Original Content ‚Üí Optimizer ‚Üí Enhancer ‚Üí Final Content
- (Optional) Add a re-evaluation loop if initial enhancement score is still low

In [74]:
def optimizer(state: ModerationState) -> ModerationState:
    """Generate rewrite instructions based on evaluator suggestions"""
    model = get_model(temperature=0.3)
    prompt = f"""You are a writing coach for {state['route'].replace('_', ' ')} content.
Given this content and its issues, write 2-3 simple instructions to improve it
while preserving the appropriate style for {state['route'].replace('_', ' ')}.

Content: {state['content']}
Issues: {state['suggestions']}

Write short, clear instructions only:"""

    response = model.invoke(prompt)
    print(f"üîß [Optimizer] Rewrite plan generated")
    return {"optimized_instructions": response.content.strip()} # return instructions to improve content and guide the enhancer


def enhancer(state: ModerationState) -> ModerationState:
    """Rewrite content based on optimizer instructions"""
    model = get_model(temperature=0.7)
    prompt = f"""You are a skilled editor. Rewrite this content to improve clarity, grammar, and style.

Original Content: {state['content']}

Instructions: {state['optimized_instructions'] if state['optimized_instructions'] else 'Improve grammar, clarity and tone.'}

Provide only the rewritten content, no explanations:""" # gets instructions from optimizer

    response = model.invoke(prompt)
    enhanced = response.content.strip()
    print(f"‚ú® [Enhancer] Result: {enhanced[:100]}")
    return {"enhanced_content": enhanced} # return enhanced content

### Task 3.5: Orchestrator (Orchestrator-Worker Pattern)

**Requirements:**
- Create a master orchestrator that coordinates the entire pipeline:
  1. Route content type
  2. Run parallel analysis
  3. Evaluate and decide
  4. If approved, optimize and enhance
  5. Return final result with metadata
- Handle both approval and rejection cases
- Provide clear logging of each step

In [75]:
# TODO: Implement Orchestrator
# This coordinates the entire moderation pipeline

def build_moderation_graph():
    """Build and return the full moderation pipeline graph"""
    
    workflow = StateGraph(ModerationState)

    # Add all nodes
    workflow.add_node("language_detector", detect_language)
    workflow.add_node("router", route_query)
    workflow.add_node("safety_checker", safety_checker)
    workflow.add_node("tone_analyzer", tone_analyzer)
    workflow.add_node("grammar_checker", grammar_checker)
    workflow.add_node("evaluator", evaluator)
    workflow.add_node("optimizer", optimizer)
    workflow.add_node("enhancer", enhancer)

    # Entry point: language detection first
    workflow.set_entry_point("language_detector")
    
    # Language detector ‚Üí router
    workflow.add_edge("language_detector", "router")

    # Router ‚Üí parallel agents (fan-out)
    workflow.add_edge("router", "safety_checker")
    workflow.add_edge("router", "tone_analyzer")
    workflow.add_edge("router", "grammar_checker")

    # Parallel agents ‚Üí evaluator (fan-in)
    workflow.add_edge("safety_checker", "evaluator")
    workflow.add_edge("tone_analyzer", "evaluator")
    workflow.add_edge("grammar_checker", "evaluator")

    # Evaluator ‚Üí approve or reject (conditional routing)
    workflow.add_conditional_edges(
        "evaluator",
        lambda state: state["decision"],
        {
            "approve": "optimizer",
            "reject": END
        }
    )

    # Approve path: optimizer ‚Üí enhancer ‚Üí done
    workflow.add_edge("optimizer", "enhancer")
    workflow.add_edge("enhancer", END)

    return workflow.compile()


def run_moderation(content: str): # "main" function that runs pipeline
    """Run content through the full moderation pipeline"""
    
    print("\n" + "="*50)
    print(f"üì• Input: {content[:80]}...")
    print("="*50)

    graph = build_moderation_graph()
    
    initial_state = ModerationState(
        content=content,
        language="",
        route="",
        safety_result={},
        tone_result={},
        grammar_result={},
        overall_score=0.0,
        decision="",
        rejection_reason=None,
        suggestions=[],
        optimized_instructions="",
        enhanced_content="",
        feedback=""
    )

    result = graph.invoke(initial_state)

    print("\n" + "="*50)
    print("üì§ FINAL RESULT")
    print("="*50)
    if result["decision"] == "approve":
        print(f"‚úÖ APPROVED (score: {result['overall_score']:.1f})")
        print(f"Enhanced: {result['enhanced_content']}")
    else:
        print(f"‚ùå REJECTED (score: {result['overall_score']:.1f})")
        print(f"Reason: {result['rejection_reason']}")
    print("="*50)

    return result

---

## üß™ Part 4: Testing

### Task 4.1: Test with Sample Content

Test your system with the provided examples representing different scenarios.

In [76]:
# Test Case 1: Clean social media post (should be approved and enhanced)
test_content_1 = """
just finished reading an amzing book about AI ethics! 
its really make me think about how we build responsible systems. 
highly recomend it to anyone in tech!
"""

# Test Case 2: Professional article excerpt (should be approved, might need minor fixes)
test_content_2 = """
Machine learning algorithms have transformed the healthcare industry over the past decade.
These systems now assist in diagnosis, treatment planning, and patient monitoring.
However, concerns about data privacy and algorithmic bias remain significant challenges
that researchers and practitioners must address to ensure equitable healthcare delivery.
"""

# Test Case 3: Short comment with grammar issues (should be approved but needs enhancement)
test_content_3 = "this is grate! i totally agree with ur point about ai safety its so important"

# Test Case 4: Content with potential safety issues (might be rejected or flagged)
test_content_4 = """
I hate this stupid product! Complete waste of money. 
The company is terrible and everyone should avoid them.
"""

# TODO: Run your orchestrator on each test case
# Display the results clearly showing:
# - Content type classification
# - Analysis results (safety, tone, grammar)
# - Evaluation score and decision
# - Enhanced version (if approved)

print("="*70)
print("TEST CASE 1: Social Media Post with Errors")
print("="*70)
result1 = run_moderation(test_content_1)

print("\n" + "="*70)
print("TEST CASE 2: Professional Article")
print("="*70)
result2 = run_moderation(test_content_2)

print("\n" + "="*70)
print("TEST CASE 3: Short Comment")
print("="*70)
result3 = run_moderation(test_content_3)

print("\n" + "="*70)
print("TEST CASE 4: Potentially Problematic Content")
print("="*70)
result4 = run_moderation(test_content_4)

TEST CASE 1: Social Media Post with Errors

üì• Input: 
just finished reading an amzing book about AI ethics! 
its really make me think...
üåê Detected language: English (en)

Analyzing query for routing...
‚úÖ Query routed to: social_media

üõ°Ô∏è Safety: 1.00 | Flagged: False
üé≠ Tone: 0.55 | Sentiment: positive
üìù Grammar: 0.75
üîé Debug state results:
{'score': 1.0, 'flagged': False, 'issues': []}
{'score': 0.55, 'sentiment': 'positive', 'issues': ['informal tone', "spelling errors: 'amzing', 'recomend'", "grammar errors: 'its really make me think'"]}
{'score': 0.75, 'issues': ["Spelling error: 'amzing' should be 'amazing'.", "Grammar: 'its' should be 'it' or 'it's'.", "Grammar: 'make' should be 'makes'.", "Spelling error: 'recomend' should be 'recommend'.", 'Clarity: Sentence fragments and informal tone reduce clarity. Consider capitalizing the first word of each sentence.']}
üìä Overall score: 81.5 | Decision: APPROVE
‚úÖ Approved with 8 suggestions
üîß [Optimizer] Rewri

---

## üìä Part 5: Reflection & Analysis

### Task 5.1: Pattern Usage Documentation

Document how you used each agentic pattern in your implementation.

### ‚úçÔ∏è YOUR PATTERN USAGE ANALYSIS

**1. Routing Pattern:**
- Where used: The route_query node classifies incoming content into social_media, article, or comment based on length, structure, and style.
- Why effective: Knowing the type of content allows other agents to apply appropriate moderation.

**2. Parallelization Pattern:**
- Where used: After routing three agents run parallel ‚Äî safety_checker, tone_analyzer, and grammar_checker - each analyzing a different part about the content.
- Why effective: Each agent focuses on one concern only, making results more accurate and prompts simpler.
- Performance benefit: All three run at the same time, making it 3 times faster than sequential.

**3. Evaluator-Optimizer Pattern:**
- Where used: The evaluator aggregates scores from the three parallel agents and makes the approve/reject decision. If approved the optimizer generates improvement instructions.
- How feedback loop works: The evaluator identifies issues from all three agents, passes them as suggestions to the optimizer, which turns them into actionable rewrite instructions for the enhancer.

**4. Prompt Chaining Pattern:**
- Where used: The approve path from evaluator to final output.
- Stages in chain: Original content -> Optimizer (generate instructions) -> Enhancer (apply instructions) -> Final enhanced content.

**5. Orchestrator-Worker Pattern:**
- How orchestration is managed: build_moderation_graph is the orchestrator, defining the full pipeline structure - entry point, parallel part, conditional branching and the enhancement chain.
- Worker coordination: Each node is an independent worker with a single responsibility. The graph edges and conditional logic coordinate their execution order, with LangGraph managing state passing between all workers automatically.

---

### Task 5.2: Challenges & Solutions

Reflect on difficulties you encountered and how you solved them.

### ‚úçÔ∏è YOUR CHALLENGES & SOLUTIONS

**Challenge 1:**
- Problem: Parallel agents caused InvalidUpdateError because all three returned the full state, resulting in multiple conflicting writes to the same keys like content and route.
- Solution: Changed each parallel agent to return only its own key as a dict, instead of the full state, this eliminated write conflicts entirely.

**Challenge 2:**
- Problem: The grammar checker was analyzing the prompt text itself instead of the user content, flagging phrases like "replace example values with your actual assessment" as grammar issues.
- Solution: Restructured all agent prompts to clearly separate instructions from content using a TEXT TO ANALYZE, with this the agent finally realized what to analyze.

**Challenge 3:**
- Problem: JSON prompts using float and bool as type annotations caused the model to either return the literal word "float" or score everything as 0.0 or 1.0.
- Solution: Replaced type annotations with example values ("score": 0.95) and added "replace example values with your actual assessment" so the model understood the JSON as a template, not a fixed response.

**Challenge 4:**
- Problem: The router classified every content type as "comment" regardless of the actual content, because the prompt was indented inside the f-string which added leading whitespace that confused the model.
- Solution: Fixed prompt indentation so the text was flush to the left inside the f-string, and added a word count heuristic to automatically route long texts to "article" without relying on the model.

**Challenge 5:**
- Problem: The content field was never passed to the parallel agents, which caused agents to analyze nothing and return identical hardcoded scores across all test cases.
- Solution: Traced the bug using debug prints at the top of each agent, which revealed state['content'] was empty. The cause was the reducer lambda x, y: x on the content field in ModerationState ‚Äî this reducer always keeps the existing value (x) and discards the new one (y), which also blocks the initial write, leaving content as "". Removing from content and route fixed it.
---

### Task 5.3: Framework Reflection

Now that you've completed the challenge, reflect on your framework choice.

### ‚úçÔ∏è YOUR FRAMEWORK REFLECTION

**What worked well with your chosen framework?**

LangGraph's graph structure made the pipeline very easy to reason about. Adding nodes and edges was easy, and the fan-out/fan-in pattern for parallel agents worked exactly as expected. The conditional edge after the evaluator was clean and readable - just a lambda returning "approve" or "reject". State management felt familiar, this is something I recognised from Haskell. It was nice having one shared object that flows through the entire sysytem.

**What was difficult or limiting?**

The Annotated reducers were not intuitive at first, and the InvalidUpdateError from parallel writes took significant debugging to understand and fix. 

**Would you choose the same framework again? Why or why not?**

Yes. I think LangGraph was the right tool for this pipeline, even though i havent tested other frameworks with this task. The explicit graph structure made the architecture transparent and easy to modify.

**What would you do differently next time?**

Start with each agent returning only its own key as a dict from the beginning, rather than returning full state. This would have avoided the parallel write conflicts entirely. Also spend more time on prompt engineering upfront ‚Äî most of the debugging time was spent fixing how agents interpreted their prompts, not the pipeline structure itself.

---

## üéÅ Bonus Challenges (Optional)

If you want to go further, try these enhancements:

### Bonus 1: Multi-Language Support
- Add a language detection agent
- Support content in at least 3 languages

### Bonus 2: Customizable Moderation Rules
- Allow users to set content policy preferences
- Adjust safety thresholds based on use case (e.g., strict for children's content)

### Bonus 3: Performance Optimization
- Measure execution time for each component
- Implement caching for repeated content
- Optimize parallel execution

### Bonus 4: Explainability Dashboard
- Create a visualization showing:
  - Agent decision flow
  - Confidence scores at each stage
  - Before/after content comparison

### Bonus 5: Iterative Re-evaluation
- If enhanced content scores < 90, run another optimization loop
- Limit to maximum 3 iterations to prevent infinite loops

---

In [77]:
# Bonus 1: Multi-Language Support
# Test content in different languages

# English test
print("\n" + "="*70)
print("BONUS TEST 1: English Content")
print("="*70)
en_content = "Just finished reading an amazing book about AI ethics!"
result_en = run_moderation(en_content)

# Spanish test
print("\n" + "="*70)
print("BONUS TEST 2: Spanish Content")
print("="*70)
es_content = "Acabo de terminar de leer un libro incre√≠ble sobre √©tica en IA!"
result_es = run_moderation(es_content)

# French test
print("\n" + "="*70)
print("BONUS TEST 3: French Content")
print("="*70)
fr_content = "Je viens de terminer la lecture d'un livre incroyable sur l'√©thique de l'IA!"
result_fr = run_moderation(fr_content)

# German test
print("\n" + "="*70)
print("BONUS TEST 4: German Content")
print("="*70)
de_content = "Ich habe gerade ein erstaunliches Buch √ºber KI-Ethik gelesen!"
result_de = run_moderation(de_content)

# for fun
print("\n" + "="*70)
print("BONUS TEST 5: Multiple languages")
print("="*70)
en_content1 = """I love this! Me encanta! C'est magnifique!"""
result_ff = run_moderation(en_content1)


print("\n" + "="*70)
print("MULTI-LANGUAGE SUPPORT SUMMARY")
print("="*70)
print(f"English detected as: {result_en['language']}")
print(f"Spanish detected as: {result_es['language']}")
print(f"French detected as: {result_fr['language']}")
print(f"German detected as: {result_de['language']}")
print("="*70)



BONUS TEST 1: English Content

üì• Input: Just finished reading an amazing book about AI ethics!...
üåê Detected language: English (en)

Analyzing query for routing...
‚úÖ Query routed to: social_media

üìù Grammar: 1.00
üé≠ Tone: 0.70 | Sentiment: positive
üõ°Ô∏è Safety: 1.00 | Flagged: False
üîé Debug state results:
{'score': 1.0, 'flagged': False, 'issues': []}
{'score': 0.7, 'sentiment': 'positive', 'issues': ['informal tone', 'lack of detail', 'exclamation mark']}
{'score': 1.0, 'issues': []}
üìä Overall score: 91.0 | Decision: APPROVE
‚úÖ Approved with 3 suggestions
üîß [Optimizer] Rewrite plan generated
‚ú® [Enhancer] Result: I have just completed reading an insightful book on AI ethics that explores the complex challenges o

üì§ FINAL RESULT
‚úÖ APPROVED (score: 91.0)
Enhanced: I have just completed reading an insightful book on AI ethics that explores the complex challenges of algorithmic bias and accountability.

BONUS TEST 2: Spanish Content

üì• Input: Acabo de t

---

## üìù Evaluation Criteria

Your implementation will be assessed on:

### Functionality (20 points)
- ‚úÖ Router correctly classifies content types
- ‚úÖ Parallel agents execute concurrently
- ‚úÖ Evaluator makes appropriate approve/reject decisions
- ‚úÖ Enhancement chain improves content quality
- ‚úÖ Orchestrator coordinates full pipeline

### Pattern Implementation (20 points)
- ‚úÖ Routing pattern clearly implemented
- ‚úÖ Parallelization working correctly
- ‚úÖ Evaluator-optimizer feedback loop functional
- ‚úÖ Prompt chaining evident in enhancement
- ‚úÖ Orchestrator-worker hierarchy clear

### Code Quality (20 points)
- ‚úÖ Clean, readable code
- ‚úÖ Proper error handling
- ‚úÖ Good documentation/comments
- ‚úÖ Framework best practices followed

### Reflection & Analysis ( **40 points** )
- ‚úÖ Thoughtful framework justification
- ‚úÖ Clear pattern usage documentation
- ‚úÖ Honest challenge/solution discussion
- ‚úÖ Insightful framework reflection

### Bonus Points (up to 10 extra points)
- Optional challenges attempted and completed

---

## üéâ Conclusion

Congratulations on completing this challenge! You've built a sophisticated multi-agent system that combines multiple agentic patterns in a real-world scenario.

### Key Takeaways

Through this challenge, you've learned:
- How to select appropriate frameworks for specific tasks
- How to combine multiple agentic patterns effectively
- How to design complex multi-agent systems
- How to handle real-world challenges in agent development
- How to evaluate and reflect on your architectural decisions

### Next Steps

1. **Experiment**: Try implementing this challenge with a different framework
2. **Extend**: Add more sophisticated features (RAG, custom tools, memory)
3. **Deploy**: Consider how you'd productionize this system
4. **Share**: Document your learnings and share with the community

Keep building, keep learning, and keep pushing the boundaries of what's possible with agentic systems! üöÄ

---

**Happy Coding!** üíª‚ú®