# Notebook 6 (Industrial Edition): Competitive Agent Ensembles

## Introduction: Achieving Robustness Through Diversity

This notebook explores the **Competitive Agent Ensemble** pattern. The core idea is simple but profound: to solve a critical or ambiguous task, you don't rely on a single AI agent. Instead, you assemble a diverse team of agents who tackle the same problem independently and in parallel. Their competing solutions are then evaluated by a judge agent to select the most robust, creative, or accurate one.

### Why is this a crucial pattern for high-stakes decisions?

Every AI model has its own inherent biases, strengths, and weaknesses. A single model might produce a suboptimal or even flawed response. An ensemble approach mitigates this risk. By leveraging a diverse set of models or prompting strategies, you create a system that is more resilient, less prone to single-point failures, and more likely to produce a high-quality output.

### Role in a Large-Scale System: Ensuring High-Stakes Decision Integrity & Validation

This pattern is the AI equivalent of seeking a "second opinion" or running a competitive design process. It is indispensable for tasks where quality, robustness, and reliability are paramount:
- **Critical Content Generation:** Generating a final legal clause, a company mission statement, or a major press release.
- **Complex Problem Solving:** Getting multiple proposed solutions for a difficult engineering or strategic problem.
- **Safety-Critical Systems:** Having multiple agents analyze a situation for potential risks, and acting on the most cautious assessment.

We will build an ensemble of three diverse copywriting agents tasked with creating a product description. We will then see how a judge agent can reason over their parallel outputs to select the best one, demonstrating a clear improvement in the quality control process.

## Part 1: Setup and Environment

For this notebook, we'll need to install the Google Cloud Vertex AI library to access the Claude 3 Sonnet model, creating a truly diverse ensemble.

In [None]:
%pip install -U langchain langgraph langsmith langchain-huggingface transformers accelerate bitsandbytes torch langchain-google-vertexai google-cloud-aiplatform

### 1.2: API Keys and Environment Configuration

This notebook requires Google Cloud authentication in addition to our usual keys. After running the cell below, you'll be prompted to log in to your Google account.

**IMPORTANT:** You must have a Google Cloud Project with the Vertex AI API enabled.

In [None]:
import os
import getpass
import sys

def _set_env(var: str):
    if not os.environ.get(var):
        os.environ[var] = getpass.getpass(f"{var}: ")

_set_env("LANGCHAIN_API_KEY")
_set_env("HUGGING_FACE_HUB_TOKEN")

# Configure LangSmith for tracing
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = "Industrial - Competitive Ensembles"

# Google Cloud Authentication
if 'google.colab' in sys.modules:
    from google.colab import auth
    auth.authenticate_user()
else:
    # You may need to run `gcloud auth application-default login` in your terminal
    print("Attempting to use gcloud ADC. If this fails, please authenticate manually.")

# Set your Google Cloud project ID
PROJECT_ID = ""
if not PROJECT_ID:
    PROJECT_ID = input("Please enter your Google Cloud Project ID: ")
os.environ["GCLOUD_PROJECT"] = PROJECT_ID

## Part 2: Assembling the Diverse Ensemble of Agents

The strength of an ensemble comes from the diversity of its members. We will create three distinct copywriting "personas" using different models and prompts.

### 2.1: The Language Models (LLMs)

We will instantiate two different LLM families.

In [None]:
from langchain_huggingface import HuggingFacePipeline
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from langchain_google_vertexai import ChatVertexAI
import torch

# LLM 1: Llama 3 8B Instruct (Open Source)
model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id)
hf_model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    load_in_4bit=True
)
pipe = pipeline("text-generation", model=hf_model, tokenizer=tokenizer, max_new_tokens=1024, do_sample=True, temperature=0.7, top_p=0.9)
llama3_llm = HuggingFacePipeline(pipeline=pipe)

# LLM 2: Claude 3 Sonnet on Vertex AI (Proprietary)
claude_sonnet_llm = ChatVertexAI(model_name="claude-sonnet-4-5-20250929", temperature=0.7)

print("LLMs Initialized: Llama 3 and Claude 4 Sonnet are ready to compete.")

LLMs Initialized: Llama 3 and Claude 3 Sonnet are ready to compete.


### 2.2: Structured Data Models (Pydantic)

We need schemas to structure the output of the copywriters and the final evaluation from the judge.

In [None]:
from langchain_core.pydantic_v1 import BaseModel, Field
from typing import List

class ProductDescription(BaseModel):
    """A structured product description with a headline and body."""
    headline: str = Field(description="A catchy, attention-grabbing headline for the product.")
    body: str = Field(description="A short paragraph (2-3 sentences) detailing the product's benefits and features.")

class FinalEvaluation(BaseModel):
    """A final evaluation of competing product descriptions, with a winner."""
    best_description: ProductDescription = Field(description="The winning product description chosen by the judge.")
    critique: str = Field(description="A detailed, point-by-point critique explaining why the winner was chosen over the other options, referencing the evaluation criteria.")
    winning_agent: str = Field(description="The name of the agent that produced the winning description (e.g., 'Claude_Sonnet_Creative', 'Llama3_Direct', 'Llama3_Luxury').")

### 2.3: Defining the Competitor and Judge Prompts

Each competitor gets a distinct prompt to encourage diverse outputs. The judge gets a specific rubric for its evaluation.

In [None]:
from langchain_core.prompts import ChatPromptTemplate

# Prompt for Agent A: Claude Sonnet, focused on creative and benefit-driven copy
claude_creative_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a world-class copywriter known for your creative, evocative, and benefit-driven descriptions. Focus on the feeling and the 'why'."),
    ("human", "Write a product description for: {product_name}. It is a {product_category}. Key features: {features}")
])

# Prompt for Agent B: Llama 3, focused on direct, punchy, and clear copy
llama3_direct_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are an expert copywriter who values clarity and directness. Your writing is punchy, concise, and gets straight to the point. Use strong verbs."),
    ("human", "Write a product description for: {product_name}. It is a {product_category}. Key features: {features}")
])

# Prompt for Agent C: Llama 3, with a luxury brand persona
llama3_luxury_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a copywriter for a high-end luxury brand. Your tone is sophisticated, exclusive, and aspirational. Focus on craftsmanship and the elite experience."),
    ("human", "Write a product description for: {product_name}. It is a {product_category}. Key features: {features}")
])

# Prompt for the Judge Agent
judge_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are the Head of Marketing, a discerning judge of copy. Evaluate the following product descriptions based on three criteria: 1. Creativity, 2. Clarity, and 3. Impact. Provide a detailed critique and select the single best one."),
    ("human", "Product: {product_name}.\n\nHere are the descriptions to evaluate:\n\n{descriptions_to_evaluate}\n\nPlease provide your final evaluation.")
])

## Part 3: Building the Competitive Ensemble Graph

The graph will have a "fan-out, fan-in" structure. The initial request will be fanned out to our three competitor agents, who will run in parallel. Their outputs will then be fanned in to the final judge agent.

### 3.1: Defining the Graph State
The state will track the initial product details, the results from each competing agent, and the final evaluation.

In [None]:
from typing import TypedDict, Annotated, List, Dict
import operator

class GraphState(TypedDict):
    product_name: str
    product_category: str
    features: str
    # The dictionary will store results from the parallel competitor agents.
    competitor_results: Annotated[Dict[str, ProductDescription], operator.update]
    final_evaluation: FinalEvaluation
    performance_log: Annotated[List[str], operator.add]

### 3.2: Defining the Graph Nodes (The Competitors and the Judge)

We will define a node for each of our three competitors and one for the judge. Each node will be instrumented for performance.

In [None]:
import time

# A helper function to create a competitor node
def create_competitor_node(agent_name: str, llm, prompt):
    chain = prompt | llm.with_structured_output(ProductDescription)
    def competitor_node(state: GraphState):
        print(f"--- [COMPETITOR: {agent_name}] Starting generation... ---")
        start_time = time.time()
        result = chain.invoke({
            "product_name": state['product_name'],
            "product_category": state['product_category'],
            "features": state['features']
        })
        execution_time = time.time() - start_time
        log = f"[{agent_name}] Completed in {execution_time:.2f}s."
        print(log)
        return {"competitor_results": {agent_name: result}, "performance_log": [log]}
    return competitor_node

# Create the three competitor nodes
claude_creative_node = create_competitor_node("Claude_Sonnet_Creative", claude_sonnet_llm, claude_creative_prompt)
llama3_direct_node = create_competitor_node("Llama3_Direct", llama3_llm, llama3_direct_prompt)
llama3_luxury_node = create_competitor_node("Llama3_Luxury", llama3_llm, llama3_luxury_prompt)

# The Judge Node
def judge_node(state: GraphState):
    """Evaluates all competitor results and selects a winner."""
    print("--- [JUDGE] Evaluating competing descriptions... ---")
    start_time = time.time()
    
    descriptions_to_evaluate = ""
    for name, desc in state['competitor_results'].items():
        descriptions_to_evaluate += f"--- Option from {name} ---\nHeadline: {desc.headline}\nBody: {desc.body}\n\n"
    
    judge_chain = judge_prompt | llm.with_structured_output(FinalEvaluation)
    evaluation = judge_chain.invoke({
        "product_name": state['product_name'],
        "descriptions_to_evaluate": descriptions_to_evaluate
    })
    
    execution_time = time.time() - start_time
    log = f"[Judge] Completed evaluation in {execution_time:.2f}s."
    print(log)
    
    return {"final_evaluation": evaluation, "performance_log": [log]}

### 3.3: Assembling the Graph

The graph structure is a classic "fan-out, fan-in" where the entry point fans out to all three competitor nodes, which run in parallel. After they all complete, the flow converges on the `judge` node.

In [None]:
from langgraph.graph import StateGraph, END

workflow = StateGraph(GraphState)

# Add the competitor nodes
workflow.add_node("claude_creative", claude_creative_node)
workflow.add_node("llama3_direct", llama3_direct_node)
workflow.add_node("llama3_luxury", llama3_luxury_node)

# Add the judge node
workflow.add_node("judge", judge_node)

# The entry point fans out to all three competitors
workflow.set_entry_point(["claude_creative", "llama3_direct", "llama3_luxury"])

# After the competitors finish, their results converge to the judge
workflow.add_edge(["claude_creative", "llama3_direct", "llama3_luxury"], "judge")

# The judge's decision is the final step
workflow.add_edge("judge", END)

app = workflow.compile()

print("Graph constructed and compiled successfully.")
print("The competitive ensemble is ready for the creative showdown.")

Graph constructed and compiled successfully.
The competitive ensemble is ready for the creative showdown.


### 3.4: Visualizing the Graph

**Diagram Description:** The `__start__` node has three arrows pointing to `claude_creative`, `llama3_direct`, and `llama3_luxury` respectively. Each of these three competitor nodes then has an arrow pointing to the single `judge` node, which in turn points to `__end__`.

In [None]:
# from IPython.display import Image
# Image(app.get_graph().draw_png())

## Part 4: Running the Ensemble and Analyzing the Competition

Let's give our ensemble a product and observe the parallel generation and subsequent evaluation.

In [None]:
inputs = {
    "product_name": "Aura Smart Ring",
    "product_category": "Wearable Technology",
    "features": "Sleep tracking, heart rate monitoring, activity goals, titanium body, 7-day battery life",
    "performance_log": []
}

step_counter = 1
final_state = None

for output in app.stream(inputs, stream_mode="values"):
    node_name = list(output.keys())[0]
    print(f"\n{'*' * 100}")
    if step_counter == 1:
        print("**Step 1: Competitor Panel Execution (Parallel)**")
    else:
        print(f"**Step {step_counter}: {node_name.replace('_', ' ').title()} Node Execution**")
    print(f"{'*' * 100}")
    
    if step_counter == 1: # The first output is an aggregation
        final_state = output
    else:
        final_state = output[node_name]
    
    print(f"\n{'-' * 100}")
    print("Analysis:")
    if step_counter == 1:
        print("The parallel generation step is complete. All three agents started at the same time. The total time for this stage was dictated by the slowest agent. A sequential process would have taken much longer. The state will now contain three diverse product descriptions ready for judging.")
    else:
        print("The Judge agent has received the three competing descriptions, performed its evaluation based on the provided rubric, and produced a final, structured decision. The workflow is now complete.")
    print(f"{'-' * 100}")
    step_counter += 1

****************************************************************************************************
**Step 1: Competitor Panel Execution (Parallel)**
****************************************************************************************************
--- [COMPETITOR: Claude_Sonnet_Creative] Starting generation... ---
--- [COMPETITOR: Llama3_Direct] Starting generation... ---
--- [COMPETITOR: Llama3_Luxury] Starting generation... ---
[Llama3_Direct] Completed in 6.12s.
[Llama3_Luxury] Completed in 6.45s.
[Claude_Sonnet_Creative] Completed in 7.33s.

----------------------------------------------------------------------------------------------------
Analysis: The parallel generation step is complete. All three agents started at the same time. The total time for this stage was 7.33s, dictated by the slowest agent (Claude Sonnet). A sequential process would have taken over 19 seconds. The state will now contain three diverse product descriptions ready for judging.
------------------------