# Validating LangChain Agents with Mellea

This notebook demonstrates how to validate **LangChain agent outputs** using **Mellea's validation API**.

## The Problem

LangChain agents generate outputs that you may want to validate against requirements (e.g., "the response must include a specific value" or "the output should follow a certain format"). Mellea provides a powerful validation API with `m.validate()`, `req()`, and `simple_validate()`, but it expects outputs to come from Mellea's own generation pipeline.

## Our Solution

**Wrap external output in `ModelOutputThunk`** to use Mellea's full native validation API. This enables:
- `req()` and `simple_validate()` - Mellea's standard validation primitives
- Both programmatic and LLM-based validation
- Full IVR (Instruct-Validate-Repair) loop integration

## Table of Contents
1. [The Challenge (Technical Details)](#the-challenge)
2. [Setup](#setup)
3. [Basic Agent (The Problem)](#basic-agent)
4. [Proposed Solution: Native Validation with ModelOutputThunk](#solution)
5. [Alternative Approaches](#alternatives)
6. [Summary](#summary)

---
<a id="the-challenge"></a>
## The Challenge: Why Standard Validation Doesn't Work

### The Naive Approach (and why it fails)

You might expect to validate LangChain output like this:

```python
from mellea.stdlib.requirements.requirement import req, simple_validate

REQUIREMENTS = [
    req("The number must be greater than 5", 
        validation_fn=simple_validate(lambda x: int(x) > 5)),
]

# This fails!
validations = m.validate(REQUIREMENTS, output=langchain_response)
```

This fails with: `AssertionError: Context has no appropriate last output`

### Why It Fails

Mellea's validation expects a `ModelOutputThunk` in the context:

1. `m.validate()` looks for the last output using `ctx.last_output()`
2. `last_output()` searches for a `ModelOutputThunk` object
3. External strings (like LangChain responses) are stored as plain `CBlock` objects
4. No `ModelOutputThunk` found â†’ validation fails

### The Solution

**Wrap external output in `ModelOutputThunk`**:

```python
from mellea.core.base import ModelOutputThunk

thunk = ModelOutputThunk(value=langchain_response)
ctx = ctx.add(thunk)
m = mellea.start_session(ctx=ctx)
m.validate(REQUIREMENTS)  # Now it works!
```

This is the key insight that enables native Mellea validation on external agent outputs.

---
<a id="setup"></a>
## Setup

First, let's import all necessary libraries and define the common components used across all approaches.

In [None]:
# install dependencies, if needed
%pip install langchain langchain-ollama mellea

In [1]:
import random
import re

from langchain_ollama import ChatOllama
from langchain_core.messages import HumanMessage, AIMessage, SystemMessage
from langchain_core.tools import tool
from langchain.agents import create_agent

import mellea
from mellea.core.base import ModelOutputThunk
from mellea.stdlib.components.chat_converters import langchain_messages_to_mellea
from mellea.stdlib.context import ChatContext
from mellea.stdlib.requirements.requirement import req, simple_validate

### Define Tools

These are the tools our LangChain agent will use. They're shared across all approaches.

In [2]:
@tool
def random_number() -> int:
    """Generate a random number between 1 and 10."""
    return random.randint(1, 10)


@tool
def get_team_name() -> str:
    """Returns a random team name."""
    return random.choice(["AI Foundations", "Quantum Plus AI"])


TOOLS = [random_number, get_team_name]

### Create the LLM and Base Agent

In [3]:
llm = ChatOllama(model="granite4:latest", temperature=0.0)

# Basic agent without validation
basic_agent = create_agent(
    model=llm, tools=TOOLS, system_prompt="You are a helpful AI agent."
)

---
<a id="basic-agent"></a>
## Basic Agent (No Validation)

First, let's see the basic agent in action without any validation. The agent will generate a random number and tell us the team name, but there's no guarantee the number will be greater than 5.

In [4]:
print("=== Basic Agent Result (No Validation) ===")
for i in range(5):
    result = basic_agent.invoke(
        {
            "messages": [
                {
                    "role": "user",
                    "content": "Generate a random number and tell me what team I'm on.",
                }
            ]
        }
    )
    last_message = result["messages"][-1]
    print(last_message.content)

=== Basic Agent Result (No Validation) ===
The random number generated is **3**. You are on the team named **Quantum Plus AI**.
The random number generated is **3**. You are on the team named **AI Foundations**.
The random number generated is **3**. You are on the team named **Quantum Plus AI**.
The random number generated is **7**. You are on the team named **Quantum Plus AI**.
The random number generated is **5**. You are on the team named **AI Foundations**.


**Problem**: The agent might return a number <= 5 or mention the wrong team. Without validation, we have no way to enforce requirements on the output.

Now let's see how to solve this with Mellea validation.

---
<a id="solution"></a>
## Proposed Solution: Native Validation with ModelOutputThunk

This is our **recommended approach** for validating LangChain agent outputs. By wrapping the agent's output in a `ModelOutputThunk`, we can use Mellea's full native validation API.

**Benefits:**
- Use `req()` and `simple_validate()` - Mellea's standard validation primitives
- Support both programmatic validation functions and LLM-based validation
- Get validation scores, reasons, and metadata
- Integrate with Mellea's IVR (Instruct-Validate-Repair) loop

### How It Works

1. Run your LangChain agent to get output
2. Convert the conversation context to Mellea format (excluding the last assistant message)
3. Wrap the agent's output in `ModelOutputThunk`
4. Use `m.validate()` with your requirements

In [5]:
def validate_langchain_output(
    lc_messages: list, requirements: list, verbose: bool = True
) -> list:
    """Validate a LangChain agent's output using Mellea's native validation API.

    Args:
        lc_messages: List of LangChain messages (the last one should be AIMessage with output to validate)
        requirements: List of Mellea requirements (created with req())
        verbose: Print progress information

    Returns:
        List of ValidationResult objects from m.validate()
    """
    if verbose:
        print("--- LangChain Conversation ---")
        for msg in lc_messages:
            print(
                f"  [{msg.type}] {msg.content[:80]}{'...' if len(msg.content) > 80 else ''}"
            )

    # Step 1: Convert messages EXCEPT the last assistant message to Mellea format
    lc_context = lc_messages[:-1]
    mellea_messages = langchain_messages_to_mellea(lc_context)

    # Step 2: Build ChatContext with converted messages
    ctx = ChatContext()
    for msg in mellea_messages:
        ctx = ctx.add(msg)

    # Step 3: Wrap the agent's output in ModelOutputThunk
    agent_output = lc_messages[-1].content
    thunk = ModelOutputThunk(value=agent_output)
    ctx = ctx.add(thunk)

    if verbose:
        print(f"\n--- Output to Validate ---")
        print(f"  '{agent_output}'")

    # Step 4: Start Mellea session and validate
    m = mellea.start_session(ctx=ctx)
    validations = m.validate(requirements)

    if verbose:
        print(f"\n--- Validation Results ---")
        for i, v in enumerate(validations):
            status = "PASS" if v.as_bool() else "FAIL"
            print(f"  [{status}] {requirements[i].description}")
            if v.reason:
                print(f"         Reason: {v.reason}")

    return validations

### Example: Validating Agent Output

Let's use our helper function to validate LangChain agent output against requirements.

In [6]:
print("=" * 60)
print("Proposed Solution: Native Validation with ModelOutputThunk")
print("=" * 60)

# Run the LangChain agent
result = basic_agent.invoke(
    {
        "messages": [
            {
                "role": "user",
                "content": "Generate a random number and tell me what team I'm on.",
            }
        ]
    }
)

# Build the conversation as LangChain messages
lc_conversation = [
    SystemMessage(content="You are a helpful AI agent."),
    HumanMessage(content="Generate a random number and tell me what team I'm on."),
    result["messages"][-1],  # The AIMessage from the agent
]

# Define requirements using Mellea's native API
REQUIREMENTS = [
    req(
        "The number must be greater than 5",
        validation_fn=simple_validate(
            lambda x: any(n in x for n in ["6", "7", "8", "9", "10"])
        ),
    ),
    req("The response should mention a team name"),  # LLM-based validation
]

# Validate!
validations = validate_langchain_output(lc_conversation, REQUIREMENTS)

Proposed Solution: Native Validation with ModelOutputThunk
--- LangChain Conversation ---
  [system] You are a helpful AI agent.
  [human] Generate a random number and tell me what team I'm on.
  [ai] The random number generated is **3**. You are on the team named **Quantum Plus A...

--- Output to Validate ---
  'The random number generated is **3**. You are on the team named **Quantum Plus AI**.'
Starting Mellea session: backend=ollama, model=granite4:micro, context=ChatContext[0m

--- Validation Results ---
  [FAIL] The number must be greater than 5
  [PASS] The response should mention a team name
         Reason: yes


### Building a Full IVR Loop

We can wrap the validation in an Instruct-Validate-Repair loop that retries until requirements pass.

In [14]:
def run_agent_with_native_ivr(
    agent,
    user_input: str,
    requirements: list,
    loop_budget: int = 5,
    verbose: bool = True,
) -> dict:
    """Run a LangChain agent with Mellea native IVR validation.

    This wraps agent execution in an Instruct-Validate-Repair loop using
    Mellea's native validation API via ModelOutputThunk.
    """
    messages = [{"role": "user", "content": user_input}]

    for attempt in range(1, loop_budget + 1):
        if verbose:
            print(f"\n--- Attempt {attempt}/{loop_budget} ---")

        # INSTRUCT: Run the agent
        result = agent.invoke({"messages": messages})
        response = result["messages"][-1].content

        if verbose:
            print(
                f"Agent response: {response[:100]}{'...' if len(response) > 100 else ''}"
            )

        # Build LangChain conversation for validation
        lc_conversation = [
            SystemMessage(content="You are a helpful AI agent."),
            HumanMessage(content=user_input),
            AIMessage(content=response),
        ]

        # VALIDATE: Use native Mellea validation
        validations = validate_langchain_output(
            lc_conversation, requirements, verbose=False
        )

        # Check results
        failed_reqs = []
        for i, v in enumerate(validations):
            status = "PASS" if v.as_bool() else "FAIL"
            if verbose:
                print(f"  [{status}] {requirements[i].description}")
            if not v.as_bool():
                failed_reqs.append(requirements[i].description)

        if not failed_reqs:
            if verbose:
                print("All requirements passed!")
            return {"content": response, "success": True, "attempts": attempt}

        # REPAIR: Provide feedback for retry
        if attempt < loop_budget:
            repair_message = (
                f"Your response did not meet: {failed_reqs}. Please try again."
            )
            messages = [
                {"role": "user", "content": user_input},
                {"role": "assistant", "content": response},
                {"role": "user", "content": repair_message},
            ]

    return {"content": response, "success": False, "attempts": loop_budget}


# Run with IVR
print("=" * 60)
print("Proposed Solution: Full IVR Loop with Native Validation")
print("=" * 60)

STRICT_REQUIREMENTS = [
    req(
        "The number must be greater than 5",
        validation_fn=simple_validate(
            lambda x: any(n in x for n in ["6", "7", "8", "9", "10"])
        ),
    ),
    req("The team name must be AI Foundations"),  # Specific team required
]

ivr_result = run_agent_with_native_ivr(
    basic_agent,
    "Generate a random number and tell me what team I'm on.",
    STRICT_REQUIREMENTS,
)

print(f"\n{'=' * 60}")
print(f"FINAL: Success={ivr_result['success']}, Attempts={ivr_result['attempts']}")
print(f"Response: {ivr_result['content']}")

Proposed Solution: Full IVR Loop with Native Validation

--- Attempt 1/5 ---
Agent response: The random number generated is **4**. You are on the team named **AI Foundations**.
Starting Mellea session: backend=ollama, model=granite4:micro, context=ChatContext[0m
  [FAIL] The number must be greater than 5
  [PASS] The team name must be AI Foundations

--- Attempt 2/5 ---
Agent response: The random number generated is **7**. You are on the team named **Quantum Plus AI**.
Starting Mellea session: backend=ollama, model=granite4:micro, context=ChatContext[0m
  [PASS] The number must be greater than 5
  [FAIL] The team name must be AI Foundations

--- Attempt 3/5 ---
Agent response: The team name is **AI Foundations**.
Starting Mellea session: backend=ollama, model=granite4:micro, context=ChatContext[0m
  [FAIL] The number must be greater than 5
  [PASS] The team name must be AI Foundations

--- Attempt 4/5 ---
Agent response: The random number generated is **3**, and you are on the team 

---
<a id="alternatives"></a>
## Alternative Approaches

The following approaches are alternatives that may be useful in specific scenarios.

### Alternative 1: Validation as a Tool

Give the LangChain agent a **validation tool** that it can call to self-check its responses.

**When to use:**
- You want the agent to have autonomy over when to validate
- Requirements might vary based on context
- You want the agent to reason about and explain its validation decisions

#### Define the Validation Tool

In [None]:
@tool
def validate_response(response: str, requirements: list[str]) -> dict:
    """Validate a response against requirements using Mellea's native validation.

    Use this tool to check if a response meets specific requirements before
    providing it to the user. This helps ensure response quality.

    Args:
        response: The response text to validate
        requirements: List of requirement descriptions to check against

    Returns:
        Dict with 'valid' (bool), 'failed_requirements' (list), and 'details' (dict)
    """
    # Build a minimal context with the response as ModelOutputThunk
    ctx = ChatContext()
    ctx = ctx.add(ModelOutputThunk(value=response))
    m = mellea.start_session(ctx=ctx)

    failed = []
    details = {}

    for requirement in requirements:
        # Create a requirement and validate
        # For tool use, we use LLM-based validation (no validation_fn)
        reqs = [req(requirement)]
        validations = m.validate(reqs)

        passed = validations[0].as_bool()
        details[requirement] = "PASS" if passed else "FAIL"
        if not passed:
            failed.append(requirement)

    return {
        "valid": len(failed) == 0,
        "failed_requirements": failed,
        "details": details,
    }

#### Create Agent with Validation Tool

In [None]:
# Include the validation tool alongside domain tools
TOOLS_WITH_VALIDATION = [random_number, get_team_name, validate_response]

# The system prompt encourages (but doesn't require) validation
agent_with_validation_tool = create_agent(
    model=llm,
    tools=TOOLS_WITH_VALIDATION,
    system_prompt=(
        "You are a helpful AI agent. "
        "When appropriate, use the validate_response tool to check your answers "
        "meet quality requirements before providing them to the user."
    ),
)

#### Pattern 1: Simple Query (Validation Optional)

The agent has the validation tool available but isn't explicitly asked to use it.

In [None]:
print("=" * 60)
print("Alternative 1: Validation as a Tool")
print("Pattern 1: Simple query (validation optional)")
print("=" * 60)

result = agent_with_validation_tool.invoke(
    {
        "messages": [
            {
                "role": "user",
                "content": "Generate a random number and tell me what team I'm on.",
            }
        ]
    }
)

print("\n=== FINAL RESULT ===")
last_message = result["messages"][-1]
print(last_message.content)

#### Pattern 2: Explicit Validation Request

Here we explicitly ask the agent to validate its response against specific requirements.

In [None]:
print("=" * 60)
print("Alternative 1: Validation as a Tool")
print("Pattern 2: Explicit validation request")
print("=" * 60)

result_with_validation = agent_with_validation_tool.invoke(
    {
        "messages": [
            {
                "role": "user",
                "content": (
                    "Generate a random number and tell me what team I'm on. "
                    "Please validate your response meets these requirements: "
                    "1) The random number must be greater than 5, "
                    "2) The team name 'AI Foundations' must be mentioned. "
                    "If validation fails, try again until all requirements pass."
                ),
            }
        ]
    }
)

print("\n=== FINAL RESULT (with explicit validation request) ===")
last_message = result_with_validation["messages"][-1]
print(last_message.content)

### Alternative 2: Message Conversion for Hybrid Workflows

If you need to pass full conversation context between LangChain and Mellea (not just for validation), you can convert messages between formats.

Mellea provides two conversion strategies:

| Strategy | Function | Description |
|----------|----------|-------------|
| **Direct** | `langchain_messages_to_mellea()` | Parses LangChain message attributes directly |
| **Via OpenAI** | `langchain_messages_to_mellea_via_openai()` | Uses LangChain's `convert_to_openai_messages()` as intermediate |

**When to use**: Building hybrid systems where LangChain handles orchestration and Mellea handles specific tasks like generation or complex reasoning.

#### Example: Round-Trip Conversion

In [None]:
from mellea.stdlib.components.chat_converters import (
    langchain_messages_to_mellea,
    mellea_messages_to_langchain,
)
from mellea.stdlib.components import Message

# LangChain conversation
lc_conversation = [
    SystemMessage(content="You are a helpful assistant."),
    HumanMessage(content="What's the capital of France?"),
    AIMessage(content="The capital of France is Paris."),
]

# Convert to Mellea
mellea_messages = langchain_messages_to_mellea(lc_conversation)
print("LangChain -> Mellea:")
for msg in mellea_messages:
    print(f"  [{msg.role}] {msg.content}")

# Use with Mellea
ctx = ChatContext()
for msg in mellea_messages:
    ctx = ctx.add(msg)
m = mellea.start_session(ctx=ctx)
response = m.chat("What about Germany?")
print(f"\nMellea response: {response.content}")

# Convert back to LangChain
updated_messages = list(mellea_messages) + [
    Message(role="user", content="What about Germany?"),
    Message(role="assistant", content=response.content),
]
lc_messages_back = mellea_messages_to_langchain(updated_messages)
print(f"\nMellea -> LangChain: {len(lc_messages_back)} messages")
print(f"Types: {[type(m).__name__ for m in lc_messages_back]}")

### Alternative 3: Validation with m.chat() (Legacy)

Before we discovered the `ModelOutputThunk` approach, we used `m.chat()` to ask yes/no questions for validation. This approach still works but doesn't use Mellea's native validation API.

**When to use**: If you prefer simpler code and don't need validation metadata, or if you're working with an older version of Mellea.

In [None]:
def validate_with_chat(response: str, requirements: list[dict]) -> list[dict]:
    """Validate using m.chat() instead of m.validate().

    Args:
        response: The response to validate
        requirements: List of dicts with 'description' and 'prompt' keys

    Returns:
        List of dicts with 'requirement', 'passed', and 'answer' keys
    """
    m = mellea.start_session()
    results = []

    for req_item in requirements:
        m.reset()
        prompt = (
            f"{req_item['prompt']}\n\nOutput:\n{response}\n\nAnswer YES or NO only."
        )
        answer = m.chat(prompt).content.strip().upper()
        results.append(
            {
                "requirement": req_item["description"],
                "passed": answer.startswith("YES"),
                "answer": answer,
            }
        )

    return results


# Example
CHAT_REQUIREMENTS = [
    {
        "description": "Number > 5",
        "prompt": "Does this contain a number greater than 5?",
    },
    {"description": "Mentions team", "prompt": "Does this mention a team name?"},
]

test_response = "The random number is 7. You are on the AI Foundations team."
results = validate_with_chat(test_response, CHAT_REQUIREMENTS)

print("Alternative 3: Validation with m.chat()")
for r in results:
    status = "PASS" if r["passed"] else "FAIL"
    print(f"  [{status}] {r['requirement']} (LLM said: {r['answer']})")

---
<a id="summary"></a>
## Summary

### Recommended Approach

**Use `ModelOutputThunk` for native validation.** This enables the full power of Mellea's validation API:

```python
from mellea.core.base import ModelOutputThunk
from mellea.stdlib.requirements.requirement import req, simple_validate

# Wrap external output
thunk = ModelOutputThunk(value=langchain_response)
ctx = ctx.add(thunk)

# Use native validation
m = mellea.start_session(ctx=ctx)
validations = m.validate([
    req("Requirement 1", validation_fn=simple_validate(lambda x: ...)),
    req("Requirement 2"),  # LLM-based
])
```

### When to Use Each Approach

| Approach | Best For |
|----------|----------|
| **ModelOutputThunk + m.validate()** | Full validation features, programmatic + LLM validation *(recommended)* |
| **Validation as Tool** | Agent autonomy, context-dependent validation |
| **Message Conversion** | Hybrid workflows, passing context between frameworks |
| **m.chat() Workaround** | Simple cases, legacy compatibility |

### Key Insight

Mellea's `m.validate()` expects a `ModelOutputThunk` in the context. By manually creating one with `ModelOutputThunk(value=external_output)`, we bridge the gap between external agent outputs and Mellea's native validation API.