# External Validation API

This notebook demonstrates Mellea's **External Validation API** - a high-level interface for validating outputs from external LLM frameworks (LangChain, OpenAI SDK, etc.).

## The Problem

When using external LLM frameworks like LangChain or the OpenAI SDK, you get outputs in their native formats (LangChain messages, OpenAI dicts, etc.). To validate these outputs with Mellea, you previously needed to:

1. Convert messages to Mellea format manually
2. Wrap outputs in `ModelOutputThunk`
3. Build a `ChatContext` with the converted messages
4. Start a Mellea session and call `m.validate()`

This was verbose and error-prone (see the [LangChain IVR notebook](langchain_mellea_ivr.ipynb) for the manual approach).

## The Solution

The **External Validation API** (`mellea.stdlib.interop`) provides:

- `external_validate()` - One-shot validation with automatic format detection
- `ExternalSession` - Session-based validation with factory methods
- `external_ivr()` - Full Instruct-Validate-Repair loop helper

## Table of Contents
1. [Setup](#setup)
2. [Quick Start: external_validate()](#quick-start)
3. [Session-Based Validation](#session-based)
4. [Full IVR Loop](#ivr-loop)
5. [Supported Formats](#formats)
6. [Summary](#summary)

---
<a id="setup"></a>
## Setup

First, let's import the necessary modules and set up a backend.

In [None]:
# Install dependencies if needed
%pip install mellea langchain langchain-ollama

In [None]:
from mellea import external_validate, ExternalSession
from mellea.stdlib.interop import external_ivr, IVRResult
from mellea.stdlib.requirements import req, simple_validate
from mellea.backends import OllamaBackend

# Optional: LangChain for examples
try:
    from langchain_core.messages import AIMessage, HumanMessage, SystemMessage

    HAS_LANGCHAIN = True
except ImportError:
    HAS_LANGCHAIN = False
    print("LangChain not installed - LangChain examples will be skipped")

In [None]:
# Create a backend for validation
# You can use any Mellea backend (Ollama, OpenAI, Anthropic, etc.)
backend = OllamaBackend(model="granite4:micro")
print(f"Using backend: {backend}")

---
<a id="quick-start"></a>
## Quick Start: external_validate()

The simplest way to validate external output is with `external_validate()`. It accepts:

- **output**: String, OpenAI dict, LangChain message, or Mellea Message
- **requirements**: List of strings (LLM-as-judge) or `Requirement` objects
- **backend**: A Mellea backend for LLM-based validation
- **context** (optional): Conversation history in any supported format

In [None]:
# Example 1: Validate a simple string output
output = "The capital of France is Paris. It's known for the Eiffel Tower."

results = external_validate(
    output=output,
    requirements=["Must mention a city name", "Must be factually correct"],
    backend=backend,
)

print("String Output Validation:")
for i, result in enumerate(results):
    status = "PASS" if result.as_bool() else "FAIL"
    print(f"  [{status}] Requirement {i + 1}")
    if result.reason:
        print(f"          Reason: {result.reason}")

In [None]:
# Example 2: Validate with programmatic requirements
output = "The answer is 42."

results = external_validate(
    output=output,
    requirements=[
        # Programmatic validation with simple_validate
        req(
            "Must contain a number",
            validation_fn=simple_validate(lambda x: any(c.isdigit() for c in x)),
        ),
        # LLM-as-judge validation (string requirement)
        "Must provide an explanation",
    ],
    backend=backend,
)

print("Mixed Validation (Programmatic + LLM):")
print(
    f"  [{'PASS' if results[0].as_bool() else 'FAIL'}] Contains number (programmatic)"
)
print(f"  [{'PASS' if results[1].as_bool() else 'FAIL'}] Has explanation (LLM-judged)")

In [None]:
# Example 3: Validate OpenAI-format output
openai_response = {
    "role": "assistant",
    "content": "I'd be happy to help you with that task!",
}

results = external_validate(
    output=openai_response, requirements=["Must be polite and helpful"], backend=backend
)

print(f"OpenAI Format Validation: [{'PASS' if results[0].as_bool() else 'FAIL'}]")

In [None]:
# Example 4: Validate with conversation context
context = [
    {"role": "system", "content": "You are a math tutor."},
    {"role": "user", "content": "What is 15 + 27?"},
]

output = "15 + 27 = 42"

results = external_validate(
    output=output,
    requirements=["Must answer the user's question", "Must show the calculation"],
    backend=backend,
    context=context,
)

print("Validation with Context:")
for i, result in enumerate(results):
    print(f"  [{('PASS' if result.as_bool() else 'FAIL')}] Requirement {i + 1}")

---
<a id="session-based"></a>
## Session-Based Validation with ExternalSession

`ExternalSession` provides a class-based interface for validation with convenient factory methods:

- `ExternalSession.from_output()` - Create from any output format
- `ExternalSession.from_openai()` - Create from OpenAI message list
- `ExternalSession.from_langchain()` - Create from LangChain message list

This is useful when you want to:
- Validate multiple requirement sets against the same output
- Access the `output` property directly
- Use the convenience `all_passed()` method

In [None]:
# Create a session from a string output
session = ExternalSession.from_output(
    output="The meeting is scheduled for 3pm tomorrow.", backend=backend
)

print(f"Session output: {session.output}")

# Validate with the session
results = session.validate(["Must mention a time", "Must be about scheduling"])

print(f"\nValidation Results:")
for i, r in enumerate(results):
    print(f"  [{('PASS' if r.as_bool() else 'FAIL')}] Requirement {i + 1}")

In [None]:
# Create a session from OpenAI-format messages
openai_messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Tell me a joke."},
    {
        "role": "assistant",
        "content": "Why did the scarecrow win an award? Because he was outstanding in his field!",
    },
]

session = ExternalSession.from_openai(openai_messages, backend)

print(f"Output (auto-detected from last assistant message):")
print(f"  '{session.output}'")

# Use all_passed() for a simple boolean check
is_valid = session.all_passed(["Must be a joke", "Must be family-friendly"])
print(f"\nAll requirements passed: {is_valid}")

In [None]:
# Create a session from LangChain messages (if available)
if HAS_LANGCHAIN:
    lc_messages = [
        SystemMessage(content="You are a travel assistant."),
        HumanMessage(content="What's a good destination for beach lovers?"),
        AIMessage(
            content="I recommend Bali, Indonesia! It has beautiful beaches, great surfing, and amazing cultural experiences."
        ),
    ]

    session = ExternalSession.from_langchain(lc_messages, backend)

    print(f"Output (from LangChain AIMessage):")
    print(f"  '{session.output}'")

    results = session.validate(
        ["Must recommend a specific destination", "Must mention beaches"]
    )

    print(f"\nValidation:")
    for i, r in enumerate(results):
        print(f"  [{('PASS' if r.as_bool() else 'FAIL')}] Requirement {i + 1}")
else:
    print("Skipping LangChain example (langchain not installed)")

---
<a id="ivr-loop"></a>
## Full IVR Loop with external_ivr()

The `external_ivr()` function implements a complete Instruct-Validate-Repair loop for external LLMs. It:

1. **Instruct**: Calls your generation function
2. **Validate**: Checks requirements using Mellea
3. **Repair**: If validation fails, adds a repair prompt and retries

This is useful when you want to ensure outputs meet requirements, with automatic retries.

In [None]:
from mellea.stdlib.components import Message
import random


# Simulate an external LLM that sometimes fails requirements
def mock_generate(messages: list[Message]) -> str:
    """Simulated LLM that generates responses.

    In real usage, this would call your LangChain agent, OpenAI API, etc.
    """
    # Check if this is a retry (repair message in context)
    is_retry = any(
        "requirements" in m.content.lower() for m in messages if m.role == "user"
    )

    if is_retry:
        # On retry, always include required elements
        return "The random number is 8. Welcome to the AI Foundations team!"
    else:
        # First attempt might fail
        num = random.randint(1, 10)
        team = random.choice(["AI Foundations", "Quantum Plus AI"])
        return f"The random number is {num}. Welcome to the {team} team!"


print("Simulated generation (first call):")
print(
    f"  '{mock_generate([Message(role='user', content='Generate a number and team.')])}'"
)

In [None]:
# Run the IVR loop
initial_context = [
    {
        "role": "user",
        "content": "Generate a random number and tell me what team I'm on.",
    }
]

requirements = [
    req(
        "The number must be greater than 5",
        validation_fn=simple_validate(
            lambda x: any(n in x for n in ["6", "7", "8", "9", "10"])
        ),
    ),
    "Must mention the AI Foundations team",
]

result = external_ivr(
    generate_fn=mock_generate,
    requirements=requirements,
    backend=backend,
    initial_context=initial_context,
    loop_budget=5,
)

print("=" * 50)
print(f"IVR Result:")
print(f"  Success: {result.success}")
print(f"  Attempts: {result.attempts}")
print(f"  Final output: '{result.output}'")
print(f"\nAll outputs generated:")
for i, out in enumerate(result.all_outputs, 1):
    print(f"  {i}. '{out}'")

In [None]:
# Custom repair prompt function
def custom_repair_prompt(failed_reqs, output):
    """Generate a custom repair prompt."""
    failed_descriptions = [req.description for req, _ in failed_reqs]
    return (
        f"IMPORTANT: Your response '{output[:50]}...' did not meet these requirements:\n"
        f"- " + "\n- ".join(failed_descriptions) + "\n\n"
        f"Please generate a new response that satisfies ALL requirements."
    )


result = external_ivr(
    generate_fn=mock_generate,
    requirements=requirements,
    backend=backend,
    initial_context=initial_context,
    loop_budget=3,
    repair_prompt_fn=custom_repair_prompt,
)

print(f"IVR with custom repair prompt:")
print(f"  Success: {result.success}, Attempts: {result.attempts}")

### Using external_ivr() with a Real LangChain Agent

Here's how you would integrate `external_ivr()` with an actual LangChain agent:

In [None]:
# Example: Wrapping a LangChain agent for IVR
# (This is a template - uncomment and modify for your use case)

'''
from langchain_ollama import ChatOllama
from langchain.agents import create_agent

# Create your LangChain agent
llm = ChatOllama(model="granite4:latest", temperature=0.0)
agent = create_agent(model=llm, tools=[], system_prompt="You are helpful.")

def langchain_generate(messages: list[Message]) -> str:
    """Wrap LangChain agent for external_ivr."""
    # Convert Mellea messages to LangChain format
    lc_messages = [
        {"role": m.role, "content": m.content}
        for m in messages
    ]
    result = agent.invoke({"messages": lc_messages})
    return result["messages"][-1].content


# Run IVR with the LangChain agent
result = external_ivr(
    generate_fn=langchain_generate,
    requirements=["Must be helpful", "Must answer the question"],
    backend=backend,
    initial_context=[{"role": "user", "content": "What is Python?"}],
    loop_budget=3,
)
'''

print("See the code cell above for a LangChain integration template.")

---
<a id="formats"></a>
## Supported Formats

The External Validation API auto-detects and handles multiple formats:

### Output Formats

| Format | Example |
|--------|----------|
| String | `"The answer is 42."` |
| OpenAI dict | `{"role": "assistant", "content": "..."}` |
| LangChain message | `AIMessage(content="...")` |
| Mellea Message | `Message(role="assistant", content="...")` |
| ModelOutputThunk | `ModelOutputThunk(value="...")` |

### Context Formats

| Format | Example |
|--------|----------|
| OpenAI list | `[{"role": "user", "content": "..."}]` |
| LangChain list | `[HumanMessage(content="...")]` |
| Mellea list | `[Message(role="user", content="...")]` |

### Requirement Formats

| Format | Validation Type |
|--------|----------------|
| String | LLM-as-judge (uses backend) |
| `req(description)` | LLM-as-judge |
| `req(description, validation_fn=...)` | Programmatic |
| `req(description, validation_fn=simple_validate(...))` | Programmatic (lambda) |

In [None]:
# Demonstrate format auto-detection
test_outputs = [
    ("string", "Hello world"),
    ("OpenAI dict", {"role": "assistant", "content": "Hello world"}),
    ("Mellea Message", Message(role="assistant", content="Hello world")),
]

if HAS_LANGCHAIN:
    test_outputs.append(("LangChain", AIMessage(content="Hello world")))

print("Format Auto-Detection Test:")
for format_name, output in test_outputs:
    results = external_validate(
        output=output, requirements=["Must contain a greeting"], backend=backend
    )
    status = "PASS" if results[0].as_bool() else "FAIL"
    print(f"  [{status}] {format_name}: {type(output).__name__}")

---
<a id="summary"></a>
## Summary

### API Reference

```python
from mellea import external_validate, ExternalSession
from mellea.stdlib.interop import external_ivr, IVRResult
```

### Quick Reference

| Function/Class | Use Case |
|---------------|----------|
| `external_validate()` | One-shot validation of any output format |
| `ExternalSession.from_output()` | Session from output + optional context |
| `ExternalSession.from_openai()` | Session from OpenAI message list |
| `ExternalSession.from_langchain()` | Session from LangChain message list |
| `session.validate()` | Validate against requirements |
| `session.all_passed()` | Quick boolean check |
| `external_ivr()` | Full IVR loop with automatic retries |

### Key Benefits

1. **Format Agnostic**: Works with strings, OpenAI dicts, LangChain messages, or Mellea types
2. **Simple API**: One function call vs. manual context building
3. **Full Validation**: Supports both programmatic and LLM-as-judge validation
4. **IVR Support**: Built-in retry loop with customizable repair prompts
5. **Async Support**: All functions have async variants (`aexternal_validate`, `aexternal_ivr`)

### Comparison with Manual Approach

**Before (Manual):**
```python
from mellea.core.base import ModelOutputThunk
from mellea.stdlib.context import ChatContext
from mellea.stdlib.components.chat_converters import langchain_messages_to_mellea

# Convert context
mellea_messages = langchain_messages_to_mellea(lc_messages[:-1])
ctx = ChatContext()
for msg in mellea_messages:
    ctx = ctx.add(msg)

# Wrap output
thunk = ModelOutputThunk(value=lc_messages[-1].content)
ctx = ctx.add(thunk)

# Validate
m = mellea.start_session(ctx=ctx)
results = m.validate(requirements)
```

**After (External Validation API):**
```python
from mellea import external_validate

results = external_validate(
    output=lc_messages[-1],
    requirements=["Must be helpful"],
    backend=backend,
    context=lc_messages[:-1],
)
```