# ðŸŽ­ Multi-Agent Debate

This tutorial demonstrates `af.collect` and `af.inject` for debugging and replaying multi-agent interactions.

## Setup (Colab only)

Uncomment and run the following cell if running in Google Colab:

In [None]:
# !pip install autoform
# import os
# os.environ["OPENAI_API_KEY"] = "your-key-here"

In [None]:
import autoform as af

MODEL = "openai/gpt-4o"  # or "ollama/llama3.2:3b" for local

## 1. The Problem

Multi-agent systems are hard to debug. When three agents produce an unexpected output, which agent caused it? You need to capture intermediate states and replay with modifications.

## 2. Define Output Structure

In [None]:
class Position(af.Struct):
    argument: str
    confidence: str

## 3. Three-Agent Debate

Two agents argue, a third synthesizes:

In [None]:
def debate(topic: str) -> Position:
    """Two agents debate, third synthesizes."""

    # Agent 1: Proponent
    pro_prompt = af.format("Argue strongly FOR: {}", topic)
    msgs_pro = [{"role": "user", "content": pro_prompt}]
    pro = af.lm_call(msgs_pro, model=MODEL)
    pro = af.checkpoint(pro, key="pro", collection="debate")

    # Agent 2: Opponent
    con_prompt = af.format("Argue strongly AGAINST: {}", topic)
    msgs_con = [{"role": "user", "content": con_prompt}]
    con = af.lm_call(msgs_con, model=MODEL)
    con = af.checkpoint(con, key="con", collection="debate")

    # Agent 3: Synthesizer
    synth_prompt = af.format(
        "PRO:\n{}\n\nCON:\n{}\n\nSynthesize a balanced position:",
        pro,
        con,
    )
    msgs_synth = [{"role": "user", "content": synth_prompt}]
    return af.struct_lm_call(msgs_synth, model=MODEL, struct=Position)

## 4. Build the IR

In [None]:
ir = af.trace(debate)("...")
print(ir)

## 5. Run with Collect

Capture what each agent said:

In [None]:
result, captured = af.collect(ir, collection="debate")("AI will replace programmers")

print("Final position:", result.argument)
print("\nConfidence:", result.confidence)
print("\nCaptured agents:", list(captured.keys()))
print("\nPRO argued:", captured["pro"][0])
print("\nCON argued:", captured["con"][0])

## 6. Inject: Replay with Modifications

What if the PRO agent had argued differently? Inject a modified argument:

In [None]:
modified = {
    "pro": ["AI is a tool that augments programmers, making them 10x more productive."],
    "con": [captured["con"][0]],  # keep original CON
}

result2 = af.inject(ir, collection="debate", values=modified)("AI will replace programmers")

print("New position:", result2.argument)
print("\nConfidence:", result2.confidence)

## 7. Pullback: Get Feedback Flow

Which agent should improve based on output feedback?

In [None]:
pb_ir = af.pullback(ir)

critique = Position(
    argument="Position was too one-sided towards AI",
    confidence="low confidence in balanced analysis",
)

output, gradient = af.call(pb_ir)(("AI will replace programmers", critique))

print("Gradient (improvement hint for topic):")
print(gradient)

## Summary

1. **Checkpoint** each agent's output for observability
2. **Collect** captures all agent outputs
3. **Inject** replays with modified intermediate values
4. **Pullback** traces feedback to suggest topic improvements

Use this pattern for any multi-agent system where you need to debug, replay, or counterfactually analyze agent interactions.