# LangGraph + LLMs: OpenAI vs Nonâ€‘OpenAI Output Patterns

This notebook summarizes everything we discussed about:

1. How different LLM providers (OpenAI, Groq, Gemini) return outputs.
2. Why your original LangGraph code worked only with OpenAI.
3. How to write **providerâ€‘agnostic** code that works with *any* LLM.
4. How to implement the **Reflexion Agent** style graph in two ways:
   - Providerâ€‘agnostic (Groq / Gemini / OpenAI)
   - OpenAIâ€‘tools style (classic examples)

You can read this like notes + code templates.

## 1. The Core Contract: What LangGraph MessageGraph Expects

LangGraph's `MessageGraph` has a simple contract:

- **Node input:** `List[BaseMessage]`
- **Node output:** `BaseMessage` *or* `List[BaseMessage]`

Valid message types are the LangChain messages, e.g.:

- `HumanMessage`
- `AIMessage`
- `ToolMessage`
- (all subclasses of `BaseMessage`)

ðŸ‘‰ If a node returns *anything else* (like a raw Pydantic model), LangGraph
will throw an error like:

> `Unsupported message type: <class 'schema.AnswerQuestion'>`

This is what happened in your earlier code.

## 2. Two Patterns of LLM Usage

There are two main ways we used LLMs:

### 2.1 OpenAI Tools Pattern

```python
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool

@tool
def add(x: int, y: int):
    return x + y

llm = ChatOpenAI(model="gpt-4o-mini").bind_tools([add])

resp = llm.invoke("add 5 and 7")
print(resp)
```

OpenAI returns an `AIMessage` with a `tool_calls` field, e.g.:

```python
AIMessage(
  content="",
  tool_calls=[
    {
      "name": "add",
      "args": {"x": 5, "y": 7},
      "id": "call_123"
    }
  ]
)
```

Your graph nodes can then read `last_ai_message.tool_calls`.

### 2.2 Structured Output Pattern (Pydantic)

```python
from pydantic import BaseModel
from langchain_groq import ChatGroq

class Person(BaseModel):
    name: str
    age: int

llm = ChatGroq(model="llama-3.3-70b-versatile").with_structured_output(Person)

resp = llm.invoke("Name Alice, age 25")
print(resp)
```

Here, `resp` is **not** an `AIMessage`. It is a Pydantic model:

```python
Person(name="Alice", age=25)
```

If you plug this *directly* into `MessageGraph`, it will fail, because the
graph expects `BaseMessage` objects, not arbitrary Python models.

## 3. Why Your Old Code Worked Only with OpenAI

Your original design assumed the **OpenAI tools pattern**:

1. LLM returns `AIMessage` with `.tool_calls`.
2. `execute_tools` reads `tool_calls` and produces `ToolMessage`s.
3. Your loop (`event_loop`) counted `ToolMessage`s to decide when to stop.

This depends on features that only OpenAI (and APIâ€‘compatible providers) expose.

### Example of the old loop logic:

```python
from langchain_core.messages import ToolMessage
from langgraph.graph import END

MAX_ITERATIONS = 2

def event_loop(state: list[BaseMessage]) -> str:
    count_tool_visit = sum(isinstance(m, ToolMessage) for m in state)
    if count_tool_visit > MAX_ITERATIONS:
        return END
    return "execute_tools"
```

This works only if `ToolMessage`s are actually being produced, which is true in
the OpenAI tools pattern. But when you switched to Groq/Gemini + structured
output, there were no `tool_calls` and often no `ToolMessage`s, so the count
never increased. That caused an infinite loop and a `GraphRecursionError`.

## 4. Providerâ€‘Agnostic Pattern (Works with Groq / Gemini / OpenAI)

To make your Reflexion agent work with **any** LLM, we did three things:

1. Use `with_structured_output()` to get Pydantic models.
2. Wrap those models into `AIMessage` manually.
3. Implement loop logic (and tool usage) based on **your own fields**, not
   on OpenAI's `tool_calls`.

Below is a minimal providerâ€‘agnostic template.

In [None]:
# 4.1 Pydantic schemas (these are the same for all providers)
from pydantic import BaseModel, Field
from typing import List

class Reflection(BaseModel):
    missing: str = Field(description="Critique of what is missing.")
    superfluous: str = Field(description="Critique of what is superfluous.")

class AnswerQuestion(BaseModel):
    """Answer the question."""
    answer: str = Field(description="~250 word detailed answer to the question.")
    search_queries: List[str] = Field(
        description=(
            "1-3 search queries for researching improvements to"
            " address the critique of your current answer."
        )
    )
    reflection: Reflection = Field(
        description="Your reflection on the initial answer."
    )

class ReviseAnswer(AnswerQuestion):
    """Revise your original answer to your question."""
    references: List[str] = Field(
        description="Citations motivating your updated answer."
    )

In [None]:
# 4.2 Draft and Revisor chains (LLM-agnostic setup)
import datetime
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import AIMessage, BaseMessage

# Example: here we would plug in Groq, Gemini, or OpenAI
from langchain_groq import ChatGroq

llm = ChatGroq(model="llama-3.3-70b-versatile", temperature=0)

actor_prompt_template = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            """You are expert AI researcher.
Current time: {time}

You must follow this process:

1. {first_instruction}
2. Reflect and critique your answer. Be severe to maximize improvement.
3. After the reflection, list 1â€“3 search queries separately for researching improvements.
   - Do NOT include the search queries inside the reflection.
4. Finally, answer the user's question above using the required format.
""".strip(),
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
).partial(time=lambda: datetime.datetime.now().isoformat())

first_responder_prompt = actor_prompt_template.partial(
    first_instruction="Provide a detailed ~250 word answer"
)

revise_instructions = """Revise your previous answer using the new information.
    - You should use the previous critique to add important information to your answer.
        - You MUST include numerical citations in your revised answer to ensure it can be verified.
        - Add a \"References\" section to the bottom of your answer.
    - You should use the previous critique to remove superfluous information from your answer
      and make SURE it is not more than 250 words.
"""

revisor_prompt = actor_prompt_template.partial(
    first_instruction=revise_instructions
)

# LLM with structured output
first_responder_llm = llm.with_structured_output(AnswerQuestion)
revisor_llm = llm.with_structured_output(ReviseAnswer)

first_responder_chain = first_responder_prompt | first_responder_llm
revisor_chain = revisor_prompt | revisor_llm

In [None]:
# 4.3 Draft node: Pydantic -> AIMessage
def draft_node(state: list[BaseMessage]) -> list[BaseMessage]:
    """Use AnswerQuestion as *structured output*, then wrap into AIMessage.
    This works with Groq, Gemini, OpenAI, etc.
    """
    result: AnswerQuestion = first_responder_chain.invoke({"messages": state})

    msg = AIMessage(
        content=result.answer,
        additional_kwargs={
            "search_queries": result.search_queries,
            "reflection": result.reflection.model_dump(),
            "role": "draft",
        },
    )
    return [msg]

In [None]:
# 4.4 Tools node: use our own search_queries instead of tool_calls
import json
from typing import Dict, Any, List as PyList
from langchain_core.messages import ToolMessage
from langchain_community.tools import TavilySearchResults

tavily = TavilySearchResults(max_results=2)

def execute_tools(state: PyList[BaseMessage]) -> PyList[BaseMessage]:
    # Take the last AI message
    last_ai = next((m for m in reversed(state) if isinstance(m, AIMessage)), None)
    if not last_ai:
        return []

    search_queries = last_ai.additional_kwargs.get("search_queries", [])
    if not search_queries:
        return []

    query_results: Dict[str, Any] = {}
    for q in search_queries:
        # In a real run, this would call Tavily's API
        query_results[q] = tavily.invoke(q)

    tool_msg = ToolMessage(
        content=json.dumps(query_results),
        tool_call_id="manual-tools-1",
    )
    return [tool_msg]

In [None]:
# 4.5 Revisor node: structured ReviseAnswer -> AIMessage
from langchain_core.messages import AIMessage

def revisor_node(state: list[BaseMessage]) -> list[BaseMessage]:
    # Count prior revisions
    past_revisions = sum(
        isinstance(m, AIMessage) and m.additional_kwargs.get("role") == "revisor"
        for m in state
    )

    result: ReviseAnswer = revisor_chain.invoke({"messages": state})

    msg = AIMessage(
        content=result.answer,
        additional_kwargs={
            "role": "revisor",
            "iteration": past_revisions + 1,
            "search_queries": result.search_queries,
            "reflection": result.reflection.model_dump(),
            "references": result.references,
        },
    )
    return [msg]

In [None]:
# 4.6 Loop logic: stop based on our own tags (provider-agnostic)
from langgraph.graph import MessageGraph, END

MAX_ITERATIONS = 2

def event_loop(state: list[BaseMessage]) -> str:
    # Count how many times revisor has run
    num_iterations = sum(
        isinstance(m, AIMessage) and m.additional_kwargs.get("role") == "revisor"
        for m in state
    )

    if num_iterations >= MAX_ITERATIONS:
        return END

    # Optional safety: if no search queries, just stop
    last_ai = next((m for m in reversed(state) if isinstance(m, AIMessage)), None)
    if not last_ai:
        return END
    if not last_ai.additional_kwargs.get("search_queries"):
        return END

    return "execute_tools"

# Wire the graph
graph = MessageGraph()
graph.add_node("draft", draft_node)
graph.add_node("execute_tools", execute_tools)
graph.add_node("revisor", revisor_node)

graph.add_edge("draft", "execute_tools")
graph.add_edge("execute_tools", "revisor")
graph.add_conditional_edges("revisor", event_loop)
graph.set_entry_point("draft")

app = graph.compile()

# Example usage (will error here if you don't have API keys/env set):
print("Provider-agnostic Reflexion graph ready (not executed here).")

## 5. OpenAI Tools Style Version

Now, let's see how you would implement **the same idea** using OpenAI's
native tools (`bind_tools`) instead of structured output.

Key differences:

- Use `ChatOpenAI(...).bind_tools([...])` instead of `with_structured_output`.
- The LLM returns `AIMessage` with `.tool_calls`.
- `execute_tools` reads `.tool_calls` instead of `search_queries` from
  `additional_kwargs`.
- Your stop condition can be based on `ToolMessage` count *or* your own tags.

In [None]:
# 5.1 OpenAI-specific setup (pseudo-code; requires OPENAI_API_KEY)
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import BaseMessage, ToolMessage, AIMessage

openai_llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

actor_prompt_template_oa = ChatPromptTemplate.from_messages([
    (
        "system",
        """You are expert AI researcher.
Current time: {time}

You must follow this process:
1. {first_instruction}
2. Reflect and critique your answer.
3. After the reflection, list 1â€“3 search queries separately for researching improvements.
""".strip(),
    ),
    MessagesPlaceholder(variable_name="messages"),
]).partial(time=lambda: datetime.datetime.now().isoformat())

first_responder_prompt_oa = actor_prompt_template_oa.partial(
    first_instruction="Provide a detailed ~250 word answer"
)

revisor_prompt_oa = actor_prompt_template_oa.partial(
    first_instruction="Revise your previous answer using the new information."
)

# Now bind tools instead of structured output
first_responder_chain_oa = first_responder_prompt_oa | openai_llm.bind_tools(
    tools=[AnswerQuestion],
    tool_choice="AnswerQuestion",  # force this tool
)

revisor_chain_oa = revisor_prompt_oa | openai_llm.bind_tools(
    tools=[ReviseAnswer],
    tool_choice="ReviseAnswer",
)

print("OpenAI tool-calling chains defined (not executed here).")

In [None]:
# 5.2 execute_tools for OpenAI: read .tool_calls
from typing import List as PyList, Dict, Any
import json
from langchain_community.tools import TavilySearchResults

tavily_tool = TavilySearchResults(max_results=2)

def execute_tools_openai(state: PyList[BaseMessage]) -> PyList[BaseMessage]:
    last_ai = state[-1]
    if not isinstance(last_ai, AIMessage):
        return []

    if not getattr(last_ai, "tool_calls", None):
        return []

    tool_messages: PyList[ToolMessage] = []

    for tool_call in last_ai.tool_calls:
        name = tool_call["name"]
        args = tool_call["args"]
        call_id = tool_call["id"]

        if name in ["AnswerQuestion", "ReviseAnswer"]:
            search_queries = args.get("search_queries", [])
            query_results: Dict[str, Any] = {}
            for q in search_queries:
                query_results[q] = tavily_tool.invoke(q)

            tool_messages.append(
                ToolMessage(
                    content=json.dumps(query_results),
                    tool_call_id=call_id,
                )
            )

    return tool_messages

print("OpenAI execute_tools defined (not executed here).")

In [None]:
# 5.3 Example of loop condition using ToolMessage count (OpenAI style)
from langgraph.graph import MessageGraph, END

MAX_ITERATIONS_OA = 2

def event_loop_openai(state: list[BaseMessage]) -> str:
    count_tool_visit = sum(isinstance(m, ToolMessage) for m in state)
    if count_tool_visit >= MAX_ITERATIONS_OA:
        return END
    return "execute_tools_openai"

graph_oa = MessageGraph()
graph_oa.add_node("draft", first_responder_chain_oa)
graph_oa.add_node("execute_tools_openai", execute_tools_openai)
graph_oa.add_node("revisor", revisor_chain_oa)

graph_oa.add_edge("draft", "execute_tools_openai")
graph_oa.add_edge("execute_tools_openai", "revisor")
graph_oa.add_conditional_edges("revisor", event_loop_openai)
graph_oa.set_entry_point("draft")

app_oa = graph_oa.compile()
print("OpenAI-based Reflexion graph ready (not executed here).")

## 6. Summary: When to Use Which Pattern

- **If you use OpenAI and want tight tool integration**:
  - Use `bind_tools`.
  - Consume `AIMessage.tool_calls`.
  - Optionally base loop logic on `ToolMessage` count.

- **If you use Groq, Gemini, or want providerâ€‘agnostic code**:
  - Use `with_structured_output(PydanticModel)`.
  - Wrap the result into `AIMessage` manually.
  - Store metadata (search_queries, role, iteration, reflection, references)
    in `additional_kwargs`.
  - Implement `execute_tools` and loop conditions based on those fields.

If you follow these patterns, you will avoid the common errors:

- `Unsupported message type: <class '...'>`
- `GraphRecursionError: Recursion limit reached without hitting a stop condition`

and your agent graphs will work across different LLM providers.