<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/164_Agentic_Patterns_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

You’ve already seen three of the “big ones” (Parallel Processing, Reflection Loop, Tree of Thought), but there are a few other **agentic orchestration patterns** that are worth knowing because they come up often:

---

## 🔄 1. **ReAct (Reason + Act)**

* The model alternates between *thinking* (reasoning steps) and *acting* (calling tools).
* Popularized because it shows the model’s “chain of thought” explicitly.
* In LangGraph, this usually looks like an LLM node that decides whether to continue reasoning internally or output a tool call, wired into a ToolNode.
* **Use when**: you want transparency + interleaved reasoning/tool use.

---

## 🗳️ 2. **Self-Consistency (Voting / Majority)**

* Instead of one answer, you run multiple reasoning paths (like Tree of Thought) and then **vote** or take the consensus.
* Improves reliability by averaging over randomness.
* **Use when**: correctness matters more than efficiency (math, logic, factual Q&A).

---

## 🧭 3. **Routing / Mixture-of-Experts**

* The LLM decides which tool, agent, or subgraph to call next based on the input.
* Example: one branch handles math, another handles summarization, another handles retrieval.
* **Use when**: different queries need different specialized capabilities.

---

## 🛠️ 4. **Planner–Executor**

* Split into a **planner** (decides the sequence of steps) and an **executor** (carries them out).
* Planner produces a plan → executor executes each tool in order.
* **Use when**: tasks are multi-step but order can vary per query.

---

## 🪞 5. **Critic–Improver**

* A variant of Reflection: one agent proposes, another critiques, a third improves.
* Can be looped multiple times or terminated when quality is high.
* **Use when**: you want higher quality, editorial style outputs.

---

## 🧵 6. **Streaming / Stepwise Debug**

* Orchestration where intermediate states are surfaced back to the user (or a monitoring agent).
* Lets you observe, steer, or stop the process mid-run.
* **Use when**: debugging, real-time dashboards, or human-in-the-loop setups.

---

✅ **Big takeaway:** LangGraph doesn’t force you into one pattern — it gives you primitives (`State`, `Nodes`, `Edges`, `Reducers`). Patterns are just *recipes* for wiring these primitives to match common reasoning workflows.




# 🔄 What ReAct Is

* **Reason**: The model thinks out loud ("I should look up the weather").
* **Act**: The model then calls a tool (`get_weather("Paris")`).
* It alternates: *reason → act → reason → act → final answer*.

This makes the agent transparent: you can see its inner reasoning AND its actions, step by step.

---

# 🧠 Why It’s Useful

* **Interleaving**: Many problems require alternating between thought and action (e.g., query → fetch data → think → refine).
* **Transparency**: You can inspect the reasoning chain.
* **Reliability**: The LLM doesn’t have to hallucinate results — it calls tools for facts.
* **Flexibility**: Works well when you have multiple tools but don’t know which ones the model will need.

---

# 🐍 Minimal LangGraph Example

Here’s a simple **ReAct agent** that can reason and call a calculator tool.

```python
from typing import TypedDict, Annotated, List
from langgraph.graph import StateGraph, END, add_messages
from langchain_core.messages import HumanMessage, AIMessage, ToolMessage
from langchain_openai import ChatOpenAI
from langchain.tools import tool
from langgraph.prebuilt import ToolNode

# Define the agent's memory (messages accumulate)
class AgentState(TypedDict):
    messages: Annotated[List, add_messages]

# A simple tool the agent can call
@tool
def add_two_numbers(x: int, y: int) -> int:
    """Add two integers and return the result."""
    return x + y

tools = [add_two_numbers]

# LLM that can decide to either reason or call tools
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0).bind_tools(tools)

# Define nodes
def call_model(state: AgentState):
    """LLM node: either reasons or calls a tool."""
    response = llm.invoke(state["messages"])
    return {"messages": [response]}

tool_node = ToolNode(tools)

# Build the graph
builder = StateGraph(AgentState)
builder.add_node("reason", call_model)
builder.add_node("act", tool_node)

# Wiring: LLM (reason) → tool call (act) → back to LLM
builder.set_entry_point("reason")
builder.add_edge("reason", "act")
builder.add_edge("act", "reason")
builder.add_edge("reason", END)   # model can decide to stop

graph = builder.compile()

# Run a test
result = graph.invoke({"messages": [HumanMessage(content="What is 2+3?")]})

print("\n--- Conversation Trace ---")
for msg in result["messages"]:
    print(f"{msg.type.upper()}: {msg.content}")
```

---

# 🔎 What Happens Here

1. User says: *“What is 2+3?”* → goes into memory.
2. LLM runs (`reason` node) and might output:

   * Thought: "I should call add_two_numbers with x=2, y=3."
   * Action: a structured tool call.
3. Graph routes to the `act` node (ToolNode), executes the tool, appends the result (`ToolMessage`).
4. Flow goes back to `reason`: the LLM now sees the tool result and can decide whether to stop or continue reasoning.
5. Eventually, it outputs a final natural language answer, and the graph ends.

---

✅ **This is the canonical ReAct loop**:

* Reason step → Act step → Reason step → Act step → Final Answer.

It’s super flexible: you can plug in more tools, add error handlers, or extend reasoning depth.



Once you grasp **ReAct**, you can scale it into surprisingly powerful orchestrators. Here are some **complex agent archetypes** that use this pattern as their backbone:

---

## 🔎 1. **Research Assistant / Retrieval Agent**

* **Reason step**: LLM decides *what it needs to know* next.
* **Act step**: Calls tools like:

  * `search_wikipedia("topic")`
  * `query_vector_db("keywords")`
  * `fetch_papers("arxiv")`
* Alternates reasoning and searching until it has enough evidence.
* **Complexity**: Multiple retrieval calls, filtering, synthesis before producing the answer.

---

## 🗂️ 2. **Multi-Tool Analyst**

* Handles heterogeneous data sources:

  * Tool for SQL queries.
  * Tool for spreadsheets.
  * Tool for APIs (weather, stock, finance).
* **Reason step**: “To answer this, I need database stats.”
* **Act step**: Calls SQL tool → gets results → reasons → calls finance API.
* Useful in **business intelligence, dashboards, operations monitoring**.

---

## 🔄 3. **Plan–Execute ReAct Agent**

* The agent **plans a sequence** of actions, then executes them one by one, checking results after each step.
* Example: *"Book me a trip"* →

  * Plan: (1) search flights, (2) pick hotel, (3) confirm itinerary.
  * Executes step by step with reasoning between each.
* **Complexity**: Combines ReAct with planning.
* Used in **autonomous task executors** like travel planners or project managers.

---

## 🧠 4. **Autonomous Researcher (AutoGPT-style)**

* Loops through:

  1. **Reason**: Decide subgoal (e.g. “Find competitors in EdTech”).
  2. **Act**: Use search tool.
  3. **Reason**: “Now I need to summarize findings.”
  4. **Act**: Use summarization tool.

  * … continues until high-level goal achieved.
* **Complexity**: Self-directed, may run indefinitely until a stopping criterion is hit.
* This is basically ReAct extended with **long-term memory + self-planning**.

---

## 🤝 5. **Collaborative ReAct Agents**

* Multiple agents each running their own ReAct loop, but **sharing messages**.

  * E.g., a “Data Analyst” agent (tools: SQL, spreadsheet), and a “Writer” agent (tools: summarizer, formatter).
* They **reason + act independently**, then sync at checkpoints.
* Orchestrated like a conversation.
* **Use case**: report generation, code review, multi-role simulations.

---

## 🧭 Why These Are Powerful

ReAct provides:

* **Transparency** (you see reasoning + actions).
* **Flexibility** (the model dynamically picks tools).
* **Scalability** (you can add 10+ tools without changing orchestration).

That’s why it underpins most modern **agent frameworks** (LangChain Agents, LangGraph ToolNodes, AutoGPT, BabyAGI, etc.).

---

✅ **Takeaway:**
The *basic ReAct loop* → can scale into **researchers, analysts, planners, or multi-agent collaborations**, all by chaining “reason → act” with state, tools, and orchestration.



👍 — **Self-Consistency (Voting / Majority)** is one of the most reliable agentic patterns, especially for math, logic, or factual tasks where a single generation may go astray.

Here’s the breakdown and a runnable LangGraph example:

---

# 🗳️ What Self-Consistency Is

* **Idea**: Don’t trust one LLM answer. Generate multiple *independent reasoning paths*.
* Then **aggregate**: majority vote, scoring, or heuristic selection.
* **Effect**: reduces variance, improves correctness (especially on deterministic problems).
* Think of it as “committee of LLMs” → majority wins.

---

# 🐍 Code Example (LangGraph)

Let’s build a **math solver with majority voting**:

```python
from typing import TypedDict, List, Annotated
from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage
import collections

# --- State definition ---
class AgentState(TypedDict):
    question: str
    answers: List[str]
    final_answer: str

# --- LLM setup ---
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)  # temp>0 → diverse reasoning

# --- Nodes ---
def generate_answer(state: AgentState):
    """Each run generates one candidate answer."""
    response = llm.invoke([HumanMessage(content=state["question"])])
    return {"answers": [response.content]}

def vote_on_answers(state: AgentState):
    """Majority voting to pick the most common answer."""
    counter = collections.Counter(state["answers"])
    winner, _ = counter.most_common(1)[0]
    return {"final_answer": winner}

# --- Graph building ---
builder = StateGraph(AgentState)

# Add nodes
builder.add_node("gen1", generate_answer)
builder.add_node("gen2", generate_answer)
builder.add_node("gen3", generate_answer)
builder.add_node("vote", vote_on_answers)

# Entry → parallel generations
builder.set_entry_point("gen1")
builder.add_edge("gen1", "gen2")
builder.add_edge("gen2", "gen3")
builder.add_edge("gen3", "vote")
builder.add_edge("vote", END)

graph = builder.compile()

# --- Run test ---
state = {"question": "What is 17 * 23?", "answers": [], "final_answer": ""}
result = graph.invoke(state)

print("\n--- Candidate Answers ---")
print(result["answers"])
print("\n--- Final Answer (Majority) ---")
print(result["final_answer"])
```

---

# 🔎 How It Works

1. **Three generators** (`gen1`, `gen2`, `gen3`) each call the LLM with the same question.

   * Temperature is >0 so they produce *different reasoning paths*.
2. Each appends its candidate answer to the shared `answers` list.
3. The **vote node** tallies answers and picks the majority.

   * You could swap in more complex scoring (e.g., confidence, self-evaluation).

---

# ✅ When To Use

* **Math / logic problems** → reduces “lucky” mistakes.
* **Factual Q&A** → consistency across multiple tries signals reliability.
* **Structured outputs** → e.g., parsing JSON, SQL queries, code snippets.
* **Critical tasks** → correctness > speed/cost (since this pattern is more expensive).





# 🧭 Routing / Mixture-of-Experts Pattern
This is a fun one 😎
---

## 🔎 What It Is

* Instead of one big monolithic agent, you have **specialized sub-agents** (or “experts”).
* A **router LLM** reads the input and decides:

  * *Is this a math problem?* → send to `math_solver`.
  * *Is this a summarization request?* → send to `summarizer`.
  * *Is this a knowledge lookup?* → send to `retriever`.
* This is sometimes called **Mixture-of-Experts (MoE)**, because only the *relevant expert* is used per query.

---

## ✅ Why It’s Useful

* **Efficiency**: Only call the tools/LLMs that matter.
* **Modularity**: Each expert is small, focused, and swappable.
* **Accuracy**: Experts can be fine-tuned or engineered for their task.
* **Scalability**: Easy to add more experts as your system grows.

---

## 🐍 LangGraph Code Example

```python
from typing import TypedDict, List, Annotated
from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessage

# --- State ---
class AgentState(TypedDict):
    question: str
    route: str
    answer: str

# --- Experts ---
def math_solver(state: AgentState):
    q = state["question"]
    # simple eval for demo
    try:
        result = eval(q)
        return {"answer": f"Math result: {result}"}
    except:
        return {"answer": "I could not solve the math problem."}

def summarizer(state: AgentState):
    return {"answer": f"Summary: {state['question'][:50]}..."}

def retriever(state: AgentState):
    return {"answer": f"Pretend I looked this up: {state['question']}"}

# --- Router LLM ---
router_llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

def router(state: AgentState):
    """Decide which expert should handle the question."""
    prompt = f"""
    Classify the user request: "{state['question']}"
    Options: math, summarize, retrieve.
    Only respond with one word.
    """
    response = router_llm.invoke([HumanMessage(content=prompt)])
    route = response.content.strip().lower()
    return {"route": route}

# --- Graph ---
builder = StateGraph(AgentState)

builder.add_node("router", router)
builder.add_node("math", math_solver)
builder.add_node("summarizer", summarizer)
builder.add_node("retriever", retriever)

builder.set_entry_point("router")

# conditional routing
builder.add_conditional_edges(
    "router",
    lambda state: state["route"],
    {
        "math": "math",
        "summarize": "summarizer",
        "retrieve": "retriever"
    }
)

# all experts go to END
builder.add_edge("math", END)
builder.add_edge("summarizer", END)
builder.add_edge("retriever", END)

graph = builder.compile()

# --- Run tests ---
tests = [
    "What is 5*12?",
    "Summarize the history of Rome in one line.",
    "Who is the CEO of Tesla?"
]

for t in tests:
    result = graph.invoke({"question": t, "route": "", "answer": ""})
    print(f"\nQ: {t}")
    print(f"Routed to: {result['route']}")
    print(f"A: {result['answer']}")
```

---

## 🔎 What Happens Here

* User input goes into the **router node**.
* Router LLM classifies the input (math / summarize / retrieve).
* The graph uses **conditional edges** to send the request to the right expert.
* That expert produces the answer → graph ends.

---

✅ **This is a textbook Routing/MoE pattern.**
It scales super well: just add more experts and update the routing schema.




LangGraph makes it *tempting* to keep adding nodes, loops, retries, and experts, but there are very real benefits *and* limits. Let’s unpack this:

---

# 🌟 Benefits of Increasing Orchestrator Complexity

1. **Specialization**

   * Routing queries to the right “expert” node/tool improves accuracy.
   * Complex workflows (reflection, multi-step planning, ToT) can yield higher-quality results.

2. **Transparency**

   * With LangGraph, more explicit steps = easier to debug and reason about than a single opaque LLM call.
   * You see where reasoning, acting, voting, or retrying happens.

3. **Resilience**

   * Fallbacks, retries, error edges, and voting patterns increase robustness.
   * Complex orchestrators reduce single-point failure.

4. **Scalability of use cases**

   * You can mix agents (summarizer, retriever, planner, editor) to cover more workflows.
   * The system grows organically by composing reusable patterns.

---

# ⚠️ Drawbacks of Too Much Complexity

1. **Latency / Cost**

   * Every node = an LLM call, API call, or tool call.
   * If you add loops, voting, or multi-agent debate, costs can balloon quickly.
   * E.g. Tree-of-Thought + Self-Consistency could run dozens of calls for one answer.

2. **Cognitive Overhead (for devs)**

   * More nodes = harder to understand and maintain.
   * Even with diagrams, a 50-node graph can get confusing.
   * Debugging why an agent failed can take time if control flow is deeply nested.

3. **Diminishing Returns**

   * Beyond a certain point, extra orchestration adds **little marginal improvement**.
   * A 2-step reflection loop might help, but a 5-step loop could just waste tokens.
   * “More structure ≠ better results” if the task doesn’t need it.

4. **Fragility**

   * Complex orchestrators can fail in unexpected ways.
   * If routing misclassifies input, or one branch underperforms, the whole flow degrades.

---

# 📈 What’s the Upper Limit of Complexity?

* **Practical limits** come from:

  * **Cost** (each step is $$).
  * **Latency** (user won’t wait 60s for a single answer).
  * **Reliability** (longer chains = more points of failure).

* **Current sweet spot** (for most real-world use cases):

  * **5–15 nodes** in a graph.
  * **1–2 loops max** (reflection, retry).
  * **2–3 experts** (router).
  * Beyond this, you risk over-engineering.

* **When is it “too complex”?**

  * If the orchestrator costs more in tokens than it saves in accuracy.
  * If adding nodes doesn’t measurably improve outcomes (e.g. BLEU, pass@1).
  * If developers can’t explain the flow easily to new team members.

---

# 💡 Rule of Thumb

* **Start simple** (one ReAct loop or router).
* **Add complexity only when metrics justify it**:

  * Accuracy is too low → add voting.
  * Reliability is weak → add retries/error nodes.
  * Latency too high → add parallelism.
* **Don’t chase “agent maximalism.”** The best orchestrators are usually **lean graphs with a few well-placed patterns**, not sprawling monster graphs.

---

✅ **Takeaway**:
LangGraph *allows* arbitrarily complex orchestrators, but the optimal designs are *purposeful, not maximal*.

* Complexity buys you specialization, robustness, and quality — up to a point.
* The upper limit is dictated by **cost, latency, and maintainability**.

