# üß† Production-Grade Self-Reflecting Agent with LangGraph

This notebook demonstrates **how self-reflection is implemented in production agentic AI systems** using **LangGraph**.

### Key characteristics
- ‚úî LangGraph StateGraph
- ‚úî No LLMs / No APIs
- ‚úî Multiple strategies & models
- ‚úî Accuracy vs cost trade-off
- ‚úî Retry budget & termination logic
- ‚úî Fully auditable agent memory

This mirrors **real production agent loops**, not toy demos.

## üß© Agent State (Production Schema)

The state is shared across all LangGraph nodes.

**Design goals:**
- Explicit memory
- Retry control
- Strategy selection
- Deterministic termination

In [1]:
from dataclasses import dataclass, field
from typing import List, Dict, Any

@dataclass
class AgentState:
    history: List[Dict[str, Any]] = field(default_factory=list)
    strategy: str = "fast"          # fast | accurate
    retries: int = 0
    max_retries: int = 3
    done: bool = False

## üì¶ Local ML Dependencies

We intentionally use **simple, local models** to focus on **agent behavior**, not ML complexity.

In [2]:
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

## ‚öôÔ∏è LangGraph Action Node: Train Model

This node represents an **agent action**.

- `fast` strategy ‚Üí cheaper, faster model
- `accurate` strategy ‚Üí more expensive, higher-quality model

In [8]:
def train_model_node(state: AgentState) -> AgentState:
    X, y = make_classification(
        n_samples=500,
        n_features=10,
        class_sep=0.8 if state.strategy == "fast" else 1.5,
        flip_y=0.15,
        random_state=42
    )


    if state.strategy == "fast":
        model = LogisticRegression(max_iter=100)
        cost = 1
    else:
        model = SVC(kernel="rbf", gamma="scale")
        cost = 3

    model.fit(X[:400], y[:400])
    preds = model.predict(X[400:])
    acc = accuracy_score(y[400:], preds)

    state.history.append({
        "node": "train",
        "strategy": state.strategy,
        "accuracy": acc,
        "cost": cost
    })

    print(f"[Train] strategy={state.strategy} | acc={acc:.2f} | cost={cost}")
    return state

## üîÅ LangGraph Reflection Node (Core Intelligence)

This node **reads the agent‚Äôs own memory** and reasons over:

- Accuracy
- Compute cost
- Retry budget

It then decides whether to **retry with a better strategy or stop execution**.

In [9]:
def reflection_node(state: AgentState) -> AgentState:
    last = state.history[-1]
    acc = last["accuracy"]
    cost = last["cost"]

    print(f"[Reflect] acc={acc:.2f} | cost={cost} | retries={state.retries}")

    if acc < 0.80 and state.retries < state.max_retries:
        state.strategy = "accurate"
        state.retries += 1
        decision = "retry_with_better_model"
    else:
        state.done = True
        decision = "stop"

    state.history.append({
        "node": "reflect",
        "decision": decision,
        "next_strategy": state.strategy,
        "done": state.done
    })

    return state

## üß≠ LangGraph Conditional Edge

Reflection controls **graph execution flow**, not loops.

In [10]:
def should_continue(state: AgentState) -> str:
    return "end" if state.done else "train"

## üï∏Ô∏è Build the LangGraph (Production Pattern)

In [11]:
from langgraph.graph import StateGraph, END

graph = StateGraph(AgentState)

graph.add_node("train", train_model_node)
graph.add_node("reflect", reflection_node)

graph.set_entry_point("train")
graph.add_edge("train", "reflect")

graph.add_conditional_edges(
    "reflect",
    should_continue,
    {
        "train": "train",
        "end": END
    }
)

agent_graph = graph.compile()

## ‚ñ∂Ô∏è Run the Self-Reflecting Agent

In [12]:
final_state = agent_graph.invoke(AgentState())

print("\n--- FINAL AGENT MEMORY ---")
for h in final_state["history"]:
    print(h)

[Train] strategy=fast | acc=0.73 | cost=1
[Reflect] acc=0.73 | cost=1 | retries=0
[Train] strategy=accurate | acc=0.81 | cost=3
[Reflect] acc=0.81 | cost=3 | retries=1

--- FINAL AGENT MEMORY ---
{'node': 'train', 'strategy': 'fast', 'accuracy': 0.73, 'cost': 1}
{'node': 'reflect', 'decision': 'retry_with_better_model', 'next_strategy': 'accurate', 'done': False}
{'node': 'train', 'strategy': 'accurate', 'accuracy': 0.81, 'cost': 3}
{'node': 'reflect', 'decision': 'stop', 'next_strategy': 'accurate', 'done': True}


## ‚úÖ Why This Is Production-Ready

- ‚úî Explicit state & schema
- ‚úî Deterministic execution
- ‚úî Auditable memory trail
- ‚úî Retry & cost control
- ‚úî Reflection drives flow

This pattern scales directly to:
- LLM-as-judge reflection
- Tool-using agents
- RAG pipelines
- Multi-agent systems