## **What is Self-Reflection in RAG?**
Self-reflection = LLM evaluates its own output:
“Is this clear, complete, and accurate?”

**Self-Reflection in RAG using LangGraph, we’ll design a workflow where the agent:**

1. Generates an initial answer using retrieved context
2. Reflects on that answer with a dedicated self-critic LLM step
3. If unsatisfied, it can revise the query, retrieve again, or regenerate the answer

In [1]:
import os
from typing import List
from pydantic import BaseModel
from langchain_openai import OpenAIEmbeddings
from langchain.schema import Document
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain_community.document_loaders import TextLoader
from langgraph.graph import StateGraph, END

In [2]:
### load llm models
import os
from langchain.chat_models import init_chat_model
from dotenv import load_dotenv
os.environ["OPENAI_API_KEY"]=os.getenv("OPENAI_API_KEY")
llm=init_chat_model("openai:gpt-4o")

In [3]:
docs = TextLoader("internal_docs.txt").load()
chunks = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50).split_documents(docs)
vectorstore = FAISS.from_documents(chunks, OpenAIEmbeddings())
retriever = vectorstore.as_retriever()

In [4]:
# -------------------------
# 2. State Definition
# -------------------------
class RAGReflectionState(BaseModel):
    question: str
    retrieved_docs: List[Document] = []
    answer: str = ""
    reflection: str = ""
    revised: bool = False
    attempts: int = 0

In [8]:
# -------------------------
# 3. Nodes
# -------------------------

# a. Retrieve
def retrieve_docs(state: RAGReflectionState) -> RAGReflectionState:
    docs = retriever.invoke(state.question)
    return state.model_copy(update={"retrieved_docs": docs})

# b. Generate Answer
def generate_answer(state: RAGReflectionState) -> RAGReflectionState:
    
    context = "\n\n".join([doc.page_content for doc in state.retrieved_docs])
    prompt = f"""
Use the following context to answer the question:

Context:
{context}

Question:
{state.question}
"""
    answer = llm.invoke(prompt).content.strip()
    return state.model_copy(update={"answer": answer, "attempts": state.attempts + 1})

In [5]:
# c. Self-Reflect
def reflect_on_answer(state: RAGReflectionState) -> RAGReflectionState:
    
    prompt = f"""
Reflect on the following answer to see if it fully addresses the question. 
State YES if it is complete and correct, or NO with an explanation.

Question: {state.question}

Answer: {state.answer}

Respond like:
Reflection: YES or NO
Explanation: ...
"""
    result = llm.invoke(prompt).content
    is_ok = "reflection: yes" in result.lower()
    return state.model_copy(update={"reflection": result, "revised": not is_ok})

In [6]:
# d. Finalizer
def finalize(state: RAGReflectionState) -> RAGReflectionState:
    return state

In [9]:
# -------------------------
# 4. LangGraph DAG
# -------------------------
builder = StateGraph(RAGReflectionState)

builder.add_node("retriever", retrieve_docs)
builder.add_node("responder", generate_answer)
builder.add_node("reflector", reflect_on_answer)
builder.add_node("done", finalize)

builder.set_entry_point("retriever")

builder.add_edge("retriever", "responder")
builder.add_edge("responder", "reflector")
builder.add_conditional_edges(
    "reflector",
    lambda s: "done" if not s.revised or s.attempts >= 2 else "retriever"
)

builder.add_edge("done", END)
graph = builder.compile()

In [10]:
# -------------------------
# 5. Run the Agent
# -------------------------
if __name__ == "__main__":
    user_query = "What are the transformer variants in production deployments?"
    init_state = RAGReflectionState(question=user_query)
    result = graph.invoke(init_state)

    print("\n🧠 Final Answer:\n", result["answer"])
    print("\n🔁 Reflection Log:\n", result["reflection"])
    print("🔄 Total Attempts:", result["attempts"])


🧠 Final Answer:
 The transformer variants mentioned in the production deployments are:

1. EfficientFormer
3. Reformer
4. LLaMA2
5. TinyBERT

🔁 Reflection Log:
 Reflection: NO  
Explanation: While the answer lists some transformer variants such as EfficientFormer, Reformer, LLaMA2, and TinyBERT, it is incomplete. There are numerous other transformer variants widely used in production deployments that are not mentioned, such as BERT, GPT-3, DistilBERT, RoBERTa, and T5. The answer should include a broader range of examples to fully address the question. Additionally, it should provide some context or explanation for why these specific variants are notable in the context of production usage.
🔄 Total Attempts: 2


##  **Self-Reflection with RAGs and ReAct Agents in LangGraph & LangChain**

---

## The Core Idea

In advanced AI systems, we want agents that can:

* **Retrieve** real data (RAG).
* **Reason** and make decisions (ReAct).
* **Reflect** on their own output and **self-correct** (Self-Reflection).

So, the agent doesn’t just generate answers — it:

> 💭 *Thinks → Acts → Evaluates → Improves → Responds.*

This forms a **self-improving reasoning loop** — and LangGraph + LangChain give you the tools to implement this *elegantly.*

---

## The Three Pillars

| Concept                                     | Description                                                                                          | Analogy                           |
| ------------------------------------------- | ---------------------------------------------------------------------------------------------------- | --------------------------------- |
| 🧩 **RAG (Retrieval-Augmented Generation)** | Brings **facts** by retrieving relevant context from external data sources                           | “Looking it up in your notes”     |
| 🧠 **ReAct Agent**                          | Combines **Reasoning + Action** — LLM decides when to think, when to use a tool, and when to respond | “Thinking + Doing”                |
| 🔁 **Self-Reflection**                      | Model **reviews its own output**, identifies flaws, and improves before finalizing                   | “Proofreading your own reasoning” |

Together, these make your agent **smart, grounded, and self-improving.**

---

## Concept Flow

Let’s visualize the **Self-Reflective RAG ReAct Loop**:

```
User Query
   ↓
[Retriever] → Get Context
   ↓
[Reasoning (ReAct)] → Think + Take Actions
   ↓
[Reflection] → Review Output, Check for Hallucination or Gaps
   ↓
[Refine / Retry] → Generate Improved Final Answer
```

Each stage is a **LangGraph node**, forming a **cycle** until the output is “good enough”.

---

## What is Self-Reflection (in LLM terms)?

**Definition:**
Self-Reflection = When an AI model evaluates its *own output* (or reasoning steps) and *adjusts them* before producing a final response.

**Prompt Example:**

> “Here is your previous reasoning and answer. Please critique your logic, identify any gaps or errors, and rewrite an improved version.”

This turns the model from a *static responder* into a *self-evaluating reasoning system.*

---

## What are ReAct Agents?

ReAct = **Reasoning + Acting**
Introduced in the paper *“ReAct: Synergizing Reasoning and Acting in Language Models”* (Yao et al., 2022).

A ReAct agent:

* Thinks about the problem
* Chooses an action (like a search, retrieval, or calculation)
* Observes the result
* Continues reasoning with updated info

**Loop:**

```
Thought → Action → Observation → Thought → Final Answer
```

In LangChain and LangGraph, ReAct agents are the base for **tool-using, multi-step reasoning agents**.

---

## Why Combine RAG + ReAct + Reflection?

| Layer         | Role                                | Outcome                 |
| ------------- | ----------------------------------- | ----------------------- |
| 🧠 ReAct      | Gives reasoning and decision-making | Structured thinking     |
| 📚 RAG        | Gives factual grounding             | Real-world data         |
| 💭 Reflection | Gives self-evaluation               | Quality and correctness |

So your AI doesn’t just “talk smart” — it **thinks, checks, and improves**.

---

## Implementing in LangChain (Step-by-Step)

Let’s build this pipeline incrementally.

---

### Step 1: Imports

```python
from langchain.llms import ChatOpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain, RetrievalQA
from langchain.agents import initialize_agent, Tool
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
```

---

### Step 2: Setup RAG (Retriever)

```python
embeddings = OpenAIEmbeddings()
db = Chroma(persist_directory="./rag_db", embedding_function=embeddings)
retriever = db.as_retriever(search_kwargs={"k": 3})
```

---

### Step 3: Create a Self-Reflection Prompt

```python
reflection_prompt = PromptTemplate(
    input_variables=["query", "previous_answer"],
    template="""
You are a reflective AI assistant. Evaluate your previous answer critically.

Question: {query}
Previous Answer: {previous_answer}

Identify:
1. Any factual inaccuracies.
2. Missing reasoning steps.
3. Improvements for clarity and correctness.

Then produce an improved final answer.
"""
)
```

---

### Step 4: Define LLMs

```python
llm_reason = ChatOpenAI(model="gpt-4o")     # main reasoning LLM
llm_reflect = ChatOpenAI(model="gpt-4o-mini")  # lighter model for reflection
```

---

### Step 5: Create RAG + Reasoning Chain

```python
qa_chain = RetrievalQA.from_chain_type(
    llm=llm_reason,
    retriever=retriever,
    chain_type="stuff"
)
```

---

### Add Reflection Loop

```python
def self_reflective_rag(query):
    # Step 1: Generate initial answer
    initial_answer = qa_chain.run(query)

    # Step 2: Reflect and improve
    reflection_chain = LLMChain(llm=llm_reflect, prompt=reflection_prompt)
    improved_answer = reflection_chain.run(query=query, previous_answer=initial_answer)

    return improved_answer
```

---

**Usage:**

```python
response = self_reflective_rag("Explain LangGraph and its advantages over LangChain.")
print(response)
```

The model first retrieves context, then reasons, then critiques itself, and finally outputs a refined, factual response.

---

## Self-Reflective RAG in LangGraph

Now, let’s visualize the **graph-based version** with explicit reasoning + reflection nodes.

---

### Node Structure

| Node          | Function                             |
| ------------- | ------------------------------------ |
| 🔍 `retrieve` | Fetch documents from vector store    |
| 💬 `reason`   | Use LLM to generate answer           |
| 🔁 `reflect`  | Critique and improve previous answer |
| 🏁 `final`    | Return improved answer               |

---

### Code Example

```python
from langgraph.graph import StateGraph, START, END
from langchain_openai import ChatOpenAI

class AgentState(dict): pass

llm_reason = ChatOpenAI(model="gpt-4o")
llm_reflect = ChatOpenAI(model="gpt-4o-mini")

workflow = StateGraph(AgentState)

def retrieve_node(state):
    q = state["question"]
    docs = retriever.get_relevant_documents(q)
    return {"context": docs}

def reason_node(state):
    q, ctx = state["question"], state["context"]
    answer = llm_reason.invoke(f"Question: {q}\nContext: {ctx}\nAnswer thoughtfully.")
    return {"answer": answer}

def reflect_node(state):
    q, ans = state["question"], state["answer"]
    reflection = llm_reflect.invoke(
        f"Question: {q}\nPrevious Answer: {ans}\nCritically evaluate and rewrite improved version."
    )
    return {"final_answer": reflection}

workflow.add_node("retrieve", retrieve_node)
workflow.add_node("reason", reason_node)
workflow.add_node("reflect", reflect_node)

workflow.add_edge(START, "retrieve")
workflow.add_edge("retrieve", "reason")
workflow.add_edge("reason", "reflect")
workflow.add_edge("reflect", END)

self_reflective_rag_graph = workflow.compile()

result = self_reflective_rag_graph.invoke({"question": "What is a ReAct agent in LangChain?"})
print(result["final_answer"])
```

---

### Output:

```
ReAct agents combine reasoning and acting. 
They think step-by-step, decide which tools to use, observe outcomes, and revise their answers.
(Reflection: Added missing detail about LangChain integration.)
Final Answer: ReAct agents in LangChain are reasoning agents that use external tools and environment feedback to solve tasks iteratively.
```

---

## Integrating ReAct Tools

You can add **tools** like retrievers, web search, or calculators into the reasoning loop:

```python
tools = [
    Tool(
        name="RAG Retriever",
        func=lambda q: retriever.get_relevant_documents(q),
        description="Retrieve factual context from vector DB"
    )
]

react_agent = initialize_agent(
    tools=tools,
    agent_type="zero-shot-react-description",
    llm=llm_reason,
    verbose=True
)

result = react_agent.run("Find and explain what Self-Reflection in AI means.")
```

You can now **combine this agent with reflection logic** inside the LangGraph loop — creating a *self-correcting ReAct agent.*

---

## Advanced: ReAct + Reflection Loop (LangGraph Hybrid)

This version loops multiple times until the reflection node says output is “good”.

```python
while True:
    answer = reason_node(state)
    feedback = reflect_node(state)
    if "good" in feedback["final_answer"].lower():
        break
    state["answer"] = feedback["final_answer"]
```

This creates an **iterative feedback loop**, similar to how humans revise drafts.

---

## Best Practices

| Aspect          | Tip                                                       |
| --------------- | --------------------------------------------------------- |
| 🧠 Reflection   | Use a smaller, focused model to save cost                 |
| 🔁 Loop Control | Limit reflection iterations (2–3 loops max)               |
| 📚 Context      | Always pass retrieved context to reflection prompt        |
| 🧩 Memory       | Use LangGraph’s state to preserve reasoning between steps |
| 🔍 Debug        | Print intermediate reasoning for transparency             |

---

## Real-World Applications

| Use Case                | Description                                  |
| ----------------------- | -------------------------------------------- |
| 💬 Chatbots             | Self-correct hallucinations automatically    |
| 🧑‍🏫 Tutors            | Evaluate and refine explanations             |
| 📚 Knowledge Assistants | Verify factual accuracy from internal docs   |
| ⚙️ Agents               | Auto-correct tool misuse or reasoning errors |

---

## Visualization of the Full Loop

```
User Query
   ↓
[Retriever] → Context
   ↓
[Reasoning Node] → Draft Answer
   ↓
[Reflection Node] → Critique + Improve
   ↺ (Loop if needed)
   ↓
[Final Output]
```

---

## Summary Table

| Concept         | Role                            | Framework Support                           |
| --------------- | ------------------------------- | ------------------------------------------- |
| RAG             | Factual grounding               | LangChain (Retrievers, VectorStores)        |
| ReAct           | Reason + Act agentic behavior   | LangChain Agents                            |
| Self-Reflection | Self-evaluation and improvement | LangGraph loops / reflection nodes          |
| LangGraph       | State-driven workflow           | Perfect for chaining RAG → Reason → Reflect |
| LangChain       | Core building blocks            | LLMs, tools, retrievers                     |

---

## TL;DR Summary

> **Self-Reflective ReAct RAG Agent =**
>
> 🧩 Retrieval (for facts) +
> 🧠 ReAct (for reasoning + action) +
> 💭 Reflection (for self-correction)
>
> implemented using **LangChain (logic)** + **LangGraph (workflow)**.

---

## Example Use Case Summary

Imagine an **AI Research Assistant**:

1. Retrieves papers (RAG).
2. Analyzes content step-by-step (ReAct).
3. Checks its reasoning and correctness (Self-Reflection).
4. Produces a verified summary.

