### 📖 Where We Are

**In the last notebook**, we built a powerful **Agentic RAG** system that could intelligently choose between different tools. This introduced the idea of a dynamic, reasoning-driven workflow.

**In this notebook**, we'll explore another advanced agentic pattern: **Corrective RAG (C-RAG)**. This architecture adds a crucial layer of self-correction to the retrieval process. The system will learn to grade the relevance of its retrieved documents and, if they are not good enough, take corrective actions like rewriting the query and falling back to a web search. This creates a highly robust and accurate RAG system that can recover from initial retrieval failures.

### 1. Understanding Corrective RAG (C-RAG)

Corrective Retrieval-Augmented Generation (C-RAG) is an advanced, adaptive RAG pattern designed to make retrieval systems more robust and accurate. Its core innovation is a built-in **self-correction loop**. Instead of blindly trusting the documents it retrieves, a C-RAG system actively assesses the quality of those documents and takes corrective actions if they are irrelevant.

#### The Problem: When Initial Retrieval Fails

A standard RAG pipeline can fail if the initial document retrieval is poor, leading to hallucinations or "I don't know" answers. C-RAG adds a quality control gate to recover from this.

**Analogy: The Expert Researcher with Internet Access 👩‍💻**

-   A **Standard RAG System** is a researcher who can only look in the company's private library. If the answer isn't there, they write a poor report or give up.
-   A **Corrective RAG System** is a more seasoned researcher:
    1.  They first check the internal company library (**Retrieve**).
    2.  They critically read the documents they found and ask, "Does this actually answer my question?" (**Grade**).
    3.  **If the documents are good**, they write their report based on them (**Generate**).
    4.  **If the documents are bad**, they think, "The internal library is insufficient. I need to rephrase my search and look elsewhere." (**Transform Query**). Then, they use a public search engine (**Web Search**) to find the correct information before writing their final, well-sourced report (**Generate**).

In [26]:
# --- Environment Setup ---
import os
from dotenv import load_dotenv
load_dotenv() # Loading all the environment variables

# Set API keys for the services we'll use.
os.environ["GROQ_API_KEY"] = os.getenv("GROQ_API_KEY")
os.environ["TAVILY_API_KEY"] = os.getenv("TAVILY_API_KEY")

In [27]:
# --- 2. Build the Index (Primary Knowledge Base) ---
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import FAISS
from langchain_huggingface import HuggingFaceEmbeddings

# Define the URLs for our knowledge base.
urls = [
    "https://lilianweng.github.io/posts/2023-06-23-agent/",
    "https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/",
    "https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/",
]

# Load, split, and index the documents into a FAISS vector store.
docs = [WebBaseLoader(url).load() for url in urls]
docs_list = [item for sublist in docs for item in sublist]
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(chunk_size=500, chunk_overlap=0)
doc_splits = text_splitter.split_documents(docs_list)
vectorstore = FAISS.from_documents(documents=doc_splits, embedding=HuggingFaceEmbeddings(model="all-MiniLM-L6-v2"))
retriever = vectorstore.as_retriever()

### 3. Building the C-RAG Components
We will build our Corrective RAG system from modular components using standard LangChain chains.

In [29]:
# --- Component 1: The Retrieval Grader ---
from langchain_core.prompts import ChatPromptTemplate
from langchain_groq import ChatGroq
from pydantic import BaseModel, Field
from langchain.chains import LLMChain

class GradeDocuments(BaseModel):
    """Binary score for relevance check on retrieved documents."""
    binary_score: str = Field(description="Documents are relevant to the question, 'yes' or 'no'")

llm = ChatGroq(model="llama-3.3-70b-versatile", temperature=0)
structured_llm_grader = llm.with_structured_output(GradeDocuments)

system = """You are a grader assessing the relevance of a retrieved document to a user question. 
    If the document contains keywords or semantic meaning related to the question, grade it as relevant. 
    Give a binary score 'yes' or 'no' to indicate whether the document is relevant."""
grade_prompt = ChatPromptTemplate.from_messages([
    ("system", system),
    ("human", "Retrieved document: \n\n {document} \n\n User question: {question}"),
])

retrieval_grader = grade_prompt | structured_llm_grader

In [30]:
# --- Component 2: The Answer Generator ---
from langchain import hub
from langchain_core.output_parsers import StrOutputParser

prompt = hub.pull("rlm/rag-prompt")
rag_chain = prompt | llm | StrOutputParser()



In [31]:
# --- Component 3: The Question Rewriter ---
system = """You are a question re-writer that converts an input question to a better version that is optimized for web search.
     Look at the input and try to reason about the underlying semantic intent / meaning."""
re_write_prompt = ChatPromptTemplate.from_messages([
    ("system", system),
    ("human", "Here is the initial question: \n\n {question} \n Formulate an improved question."),
])

question_rewriter = re_write_prompt | llm | StrOutputParser()

In [32]:
# --- Component 4: The Web Search Tool (Fallback) ---
from langchain_community.tools.tavily_search import TavilySearchResults
web_search_tool = TavilySearchResults(k=3)

### 4. Implementing the Corrective RAG Logic
Now, we'll create a main function that orchestrates the entire self-correcting loop using the components we just built.

In [33]:
from langchain.schema import Document

def corrective_rag_flow(question: str):
    """Implements the full Corrective RAG workflow."""
    
    print("--- STEP 1: INITIAL RETRIEVAL ---")
    documents = retriever.invoke(question)
    print(f"Retrieved {len(documents)} documents.")
    
    print("\n--- STEP 2: GRADING DOCUMENTS ---")
    filtered_docs = []
    for doc in documents:
        grade = retrieval_grader.invoke({"question": question, "document": doc.page_content})
        if grade.binary_score == "yes":
            print(f"- Document RELEVANT")
            filtered_docs.append(doc)
        else:
            print(f"- Document NOT RELEVANT")
    
    # If all retrieved documents are irrelevant, we trigger the corrective actions.
    if not filtered_docs:
        print("\n--- STEP 3a: CORRECTIVE ACTION - REWRITING QUERY ---")
        new_question = question_rewriter.invoke({"question": question})
        print("Rewritten Question:", new_question)
        
        print("\n--- STEP 3b: CORRECTIVE ACTION - WEB SEARCH ---")
        # The error was here. The Tavily tool returns a list of dictionaries.
        # We need to access the 'content' key from each dictionary.
        web_search_results = web_search_tool.invoke({"query": new_question})
        web_content = "\n".join([d["content"] for d in web_search_results])
        
        # Add the web search results as a new document to our context.
        filtered_docs.append(Document(page_content=web_content))
    
    print("\n--- STEP 4: GENERATING FINAL ANSWER ---")
    final_answer = rag_chain.invoke({"context": filtered_docs, "question": question})
    
    return final_answer

In [34]:
# Run the graph with a sample question.
final_answer = corrective_rag_flow("What are the types of agent memory?")
print("\n--- FINAL ANSWER ---")
print(final_answer)

--- STEP 1: INITIAL RETRIEVAL ---
Retrieved 4 documents.

--- STEP 2: GRADING DOCUMENTS ---
- Document NOT RELEVANT
- Document RELEVANT
- Document RELEVANT
- Document NOT RELEVANT

--- STEP 4: GENERATING FINAL ANSWER ---

--- FINAL ANSWER ---
The types of agent memory are short-term memory and long-term memory. Short-term memory utilizes in-context learning, while long-term memory provides the capability to retain and recall information over extended periods. This is achieved by leveraging an external vector store and fast retrieval.


### 🔑 Key Takeaways

* **C-RAG is a Robust Pattern**: Corrective RAG builds a self-correction loop into your retrieval process, making it more resilient to initial retrieval failures.
* **Grading is the Quality Gate**: The core of C-RAG is the **grading** step, where an LLM assesses the relevance of retrieved documents. This allows the system to identify when it needs to take corrective action.
* **Correction through Rewriting and Fallbacks**: When retrieval fails, the system can **rewrite** the query for better clarity and/or fall back to an external knowledge source like a **web search** to find the necessary information.
* **Modular Implementation**: You can implement this pattern by creating separate, modular components (grader, rewriter, generator) and orchestrating them with a controlling function or a graph framework like LangGraph.