## **Corrective RAG**

Enhances the quality of response, by evaluating and correcting retrieval process by combining the power of vector DB, web search.

#### **Motivation** 

Inspite of having improved information retrieval and response generation traditional RAG sometimes lack when retrieved info is outdated or irrelevant. 

CRAG addresses these limitations by:

- Pre-existing knowledge base 
- Evaluating relevance of retrieved information
- Searching web for info when necessary 
- Refining and combining knowledge from multiple sources 
- Generating human-like responses based on appropriate knowledge 

#### **Key Components**

1. **FAISS index** - VectorDB for efficient similarity search 
2. **Retrieval Evaluator** - Finds out relevance of retrieved docs to the query
3. **Knowledge Refinement** - Extracts key information from docs when necessary
4. **Web Search Query Rewriter** - Optimize queries for web search when local knowledge is insufficient
5. **Response Generator** - Creates responses based on accurate knowledge 

#### **Method Details**

1. **Document Retrieval**
- Performs similarity search in FAISS to find most relevant docs 
- Retrieves top-k documents 

2. **Document Evaluation**
- Calculate relevance score for each retrieved doc and uses appropriate action

3. **Action as per relevancy score**
- If score > 0.7 : Use the doc itself 
- If score < 0.3 : corrects by performing a web search with re-written query 
- If 0.3 < score > 0.7 : corrects by combining most relevant doc with web search

4. **Response Generation** 
- Use LLM with a system prompt given query and context

--- 

#### **LLM used**

In [1]:
from langchain_ollama import ChatOllama 

llm = ChatOllama(
    model='llama3.2',
    temperature=0.2,
    verbose=True
)

llm.invoke("Hey, How are you?")

  from .autonotebook import tqdm as notebook_tqdm


AIMessage(content="I'm just a language model, so I don't have emotions or feelings like humans do. However, I'm functioning properly and ready to assist you with any questions or tasks you may have! How can I help you today?", additional_kwargs={}, response_metadata={'model': 'llama3.2', 'created_at': '2025-12-31T16:39:23.291207Z', 'done': True, 'done_reason': 'stop', 'total_duration': 19967040958, 'load_duration': 1548580000, 'prompt_eval_count': 31, 'prompt_eval_duration': 15100790792, 'eval_count': 47, 'eval_duration': 3311969916, 'logprobs': None, 'model_name': 'llama3.2', 'model_provider': 'ollama'}, id='lc_run--019b7546-f582-72d1-b912-1f3f9bf1dd38-0', usage_metadata={'input_tokens': 31, 'output_tokens': 47, 'total_tokens': 78})

--- 

#### **Embedding Model**

In [2]:
from langchain_huggingface import HuggingFaceEmbeddings 

embedding_model = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

text_doc = "Hey, How are you?"
sample_embeddings = embedding_model.embed_query(text_doc)

print(f"Length of embedding : {len(sample_embeddings)}")
print(f"Sample embedding : {sample_embeddings[:100]}")

Length of embedding : 384
Sample embedding : [0.0045777312479913235, 0.0016591983148828149, 0.10974686592817307, 0.0874689593911171, 0.017562976107001305, -0.058710481971502304, 0.06303051114082336, 0.0037432187236845493, -0.07938601821660995, 0.01719500683248043, -0.03582882508635521, -0.014464635401964188, -0.03638741746544838, 0.002508581383153796, 0.028078017756342888, -0.057133182883262634, 0.07289010286331177, -0.11976979672908783, -0.14799512922763824, 0.05696568638086319, -0.006158941425383091, 0.02934739738702774, 0.03251346945762634, 0.04774465039372444, -0.06293025612831116, -0.0452665351331234, -0.011549439281225204, 0.037300825119018555, -0.05545136332511902, -0.061443645507097244, -0.09237056970596313, 0.05370642989873886, -0.08650102466344833, -0.006575974635779858, 0.02130102552473545, 0.007713349536061287, -0.04249992594122887, -0.13462883234024048, 0.009984271600842476, -0.00842222198843956, 0.0261904988437891, 0.005356464069336653, -0.017299529165029526, -0.014943229

---

#### **Load the Documents** 

In [3]:
from langchain_community.document_loaders import PyPDFLoader

pdf_path = "../data/Understanding_Climate_Change.pdf"
loader = PyPDFLoader(pdf_path)

pages = loader.load()
print(f"Number of pages : {len(pages)}")

Number of pages : 33


Getting the chunks

In [4]:
from langchain_text_splitters import RecursiveCharacterTextSplitter 

text_splitter = RecursiveCharacterTextSplitter(chunk_size=400, chunk_overlap=50)

chunks = text_splitter.split_documents(pages)
print(f"Number of chunks : {len(chunks)}")

Number of chunks : 215


---

#### **Create a vector_store**

In [5]:
import faiss
from langchain_community.docstore.in_memory import InMemoryDocstore
from langchain_community.vectorstores import FAISS

embedding_dim = len(embedding_model.embed_query("hello world"))
index = faiss.IndexFlatL2(embedding_dim)

vector_store = FAISS(
    embedding_function=embedding_model,
    index=index,
    docstore=InMemoryDocstore(),
    index_to_docstore_id={},
)

vector_store.add_documents(documents=chunks)

['54846de4-1464-46a1-b637-1c32dd07a03b',
 'cf208dc2-df6f-437d-98a5-eee8e515a5a1',
 'f540ca74-726a-4327-acb6-c32b160fa807',
 '5a4f214a-c2a0-43be-8d9a-d640b3b95c0a',
 '9441328e-ec34-4c40-9b1e-fda91c8a573c',
 'f72750b5-4247-4f51-a227-e9926b03e4e6',
 'a45de325-a520-49b3-a368-ff167afba53b',
 '68d895fc-da25-4c22-80e7-49a36237f69a',
 '2eed3218-2b85-46fc-acd8-d2071667dc2c',
 '413bb128-8ea4-44f2-b26a-12b1ce322039',
 '9a62b8ca-991d-4590-a039-1f7f07ab948d',
 '329cb904-165c-4425-9ac9-8950fabcc4ec',
 'a924fbb0-a048-4e17-9d32-8d8db123482b',
 'f68a0ddc-d2a4-4efb-9c81-ddcecdee0f7e',
 '825e3699-eb58-4953-9d9b-b4b26e3f5d9f',
 '6d1dd320-97e8-4e65-aa52-1256573e1d1b',
 '53cacfb8-d219-4972-a5a5-0038fbb32f41',
 '4ac11100-e92f-43d0-997c-03fda1d22ffe',
 '7f4e6733-5616-47d3-8248-e02462db1b79',
 '2efe907b-ada6-45cf-8726-9f17b70dcc6f',
 'cb07d4dd-61aa-43ed-89e5-1d2084a22cf9',
 'c378e583-cf27-43fc-928d-68910fb43b89',
 '8776071a-a005-47f0-8e90-836be720dffd',
 '73b9691d-ecd2-4c2d-8501-15dc0bcc2246',
 '084ea6bb-bfb8-

#### **Web search tools**

In [8]:
from langchain_community.tools import DuckDuckGoSearchResults

In [10]:
search = DuckDuckGoSearchResults()

#### **Retrieval evaluator | Knowledge Refinement | Query Rewriter**

In [12]:
from pydantic import BaseModel, Field
from typing import Annotated, List, Tuple
from langchain_core.prompts import PromptTemplate

## Retrieval Evaluator Structured LLM Output data validation
class RetrievalEvaluatorInput(BaseModel):
    relevancy_score: Annotated[float, Field(description="Relevancy score of dcoument with respect to query in between 0 to 1.")]

## Retrieval evaluator 
def retrieval_evaluator(query: str, document: str) -> float:
    prompt_template = """ 
        "On a scale from 0 to 1, how relevant is the following document to the query? Query: {query}\nDocument: {document}\nRelevance score:"
    """
    prompt = PromptTemplate(
        template=prompt_template,
        input_variable=['query', 'document']
    )

    # configuring llm with structured output 
    llm_relevancy_score = llm.with_structured_output(RetrievalEvaluatorInput)
    chain = prompt | llm_relevancy_score 
    result = chain.invoke({'query' : query, 'document' : document}).relevancy_score 
    return result 

## Knowledge Refinement 
class KnowledgeRefinementInput(BaseModel):
    key_points: Annotated[str, Field(description='The document to extract key information from.')]

def knowledge_refinement(document: str) -> str:
    prompt = PromptTemplate(
        input_variables=["document"],
        template="Extract the key information from the following document in bullet points:\n{document}\nKey points:"
    )
    chain = prompt | llm.with_structured_output(KnowledgeRefinementInput)
    input_variables = {"document": document}
    result = chain.invoke(input_variables).key_points
    return result 

# Web Search Query Rewriter
class QueryRewriterInput(BaseModel):
    query: Annotated[str, Field(description="The query to rewrite.")]
    
def rewrite_query(query: str) -> str:
    prompt = PromptTemplate(
        input_variables=["query"],
        template="Rewrite the following query to make it more suitable for a web search:\n{query}\nRewritten query:"
    )
    chain = prompt | llm.with_structured_output(QueryRewriterInput)
    input_variables = {"query": query}
    return chain.invoke(input_variables).query

In [13]:
import json

def parse_search_results(results_string: str) -> List[Tuple[str, str]]:
    """
    Parse a JSON string of search results into a list of title-link tuples.

    Args:
        results_string (str): A JSON-formatted string containing search results.

    Returns:
        List[Tuple[str, str]]: A list of tuples, where each tuple contains the title and link of a search result.
                               If parsing fails, an empty list is returned.
    """
    try:
        # Attempt to parse the JSON string
        results = json.loads(results_string)
        # Extract and return the title and link from each result
        return [(result.get('title', 'Untitled'), result.get('link', '')) for result in results]
    except json.JSONDecodeError:
        # Handle JSON decoding errors by returning an empty list
        print("Error parsing search results. Returning empty list.")
        return []

In [14]:
def retrieve_documents(query: str, faiss_index: FAISS, k: int = 3) -> List[str]:
    docs = faiss_index.similarity_search(query, k=k)
    return [doc.page_content for doc in docs]

def evaluate_documents(query: str, documents: List[str]) -> List[float]:
    return [retrieval_evaluator(query, doc) for doc in documents]

def perform_web_search(query: str) -> Tuple[List[str], List[Tuple[str, str]]]:
    rewritten_query = rewrite_query(query)
    web_results = search.run(rewritten_query)
    web_knowledge = knowledge_refinement(web_results)
    sources = parse_search_results(web_results)
    return web_knowledge, sources

def generate_response(query: str, knowledge: str, sources: List[Tuple[str, str]]) -> str:
    response_prompt = PromptTemplate(
        input_variables=["query", "knowledge", "sources"],
        template="Based on the following knowledge, answer the query. Include the sources with their links (if available) at the end of your answer:\nQuery: {query}\nKnowledge: {knowledge}\nSources: {sources}\nAnswer:"
    )
    input_variables = {
        "query": query,
        "knowledge": knowledge,
        "sources": "\n".join([f"{title}: {link}" if link else title for title, link in sources])
    }
    response_chain = response_prompt | llm
    return response_chain.invoke(input_variables).content

---

#### **CRAG process**

In [15]:
def crag_process(query: str, faiss_index: FAISS) -> str:
    """
    Process a query by retrieving, evaluating, and using documents or performing a web search to generate a response.

    Args:
        query (str): The query string to process.
        faiss_index (FAISS): The FAISS index used for document retrieval.

    Returns:
        str: The generated response based on the query.
    """
    print(f"\nProcessing query: {query}")

    # Retrieve and evaluate documents
    retrieved_docs = retrieve_documents(query, faiss_index)
    eval_scores = evaluate_documents(query, retrieved_docs)
    
    print(f"\nRetrieved {len(retrieved_docs)} documents")
    print(f"Evaluation scores: {eval_scores}")
    
    # Determine action based on evaluation scores
    max_score = max(eval_scores)
    sources = []
    
    if max_score > 0.7:
        print("\nAction: Correct - Using retrieved document")
        best_doc = retrieved_docs[eval_scores.index(max_score)]
        final_knowledge = best_doc
        sources.append(("Retrieved document", ""))
    elif max_score < 0.3:
        print("\nAction: Incorrect - Performing web search")
        final_knowledge, sources = perform_web_search(query)
    else:
        print("\nAction: Ambiguous - Combining retrieved document and web search")
        best_doc = retrieved_docs[eval_scores.index(max_score)]
        # Refine the retrieved knowledge
        retrieved_knowledge = knowledge_refinement(best_doc)
        web_knowledge, web_sources = perform_web_search(query)
        final_knowledge = "\n".join(retrieved_knowledge + web_knowledge)
        sources = [("Retrieved document", "")] + web_sources

    print("\nFinal knowledge:")
    print(final_knowledge)
    
    print("\nSources:")
    for title, link in sources:
        print(f"{title}: {link}" if link else title)

    # Generate response
    print("\nGenerating response...")
    response = generate_response(query, final_knowledge, sources)

    print("\nResponse generated")
    return response

In [17]:
query = "What are the main causes of climate change?"
result = crag_process(query, vector_store)
print(f"Query: {query}")
print(f"Answer: {result}")


Processing query: What are the main causes of climate change?

Retrieved 3 documents
Evaluation scores: [0.8, 0.9, 0.0]

Action: Correct - Using retrieved document

Final knowledge:
provide a historical record that scientists use to understand past climate conditions and 
predict future trends. The evidence overwhelmingly shows that recent changes are primarily 
driven by human activities, particularly the emission of greenhouse gases. 
Chapter 2: Causes of Climate Change 
Greenhouse Gases

Sources:
Retrieved document

Generating response...

Response generated
Query: What are the main causes of climate change?
Answer: The main causes of climate change can be attributed to several factors, but the most significant contributor is the emission of greenhouse gases (GHGs). The evidence overwhelmingly shows that recent changes are primarily driven by human activities.

Historically, scientists have used various records to understand past climate conditions and predict future trends. Some o