# **Introduction**
Self-Retrieval-Augmented Generation (Self-RAG) is an advanced variant of Retrieval-Augmented Generation (RAG) that enhances retrieval efficiency by allowing the model to refine its own queries iteratively. Unlike traditional RAG, where user queries are directly used for retrieval, Self-RAG enables an LLM to autonomously generate and improve search queries, leading to better information retrieval and response generation.

# **Concepts of Self-RAG**
Self-RAG introduces query self-refinement and iterative retrieval, making it more dynamic and capable of handling complex queries. It consists of the following key components:

1. **Query Expansion & Refinement**

  * The model rephrases or expands the user query to improve retrieval performance.
  * This ensures that retrieved documents are more relevant to the intent of the query.
2. **Self-Iterative Retrieval**

  * Instead of retrieving documents only once, the model refines its query iteratively to get the most accurate results.
3. **Feedback Loop**

  * The retrieved results influence further query modifications.
  * The system uses a feedback loop to adjust and refine the generated text.
4. **Multi-Step Reasoning**

  * Instead of generating an answer from a single retrieval pass, the model retrieves, processes, and refines information in multiple steps.

# **Applications of Self-RAG**
Self-RAG is beneficial in various AI-driven applications, including:

1. Legal and Research Document Analysis

  * Helps refine complex legal queries for better case law retrieval.
2. Medical Q&A Systems

  * Improves retrieval of patient-specific or disease-related information.
3. Enterprise Knowledge Management

  * Enhances internal company knowledge retrieval by refining queries based on contextual clues.
4. Customer Support Chatbots

  * Enables intelligent, multi-step query resolution.
5. Code Search and Documentation Retrieval

  * Enhances search in large repositories like GitHub or API documentation.

# **Implementation of Self-RAG in GenAI**

In [1]:
!pip install -qU langchain langchain-openai langchain-community faiss-cpu

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m9.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m54.5/54.5 kB[0m [31m3.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m44.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m30.7/30.7 MB[0m [31m29.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m412.7/412.7 kB[0m [31m19.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m38.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.8/50.8 kB[0m [31m2.6 MB/s[0m eta [36m0:00:00[0m
[?25h

In [7]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnableLambda, RunnablePassthrough
from langchain_community.retrievers import BM25Retriever
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_core.documents import Document
from typing import List, Optional, Dict
from langchain.text_splitter import RecursiveCharacterTextSplitter

In [3]:
import openai
from google.colab import userdata
import os


openai_api= userdata.get("OPENAI_API_KEY")

In [4]:
# 1. Initialize components
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0,openai_api_key=openai_api)
embeddings = OpenAIEmbeddings(model="text-embedding-3-small",openai_api_key=openai_api)

In [5]:
documents = [
    Document(page_content="The French Revolution began in 1789 with the storming of the Bastille."),
    Document(page_content="Louis XVI was executed in 1793 during the Reign of Terror."),
    Document(page_content="The Revolution led to the rise of Napoleon Bonaparte in 1799.")
]

In [9]:
!pip install rank_bm25

Collecting rank_bm25
  Downloading rank_bm25-0.2.2-py3-none-any.whl.metadata (3.2 kB)
Downloading rank_bm25-0.2.2-py3-none-any.whl (8.6 kB)
Installing collected packages: rank_bm25
Successfully installed rank_bm25-0.2.2


In [10]:
# Initialize hybrid retriever (BM25 + FAISS)
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
split_docs = text_splitter.split_documents(documents)

# Sparse retriever
bm25_retriever = BM25Retriever.from_documents(split_docs)
bm25_retriever.k = 2

# Dense retriever
vectorstore = FAISS.from_documents(split_docs, embeddings)
faiss_retriever = vectorstore.as_retriever(search_kwargs={"k": 2})

In [11]:
# 2. Define Self-RAG components
class SelfRAGProcessor:
    def __init__(self):
        self.retrieval_triggers = ["[Retrieve]", "[Verify]", "[Expand]"]

    def _detect_retrieval_need(self, text: str) -> bool:
        return any(trigger in text for trigger in self.retrieval_triggers)

    def _hybrid_retrieve(self, query: str) -> List[Document]:
        # Combine BM25 and FAISS results
        bm25_docs = bm25_retriever.invoke(query)
        faiss_docs = faiss_retriever.invoke(query)
        return self._merge_docs(bm25_docs + faiss_docs)

    def _merge_docs(self, docs: List[Document]) -> List[Document]:
        # Remove duplicates while preserving order
        seen = set()
        return [doc for doc in docs if not (doc.page_content in seen or seen.add(doc.page_content))]

In [12]:
# 3. Create Self-RAG chain
self_rag_processor = SelfRAGProcessor()

prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a Self-RAG assistant. Use these triggers when needed:
     [Retrieve] - When needing factual verification or additional context
     [Verify] - When confirming specific facts
     [Expand] - When needing broader perspective

     Current Context: {context}"""),
    ("human", "{query}")
])

def retrieval_wrapper(state: Dict) -> Dict:
    query = state["query"]
    context = state.get("context", "")

    # Generate initial response
    generation_chain = prompt | llm | StrOutputParser()
    response = generation_chain.invoke({"query": query, "context": context})

    # Check if retrieval needed
    if self_rag_processor._detect_retrieval_need(response):
        retrieved_docs = self_rag_processor._hybrid_retrieve(query)
        new_context = "\n".join([doc.page_content for doc in retrieved_docs])
        return {"query": query, "response": response, "context": new_context}

    return {"query": query, "response": response, "context": context}

self_rag_chain = RunnablePassthrough.assign(
    context=lambda x: x.get("context", "")
) | RunnableLambda(retrieval_wrapper)

# 4. Iterative generation with retrieval
def full_self_rag(query: str, max_iter=3) -> str:
    state = {"query": query, "context": ""}
    for _ in range(max_iter):
        state = self_rag_chain.invoke(state)
        if not self_rag_processor._detect_retrieval_need(state["response"]):
            break
    return state["response"]

In [13]:
# 5. Test the implementation
query = "Explain the causes of the French Revolution and its consequences"
result = full_self_rag(query)
print("Self-RAG Output:\n", result)

Self-RAG Output:
 The French Revolution, which took place from 1789 to 1799, was a period of significant social and political upheaval in France. There were several causes of the French Revolution, including:

1. **Social Inequality**: The French society was divided into three estates, with the clergy and nobility enjoying privileges and exemptions from taxes, while the common people faced heavy taxation and economic hardship.

2. **Financial Crisis**: France was facing a severe financial crisis due to extravagant spending by the monarchy, costly wars, and a regressive tax system that burdened the common people.

3. **Enlightenment Ideas**: The ideas of the Enlightenment, which emphasized individual rights, equality, and popular sovereignty, inspired many French people to question the existing social and political order.

4. **Weak Leadership**: King Louis XVI's indecisiveness and inability to address the country's problems effectively weakened the monarchy's authority and legitimacy.
