# 🤖 Self-RAG 

Welcome to the **Self-Retrieval-Augmented Generation (Self-RAG)** project!  
This notebook demonstrates a workflow inspired by the [Self-RAG paper (arXiv:2310.11511)](https://arxiv.org/pdf/2310.11511), where a language model decides **when and how** to retrieve external knowledge to answer queries more effectively.

---

## 🚀 Overview

- **Self-RAG** enables an LLM to:
    - Decide if external retrieval is needed for a given query.
    - Retrieve relevant documents using a vector store.
    - Filter and retain only the most relevant context.
    - Generate answers using either retrieved context or internal knowledge.
    - Evaluate the factual support and usefulness of generated responses.

---

## 🛠️ Workflow

1. **API Key Setup**  
     Securely input your API keys for Groq and Google.

2. **Model Initialization**  
     - Uses `langchain-groq` and `langchain-google-genai` for LLMs.
     - Uses Google GenAI embeddings for vector search.

3. **Document Loading & Indexing**  
     - Loads documents from the `data` directory.
     - Splits them into chunks and indexes them in a vector store.

4. **Self-RAG Pipeline**  
     - `retrieval_required`: Determines if retrieval is needed.
     - `is_context_relevant`: Filters retrieved chunks based on relevance.
     - `response_generation`: Generates answers using either context or model knowledge.
     - `support_response` & `utility_response`: 
       - Evaluate how well the generated answer is supported by context (if retrieved).
       - Always prints the **utility score** (even if no retrieval is used), as described in the Self-RAG paper.
     - `self_RAG`: Coordinates all the steps for each query.

5. **Example Queries**  
     - The notebook runs queries related to robotics and shows how Self-RAG handles them.

---

## 📦 Key Features

- **Dynamic Retrieval**: Avoids unnecessary lookups.
- **Context Filtering**: Keeps only what's truly relevant.
- **Answer Scoring**: Scores responses based on factual support and usefulness.
- **Modular Design**: Clean, reusable components.

---

## 📚 References

- [Self-RAG: Learning to Retrieve, Generate, and Critique in a Self-Refining Loop (arXiv:2310.11511)](https://arxiv.org/pdf/2310.11511)

---

## 💡 How to Use

1. Place your documents in the `data` folder.
2. Run the notebook cells sequentially.
3. Enter your API keys when prompted.
4. Run your own queries using the `self_RAG` function.

---

## 📝 Notes

- ⚠️ **This is not an exact reproduction of the Self-RAG paper**, but a simplified and practical implementation inspired by its core ideas.
- The current version avoids agentic loop structures and focuses on step-wise reasoning with modular functions.
- The **utility score is printed for every response**, even when no retrieval is performed, in alignment with the critique step in the paper.
- Feel free to adapt or extend it for your own use cases!


---

Happy experimenting! 🤗


Image reference : [Langchain blog](https://blog.langchain.com/agentic-rag-with-langgraph/)
Paper Reference : [Self RAG paper](https://arxiv.org/pdf/2310.11511)
![Model](https://blog.langchain.com/content/images/2024/02/data-src-image-ebf55e8c-de51-49b8-9f32-94ccbf24741f.png)

In [51]:
import getpass
import os

if "GROQ_API_KEY" not in os.environ:
    os.environ["GROQ_API_KEY"] = getpass.getpass("Enter your Groq API key: ")

Enter your Groq API key: ··········


In [52]:
%pip install -qU langchain-groq

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m131.1/131.1 kB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m
[?25h

In [2]:
import os
GOOGLE_API_KEY = ""
os.environ["GOOGLE_API_KEY"] = GOOGLE_API_KEY
from langchain_google_genai import ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(
    model="gemini-2.5-flash",
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2,
)

In [54]:
from langchain_groq import ChatGroq

llm_2 = ChatGroq(
    model="deepseek-r1-distill-llama-70b",
    temperature=0,
    max_tokens=None,
    reasoning_format="parsed",
    timeout=None,
    max_retries=2,
)

In [5]:
from llama_index.core import VectorStoreIndex, Settings, SimpleDirectoryReader
from llama_index.core.node_parser import SentenceSplitter
from llama_index.embeddings.google_genai import GoogleGenAIEmbedding
documents = SimpleDirectoryReader("data").load_data()
splitter = SentenceSplitter(chunk_size=1000,chunk_overlap=200)
nodes = splitter.get_nodes_from_documents(documents)

embed_model = GoogleGenAIEmbedding(
    model_name="text-embedding-004",
    embed_batch_size=100,
)
Settings.embed_model = embed_model
Settings.llm = llm
vector_index = VectorStoreIndex(nodes)
retriever = vector_index.as_retriever(similarity_top_k=5)

In [14]:
def retrieval_required(query):
    """
    For a given query, determine if retrieval is necessary using the LLM.
    Returns True if 'Yes', False otherwise.
    """
    prompt = f"""
    Given the query: "{query}", determine if retrieval of external knowledge is required
    to accurately answer it. Reply with only one word: Yes or No.
    """

    response = llm.invoke(prompt)
    answer = response.content.strip().lower()
    print(answer)
    if "yes" in answer:
        return True
    elif "no" in answer:
        return False
    else:
        # fallback case
        return True


In [55]:
def is_context_relevant(query, context):
    """
    For a given query and context chunk, determine if the context is relevant.
    Returns True if relevant, False otherwise.
    """
    prompt = f"""
    Query: "{query}"
    Context: "{context}"

    Is the context relevant to the query? Reply with only one word: Relevant or Irrelevant.
    """

    response = llm_2.invoke(prompt)
    answer = response.content.strip().lower()
    print(f"LLM response: {answer}")

    if "relevant" in answer and "irrelevant" not in answer:
        return True
    elif "irrelevant" in answer:
        return False
    else:
        # fallback for unclear answers
        return False


In [56]:
def response_generation(query, context):
    """
    Generate a response for a given query using the context.
    """
    if(context=="No retrieval needed"):
      prompt= f"""
      Answer the following {query} based on your knowledge
      """
      response = llm_2.invoke(prompt)
      return response.content.strip()
    else:
      prompt = f"""
      You are given the following context: "{context}"

      Now answer the following query using only the information from the context:
      "{query}"

      If the answer is not found in the context, say 'Insufficient information.'
      """
      response = llm.invoke(prompt)
      return response.content.strip()

In [57]:
def support_response(response, context):
    """
    Determines if response is supported by the given context.
    Returns: 'Fully supported', 'Partially supported', or 'No support'
    """
    prompt = f"""
    Context: "{context}"
    Response: "{response}"

    Is the response supported by the context?
    Reply with only one of the following: Fully supported, Partially supported, or No support.
    """
    result = llm_2.invoke(prompt).content.strip().lower()

    if "fully" in result:
        return "Fully supported"
    elif "partially" in result:
        return "Partially supported"
    elif "no support" in result or "not supported" in result:
        return "No support"
    else:
        return "Unclear"


In [58]:
def utility_response(query, response):
    """
    Rates the utility of the response from 1 (poor) to 5 (excellent).
    Returns an integer score.
    """
    prompt = f"""
    Query: "{query}"
    Response: "{response}"

    On a scale of 1 to 5, how useful is this response in answering the query?
    Reply with only the number.
    """
    result = llm_2.invoke(prompt).content.strip()

    try:
        score = int(result)
        return min(max(score, 1), 5)
    except ValueError:
        return -1


In [75]:
def self_RAG(query, retriever):
    #  Check if retrieval is necessary
    if retrieval_required(query):
        print("🔍 Retrieval required. Fetching documents...")
        docs = retriever.retrieve(query)
        contexts = [d.text for d in docs]

        #  Filter relevant contexts
        relevant_contexts = []
        for i, c in enumerate(contexts):
            if is_context_relevant(query, c):
                print(f"✅ Document {i+1} is relevant.")
                relevant_contexts.append(c)
            else:
                print(f"❌ Document {i+1} is not relevant.")

        #  Handle no relevant context
        if not relevant_contexts:
            print("⚠️ No relevant context found. Generating without context...")
            return response_generation(query, "No relevant context found.")

        #  Generate responses and evaluate
        responses = []
        for i, context in enumerate(relevant_contexts):
            _response = response_generation(query, context)
            _support = support_response(_response, context).strip().lower()
            try:
                _utility = utility_response(query, _response)
            except:
                _utility = 0

            responses.append((_response, _support, _utility))

        # Select best response (fully supported & highest utility)
        final_best_answer = max(
            responses, key=lambda x: (x[1] == 'fully supported', x[2])
        )
        print(f"\n🏁 Selected Best Response: [Support: {final_best_answer[1]}, Utility: {final_best_answer[2]}]")
        return final_best_answer[0]

    else:
        print("🧠 Retrieval not required. Generating without documents...")
        response= response_generation(query, "No retrieval needed")
        utility_raw = utility_response(query, response)
        print(f"✅ Direct Response Utility Score: {utility_raw}")
        return response


In [76]:
final_SELF_RAG_response=self_RAG("what are degrees of freedom in a robot ?",retriever)
print(final_SELF_RAG_response)

no
🧠 Retrieval not required. Generating without documents...
✅ Direct Response Utility Score: 5
Degrees of freedom (DOF) in robotics refer to the number of independent movements a robot can perform. Each DOF corresponds to a specific direction or type of movement, such as translation along an axis or rotation around it. In three-dimensional space, there are six possible DOF: three translational (movement along x, y, and z axes) and three rotational (roll, pitch, and yaw).

Robots can have varying numbers of DOF depending on their design and purpose. For example, a simple robotic arm might have 3 DOF, allowing it to move in three planes, while a more complex robot might have 6 DOF, enabling it to move freely in all directions. Each joint in a robot contributes to its total DOF, with different types of joints offering different numbers of DOF. For instance, a revolute joint provides 1 DOF, while a spherical joint offers 3 DOF.

The total DOF of a robot is typically the sum of the DOF fro

In [64]:
query_2="tell about Advanced and Emerging Actuator Technologies"
response_2=self_RAG(query_2,retriever)
print(response_2)

yes
🔍 Retrieval required. Fetching documents...
LLM response: relevant
✅ Document 1 is relevant.
LLM response: irrelevant
❌ Document 2 is not relevant.
LLM response: irrelevant
❌ Document 3 is not relevant.
LLM response: relevant
✅ Document 4 is relevant.
LLM response: relevant
✅ Document 5 is relevant.

🏁 Selected Best Response: [Support: fully supported, Utility: 4]
Several advanced actuator technologies are being developed and applied in robotics to address limitations of conventional systems or to enable new capabilities:

*   **Shape memory alloy actuators** use materials that change shape when heated, providing compact actuators with high force output. They are valuable for applications requiring small size, silent operation, or biomimetic behavior, but typically have slow response times and limited stroke length.
*   **Electroactive polymers** change shape when subjected to electric fields, potentially providing artificial muscles with characteristics similar to biological syste

In [65]:
query_3="how is Cloud and Edge Computing integrating into robots"
response_3=self_RAG(query_3,retriever)
print(response_3)

yes
🔍 Retrieval required. Fetching documents...
LLM response: relevant
✅ Document 1 is relevant.
LLM response: relevant
✅ Document 2 is relevant.
LLM response: relevant
✅ Document 3 is relevant.
LLM response: irrelevant
❌ Document 4 is not relevant.
LLM response: irrelevant
❌ Document 5 is not relevant.

🏁 Selected Best Response: [Support: fully supported, Utility: 3]
Cloud and Edge Computing is integrating into robots in the following ways:
*   **Cloud robotics** allows robots to access powerful computational resources and shared knowledge bases through network connections.
*   **Edge computing** provides local processing capabilities.


In [66]:
query_4="what are the Technological Convergence and Breakthrough Potential discussed ?"
response_4=self_RAG(query_4,retriever)
print(response_4)

yes
🔍 Retrieval required. Fetching documents...
LLM response: relevant
✅ Document 1 is relevant.
LLM response: relevant
✅ Document 2 is relevant.
LLM response: relevant
✅ Document 3 is relevant.
LLM response: relevant
✅ Document 4 is relevant.
LLM response: irrelevant
❌ Document 5 is not relevant.

🏁 Selected Best Response: [Support: fully supported, Utility: 5]
The Technological Convergence and Breakthrough Potential discussed include:

*   **Artificial intelligence**: advancements in machine learning, neural networks, cognitive architectures, and the integration of large language models for sophisticated reasoning, decision-making, and natural human-robot communication.
*   **Quantum computing**: potential to revolutionize robot control and planning algorithms by solving complex optimization problems and providing unprecedented sensitivity and precision for robot perception systems through quantum sensors.
*   **Biotechnology and robotics convergence**: creation of new possibilities 

In [77]:
query_5="what are different types of actuators ? explain them"
response_5=self_RAG(query_5,retriever)
print(response_5)

no
🧠 Retrieval not required. Generating without documents...
✅ Direct Response Utility Score: 5
Actuators are devices that convert energy into motion, essential in various systems like robotics, machines, and automation. Here's an organized overview of different types of actuators, their mechanisms, applications, and characteristics:

1. **Hydraulic Actuators**
   - **Mechanism:** Use fluid pressure to create movement via a piston in a cylinder.
   - **Applications:** Heavy machinery, construction equipment, automotive brakes.
   - **Pros/Cons:** High strength, but prone to fluid leaks.

2. **Pneumatic Actuators**
   - **Mechanism:** Utilize compressed air or gas to move a piston.
   - **Applications:** Factories, conveyor belts, automation.
   - **Pros/Cons:** Clean and lightweight, but less precise.

3. **Electric Actuators**
   - **Mechanism:** Convert electrical energy into motion using motors.
   - **Applications:** Fans, robotics, automation.
   - **Pros/Cons:** Versatile and pre

In [69]:
query_6="what are different classifications of robots as said in PDF  ?"
response_6=self_RAG(query_6,retriever)
print(response_6)

yes
🔍 Retrieval required. Fetching documents...
LLM response: relevant
✅ Document 1 is relevant.
LLM response: relevant
✅ Document 2 is relevant.
LLM response: relevant
✅ Document 3 is relevant.
LLM response: relevant
✅ Document 4 is relevant.
LLM response: irrelevant
❌ Document 5 is not relevant.

🏁 Selected Best Response: [Support: fully supported, Utility: 5]
The context provides two classifications of robots:

1.  **By Configuration:**
    *   Cylindrical robots
    *   SCARA robots
    *   Parallel robots (including Delta robots as a specific type)

2.  **By Locomotion Method:**
    *   Wheeled robots
    *   Tracked robots
    *   Legged robots
    *   Flying robots
    *   Swimming robots


In [78]:
query_7="Visual Perception Systems in robots"
response_7=self_RAG(query_7,retriever)
print(response_7)

no
🧠 Retrieval not required. Generating without documents...
✅ Direct Response Utility Score: 5
**Visual Perception Systems in Robots: An Overview**

**Introduction:**
Visual perception in robots refers to the ability of robots to interpret and understand visual data from their environment, enabling them to perform tasks that require awareness and interaction with their surroundings.

**Key Components:**
1. **Sensors:** Robots use cameras or other visual sensors to capture images or video streams.
2. **Processing:** Software and algorithms, such as convolutional neural networks (CNNs), analyze the captured data.

**Functions:**
1. **Object Recognition:** Identifying objects like cats or chairs using CNNs.
2. **Object Detection:** Locating objects within images using tools like YOLO or SSD.
3. **Scene Understanding:** Interpreting entire environments, such as kitchens or streets.
4. **Object Tracking:** Following moving objects using methods like optical flow.
5. **Depth Perception:** U

In [72]:
query_8="Design Principles for Human-Robot Interaction as described in PDF"
response_8=self_RAG(query_8,retriever)
print(response_8)

yes
🔍 Retrieval required. Fetching documents...
LLM response: relevant
✅ Document 1 is relevant.
LLM response: relevant
✅ Document 2 is relevant.
LLM response: relevant
✅ Document 3 is relevant.
LLM response: irrelevant
❌ Document 4 is not relevant.
LLM response: relevant
✅ Document 5 is relevant.

🏁 Selected Best Response: [Support: fully supported, Utility: 3]
Design Principles for Human-Robot Interaction, as described in the context, include:

*   **User-centered design:** This approach places human users at the center of the design process, focusing on meeting human needs, understanding user requirements, preferences, and limitations.
*   **Transparency and explainability:** This ensures that humans can understand robot behavior and decision-making processes, which is vital for building trust and enabling appropriate human oversight.
*   **Adaptability and personalization:** This allows robots to adjust their behavior to individual users and changing contexts, crucial for maintaini

In [73]:
query_9="Advanced Materials and Manufacturing for robots"
response_9=self_RAG(query_9,retriever)
print(response_9)

yes
🔍 Retrieval required. Fetching documents...
LLM response: relevant
✅ Document 1 is relevant.
LLM response: relevant
✅ Document 2 is relevant.
LLM response: relevant
✅ Document 3 is relevant.
LLM response: irrelevant
❌ Document 4 is not relevant.
LLM response: relevant
✅ Document 5 is relevant.

🏁 Selected Best Response: [Support: fully supported, Utility: 5]
Bio-inspired materials derived from natural systems are providing new capabilities for robotics. Examples include gecko-inspired adhesives for climbing robots, shark-skin-inspired surfaces to reduce drag for underwater robots, and self-healing materials that could enable robots to repair themselves after damage.

Advanced manufacturing techniques, including 3D printing, additive manufacturing, and automated assembly, are reducing the cost and complexity of robot production. These techniques also enable mass customization, rapid prototyping, and the creation of complex structures that would be impossible with traditional manufac

In [74]:
query_10="what are the Sensor Technology Advances ?"
response_10=self_RAG(query_10,retriever)
print(response_10)

yes
🔍 Retrieval required. Fetching documents...
LLM response: relevant
✅ Document 1 is relevant.
LLM response: relevant
✅ Document 2 is relevant.
LLM response: relevant
✅ Document 3 is relevant.
LLM response: relevant
✅ Document 4 is relevant.
LLM response: relevant
✅ Document 5 is relevant.

🏁 Selected Best Response: [Support: fully supported, Utility: 5]
Sensor Technology Advances include:
*   **Advanced vision systems** incorporating multispectral imaging, hyperspectral analysis, and computational photography, providing superhuman visual capabilities. Event-based cameras offer fast response times and low power consumption.
*   **Tactile sensing technologies** are approaching the sensitivity and resolution of human touch, with distributed tactile sensors providing detailed information about contact forces, surface textures, and object properties. Some systems can even detect chemical properties through artificial smell and taste capabilities.
*   **Proprioceptive sensing** is becomin