# RAG System with Feedback Loop: Enhancing Retrieval and Response Quality

## Overview

This system implements a Retrieval-Augmented Generation (RAG) approach with an integrated feedback loop. It aims to improve the quality and relevance of responses over time by incorporating user feedback and dynamically adjusting the retrieval process.

## Motivation

Traditional RAG systems can sometimes produce inconsistent or irrelevant responses due to limitations in the retrieval process or the underlying knowledge base. By implementing a feedback loop, we can:

1. Continuously improve the quality of retrieved documents
2. Enhance the relevance of generated responses
3. Adapt the system to user preferences and needs over time

## Key Components

1. **PDF Content Extraction**: Extracts text from PDF documents
2. **Vectorstore**: Stores and indexes document embeddings for efficient retrieval
3. **Retriever**: Fetches relevant documents based on user queries
4. **Language Model**: Generates responses using retrieved documents
5. **Feedback Collection**: Gathers user feedback on response quality and relevance
6. **Feedback Storage**: Persists user feedback for future use
7. **Relevance Score Adjustment**: Modifies document relevance based on feedback
8. **Index Fine-tuning**: Periodically updates the vectorstore using accumulated feedback

## Method Details

### 1. Initial Setup
- The system reads PDF content and creates a vectorstore
- A retriever is initialized using the vectorstore
- A language model (LLM) is set up for response generation

### 2. Query Processing
- When a user submits a query, the retriever fetches relevant documents
- The LLM generates a response based on the retrieved documents

### 3. Feedback Collection
- The system collects user feedback on the response's relevance and quality
- Feedback is stored in a JSON file for persistence

### 4. Relevance Score Adjustment
- For subsequent queries, the system loads previous feedback
- An LLM evaluates the relevance of past feedback to the current query
- Document relevance scores are adjusted based on this evaluation

### 5. Retriever Update
- The retriever is updated with the adjusted document scores
- This ensures that future retrievals benefit from past feedback

### 6. Periodic Index Fine-tuning
- At regular intervals, the system fine-tunes the index
- High-quality feedback is used to create additional documents
- The vectorstore is updated with these new documents, improving overall retrieval quality

## Benefits of this Approach

1. **Continuous Improvement**: The system learns from each interaction, gradually enhancing its performance.
2. **Personalization**: By incorporating user feedback, the system can adapt to individual or group preferences over time.
3. **Increased Relevance**: The feedback loop helps prioritize more relevant documents in future retrievals.
4. **Quality Control**: Low-quality or irrelevant responses are less likely to be repeated as the system evolves.
5. **Adaptability**: The system can adjust to changes in user needs or document contents over time.

## Conclusion

This RAG system with a feedback loop represents a significant advancement over traditional RAG implementations. By continuously learning from user interactions, it offers a more dynamic, adaptive, and user-centric approach to information retrieval and response generation. This system is particularly valuable in domains where information accuracy and relevance are critical, and where user needs may evolve over time.

While the implementation adds complexity compared to a basic RAG system, the benefits in terms of response quality and user satisfaction make it a worthwhile investment for applications requiring high-quality, context-aware information retrieval and generation.

<div style="text-align: center;">

<img src="https://github.com/NirDiamant/RAG_Techniques/blob/main/images/retrieval_with_feedback_loop.svg?raw=1" alt="retrieval with feedback loop" style="width:40%; height:auto;">
</div>

# Package Installation and Imports

The cell below installs all necessary packages required to run this notebook.


In [1]:
# Install required packages
#!pip install langchain langchain-openai python-dotenv

In [2]:
#pip install langchain

In [3]:
#pip install PyPDF2

In [4]:
#!pip install langchain-community==0.2.15  langchain-text-splitters==0.2.2 langchain-huggingface==0.0.3 "transformers>=4.41.0" faiss-cpu langchain-groq==0.1.9 PyPDF2-3.0.1 unstructured==0.15.0 unstructured[pdf]==0.15.0 nltk==3.8.1

In [8]:
#pip install pymupdf



In [5]:
from langchain.chains import RetrievalQA
from langchain.embeddings import HuggingFaceEmbeddings, SentenceTransformerEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_core.runnables import RunnableLambda
from langchain.chains.question_answering import load_qa_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains import create_history_aware_retriever
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.chains import create_retrieval_chain
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain.prompts import PromptTemplate
import openai
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain_community.embeddings import OpenAIEmbeddings
from langchain_core.messages import HumanMessage
from langchain_core.runnables import RunnableLambda
from langchain_core.chat_history import BaseChatMessageHistory
import requests
import google.generativeai as genai
from langchain_core.language_models import LLM
from typing import Optional, List
from PyPDF2 import PdfFileReader
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.embeddings import OpenAIEmbeddings
import time
import json
from typing import List, Optional
import os
import sys
from dotenv import load_dotenv
from langchain_text_splitters import RecursiveCharacterTextSplitter
#from langchain_openai import ChatOpenAI
from langchain.chains import RetrievalQA
from typing import List, Dict, Any
from langchain.chains import LLMChain
from langchain_groq import ChatGroq


**Load llm**

In [None]:
llm = ChatGroq(
    api_key="API_KEY",
    model_name="llama-3.1-8b-instant"
)

### Define documents path

In [22]:
# Download required data files
import os
os.makedirs('data', exist_ok=True)

# Download the PDF document used in this notebook
!wget -O data/Understanding_Climate_Change.pdf https://raw.githubusercontent.com/NirDiamant/RAG_TECHNIQUES/main/data/Understanding_Climate_Change.pdf
!wget -O data/feedback_data.json https://raw.githubusercontent.com/NirDiamant/RAG_TECHNIQUES/main/data/feedback_data.json

--2025-12-14 10:45:10--  https://raw.githubusercontent.com/NirDiamant/RAG_TECHNIQUES/main/data/Understanding_Climate_Change.pdf
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 206372 (202K) [application/octet-stream]
Saving to: ‘data/Understanding_Climate_Change.pdf’


2025-12-14 10:45:10 (10.9 MB/s) - ‘data/Understanding_Climate_Change.pdf’ saved [206372/206372]

--2025-12-14 10:45:10--  https://raw.githubusercontent.com/NirDiamant/RAG_TECHNIQUES/main/data/feedback_data.json
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2025-12-

### Extract text from the PDF File.

In [9]:
import pymupdf
def read_pdf_to_string(path):
    """
    Read a PDF document from the specified path and return its content as a string.

    Args:
    path (str): The file path to the PDF document.

    Returns:
    str: The concatenated text content of all pages in the PDF document.

    The function uses the 'fitz' library (PyMuPDF) to open the PDF document, iterate over each page,
    extract the text content from each page, and append it to a single string."""

    # Open the PDF document located at the specified path
    doc = pymupdf.open(path)
    content = ""
    # Iterate over each page in the document
    for page_num in range(len(doc)):
        # Get the current page
        page = doc[page_num]
        # Extract the text content from the current page and append it to the content string
        content += page.get_text()
    return content

Loading Embedding Model

In [10]:
embedding_model = 'sentence-transformers/all-MiniLM-L6-v2'
embeddings = HuggingFaceEmbeddings(model_name=embedding_model)

  embeddings = HuggingFaceEmbeddings(model_name=embedding_model)
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


### Create vector store and retrieval QA chain

In [11]:

from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
def encode_from_string(content, chunk_size=1000, chunk_overlap=200):
    """
    Encodes a string into a vector store using OpenAI embeddings.

    Args:
    content (str): The text content to be encoded.
    chunk_size (int): The size of each chunk of text.
    chunk_overlap (int): The overlap between chunks.

    Returns:
    FAISS: A vector store containing the encoded content.

    Raises:
    ValueError: If the input content is not valid.
    RuntimeError: If there is an error during the encoding process.
    """

    if not isinstance(content, str) or not content.strip():
      raise ValueError("Content must be a non-empty string.")

    if not isinstance(chunk_size, int) or chunk_size <= 0:
      raise ValueError("chunk_size must be a positive integer.")

    if not isinstance(chunk_overlap, int) or chunk_overlap < 0:
      raise ValueError("chunk_overlap must be a non-negative integer.")

    try:
    # Split the content into chunks
        text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=chunk_size,
        chunk_overlap=chunk_overlap,
        length_function=len,
        is_separator_regex=False,
        )
        chunks = text_splitter.create_documents([content])

        # Assign metadata to each chunk
        for chunk in chunks:
            chunk.metadata['relevance_score'] = 1.0

        vectorstore = FAISS.from_documents(chunks, embeddings)
        return vectorstore

    except Exception as e:
        raise RuntimeError(f"An error occured during the encoding process: {str(e)}")

    return vectorstore

In [12]:
path = "/content/item.pdf"

In [13]:
content = read_pdf_to_string(path)
vectorstore = encode_from_string(content)
retriever = vectorstore.as_retriever()

qa_chain = RetrievalQA.from_chain_type(llm=llm, retriever=retriever)

### Function to format user feedback in a dictionary

In [14]:
def get_user_feedback(query, response, relevance, quality, comments=""):
    return {
        "query": query,
        "response": response,
        "relevance": int(relevance),
        "quality": int(quality),
        "comments": comments
    }

### Function to store the feedback in a json file

In [15]:
def store_feedback(feedback):
    with open("/content/feedback_data.json", "a") as f:
        json.dump(feedback, f)
        f.write("\n")

### Function to read the feedback file

In [16]:
def load_feedback_data():
    feedback_data = []
    try:
        with open("/content/feedback_data.json", "r") as f:
        #with open("data/feedback_data.json", "r") as f:
            for line in f:
                feedback_data.append(json.loads(line.strip()))
    except FileNotFoundError:
        print("No feedback data file found. Starting with empty feedback.")
    return feedback_data

### Function to adjust files relevancy based on the feedbacks file

In [17]:
# class Response(BaseModel):
#     answer: str = Field(..., title="The answer to the question. The options can be only 'Yes' or 'No'")

def adjust_relevance_scores(query: str, docs: List[Any], feedback_data: List[Dict[str, Any]]) -> List[Any]:
    # Create a prompt template for relevance checking
    relevance_prompt = PromptTemplate(
        input_variables=["query", "feedback_query", "doc_content", "feedback_response"],
        template="""
        Determine if the following feedback response is relevant to the current query and document content.
        You are also provided with the Feedback original query that was used to generate the feedback response.
        Current query: {query}
        Feedback query: {feedback_query}
        Document content: {doc_content}
        Feedback response: {feedback_response}

        Is this feedback relevant? Respond with only 'Yes' or 'No'.
        """
    )
    #llm = ChatOpenAI(temperature=0, model_name="gpt-4o", max_tokens=4000)

    # Create an LLMChain for relevance checking
    #relevance_chain = relevance_prompt | llm.with_structured_output(Response)
    relevance_chain = LLMChain(prompt = relevance_prompt, llm=llm)

    for doc in docs:
        relevant_feedback = []

        for feedback in feedback_data:
            # Use LLM to check relevance
            input_data = {
                "query": query,
                "feedback_query": feedback['query'],
                "doc_content": doc.page_content[:1000],
                "feedback_response": feedback['response']
            }
            #result = relevance_chain.invoke(input_data).answer
            result = relevance_chain.run(input_data)

            if result == 'yes':
                relevant_feedback.append(feedback)

        # Adjust the relevance score based on feedback
        if relevant_feedback:
            avg_relevance = sum(f['relevance'] for f in relevant_feedback) / len(relevant_feedback)
            doc.metadata['relevance_score'] *= (avg_relevance / 3)  # Assuming a 1-5 scale, 3 is neutral

    # Re-rank documents based on adjusted scores
    return sorted(docs, key=lambda x: x.metadata['relevance_score'], reverse=True)

### Function to fine tune the vector index to include also queries + answers that received good feedbacks

In [18]:
def fine_tune_index(feedback_data: List[Dict[str, Any]], texts: List[str]) -> Any:
    # Filter high-quality responses
    good_responses = [f for f in feedback_data if f['relevance'] >= 4 and f['quality'] >= 4]

    # Extract queries and responses, and create new documents
    additional_texts = []
    for f in good_responses:
        combined_text = f['query'] + " " + f['response']
        additional_texts.append(combined_text)

    # make the list a string
    additional_texts = " ".join(additional_texts)

    # Create a new index with original and high-quality texts
    all_texts = texts + additional_texts
    new_vectorstore = encode_from_string(all_texts)

    return new_vectorstore

### Update the retriver

In [19]:
def retrieval_update(retriever):
    # Adjust relevance scores for future retrievals
    #global retriever
    docs = retriever.get_relevant_documents(query)
    adjusted_docs = adjust_relevance_scores(query, docs, load_feedback_data())
    # Update the retriever with adjusted docs
    retriever.search_kwargs['k'] = len(adjusted_docs)
    retriever.search_kwargs['docs'] = adjusted_docs
    # Periodically (e.g., daily or weekly), fine-tune the index
    new_vectorstore = fine_tune_index(load_feedback_data(), content)
    retriever = new_vectorstore.as_retriever()
    qa_chain = RetrievalQA.from_chain_type(llm, retriever=retriever)
    return retriever, qa_chain

### Demonstration of how to retrieve answers with respect to user feedbacks

In [None]:
#from langchain import PromptTemplate

while True:
    query = input("Enter your query: ")
    response = qa_chain.invoke({"query": query})["result"]
    print("\nResponse:", response)

    satisfaction = input("\nAre you satisfied with the result? (yes/no): ").strip().lower()

    if satisfaction == "yes":
        print("Thank you for your feedback!")
        break
    elif satisfaction == "no":
        action = input("\nWhat would you like to do? \n1. Give feedback rating\n2. Provide your own answer\nEnter 1 or 2: ").strip()

        if action == "1":
            relevance = input("Rate the relevance from 1 to 5: ")
            print("Thank you for your rating!")
            quality = input("Rate the quality from 1 to 5: ")
            print("Thank you for your rating!")
            # Collect feedback
            feedback = get_user_feedback(query, response, relevance, quality)
            # Store feedback
            store_feedback(feedback)

            retriever, qa_chain = retrieval_update(retriever)

        elif action == "2":
            user_answer = input("Please provide your own answer: ")
            print("Thank you for your input!")
            relevance = 5
            quality = 5
            # Collect feedback
            feedback = get_user_feedback(query, user_answer, relevance, quality)
            # Store feedback
            store_feedback(feedback)
            retriever, qa_chain = retrieval_update(retriever)
        else:
              print("Invalid choice. Exiting.")
        break
    else:
          print("Invalid input. Please enter yes or no.")


Object `#from langchain import PromptTemplate` not found.
Enter your query: How many layers are there in the atmosphere?

Response: The text doesn't explicitly mention the number of layers in the atmosphere, but it does mention the "troposphere" which is a specific layer. However, I can provide general information about the layers of the atmosphere.

The Earth's atmosphere is generally divided into 5 layers:

1. Troposphere: The lowest layer, extending up to about 12 km (7.5 miles) above the Earth's surface.
2. Stratosphere: Above the troposphere, extending up to about 50 km (31 miles).
3. Mesosphere: Above the stratosphere, extending up to about 85 km (53 miles).
4. Thermosphere: Above the mesosphere, extending up to about 600 km (373 miles).
5. Exosphere: The outermost layer, extending from the thermosphere to the edge of space, where the atmosphere interacts with the solar wind.

Please note that these layers are not sharply defined and can overlap in certain regions.

Are you satis

  docs = retriever.get_relevant_documents(query)
  relevance_chain = LLMChain(prompt = relevance_prompt, llm=llm)
  result = relevance_chain.run(input_data)


### Finetune the vectorstore periodicly

In [21]:
# Periodically (e.g., daily or weekly), fine-tune the index
# new_vectorstore = fine_tune_index(load_feedback_data(), content)
# retriever = new_vectorstore.as_retriever()

![](https://europe-west1-rag-techniques-views-tracker.cloudfunctions.net/rag-techniques-tracker?notebook=all-rag-techniques--retrieval-with-feedback-loop)