# RAG_Context_Relevancy_Checker


- [RAG_Context_Relevancy_Checker](https://medium.com/the-ai-forum/rag-context-relevancy-checker-agent-using-deepseek-r1-70b-on-groq-modernbert-langchain-58edb6b0f29c)
- [Paper: Developing Retrieval Augmented Generation (RAG) based LLM Systems from PDFs: An Experience Report](https://arxiv.org/pdf/2410.15944v1)


## SETUP


In [4]:
import os
from dotenv import load_dotenv, find_dotenv

_ = load_dotenv(find_dotenv())

os.environ["GROQ_API_KEY"] = os.getenv("GROQ_API_KEY")
os.environ["HF_TOKEN"] = os.getenv("HF_TOKEN")

In [6]:
from langchain.chains import SequentialChain, LLMChain
from langchain.prompts import PromptTemplate
from langchain_groq import ChatGroq

In [7]:
llm_judge = ChatGroq(model="deepseek-r1-distill-llama-70b")
llm_rag = ChatGroq(model="mixtral-8x7b-32768")

In [8]:
llm_judge.verbose = True
llm_rag.verbose = True

## DATA

In [9]:
!wget "https://arxiv.org/pdf/2410.15944v1" -O ../data/RAG.pdf

--2025-02-15 16:32:15--  https://arxiv.org/pdf/2410.15944v1
Resolving arxiv.org (arxiv.org)... 151.101.3.42, 151.101.131.42, 151.101.195.42, ...
Connecting to arxiv.org (arxiv.org)|151.101.3.42|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 749669 (732K) [application/pdf]
Saving to: ‘../data/RAG.pdf’


2025-02-15 16:32:16 (1.09 MB/s) - ‘../data/RAG.pdf’ saved [749669/749669]



In [10]:
from langchain.document_loaders import PDFPlumberLoader

loader = PDFPlumberLoader("../data/RAG.pdf")

docs = loader.load()
print(len(docs))
print(docs[0].metadata)

36
{'source': '../data/RAG.pdf', 'file_path': '../data/RAG.pdf', 'page': 0, 'total_pages': 36, 'Author': '', 'CreationDate': 'D:20241022015619Z', 'Creator': 'LaTeX with hyperref', 'Keywords': '', 'ModDate': 'D:20241022015619Z', 'PTEX.Fullbanner': 'This is pdfTeX, Version 3.141592653-2.6-1.40.25 (TeX Live 2023) kpathsea version 6.3.5', 'Producer': 'pdfTeX-1.40.25', 'Subject': '', 'Title': '', 'Trapped': 'False'}


## EMBEDDINGS

In [19]:
# chunk documents
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_experimental.text_splitter import SemanticChunker

# set the embeddings model
model_name = "nomic-ai/modernbert-embed-base"
model_kwargs = {"device": "cpu"}
encode_kwargs = {"normalize_embeddings": False}
embedding_model = HuggingFaceEmbeddings(
    model_name=model_name, model_kwargs=model_kwargs, encode_kwargs=encode_kwargs
)

text_splitter = SemanticChunker(embedding_model)
documents = text_splitter.split_documents(docs)

print(len(documents))
print(documents[0].page_content)

73
Developing Retrieval Augmented Generation
(RAG) based LLM Systems from PDFs: An
Experience Report
Ayman Asad Khan Md Toufique Hasan
Tampere University Tampere University
ayman.khan@tuni.fi mdtoufique.hasan@tuni.fi
Kai Kristian Kemell Jussi Rasku Pekka Abrahamsson
Tampere University Tampere University Tampere University
kai-kristian.kemell@tuni.fi jussi.rasku@tuni.fi pekka.abrahamsson@tuni.fi
Abstract. This paper presents an experience report on the develop-
ment of Retrieval Augmented Generation (RAG) systems using PDF
documentsastheprimarydatasource.TheRAGarchitecturecombines
generativecapabilitiesofLargeLanguageModels(LLMs)withthepreci-
sionofinformationretrieval.Thisapproachhasthepotentialtoredefine
how we interact with and augment both structured and unstructured
knowledge in generative models to enhance transparency, accuracy and
contextuality of responses. The paper details the end-to-end pipeline,
from data collection, preprocessing, to retrieval indexing and response
generat

## VECTOR STORE

In [20]:
from langchain_chroma import Chroma

vector_store = Chroma(
    collection_metadata={"hnsw:space": "cosine"},
    collection_name="deepseek_collection",
    embedding_function=embedding_model,
    persist_directory="../data/chroma_langchain_db",
)

In [21]:
# add embeddings to vector store
vector_store.add_documents(documents=documents)
len(vector_store.get()["documents"])

73

In [22]:
# setup the retriever
retriver = vector_store.as_retriever(search_type="similarity", search_kwargs={"k": 5})

In [43]:
retriver.invoke("what is RAG")

[Document(id='be70672d-0891-4a7b-a555-d8b6c8211703', metadata={'Author': '', 'CreationDate': 'D:20241022015619Z', 'Creator': 'LaTeX with hyperref', 'Keywords': '', 'ModDate': 'D:20241022015619Z', 'PTEX.Fullbanner': 'This is pdfTeX, Version 3.141592653-2.6-1.40.25 (TeX Live 2023) kpathsea version 6.3.5', 'Producer': 'pdfTeX-1.40.25', 'Subject': '', 'Title': '', 'Trapped': 'False', 'file_path': '../data/RAG.pdf', 'page': 31, 'source': '../data/RAG.pdf', 'total_pages': 36}, page_content='Fig.6: Most Valuable Aspects of the Workshop. implementation of RAG systems.'),
 Document(id='04d47e53-02c4-439d-8ba6-fa89d18bffd6', metadata={'Author': '', 'CreationDate': 'D:20241022015619Z', 'Creator': 'LaTeX with hyperref', 'Keywords': '', 'ModDate': 'D:20241022015619Z', 'PTEX.Fullbanner': 'This is pdfTeX, Version 3.141592653-2.6-1.40.25 (TeX Live 2023) kpathsea version 6.3.5', 'Producer': 'pdfTeX-1.40.25', 'Subject': '', 'Title': '', 'Trapped': 'False', 'file_path': '../data/RAG.pdf', 'page': 2, 'sou

## PROMPTS

In [27]:
relevancy_prompt = """You are an expert judge tasked with evaluating whather the  EACH OF THE CONTEXT provided in the CONTEXT LIST is self sufficient to answer the QUERY asked.
Analyze the provided QUERY AND  CONTEXT to determine if each Ccontent in the CONTEXT LIST contains Relevant information to answer the QUERY.

Guidelines:
1. The content must not introduce new information beyond what's provided in the QUERY.
2. Pay close attention to the subject of statements. Ensure that attributes, actions, or dates are correctly associated with the right entities (e.g., a person vs. a TV show they star in).
6. Be vigilant for subtle misattributions or conflations of information, even if the date or other details are correct.
7. Check that the content in the CONTEXT LIST doesn't oversimplify or generalize information in a way that changes the meaning of the QUERY.

Analyze the text thoroughly and assign a relevancy score 0 or 1 where:
- 0: The content has all the necessary information to answer the QUERY
- 1: The content does not has the necessary information to answer the QUERY

```
EXAMPLE:

INPUT (for context only, not to be used for faithfulness evaluation):
What is the capital of France?

CONTEXT:
['France is a country in Western Europe. Its capital is Paris, which is known for landmarks like the Eiffel Tower.',
'Mr. Naveen patnaik has been the chief minister of Odisha for consequetive 5 terms']

OUTPUT:
The Context has sufficient information to answer the query.

RESPONSE:
{{"score":0}}
```

CONTENT LIST:
{context}

QUERY:
{retriever_query}
Provide your verdict in JSON format  with a single key 'score' and no preamble or explanation:
[{{"content:1,"score": <your score either 0 or 1>,"Reasoning":<why you have chose the score as 0 or 1>}},
{{"content:2,"score": <your score either 0 or 1>,"Reasoning":<why you have chose the score as 0 or 1>}},
...]

"""

context_relevancy_checker_prompt = PromptTemplate(
    input_variables=["retriever_query", "context"], template=relevancy_prompt
)

In [28]:
# Relevant Context Picker Agent
relevant_prompt = PromptTemplate(
    input_variables=["relevancy_response"],
    template="""
    You main task is to analyze the json structure as a part of the Relevancy Response.
    Review the Relevancy Response and do the following:-
    (1) Look at the Json Structure content
    (2) Analyze the 'score' key in the Json Structure content.
    (3) pick the value of 'content' key against those 'score' key value which has 0.
    .

    Relevancy Response:
    {relevancy_response}

    Provide your verdict in JSON format  with a single key 'content number' and no preamble or explanation:
    [{{"content":<content number>}}]

    """,
)

In [29]:
# MeaningFul Context for Response synthesis Agent
context_prompt = PromptTemplate(
    input_variables=["context_number"],
    template="""
    Your main task is to analyze the json structure as a part of the Context Number Response and the list of Contexts provided in the 'Content List' and perform the following steps:-
    (1) Look at the output from the Relevant Context Picker Agent.
    (2) Analyze the 'content' key in the Json Structure format({{"content":<<content_number>>}}).
    (3) Retrieve the value of 'content' key and pick up the context corresponding to that element from the Content List provided.
    (4) Pass the retrieved context for each corresponing element number referred in the 'Context Number Response'

    Context Number Response:
    {context_number}

    Content List:
    {context}

    Provide your verdict in JSON format  with a two key 'relevant_content' and 'context_number' no preamble or explanation:
    [{{"context_number":<content1>,"relevant_content":<content corresponing to that element 1 in the Content List>}},
    {{"context_number":<content4>,"relevant_content":<content corresponing to that element 4 in the Content List>}},
    ...
    ]
    """,
)

In [39]:
# Updated Final Response Chain with RAG-specific prompt
final_prompt = PromptTemplate(
    input_variables=["query", "relevant_contexts"],
    template="""You are a helpful assistant very proficient in formulating clear and meaningful answers from the context provided.

SYSTEM INSTRUCTIONS:
1. Use ONLY the information from the provided relevant contexts to formulate your response
2. Provide clear, concise, and meaningful answers
3. If the contexts don't contain sufficient information, respond with 'I do not know'
4. Maintain accuracy while synthesizing information
5. Use natural, coherent language
6. Avoid making up information not present in the contexts

QUERY:
{query}

RELEVANT CONTEXTS:
{relevant_contexts}

Based on the above information, provide a clear and concise response. Remember:
- Only use information from the provided contexts
- Keep the response focused and meaningful
- Say 'I do not know' if contexts are insufficient
- Do not make up or infer information not present in the contexts

ANSWER:""",
)

## CHAINS

In [40]:
from langchain.chains import SequentialChain, LLMChain

#
context_relevancy_evaluation_chain = LLMChain(
    llm=llm_judge,
    prompt=context_relevancy_checker_prompt,
    output_key="relevancy_response",
)
#
pick_relevant_context_chain = LLMChain(
    llm=llm_judge, prompt=relevant_prompt, output_key="context_number"
)
#
relevant_contexts_chain = LLMChain(
    llm=llm_judge, prompt=context_prompt, output_key="relevant_contexts"
)

response_chain = LLMChain(llm=llm_rag, prompt=final_prompt, output_key="final_response")

In [41]:
# Updated Sequential Chain with all outputs
context_management_chain = SequentialChain(
    chains=[
        context_relevancy_evaluation_chain,
        pick_relevant_context_chain,
        relevant_contexts_chain,
        response_chain,
    ],
    input_variables=["context", "retriever_query", "query"],
    output_variables=[
        "relevancy_response",
        "context_number",
        "relevant_contexts",
        "final_response",
    ],
    verbose=True,  # Set to False in production
)

In [42]:
# Example usage with retriever
def get_rag_response(query, retriever):
    # Get contexts from retriever
    contexts = retriever.invoke(query)
    context = [d.page_content for d in contexts]

    # Run the chain
    final_output = context_management_chain(
        {"context": context, "retriever_query": query, "query": query}
    )

    return final_output

In [44]:
# Example usage
query = "What is RAG?"
final_output = get_rag_response(query, retriver)

# Access individual outputs
relevancy_response = final_output["relevancy_response"]
context_number = final_output["context_number"]
relevant_contexts = final_output["relevant_contexts"]
final_response = final_output["final_response"]

  final_output = context_management_chain(




[1m> Entering new SequentialChain chain...[0m

[1m> Finished chain.[0m


In [45]:
# Print first response (RAG Relevan Context Picker Agent)
print("\n-------- 🟥 context_relevancy_evaluation_chain Statement 🟥 --------\n")
print(final_output["relevancy_response"])

# Print final legally refined response
print("\n-------- 🟦 pick_relevant_context_chain Statement 🟦  --------\n")
print(final_output["context_number"])

print("\n-------- 🟥 relevant_contexts_chain Statement 🟥 --------\n")
print(final_output["relevant_contexts"])

print("\n-------- 🟥 Rag Response Statement 🟥 --------\n")
print(final_output["final_response"])


-------- 🟥 context_relevancy_evaluation_chain Statement 🟥 --------

<think>
Alright, I need to determine if each content in the provided context list sufficiently answers the query "What is RAG?" I'll evaluate each content one by one based on the guidelines given.

First, the query is asking for a definition or explanation of RAG. So, I'm looking for content that clearly defines RAG, perhaps by expanding the acronym and explaining its components.

Looking at content 1: It mentions "implementation of RAG systems" and "Retrieval Augmented Generation (RAG) system." It includes Figure 1 about the architecture, which might visually explain RAG, but without the figure, the text only refers to RAG without defining it. So, this might not be sufficient on its own.

Content 2: It talks about enhancing the model’s ability to respond and refers to Figure 1 again. It doesn't provide a definition of RAG, just mentions it in the context of architecture. So, it's similar to content 1 and probably ins