# **RAG Fusion**
RAG-Fusion is an enhanced version of the traditional Retrieval-Augmented Generation (RAG) model. In RAG-Fusion, after receiving a query, the model first generates related sub-queries using a large language model. These sub-queries help find more relevant documents. Instead of simply sending the retrieved documents to the model, RAG-Fusion uses a technique called Reciprocal Rank Fusion (RRF) to score and reorder the documents based on their relevance. The best-ranked documents are then used to generate a more accurate response.

Research Paper: [RAG Fusion](https://arxiv.org/pdf/2402.03367)

## **Initial Setup**

In [1]:
! pip install --q athina langsmith


[notice] A new release of pip is available: 24.0 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [2]:
import os
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')
os.environ['ATHINA_API_KEY'] = os.getenv('ATHINA_API_KEY')
os.environ['QDRANT_API_KEY'] = os.getenv('QDRANT_API_KEY')

# Optional: Verify keys are loaded
if not os.environ["OPENAI_API_KEY"] or not os.environ['ATHINA_API_KEY'] or not os.getenv('QDRANT_API_KEY'):
    print("Warning: API keys not loaded from .env file")

## **Indexing**

In [3]:
# load embedding model
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()

In [4]:
# load data
from langchain.document_loaders import CSVLoader
loader = CSVLoader(file_path="./context.csv", encoding="utf-8")
documents = loader.load()

In [5]:
# split documents
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
documents = text_splitter.split_documents(documents)

## **Qdrant Vector Database**

In [8]:
# create vectorstore
from langchain_community.vectorstores import Qdrant
vectorstore = Qdrant.from_documents(
    documents,
    embeddings,
    url="https://5336c8c7-c338-4712-a2aa-43bf713258fd.us-west-2-0.aws.cloud.qdrant.io",
    prefer_grpc=True,
    collection_name="documents",
    api_key=os.environ["QDRANT_API_KEY"],
)

## **Chromadb (Optional)**

In [None]:
# # optional vectorstore
# !pip install chromadb
# # create vectorstore
# from langchain.vectorstores import Chroma
# vectorstore = Chroma.from_documents(documents, embeddings)

## **Retriever**

In [9]:
# create retriever
retriever = vectorstore.as_retriever()

## **Reciprocal Rank Fusion Chain**

In [10]:
# create llm
from langchain_openai import ChatOpenAI
llm = ChatOpenAI()

In [11]:
# create chain
from langchain_core.output_parsers import StrOutputParser
from langsmith import Client
client = Client()
prompt = client.pull_prompt("langchain-ai/rag-fusion-query-generation")



In [12]:
# generate queries
generate_queries = (
    prompt | ChatOpenAI(temperature=0) | StrOutputParser() | (lambda x: x.split("\n"))
)

In [13]:
# rerank results
from langchain.load import dumps, loads

def reciprocal_rank_fusion(results: list[list], k=60):
    fused_scores = {}
    for docs in results:
        # Assumes the docs are returned in sorted order of relevance
        for rank, doc in enumerate(docs):
            doc_str = dumps(doc)
            if doc_str not in fused_scores:
                fused_scores[doc_str] = 0
            previous_score = fused_scores[doc_str]
            fused_scores[doc_str] += 1 / (rank + k)

    reranked_results = [
        (loads(doc), score)
        for doc, score in sorted(fused_scores.items(), key=lambda x: x[1], reverse=True)
    ]
    return reranked_results

In [14]:
# create chain
chain = generate_queries | retriever.map() | reciprocal_rank_fusion

In [15]:
# check input schema
chain.input_schema.schema()

{'properties': {'original_query': {'title': 'Original Query',
   'type': 'string'}},
 'required': ['original_query'],
 'title': 'PromptInput',
 'type': 'object'}

In [16]:
# rerank results
chain.invoke("what are points on a mortgage")

[(Document(metadata={'source': './context.csv', 'row': 1, '_id': '0bde8e87-f7d5-4552-805b-69a6071e0ec8', '_collection_name': 'documents'}, page_content='context: ["Discount points, also called mortgage points or simply points, are a form of pre-paid interest available in the United States when arranging a mortgage. One point equals one percent of the loan amount. By charging a borrower points, a lender effectively increases the yield on the loan above the amount of the stated interest rate.  Borrowers can offer to pay a lender points as a method to reduce the interest rate on the loan, thus obtaining a lower monthly payment in exchange for this'),
  0.06612021857923497),
 (Document(metadata={'source': './context.csv', 'row': 1, '_id': '17a9bc95-41e1-4a3c-9eaf-2b7f5a871d12', '_collection_name': 'documents'}, page_content='rate.Points may also be purchased to reduce the monthly payment for the purpose of qualifying for a loan.  Loan qualification based on monthly income versus the monthl

## **RAG Chain**

In [17]:
from langchain.schema.runnable import RunnablePassthrough
from langchain_core.prompts import ChatPromptTemplate

template = """Answer the question based only on the following context.
If you don't find the answer in the context, just say that you don't know.

Context: {context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

rag_fusion_chain = (
    {
        "context": chain,
        "question": RunnablePassthrough()
    }
    | prompt
    | llm
    | StrOutputParser()
)

In [18]:
rag_fusion_chain.invoke("what are points on a mortgage")

'Points on a mortgage are a form of pre-paid interest available in the United States when arranging a mortgage. One point equals one percent of the loan amount. Borrowers can pay points to reduce the interest rate on the loan, thereby obtaining a lower monthly payment in exchange. Points may also be purchased to reduce the monthly payment for the purpose of qualifying for a loan.'

## **Preparing Data for Evaluation**

In [19]:
question = ["what are points on a mortgage"]
response = []
contexts = []
ground_truths = ["Points, sometimes also called a 'discount point', are a form of pre-paid interest."]

# Inference
for query in question:
  response.append(rag_fusion_chain.invoke(query))
  contexts.append([docs.page_content for docs in retriever.get_relevant_documents(query)])

# To dict
data = {
    "query": question,
    "response": response,
    "context": contexts,
    "ground_truth": ground_truths
}

  contexts.append([docs.page_content for docs in retriever.get_relevant_documents(query)])


In [20]:
# create dataset
from datasets import Dataset
dataset = Dataset.from_dict(data)

In [21]:
# create dataframe
import pandas as pd
df = pd.DataFrame(dataset)

In [22]:
df

Unnamed: 0,query,response,context,ground_truth
0,what are points on a mortgage,Points on a mortgage are a form of pre-paid in...,"[context: [""Discount points, also called mortg...","Points, sometimes also called a 'discount poin..."


In [23]:
# Convert to dictionary
df_dict = df.to_dict(orient='records')

# Convert context to list
for record in df_dict:
    if not isinstance(record.get('context'), list):
        if record.get('context') is None:
            record['context'] = []
        else:
            record['context'] = [record['context']]

## **Evaluation in Athina AI**

We will use **Answer Relevancy** eval here. It Measures how pertinent the generated response is to the given prompt. Please refer to our [documentation](https://docs.athina.ai/api-reference/evals/preset-evals/overview) for further details.

In [25]:
from athina.keys import AthinaApiKey, OpenAiApiKey

OpenAiApiKey.set_key(os.getenv('OPENAI_API_KEY'))
AthinaApiKey.set_key(os.getenv('ATHINA_API_KEY'))

In [28]:
!pip uninstall litellm -y
!pip install "litellm==1.20.7"

Found existing installation: litellm 1.67.4.post1
Uninstalling litellm-1.67.4.post1:
  Successfully uninstalled litellm-1.67.4.post1


ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
athina 1.7.39 requires litellm==1.67.4.post1, but you have litellm 1.20.7 which is incompatible.

[notice] A new release of pip is available: 24.0 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


Collecting litellm==1.20.7
  Downloading litellm-1.20.7-py3-none-any.whl.metadata (12 kB)
Downloading litellm-1.20.7-py3-none-any.whl (2.4 MB)
   ---------------------------------------- 0.0/2.4 MB ? eta -:--:--
   - -------------------------------------- 0.1/2.4 MB 3.3 MB/s eta 0:00:01
   ---------- ----------------------------- 0.7/2.4 MB 8.4 MB/s eta 0:00:01
   ------------------------------ --------- 1.9/2.4 MB 14.8 MB/s eta 0:00:01
   ---------------------------------------  2.4/2.4 MB 17.4 MB/s eta 0:00:01
   ---------------------------------------- 2.4/2.4 MB 14.2 MB/s eta 0:00:00
Installing collected packages: litellm
Successfully installed litellm-1.20.7


In [33]:
!pip uninstall athina -y
!pip install -U athina


Found existing installation: athina 1.7.39
Uninstalling athina-1.7.39:
  Successfully uninstalled athina-1.7.39
Collecting athina
  Using cached athina-1.7.39-py3-none-any.whl.metadata (3.5 kB)
Collecting litellm==1.67.4.post1 (from athina)
  Using cached litellm-1.67.4.post1-py3-none-any.whl
Using cached athina-1.7.39-py3-none-any.whl (230 kB)
Installing collected packages: litellm, athina
  Attempting uninstall: litellm
    Found existing installation: litellm 1.20.7
    Uninstalling litellm-1.20.7:
      Successfully uninstalled litellm-1.20.7
Successfully installed athina-1.7.39 litellm-1.67.4.post1



[notice] A new release of pip is available: 24.0 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [None]:
from athina.loaders import Loader
dataset = Loader().load_dict(df_dict)

In [None]:
# evaluate
from athina.evals import RagasAnswerRelevancy
RagasAnswerRelevancy(model="gpt-4o").run_batch(data=dataset).to_df()

  and should_run_async(code)


evaluating with [answer_relevancy]


  0%|          | 0/1 [00:00<?, ?it/s]/usr/local/lib/python3.10/dist-packages/pydantic/main.py:1024: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.6/migration/
/usr/local/lib/python3.10/dist-packages/pydantic/main.py:1024: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.6/migration/
100%|██████████| 1/1 [00:01<00:00,  1.11s/it]


You can view your dataset at: https://app.athina.ai/develop/f9377fc3-cd03-4feb-9b52-c8e747590c5b


Unnamed: 0,query,context,response,expected_response,display_name,failed,grade_reason,runtime,model,ragas_answer_relevancy
0,what are points on a mortgage,"[context: [""Discount points, also called mortgage points or simply points, are a form of pre-paid interest available in the United States when arranging a mortgage. One point equals one percent of the loan amount. By charging a borrower points, a lender effectively increases the yield on the loan above the amount of the stated interest rate. Borrowers can offer to pay a lender points as a method to reduce the interest rate on the loan, thus obtaining a lower monthly payment in exchange for ...","Points on a mortgage are a form of pre-paid interest available in the United States when arranging a mortgage. One point equals one percent of the loan amount. Borrowers can offer to pay points to reduce the interest rate on the loan, thus obtaining a lower monthly payment in exchange for this up-front payment.",,Ragas Answer Relevancy,,"A response is deemed relevant when it directly and appropriately addresses the original query. Importantly, our assessment of answer relevance does not consider factuality but instead penalizes cases where the response lacks completeness or contains redundant details",1454,gpt-4o,0.919036
