
# RAG FUSION 

If you're working with Retrieval-Augmented Generation (RAG) systems in AI, you've probably noticed that sometimes... the retrieval step just doesn't bring back the best info.

That’s where RAG Fusion comes in—a smarter, more powerful way to retrieve data before generating a response.


𝐋𝐞𝐭’𝐬 𝐛𝐫𝐞𝐚𝐤 𝐢𝐭 𝐝𝐨𝐰𝐧 𝐰𝐢𝐭𝐡 𝐚 𝐫𝐞𝐚𝐥-𝐰𝐨𝐫𝐥𝐝 𝐞𝐱𝐚𝐦𝐩𝐥𝐞 👇

Say you ask your RAG system:

🧠 “What are the recent trends in electric vehicle (EV) battery technology?”

A standard RAG pipeline would:

1. Use that exact query to retrieve documents.

2. Feed those into a language model to generate an answer.



Good start. But what if that query misses better info phrased differently?



⚡ 𝐄𝐧𝐭𝐞𝐫 𝐑𝐀𝐆 𝐅𝐮𝐬𝐢𝐨𝐧:

Instead of relying on just one query, RAG Fusion:

1. Generate multiple queries. 

example: 

query: "what are the recent trends in electric vehicle battery technology?"

generated queries:

 “Latest innovations in EV battery design”

 “Advancements in lithium-ion batteries for electric cars”

 “Trends in EV energy storage tech”

2. Retrieves documents for each rewritten query.

3. Fuse, reranks, and deduplicates all results.We can use help of LLm for removing deduplicate entries and algorithms like reciprocal rank fusion for finally rerank the results

4. Finally pass the most relevant content to the language model.



The result? 🧠 A much more informed and accurate answer.



🧪 𝐖𝐡𝐲 𝐢𝐭 𝐰𝐨𝐫𝐤𝐬:

- Broader coverage of information sources.

- Higher chance of surfacing high-quality, diverse content.

- Much more resilient to vague or poorly-phrased questions.





### Steps
1. query generation
2. retrieval based on each query
3. reciprocal rank fusion on results
4. Final generation


In [176]:
import openai,os
from openai import AzureOpenAI
from dotenv import load_dotenv
from typing import DefaultDict
from azure.identity import ClientSecretCredential
from langchain_openai import OpenAIEmbeddings
from langchain_pinecone import PineconeVectorStore

load_dotenv()


True

In [190]:
def get_access_token():
    """ 
    gets api key by authenticating with azure ad
    """
    credential = ClientSecretCredential(os.getenv('OPENAI_AD_TENANT_ID'), os.getenv('OPENAI_AD_CLIENT_ID'),
                                            os.getenv('OPENAI_AD_CLIENT_SECRET'))
    access_token = credential.get_token(os.getenv('OPENAI_AD_TOKEN_BASE'))
    return access_token.token 


## Normal RAG Pipeline

**Vectorizing the documents**

In [191]:
Docs = [

"EV batteries primarily use lithium-ion chemistry, known for high energy density and rechargeability. Other components include cobalt, nickel, manganese, and graphite, each influencing performance, cost, and safety.",
"Solid-state batteries are emerging as a promising alternative to conventional lithium-ion batteries. They offer increased energy density, faster charging, and improved safety due to the absence of liquid electrolytes.",
"Legacy automakers like Ford, GM, and Volkswagen have invested billions in EV development, with GM committing to an all-electric future by 2035. These investments cover vehicle platforms, battery plants, and EV-specific R&D.",
"Tech companies like Apple and Xiaomi are also exploring EVs, either through direct vehicle development or software and autonomous systems. Their entry reflects how EVs are becoming a convergence point for the auto and tech industries.",
"Public charging infrastructure is critical for EV adoption, with fast chargers (DCFC) reducing charging times to under 30 minutes. Countries like China and the U.S. are rapidly expanding nationwide networks.",
"Home charging accounts for over 80% of EV charging, making Level 2 residential chargers an essential part of the ecosystem. Incentives often cover installation costs to promote at-home charging convenience.",
"Many EVs are integrated with advanced driver-assistance systems (ADAS), leveraging sensors, AI, and over-the-air updates. Tesla's Autopilot and Rivian’s Driver+ are examples of smart EV technologies evolving toward autonomy.",
"Global EV sales surpassed 14 million units in 2023, representing nearly 18% of all new car sales. Asia, especially China, leads the market, but Europe and North America are also seeing rapid growth",
"EVs produce zero tailpipe emissions, making them a cleaner alternative to internal combustion engine vehicles. However, lifecycle emissions depend on how electricity is generated.",
"Battery recycling and reuse are becoming crucial to EV sustainability. Companies like Redwood Materials are working on systems to recover lithium, cobalt, and nickel from end-of-life batteries to reduce mining and environmental impact."

]

### initialize a embedding model
from langchain_openai import AzureOpenAIEmbeddings

def get_emebeddding_model():

    

    embedding_model = AzureOpenAIEmbeddings(
                # azure_deployment=os.getenv("AZURE_OPENAI_EMBEDDING_MODEL_NAME"),
                model=os.getenv("AZURE_OPENAI_EMBEDDING_MODEL_NAME"),
                azure_endpoint=os.getenv("AZURE_OPENAI_EMBEDDING_BASE"),
                openai_api_version=os.getenv('AZURE_OPENAI_API_VERSION'),
                openai_api_key=get_access_token()
    )
    return embedding_model


embedding_model = get_emebeddding_model()



# Create a vector store with a documents
from langchain_core.vectorstores import InMemoryVectorStore

vectorstore = InMemoryVectorStore.from_texts(
    Docs,
    embedding=embedding_model,
)

**Query retriever**

In [192]:


query = "Ev batteries"

def get_chunks(query,k):
    retrieved_documents = vectorstore.similarity_search_with_score(k=k,query=query,sorted=True)

    return retrieved_documents

chunks = get_chunks(query=query,k=3)
chunks

[(Document(id='d5f4eb62-2323-4ff2-8735-5e0a3fb506c9', metadata={}, page_content='EV batteries primarily use lithium-ion chemistry, known for high energy density and rechargeability. Other components include cobalt, nickel, manganese, and graphite, each influencing performance, cost, and safety.'),
  0.6496517173862397),
 (Document(id='cbdce63e-7cb0-48bf-9c55-7e1e331487bf', metadata={}, page_content='Battery recycling and reuse are becoming crucial to EV sustainability. Companies like Redwood Materials are working on systems to recover lithium, cobalt, and nickel from end-of-life batteries to reduce mining and environmental impact.'),
  0.5457863616770828),
 (Document(id='65a72b17-60e0-4f5d-b101-965f031213a2', metadata={}, page_content='EVs produce zero tailpipe emissions, making them a cleaner alternative to internal combustion engine vehicles. However, lifecycle emissions depend on how electricity is generated.'),
  0.49025371123702144)]

**Answer Generator**

Initialize a prompt template

In [None]:
from langchain_core.prompts import SystemMessagePromptTemplate,HumanMessagePromptTemplate
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import AzureChatOpenAI

def get_llm():

    llm = AzureChatOpenAI(
        azure_endpoint=os.getenv("AZURE_OPENAI_API_BASE"),
        model=os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME"),
        openai_api_version=os.getenv("AZURE_OPENAI_API_VERSION"),
        api_key=get_access_token()
    )
    return llm


def get_retriever_prompt_template():

    system_message = "You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know."

    user_message = """Context: {context}
                        Question: {question}
                        Answer:"""

    system_prompt = SystemMessagePromptTemplate.from_template(system_message)
    user_prompt = HumanMessagePromptTemplate.from_template(user_message)

    prompt = ChatPromptTemplate.from_messages(
            [
                system_prompt,
                user_prompt
            ]
        )
    return prompt


Answer generation chain

In [None]:
from langchain_core.prompts import MessagesPlaceholder
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import chain
# Define the LLM
from langchain_openai import ChatOpenAI



## initiate llm for answering
llm = get_llm()


##prompt
prompt_template = get_retriever_prompt_template()

# Create a chain
retriever_chain = (
        prompt_template
        | llm
        | StrOutputParser()
    )



In [194]:


query = "EV batteries"
chunks = get_chunks(k=3,query=query)
context = "\n\n".join([doc[0].page_content for doc in chunks])


# Pass the user input
response = retriever_chain.invoke({
        "context": context,
        "question": query
    })
print(response)

EV batteries primarily use lithium-ion chemistry, which is known for its high energy density and rechargeability. They also contain other components like cobalt, nickel, manganese, and graphite, which influence the battery's performance, cost, and safety.


## RAG Fusion Pipeline

First step in rag fusion is to rephrase multiple queries from actual query

In [195]:



def get_rephrasing_prompt_template():


    system_message = """You are a helpful assistant that generates multiple search queries based on a single input query.
    #Rules
    - generated search queries must be seperated by commas
    - you should output only generated queries and nothing else
    """

    user_message = """Generate 2-3 multiple search queries related to: {original_query}.
                      Generated_queries: """

    system_prompt = SystemMessagePromptTemplate.from_template(system_message)
    user_prompt = HumanMessagePromptTemplate.from_template(user_message)

    prompt = ChatPromptTemplate.from_messages(
            [
                system_prompt,
                user_prompt
            ]
        )
    return prompt


prompt = get_rephrasing_prompt_template()

rephrasing_chain = (prompt | llm | StrOutputParser())

rephrased_queries = rephrasing_chain.invoke({"original_query":query})
rephrased_queries  = [i.strip() for i in rephrased_queries.split(',')]

for q in rephrased_queries:
    print(q)

EV battery technology
electric vehicle battery types comparison
how long do EV batteries last
EV battery recycling process
advancements in EV battery performance
best EV batteries for extended range
cost of EV batteries 2023
EV battery maintenance tips


Now lets get the top 3 search results and their corresponding score using each of the query. We will perform reciprocal rank fusion on top of these results

In [196]:
def reciprocal_rank_fusion(list_of_list_ranks_system, K=60):
    """
    Fuse rank from multiple IR systems using Reciprocal Rank Fusion.
    
    Args:
    * list_of_list_ranks_system: Ranked results from different IR system.
    K (int): A constant used in the RRF formula (default is 60).
    
    Returns:
    Tuple of list of sorted documents by score and sorted documents
    """
    # Dictionary to store RRF mapping
    rrf_map = DefaultDict(float)

    # Calculate RRF score for each result in each list
    for rank_list in list_of_list_ranks_system:
        for rank, item in enumerate(rank_list, 1):
            rrf_map[item] += 1 / (rank + K)

    # Sort items based on their RRF scores in descending order
    sorted_items = sorted(rrf_map.items(), key=lambda x: x[1], reverse=True)

    # Return tuple of list of sorted documents by score and sorted documents
    return sorted_items

Next we will rerank the search results based on reciprocal rank fusion

In [197]:
##prepate a list of doc_id and their ranks
search_results = []

for query in rephrased_queries:
    chunks = get_chunks(k=3,query=query)

    search_results.append([doc[0].id for doc in chunks])

rrf_results = reciprocal_rank_fusion(search_results)
rrf_results
    

[('d5f4eb62-2323-4ff2-8735-5e0a3fb506c9', 0.130098293503899),
 ('cbdce63e-7cb0-48bf-9c55-7e1e331487bf', 0.12904904602419145),
 ('cf9c93b3-ae95-44d4-bd35-18b5db4a72d8', 0.06374807987711213),
 ('65a72b17-60e0-4f5d-b101-965f031213a2', 0.04787506400409626),
 ('09acd1ff-071b-4ac7-a605-fdd48720f1ae', 0.01639344262295082)]

Now we will take top 3 chunks from this reranked results and use it to generate answer

In [200]:
chunk_ids = [i[0] for i in rrf_results[:3]]
rag_fusion_documents = vectorstore.get_by_ids(chunk_ids)


rag_fusion_context = "\n\n".join([doc.page_content for doc in rag_fusion_documents])
query = "EV batteries"
# Pass the user input
rag_fusion_response = retriever_chain.invoke({
        "context": rag_fusion_context,
        "question": query
    })
print(rag_fusion_response)

EV batteries primarily use lithium-ion chemistry, which is known for its high energy density and rechargeability. They also contain other components such as cobalt, nickel, manganese, and graphite, all of which affect the battery's performance, cost, and safety. Additionally, battery recycling and reuse are essential for EV sustainability, with companies like Redwood Materials focusing on recovering valuable materials like lithium, cobalt, and nickel from end-of-life batteries to minimize environmental impacts.


We can see that response from rag fusion pipeline contains more contextual and relevent information.