# RAG Fusion working
We generate similar question/answers to the original and then calculate reciprocal rank fusion scores of each document.

Example:
Suppose you have 2 ranked lists:

List 1: [A, B, C, D]

List 2: [B, D, E, F]

Using 𝑘=60 compute RRF scores:


| Document | Rank in List 1 | Rank in List 2 | RRF Score                                      |
| -------- | -------------- | -------------- | ---------------------------------------------- |
| A        | 1              | -              | 1 / (60+1) = 0.0164                            |
| B        | 2              | 1              | 1/(60+2) + 1/(60+1) = 0.0161 + 0.0164 = 0.0325 |
| C        | 3              | -              | 1/(60+3) = 0.0159                              |
| D        | 4              | 2              | 1/(60+4) + 1/(60+2) = 0.0156 + 0.0161 = 0.0317 |
| E        | -              | 3              | 1/(60+3) = 0.0159                              |
| F        | -              | 4              | 1/(60+4) = 0.0156                              |

So final ranking would be:

B (0.0325)

D (0.0317)

A (0.0164)

C (0.0159)

E (0.0159)

F (0.0156)


In [3]:
import textwrap
def wrap_text(text, width=90): #preserve_newlines
    # Split the input text into lines based on newline characters
    lines = text.split('\n')

    # Wrap each line individually
    wrapped_lines = [textwrap.fill(line, width=width) for line in lines]

    # Join the wrapped lines back together using newline characters
    wrapped_text = '\n'.join(wrapped_lines)

    return wrapped_text

In [None]:
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI

load_dotenv()

# Initialize llm
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')
llm = ChatOpenAI(model="gpt-4o")


In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores.chroma import Chroma
from langchain.document_loaders import PyPDFLoader

# Load the PDF document and split it into chunks
loader = PyPDFLoader("Harry-Potter-Chapter-1.pdf")
docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 500,
    chunk_overlap  = 100,
    length_function = len,
    is_separator_regex = False,
)
texts = text_splitter.split_documents(docs)

In [5]:
from langchain.embeddings import HuggingFaceBgeEmbeddings

model_name = "BAAI/bge-small-en-v1.5"
encode_kwargs = {'normalize_embeddings': True} # set True to compute cosine similarity

# create the embedding 
embedding_function = HuggingFaceBgeEmbeddings(
    model_name=model_name,
    encode_kwargs=encode_kwargs
)

  embedding_function = HuggingFaceBgeEmbeddings(
  from .autonotebook import tqdm as notebook_tqdm


In [6]:
db = Chroma.from_documents(texts,embedding_function,persist_directory="./chroma_db")

query = "How did Mr. Dursley react when he overheard people talking about the Potters and their son?"
db.similarity_search(query, k=5)

# Create a retriever and invoke it with the query
retriever = db.as_retriever() # can add mmr fetch_k=20, search_type="mmr"
retriever.invoke(query)

[Document(metadata={'author': 'Charlotte Roberts', 'creator': 'Microsoft® Word for Office 365', 'creationdate': '2021-01-08T11:54:40+00:00', 'source': 'Harry-Potter-Chapter-1.pdf', 'moddate': '2021-01-08T11:54:40+00:00', 'page': 3, 'page_label': '4', 'producer': 'Microsoft® Word for Office 365', 'total_pages': 11}, page_content='... her lot.’ \nMrs Dursley sipped her tea through pursed lips. Mr Dursley  wondered \nwhether he dared tell her he’d heard the name ‘Potter’.  He decided he didn’t \ndare. Instead he said, as casually as he could,  ‘Their son – he’d be about \nDudley’s age now, wouldn’t he?’ \n‘I suppose so,’ said Mrs Dursley stiffly. \n‘What’s his name again? Howard, isn’t it?’ \n‘Harry. Nasty, common name, if you ask me.’ \n‘Oh, yes,’ said Mr Dursley, his heart sinking horribly. ‘Yes, I quite agree.’'),
 Document(metadata={'creator': 'Microsoft® Word for Office 365', 'author': 'Charlotte Roberts', 'moddate': '2021-01-08T11:54:40+00:00', 'page': 3, 'creationdate': '2021-01-08

In [7]:
from langchain.prompts import ChatPromptTemplate
from langchain.schema.output_parser import StrOutputParser
from langchain.schema.runnable import RunnableLambda, RunnablePassthrough

# normal chain
template = """Answer the question based only on the following context:
{context}

Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)
chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

text_reply = chain.invoke("What was unusual about the people Mr. Dursley saw near the baker’s")

print(wrap_text(text_reply))

Mr. Dursley noticed that the people near the baker's were dressed in strangely, wearing
cloaks. He found their attire unusual and assumed it might be some new, stupid fashion
trend.


# Rag Fusion

In [8]:
from langchain.schema.output_parser import StrOutputParser
from langchain.prompts import SystemMessagePromptTemplate, HumanMessagePromptTemplate
from langchain.prompts import ChatMessagePromptTemplate, PromptTemplate

In [9]:
prompt = ChatPromptTemplate(input_variables=['original_query'],
                            messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[],template='You are a helpful assistant that generates multiple search queries based on a single input query.')),
                            HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['original_query'], template='Generate multiple search queries related to: {question} \n OUTPUT (4 queries):'))])

In [10]:
original_query = "Mr. Dursley see the owls flying outside his office"

# Generate 4 queries based on the original query
generate_queries = (
    prompt | llm | StrOutputParser() | (lambda x: x.split("\n"))
)

generate_queries

ChatPromptTemplate(input_variables=['question'], input_types={}, partial_variables={}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], input_types={}, partial_variables={}, template='You are a helpful assistant that generates multiple search queries based on a single input query.'), additional_kwargs={}), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['question'], input_types={}, partial_variables={}, template='Generate multiple search queries related to: {question} \n OUTPUT (4 queries):'), additional_kwargs={})])
| ChatOpenAI(client=<openai.resources.chat.completions.completions.Completions object at 0x00000278980E6F30>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x00000278985CB6E0>, root_client=<openai.OpenAI object at 0x0000027897FE8B60>, root_async_client=<openai.AsyncOpenAI object at 0x000002789819D5B0>, model_name='gpt-4o', model_kwargs={}, openai_api_key=SecretStr('**********'))
| Str

In [None]:
from langchain.load import dumps, loads

# Reciprocal Rank Fusion (RRF) implementation
def reciprocal_rank_fusion(results: list[list], k=60):
    fused_scores = {}
    for docs in results:
        # Assumes the docs are returned in sorted order of relevance
        for rank, doc in enumerate(docs):
            doc_str = dumps(doc)
            if doc_str not in fused_scores:
                fused_scores[doc_str] = 0
            previous_score = fused_scores[doc_str]
            fused_scores[doc_str] += 1 / (rank + k)

    reranked_results = [
        (loads(doc), score)
        for doc, score in sorted(fused_scores.items(), key=lambda x: x[1], reverse=True)
    ]
    return reranked_results

In [14]:
import langchain

# Create the RAG Fusion chain for generated queries
ragfusion_chain = generate_queries | retriever.map() | reciprocal_rank_fusion
langchain.debug = True
ragfusion_chain.invoke({"question": original_query})

[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence] Entering Chain run with input:
[0m{
  "question": "Mr. Dursley see the owls flying outside his office"
}
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > prompt:ChatPromptTemplate] Entering Prompt run with input:
[0m{
  "question": "Mr. Dursley see the owls flying outside his office"
}
[36;1m[1;3m[chain/end][0m [1m[chain:RunnableSequence > prompt:ChatPromptTemplate] [1ms] Exiting Prompt run with output:
[0m[outputs]
[32;1m[1;3m[llm/start][0m [1m[chain:RunnableSequence > llm:ChatOpenAI] Entering LLM run with input:
[0m{
  "prompts": [
    "System: You are a helpful assistant that generates multiple search queries based on a single input query.\nHuman: Generate multiple search queries related to: Mr. Dursley see the owls flying outside his office \n OUTPUT (4 queries):"
  ]
}
[36;1m[1;3m[llm/end][0m [1m[chain:RunnableSequence > llm:ChatOpenAI] [1.61s] Exiting LLM run with output:
[0m{
  "generations"

[(Document(metadata={'page_label': '2', 'producer': 'Microsoft® Word for Office 365', 'source': 'Harry-Potter-Chapter-1.pdf', 'author': 'Charlotte Roberts', 'total_pages': 11, 'moddate': '2021-01-08T11:54:40+00:00', 'creationdate': '2021-01-08T11:54:40+00:00', 'creator': 'Microsoft® Word for Office 365', 'page': 1}, page_content='down in the street did; they pointed and gazed open-mouthed as owl after owl \nsped overhead. Most of them had never seen an owl even at nighttime.  Mr \nDursley, however, had a p erfectly normal, owl-free morning. He yelled at five \ndifferent people. He made several important telephone calls and shouted a bit \nmore. He was in a very good  mood until lunch -time, when he thought he’d \nstretch his legs and walk across the road to buy himself a bun from the baker’s \nopposite.'),
  0.016666666666666666),
 (Document(metadata={'page_label': '2', 'producer': 'Microsoft® Word for Office 365', 'author': 'Charlotte Roberts', 'creator': 'Microsoft® Word for Office 3

In [16]:
from langchain.schema.runnable import RunnablePassthrough


template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

full_rag_fusion_chain = (
    {
        "context": ragfusion_chain,
        "question": RunnablePassthrough()
    }
    | prompt
    | llm
    | StrOutputParser()
)

# full rag fusion chain
full_rag_fusion_chain.invoke({"question": "Why did Mr. Dursley decide not to call his wife after almost dialing her"})

[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence] Entering Chain run with input:
[0m{
  "question": "Why did Mr. Dursley decide not to call his wife after almost dialing her"
}
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableParallel<context,question>] Entering Chain run with input:
[0m{
  "question": "Why did Mr. Dursley decide not to call his wife after almost dialing her"
}
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableParallel<context,question> > chain:RunnableSequence] Entering Chain run with input:
[0m{
  "question": "Why did Mr. Dursley decide not to call his wife after almost dialing her"
}
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableParallel<context,question> > chain:RunnableSequence > prompt:ChatPromptTemplate] Entering Prompt run with input:
[0m{
  "question": "Why did Mr. Dursley decide not to call his wife after almost dialing her"
}
[36;1m[1;3m[chain/end][0m [1m[chain:

"Mr. Dursley decided not to call his wife after almost dialing her because he thought that he was being stupid, considering that Potter wasn't such an unusual name. He realized that there were likely many people named Potter who might have a son called Harry. Additionally, he wasn't even sure if his nephew's name was Harry; it could have been Harvey or Harold. Since there was no point in worrying Mrs. Dursley, as she always got upset at any mention of her sister, he chose not to call her."