### Reciprocal Rank Fusion (RRF)

Similar strategy to multi-query retrieval, except we apply ranking alogithm to the retrieved docs. The algorithm used is Reciprocal Rank Fusion (RRF) which involves combining the ranks of different search results to produce a single, unified ranking. By combining ranks from different queries, we pull the most relevant documents to the top of the final list.

In [43]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_postgres.vectorstores import PGVector
from langchain_core.runnables import chain
from langchain_core.documents import Document
from langchain_core.output_parsers import StrOutputParser
import re
from typing import List

In [2]:
prompt_rag_fusion = ChatPromptTemplate.from_template("""You are a helpful assistant that generates multiple search queries based on a single input query. \n
 Generate multiple search queries related to: {question} \n
 Output (4 queries):""")

In [4]:
message = prompt_rag_fusion.invoke("Who were Harry's closest friends")
message

ChatPromptValue(messages=[HumanMessage(content="You are a helpful assistant that generates multiple search queries based on a single input query. \n\n Generate multiple search queries related to: Who were Harry's closest friends \n\n Output (4 queries):", additional_kwargs={}, response_metadata={})])

In [6]:
llm = ChatOpenAI(model='gpt-3.5-turbo', temperature=0)

In [8]:
temp_chain = prompt_rag_fusion | llm

message = temp_chain.invoke("Who were Harry's closest friends?")
message

AIMessage(content="1. Who were Harry Potter's closest friends in the Harry Potter series?\n2. What were the names of Harry Potter's best friends?\n3. Can you list the main friends of Harry Potter throughout the series?\n4. Who were the key friends that supported Harry Potter in the books and movies?", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 60, 'prompt_tokens': 45, 'total_tokens': 105, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'id': 'chatcmpl-Clve8t3pbbFmFAAJZ85DXTY22Bi2P', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None}, id='lc_run--d03a13f6-cf06-47fd-85ba-1c10f12b51fb-0', usage_metadata={'input_tokens': 45, 'output_tokens': 60, 'total_tokens': 105, 'input_tok

In [10]:
message.content.split('\n')

["1. Who were Harry Potter's closest friends in the Harry Potter series?",
 "2. What were the names of Harry Potter's best friends?",
 '3. Can you list the main friends of Harry Potter throughout the series?',
 '4. Who were the key friends that supported Harry Potter in the books and movies?']

In [11]:
def format_query_output(message):
    final = []
    pattern = r"^\d+\.\s*"
    text = message.content.split('\n')
    for msg in text:
        cleaned = re.sub(pattern, "", msg)
        final.append(cleaned.strip())

    return final

In [14]:
query_gen = prompt_rag_fusion | llm | format_query_output
query_gen.invoke("Who were Harry's closest friends?")

["What were the names of Harry's closest friends in the Harry Potter series?",
 'How did Harry meet his closest friends in the Harry Potter books?',
 "Were Harry's closest friends loyal to him throughout the Harry Potter series?",
 "Did Harry's closest friends play a significant role in helping him defeat Voldemort in the Harry Potter books?"]

In [30]:
def reciprocal_rank_fusion(results: List[List], k: int = 60) -> List[Document]:
    """reciprocal rank fusion on multiple lists of ranked documents
        and an optional parameter k used in the RRF formula
    """
    scores = {} # dictionary to hold fused scores for each document
    documents = {} # dictionary to hold unique documents

    # iterate through the list of ranked documents
    for docs in results:
        # iterate through each document
        for rank, doc in enumerate(docs):
            # get the page_content for uniqueness
            doc_str = doc.page_content
            # if the document is new, initialize score to 0 and save in dict
            if doc_str not in documents:
                scores[doc_str] = 0
                documents[doc_str] = doc

            # update the score using RRF formula 1 / (rank + k)
            scores[doc_str] += 1 / (rank + k)

    # sort the documents based on their scores in descending order
    ranked_docs = sorted(scores, key = lambda x : scores[x], reverse = True)

    # return the documents based on their doc_str
    return [documents[doc_str] for doc_str in ranked_docs]

In [31]:
connection = 'postgresql+psycopg://langchain:langchain@localhost:6024/langchain'
collection_name = "Harry_Potter_Complete"
embedding_model = OpenAIEmbeddings(model="text-embedding-3-small")

db = PGVector(
    embeddings=embedding_model,
    collection_name=collection_name,
    connection=connection
)

retriever = db.as_retriever()

In [32]:
retrieval_chain = query_gen | retriever.batch | reciprocal_rank_fusion

In [33]:
retrieval_chain.invoke("Who were Harry's closest friends?")

[Document(id='a63dee9d-ee7a-4e16-89ca-eb9a6593eafc', metadata={'source': 'data/HP1.txt'}, page_content='â€œAm I?â€\x9d said Harry, feeling dazed.\n\nâ€œGoodness, didnâ€™t you know, Iâ€™d have found out everything I could if it was me,â€\x9d said Hermione. â€œDo either of you know what house youâ€™ll be in? Iâ€™ve been asking around, and I hope Iâ€™m in Gryffindor, it sounds by far the best; I hear Dumbledore himself was in it, but I suppose Ravenclaw wouldnâ€™t be too bad....Anyway, weâ€™d better go and look for Nevilleâ€™s toad. You two had better change, you know, I expect weâ€™ll be there soon.â€\x9d\n\nAnd she left, taking the toadless boy with her.\n\nâ€œWhatever house Iâ€™m in, I hope sheâ€™s not in it,â€\x9d said Ron. He threw his wand back into his trunk. â€œStupid spell â€” George gave it to me, bet he knew it was a dud.â€\x9d\n\nâ€œWhat house are your brothers in?â€\x9d asked Harry.\n\nâ€œGryffindor,â€\x9d said Ron. Gloom seemed to be settling on him again. â€œMom and Dad wer

In [34]:
prompt = ChatPromptTemplate.from_template("""Answer the following question based only on the provided context : {context}
question : {question}""")

@chain
def qa_rrf(question):
    # get the ranked docs
    docs = retrieval_chain.invoke(question)
    # extract the context
    context = '\n\n'.join(d.page_content for d in docs)
    # prepare prompt
    formatted = prompt.invoke({"context": context, "question": question})
    # feed to llm
    return llm.invoke(formatted).content

In [35]:
qa_rrf.invoke("Who were Harry's closest friends?")

"Harry's closest friends were Ron Weasley and Hermione Granger."

In [38]:
qa_rrf.invoke("Put these students in order according to the length of time it took the Sorting Hat to sort them: Ron Weasley, Seamus Finnigan, and Draco Malfoy")

'Seamus Finnigan, Ron Weasley, Draco Malfoy'

In [39]:
qa_rrf.invoke("What spell did Hermione use on Neville when he tried to keep her, Harry, and Ron from leaving the common room?")

'Hermione used the Petrificus Totalus spell on Neville when he tried to keep her, Harry, and Ron from leaving the common room.'

In [40]:
qa_rrf.invoke("What planet was unusually bright the first time Harry entered the Forbidden Forest?")

'Mars'

In [41]:
qa_rrf.invoke("What does Professor Dumbledore say he sees when he looks into the Mirror of Erised?")

'Professor Dumbledore says he sees himself holding a pair of thick, woolen socks when he looks into the Mirror of Erised.'

In [42]:
qa_rrf.invoke("What types of dragons are wild in Britain?")

'Common Welsh Green and Hebridean Blacks.'

### Hypothetical Document Embedding (HyDE)

It's a strategy that involves creating a hypothetical document based on the user’s query, embedding the document, and retrieving relevant documents based on vector similarity.

In [44]:
prompt_hyde = ChatPromptTemplate.from_template("""Please write a pessage to answer the question. \n question : {question} \n passage:""")

In [45]:
generate_doc = prompt_hyde | llm | StrOutputParser()

In [46]:
generate_doc.invoke("Who were Harry Potter's closest friends?")

"Harry Potter's closest friends were Hermione Granger and Ron Weasley. Throughout their time at Hogwarts School of Witchcraft and Wizardry, the trio formed a strong bond and faced many challenges together. Hermione was known for her intelligence and quick thinking, while Ron was loyal and always had Harry's back. Together, they supported each other through thick and thin, proving that friendship can conquer even the darkest of times."

In [52]:
generate_doc.invoke("What are you supposed to feed a newborn dragon?")

'Feeding a newborn dragon can be a bit tricky as their dietary needs can vary depending on the species. Generally, newborn dragons can be fed a diet of small insects such as crickets, mealworms, and fruit flies. It is important to provide a variety of insects to ensure they are getting all the necessary nutrients. Some dragons may also benefit from the occasional small pinky mouse or other small vertebrates. It is important to research the specific dietary needs of your dragon species to ensure they are getting the proper nutrition to grow and thrive.'

Next, we take the hypothetical document and use it as input to the retriever, which will generate its embedding and search for similar documents in the vector store:

In [48]:
retrieval_chain = generate_doc | retriever

retrieval_chain.invoke("Who were Harry Potter's closest friends?")

[Document(id='0a119cc0-0915-49fb-9843-b2982a9ad9e9', metadata={'source': 'data/HP1.txt'}, page_content="Hermione left.\n\nProfessor McGonagall turned to Harry and Ron.\n\nâ€œWell, I still say you were lucky, but not many first years could have taken on a full-grown mountain troll. You each win Gryffindor five points. Professor Dumbledore will be informed of this. You may go.â€\x9d\n\nThey hurried out of the chamber and didnâ€™t speak at all until they had climbed two floors up. It was a relief to be away from the smell of the troll, quite apart from anything else.\n\nâ€œWe should have gotten more than ten points,â€\x9d Ron grumbled.\n\nâ€œFive, you mean, once sheâ€™s taken off Hermioneâ€™s.â€\x9d\n\nâ€œGood of her to get us out of trouble like that,â€\x9d Ron admitted. â€œMind you, we did save her.â€\x9d\n\nâ€œShe might not have needed saving if we hadnâ€™t locked the thing in with her,â€\x9d Harry reminded him.\n\nThey had reached the portrait of the Fat Lady.\n\nâ€œPig snout,â€\x9d t

In [50]:
@chain 
def qa_hyde(question):
    # get the ranked docs
    docs = retrieval_chain.invoke(question)
    # extract the context
    context = '\n\n'.join(d.page_content for d in docs)
    # prepare prompt
    formatted = prompt.invoke({"context": context, "question": question})
    # feed to llm
    return llm.invoke(formatted).content

In [51]:
qa_hyde.invoke("What are you supposed to feed a newborn dragon?")

'According to Hagrid, a newborn dragon should be fed on a bucket of brandy mixed with chicken blood every half hour.'

#### Why this works?

You'd think that if the question we pass to the "generated_doc" is outside of the knowledge base of the LLM it wouldn't help but tt only needs to generate a plausible-sounding passage that is semantically close to the real answer, even if the details are wrong. Because embeddings care about semantic space, not factual correctness.

But HyDE could fail in cases where:

The LLM generates a passage that is semantically too generic

The topic is too far from what exists in your vectorstore

The LLM hallucinates content that misleads embeddings into the wrong semantic space

Your chunk size is too large and semantic similarity gets washed out

You're using bad chunking.