Import the libraries

In [1]:
from langchain.vectorstores import Qdrant
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.schema import Document

In [2]:
# Example documents simulating a bank knowledge base
docs = [
    Document(page_content="If a transfer fails due to insufficient funds, ...", metadata={"source": "FAQ"}),
    Document(page_content="Transfers may be declined if daily limits are exceeded or account details are invalid, ...", metadata={"source": "Guide"}),
    Document(page_content="Error codes for failed bank transfers: 101 Insufficient Funds, 102 Account Not Found, ...", metadata={"source": "TechSupport"})
    # Additional documents can be added here
]


In [9]:
# Initialize OpenAI embeddings
embedding_model = OpenAIEmbeddings()

In [6]:
# Connect to Qdrant (ensure it is running at localhost:6333)
vectordb = Qdrant.from_documents(
    docs,
    embedding_model,
    url="http://localhost:6333",
    collection_name="bank_knowledge"
)

We load documents into memory, embed them using OpenAI's embedding model, and store them in a Qdrant collection for future retrieval.

We now use a language model to generate multiple versions of the user's query to increase recall.

In [7]:
from langchain.chat_models import ChatOpenAI
from langchain.retrievers import MultiQueryRetriever

# Use a deterministic LLM for consistent outputs
llm = ChatOpenAI(temperature=0)

# Wrap the base retriever with MultiQueryRetriever
multi_retriever = MultiQueryRetriever.from_llm(
    retriever=vectordb.as_retriever(),
    llm=llm
)


  llm = ChatOpenAI(temperature=0)


We instantiate a ChatOpenAI model to generate multiple rephrasings of the input question. These variations are passed to the retriever to pull a wider range of relevant documents.

Now, let’s use the retriever to fetch documents related to the user’s query.

In [8]:
user_question = "Why did my bank transfer fail?"
docs_found = multi_retriever.get_relevant_documents(user_question)

print(f"Retrieved {len(docs_found)} documents with Multi-Query RAG.")
for i, doc in enumerate(docs_found, 1):
    snippet = doc.page_content[:60].strip()
    print(f"Doc {i}: {snippet}...")


  docs_found = multi_retriever.get_relevant_documents(user_question)


Retrieved 3 documents with Multi-Query RAG.
Doc 1: Error codes for failed bank transfers: 101 Insufficient Fund...
Doc 2: If a transfer fails due to insufficient funds, ......
Doc 3: Transfers may be declined if daily limits are exceeded or ac...


What’s happening:
The get_relevant_documents method internally prompts the LLM to generate multiple versions of the input query. Each query is used to perform a similarity search on the Qdrant vector store. The results are combined and deduplicated before being returned.

# Reciprocal Rank Fusion

## Assume we already have our vectordb (Qdrant) and embedding model from earlier.

In [10]:

# Two different query formulations (e.g., produced by an LLM or manually crafted):
queries = [
    "Why would a bank transfer be declined? Insufficient funds scenario.", 
    "Reasons a bank transfer might fail due to account issues or limits."
]


These variations represent different phrasings of the same core user intent. They may yield different sets of relevant documents, which we aim to combine.

We run a similarity search using Qdrant for each query and collect the top-k results along with their scores.

In [11]:
results_lists = []  # List to hold results from each query

for q in queries:
    res = vectordb.similarity_search_with_score(q, k=5)  # returns (Document, similarity_score)
    results_lists.append(res)


We store the results for each query in results_lists. Each list contains tuples of documents and their similarity scores. For RRF, we use only the rank position, not the score.

We calculate a fused score for each document using the Reciprocal Rank Fusion (RRF) formula:

$$
\text{Score}_{\text{doc}} = \sum_{i=1}^{n} \frac{1}{\text{rank}_i}
$$




In [12]:
print(results_lists)

[[(Document(metadata={'source': 'FAQ', '_id': 'c55c6c24-2618-4dc9-96a7-4da7cb26215d', '_collection_name': 'bank_knowledge'}, page_content='If a transfer fails due to insufficient funds, ...'), 0.88985664), (Document(metadata={'source': 'Guide', '_id': '5562df7d-a42c-4392-91b6-f88005a42c89', '_collection_name': 'bank_knowledge'}, page_content='Transfers may be declined if daily limits are exceeded or account details are invalid, ...'), 0.8665598), (Document(metadata={'source': 'TechSupport', '_id': '8762a800-1c85-4a62-b6ab-cc22f0e36898', '_collection_name': 'bank_knowledge'}, page_content='Error codes for failed bank transfers: 101 Insufficient Funds, 102 Account Not Found, ...'), 0.8634267)], [(Document(metadata={'source': 'Guide', '_id': '5562df7d-a42c-4392-91b6-f88005a42c89', '_collection_name': 'bank_knowledge'}, page_content='Transfers may be declined if daily limits are exceeded or account details are invalid, ...'), 0.89812684), (Document(metadata={'source': 'TechSupport', '_id':

In [16]:
from collections import defaultdict

fused_scores = defaultdict(float)
doc_lookup = {}  # To map IDs back to Document objects

for res_list in results_lists:
    for rank, (doc, _) in enumerate(res_list, start=1):
        doc_id = doc.metadata.get("id", doc.page_content)  # Use ID or fallback to text
        fused_scores[doc_id] += 1.0 / rank
        doc_lookup[doc_id] = doc  # Save reference to the document


For each list of results, we iterate through the documents and assign a score of 1/rank
If a document appears in multiple lists, its score accumulates, boosting its rank in the final fused list.

Now we sort the documents in descending order of their accumulated RRF score.

In [17]:
ranked_docs = sorted(fused_scores.items(), key=lambda item: item[1], reverse=True)


Documents with higher total RRF scores (i.e., that appear higher and more frequently in multiple result sets) are ranked higher.

We print out the final top documents after fusion.

In [19]:
# Assume you already built this earlier:
# doc_lookup = {doc_id: Document}

print("Final RRF-ranked documents:")
for doc_id, total_score in ranked_docs[:5]:
    doc = doc_lookup[doc_id]  # Get the actual Document object
    title = doc.page_content.split('.')[0]  # use first sentence
    print(f"- {title}... (score={total_score:.2f})")


Final RRF-ranked documents:
- Transfers may be declined if daily limits are exceeded or account details are invalid, ... (score=1.50)
- If a transfer fails due to insufficient funds, ... (score=1.33)
- Error codes for failed bank transfers: 101 Insufficient Funds, 102 Account Not Found, ... (score=0.83)
