### 2) Re-Ranking Technique

**Re-ranking** is a second-stage filtering process in retrieval systems, especially in RAG pipelines, where we:

1. First use a fast retriever (like BM25, FAISS, hybrid) to fetch top-k documents quickly.**

2. Then use a more accurate but slower model (like a cross-encoder or LLM) to re-score and reorder those documents by relevance to the query.**

👉 It ensures that the most relevant documents appear at the top, improving the final answer from the LLM.

### STEPS
Step 1: Breadth (recall).

Step 2: Coverage (lexical + semantic).

Step 3: Focus (precision).

Step 4: Reasoning (high-quality answer).

This architecture balances speed + cost + accuracy.


In [20]:
import warnings
warnings.filterwarnings('ignore')

from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain.prompts import PromptTemplate
from langchain.schema import Document
from langchain_core.output_parsers import StrOutputParser
from langchain_community.vectorstores import FAISS

llm = ChatOpenAI(model='gpt-4o')
embedding = OpenAIEmbeddings()
loader = TextLoader("langchain_sample.txt")
data = loader.load()
splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=30)
docs = splitter.split_documents(data)
query = "How can I use langchain to build an application with memory and tools?"

# we need some vector store / DB so that we can implement the retrieveing. Here we can implement any fast search method like BM25, BOW, IF-IDF etcs

vector_store = FAISS.from_documents(docs, embedding)
retriever = vector_store.as_retriever(search_kwargs={"k": 10})

# here reranking starts
# Prompt Template
prompt = PromptTemplate.from_template("""
You are a helpful assistant. Your task is to rank the following documents from most to least relevant.

User Question: "{question}"

Documents:
{documents}

Instructions:
- Think about the relevance of each document to the user's question.
- Return a list of document indices in ranked order, starting from the most relevant.

Output format: comma-separated document indices (e.g., 2,1,3,0,...)
""")


retrieved_docs = retriever.invoke(query)
retrieved_docs

[Document(id='369c50f5-a3d7-4e08-ad7d-e80b626c6957', metadata={'source': 'langchain_sample.txt'}, page_content='Memory in LangChain enables context retention across multiple steps in a conversation or task, making the application more coherent and stateful.'),
 Document(id='48f54e7a-cd0d-4508-b284-b0d84a0fa561', metadata={'source': 'langchain_sample.txt'}, page_content='LangChain is a flexible framework designed for developing applications powered by large language models (LLMs). It provides tools and abstractions to work with LLMs more effectively and includes components for prompt management, chains, memory, and agents.'),
 Document(id='07d0cd88-6097-4a50-92c5-19ef0cc4a989', metadata={'source': 'langchain_sample.txt'}, page_content='Agents in LangChain are chains that use LLMs to decide which tools to use and in what order. This makes them suitable for multi-step tasks like question answering with search and code execution.'),
 Document(id='912b2152-4a05-49a1-8852-27b16a4c0617', meta

In [21]:
chain = prompt | llm | StrOutputParser()

doc_lines = [f"{i+1}. {doc.page_content}" for i, doc in enumerate(retrieved_docs)]
formatted_docs = "\n".join(doc_lines)
response = chain.invoke({"question": query, "documents": formatted_docs})

In [22]:
response

'The user is specifically asking about building an application with memory and tools using LangChain. Here’s a ranked list of documents based on their relevance to the user\'s question:\n\n2, 1, 4, 3, 5, 7, 6, 8, 9, 10\n\nExplanation:\n- Document 2 directly explains what LangChain is, including its support for "memory" and "tools," which the user is interested in.\n- Document 1 provides specific information about the memory component in LangChain, making it very relevant.\n- Document 4 discusses the support for tool integration within LangChain, directly addressing the user\'s interest in tools.\n- Document 3 touches on agents, which are relevant as they involve tool use and decision-making, potentially incorporating memory.\n- Document 5, although not directly related to memory or tools, mentions integrations that could aid in building applications, indirectly supporting the user\'s task.\n- Document 7 discusses RAG, which is somewhat related to using tools and external knowledge, but

In [23]:
doc_lines = [f"{i+1}. {doc.page_content}" for i, doc in enumerate(retrieved_docs)]
formatted_docs = "\n".join(doc_lines)

In [24]:
doc_lines

['1. Memory in LangChain enables context retention across multiple steps in a conversation or task, making the application more coherent and stateful.',
 '2. LangChain is a flexible framework designed for developing applications powered by large language models (LLMs). It provides tools and abstractions to work with LLMs more effectively and includes components for prompt management, chains, memory, and agents.',
 '3. Agents in LangChain are chains that use LLMs to decide which tools to use and in what order. This makes them suitable for multi-step tasks like question answering with search and code execution.',
 '4. LangChain supports tool integration including web search, calculators, and APIs, allowing LLMs to interact with external systems and respond more accurately to dynamic queries.',
 '5. LangChain integrates with many third-party services such as OpenAI, Hugging Face, and Cohere. This enables developers to experiment with different models and optimize performance for specific 

In [25]:
formatted_docs

'1. Memory in LangChain enables context retention across multiple steps in a conversation or task, making the application more coherent and stateful.\n2. LangChain is a flexible framework designed for developing applications powered by large language models (LLMs). It provides tools and abstractions to work with LLMs more effectively and includes components for prompt management, chains, memory, and agents.\n3. Agents in LangChain are chains that use LLMs to decide which tools to use and in what order. This makes them suitable for multi-step tasks like question answering with search and code execution.\n4. LangChain supports tool integration including web search, calculators, and APIs, allowing LLMs to interact with external systems and respond more accurately to dynamic queries.\n5. LangChain integrates with many third-party services such as OpenAI, Hugging Face, and Cohere. This enables developers to experiment with different models and optimize performance for specific use cases lik

In [26]:
response=chain.invoke({"question":query,"documents":formatted_docs})
response

'2, 1, 4, 3, 7, 5, 6, 9, 8, 10'

In [27]:
# Step 5: Parse and rerank
indices = [int(x.strip()) - 1 for x in response.split(",") if x.strip().isdigit()]
indices

[1, 0, 3, 2, 6, 4, 5, 8, 7, 9]

In [28]:
retrieved_docs

[Document(id='369c50f5-a3d7-4e08-ad7d-e80b626c6957', metadata={'source': 'langchain_sample.txt'}, page_content='Memory in LangChain enables context retention across multiple steps in a conversation or task, making the application more coherent and stateful.'),
 Document(id='48f54e7a-cd0d-4508-b284-b0d84a0fa561', metadata={'source': 'langchain_sample.txt'}, page_content='LangChain is a flexible framework designed for developing applications powered by large language models (LLMs). It provides tools and abstractions to work with LLMs more effectively and includes components for prompt management, chains, memory, and agents.'),
 Document(id='07d0cd88-6097-4a50-92c5-19ef0cc4a989', metadata={'source': 'langchain_sample.txt'}, page_content='Agents in LangChain are chains that use LLMs to decide which tools to use and in what order. This makes them suitable for multi-step tasks like question answering with search and code execution.'),
 Document(id='912b2152-4a05-49a1-8852-27b16a4c0617', meta

In [29]:
reranked_docs = [retrieved_docs[i] for i in indices if 0 <= i < len(retrieved_docs)]
reranked_docs

[Document(id='48f54e7a-cd0d-4508-b284-b0d84a0fa561', metadata={'source': 'langchain_sample.txt'}, page_content='LangChain is a flexible framework designed for developing applications powered by large language models (LLMs). It provides tools and abstractions to work with LLMs more effectively and includes components for prompt management, chains, memory, and agents.'),
 Document(id='369c50f5-a3d7-4e08-ad7d-e80b626c6957', metadata={'source': 'langchain_sample.txt'}, page_content='Memory in LangChain enables context retention across multiple steps in a conversation or task, making the application more coherent and stateful.'),
 Document(id='912b2152-4a05-49a1-8852-27b16a4c0617', metadata={'source': 'langchain_sample.txt'}, page_content='LangChain supports tool integration including web search, calculators, and APIs, allowing LLMs to interact with external systems and respond more accurately to dynamic queries.'),
 Document(id='07d0cd88-6097-4a50-92c5-19ef0cc4a989', metadata={'source': 'l

In [30]:
# Step 6: Show results
print("\n📊 Final Reranked Results:")
for i, doc in enumerate(reranked_docs, 1):
    print(f"\nRank {i}:\n{doc.page_content}")


📊 Final Reranked Results:

Rank 1:
LangChain is a flexible framework designed for developing applications powered by large language models (LLMs). It provides tools and abstractions to work with LLMs more effectively and includes components for prompt management, chains, memory, and agents.

Rank 2:
Memory in LangChain enables context retention across multiple steps in a conversation or task, making the application more coherent and stateful.

Rank 3:
LangChain supports tool integration including web search, calculators, and APIs, allowing LLMs to interact with external systems and respond more accurately to dynamic queries.

Rank 4:
Agents in LangChain are chains that use LLMs to decide which tools to use and in what order. This makes them suitable for multi-step tasks like question answering with search and code execution.

Rank 5:
Retrieval-Augmented Generation (RAG) is a powerful technique where external knowledge is retrieved and passed into the prompt to ground LLM responses. La