### Re-Ranking Techniques:

Re-Ranking is the second-stage filtering process in retrieval systems, especially in RAG pipelines, where as:

1. First use a fast retriever (BM25, FAISS, HYBRID) to fetch top-k documents quickly.
2. Then use a more accurate but slower model (like a cross-encoder or LLM) to re-score and re-order those documents by relevance to the query.

In [48]:
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain.chat_models import init_chat_model
from langchain_core.prompts import PromptTemplate
from langchain_core.documents import Document
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser




In [41]:
#load text file
loader = TextLoader('langchain_sample.txt')
documents= loader.load()
print(f"Number of documents: {len(documents)}")
    
#splitting documents into chunks

splitter = RecursiveCharacterTextSplitter(chunk_size=200, chunk_overlap=50)
chunks = splitter.split_documents(documents)
print(f"Number of chunks: {len(chunks)}") 
print(f"\nSample chunks: ")
for i in chunks[:3]:
    print(i)

  

Number of documents: 1
Number of chunks: 43

Sample chunks: 
page_content='LangChain is an open-source framework that enables developers to build powerful applications powered by large language models (LLMs), offering a modular, flexible, and highly extensible ecosystem that' metadata={'source': 'langchain_sample.txt'}
page_content='flexible, and highly extensible ecosystem that bridges the gap between raw model capabilities and real-world use cases.LangChain was created to simplify the process of connecting LLMs to external' metadata={'source': 'langchain_sample.txt'}
page_content='the process of connecting LLMs to external data sources, tools, and workflows, making it possible to build intelligent agents, retrieval-augmented generation (RAG) systems, and complex pipelines with' metadata={'source': 'langchain_sample.txt'}


In [42]:
#embedding and vectorstore

embed_model = HuggingFaceEmbeddings(model="sentence-transformers/all-MiniLM-L6-v2")

vectorstore = FAISS.from_documents(documents=chunks,
                                   embedding=embed_model)

#retriever
retriever = vectorstore.as_retriever(search_kwargs={"k":8})


In [43]:
#llm model
llm = init_chat_model(model="groq:llama-3.1-8b-instant")

#prompt template
prompt = PromptTemplate.from_template(
    '''
You are a helpful assistant. Your task is to rank the following documents from most to least relevant.

User question: "{question}"
Documents: {documents}

Instructions:
- Think about the relevance of each document to the user's question.
- Return a list of document indices in ranked order, starting from most relevant

Output format: comman separated document indices (e.g., 2,1,4,0..)'''
)

prompt

PromptTemplate(input_variables=['documents', 'question'], input_types={}, partial_variables={}, template='\nYou are a helpful assistant. Your task is to rank the following documents from most to least relevant.\n\nUser question: "{question}"\nDocuments: {documents}\n\nInstructions:\n- Think about the relevance of each document to the user\'s question.\n- Return a list of document indices in ranked order, starting from most relevant\n\nOutput format: comman separated document indices (e.g., 2,1,4,0..)')

In [None]:
from typing import List
query="what makes langchain compelling"

def format_docs(relevant_docs:List):
    formatted_docs= []
    for index in range(len(relevant_docs)):
       formatted_docs.append(f"{index+1}. {relevant_docs[index].page_content} ")
    return formatted_docs 

formatted_docs = format_docs(retriever.invoke(query))
formatted_docs


['1. What makes LangChain particularly compelling is its balance between ease of use for beginners and advanced customization for experts. Beginners can quickly build prototypes using pre-built chains and ',
 '2. Beyond its technical features, LangChain has fostered a vibrant community of developers and researchers who contribute extensions, share best practices, and collaborate on new use cases. The ',
 '3. In summary, LangChain is more than just a frameworkâ€”it is an ecosystem that empowers developers to harness the full potential of large language models. By providing modular chains, agent ',
 '4. LangChain is an open-source framework that enables developers to build powerful applications powered by large language models (LLMs), offering a modular, flexible, and highly extensible ecosystem that ',
 '5. In practice, LangChain has been applied across diverse industries. In finance, it powers systems that retrieve and summarize market data. In healthcare, it supports applications that

In [45]:
combine_doc = "\n".join(formatted_docs)
combine_doc

'1. What makes LangChain particularly compelling is its balance between ease of use for beginners and advanced customization for experts. Beginners can quickly build prototypes using pre-built chains and \n2. Beyond its technical features, LangChain has fostered a vibrant community of developers and researchers who contribute extensions, share best practices, and collaborate on new use cases. The \n3. In summary, LangChain is more than just a frameworkâ€”it is an ecosystem that empowers developers to harness the full potential of large language models. By providing modular chains, agent \n4. LangChain is an open-source framework that enables developers to build powerful applications powered by large language models (LLMs), offering a modular, flexible, and highly extensible ecosystem that \n5. In practice, LangChain has been applied across diverse industries. In finance, it powers systems that retrieve and summarize market data. In healthcare, it supports applications that help clinici

In [53]:
re_rank_chain = prompt | llm | StrOutputParser()
response=re_rank_chain.invoke({"documents": formatted_docs, "question": query})
print(response)



Based on the user's question "what makes LangChain compelling," I have analyzed the documents and ranked them from most to least relevant. Here is the list of document indices:

3, 1, 7, 4, 6, 8, 2, 5

Explanation:
- Document 3 directly states that LangChain is more than just a framework, which implies its compelling features. This document is most relevant.
- Document 1 mentions LangChain's balance between ease of use and advanced customization, which is a compelling aspect of the framework. This document is the second most relevant.
- Document 7 highlights LangChain's agent architecture as a key feature, which contributes to its compelling nature.
- Document 4 describes LangChain's modular, flexible, and highly extensible ecosystem, making it a compelling choice for developers.
- Document 6 explains LangChain's core features, including chains and agent architecture, which are essential to its compelling nature.
- Document 8 mentions LangChain's emphasis on prompt management and optim

In [66]:
#another function

def f_docs(query:str):
    relevant_docs = retriever.invoke(query)
    format_doc=[]
    for i in range(len(relevant_docs)):
        format_doc.append(f"{i+1}. {relevant_docs[i].page_content}")
    return "\n".join(format_doc)    

chain = ({"question": RunnablePassthrough(),"documents": f_docs}
         |prompt|
         llm|
         StrOutputParser())

result=chain.invoke(query)
print(result)

Based on the user's question "what makes LangChain compelling," I've analyzed the relevance of each document to the question. Here are the ranked document indices:

2, 1, 4, 7, 0

Here's the reasoning behind the ranking:

- Document 2 directly answers the question by highlighting LangChain's balance between ease of use and advanced customization, making it a strong candidate for the top spot.
- Document 1 also addresses the question by mentioning LangChain's balance between ease of use and advanced customization, making it a close second.
- Document 4 provides a broader overview of LangChain's features and ecosystem, which indirectly answers the question by showcasing its compelling aspects.
- Document 7 highlights LangChain's agent architecture, which can be seen as a compelling feature, especially for those interested in autonomous decision-making.
- Document 0 (the user's question itself) is not a document in the classical sense, but it's the starting point for the analysis. However