MMR Retriever Implementation

In [1]:
import os
from dotenv import load_dotenv
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chat_models import init_chat_model
from langchain.prompts import PromptTemplate
from langchain_community.vectorstores import FAISS
from langchain_huggingface import HuggingFaceEmbeddings
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains.retrieval import create_retrieval_chain

In [2]:
load_dotenv()

os.environ['OPENAI_API_KEY'] = os.getenv('OPENAI_API_KEY')

In [3]:
# Step 1: Load 
loader = TextLoader('sample.txt')
raw_doc = loader.load()
raw_doc

[Document(metadata={'source': 'sample.txt'}, page_content='LangChain: LangChain serves as an extensive open-source orchestration framework that simplifies the creation of complex, LLM-powered applications by providing a modular suite of tools for prompt management and document loading while abstracting the intricacies of model integration to allow for seamless switching between different APIs with minimal code changes.\nLangGraph: LangGraph is a specialized library within the LangChain ecosystem that introduces a robust, graph-based architecture to enable stateful, multi-agent applications that move beyond linear paths by representing workflows as a series of nodes and edges for granular control over execution.\nFAISS: FAISS is a high-performance, open-source library developed by Metaâ€™s AI research team that specializes in the efficient similarity search and clustering of dense vectors at a massive scale, utilizing advanced indexing techniques like k-means clustering and product quan

In [4]:
# Step 2: Chunk the document
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)

chunks = splitter.split_documents(raw_doc)
chunks

[Document(metadata={'source': 'sample.txt'}, page_content='LangChain: LangChain serves as an extensive open-source orchestration framework that simplifies the creation of complex, LLM-powered applications by providing a modular suite of tools for prompt management and document loading while abstracting the intricacies of model integration to allow for seamless switching between different APIs with minimal code changes.'),
 Document(metadata={'source': 'sample.txt'}, page_content='LangGraph: LangGraph is a specialized library within the LangChain ecosystem that introduces a robust, graph-based architecture to enable stateful, multi-agent applications that move beyond linear paths by representing workflows as a series of nodes and edges for granular control over execution.'),
 Document(metadata={'source': 'sample.txt'}, page_content='FAISS: FAISS is a high-performance, open-source library developed by Metaâ€™s AI research team that specializes in the efficient similarity search and clust

In [5]:
# Step 3: 
embedding_model = HuggingFaceEmbeddings(model_name='all-MiniLM-L6-v2')
vectorstore = FAISS.from_documents(chunks, embedding_model)



vectorstore

<langchain_community.vectorstores.faiss.FAISS at 0x292e4bed150>

In [6]:
# Step 4: Create MMR Retriever
retriever = vectorstore.as_retriever(
    serach_type='mmr',
    search_kwargs={'k': 3},
)

In [7]:
# Step 5: Prompt and LLM
prompt = PromptTemplate.from_template(
    '''
    Answer the question based on the context provided.
    
    Answer the question based on the context provided.
    
    Context: {context}
    
    Question: {input}
    '''
)

llm = init_chat_model('groq:llama-3.3-70b-versatile')

In [8]:
# Step 6: RAG Pipeline
document_chain = create_stuff_documents_chain(
    llm= llm,
    prompt=prompt
)

rag_chain = create_retrieval_chain(
    retriever=retriever, combine_docs_chain=document_chain
)

In [9]:
# Step 6: Query
query = {'input': 'How does langchain support agents and memory?'}
response = rag_chain.invoke(query)
#response
print(f"Answer:\n{response['answer']}")
#print('Answer:\n', response['answer'])

Answer:
According to the context provided, LangChain supports agents and memory in the following ways:

1. **Memory**: LangChain has a dedicated module for memory, which allows conversational AI to store and retrieve information from past interactions. This enables the AI to maintain context and remember user preferences over multiple turns.

2. **Agents**: LangChain, through its library LangGraph, supports stateful, multi-agent applications. LangGraph introduces a graph-based architecture that represents workflows as a series of nodes and edges, providing granular control over execution. This allows for complex, non-linear interactions and decision-making processes, which can involve multiple agents.

In summary, LangChain supports agents by providing a framework for multi-agent applications through LangGraph, and it supports memory by having a dedicated module for storing and retrieving information from past interactions.


In [10]:
response

{'input': 'How does langchain support agents and memory?',
 'context': [Document(id='39d498e9-fcb1-475d-894e-739cb18e5b78', metadata={'source': 'sample.txt'}, page_content='Memory: Memory in LangChain refers to the dedicated module responsible for storing and retrieving information from past interactions, which allows conversational AI to maintain context and remember user preferences over multiple turns rather than treating each prompt as an isolated event.'),
  Document(id='63f5170f-c858-4c68-944d-d04c14d03d8a', metadata={'source': 'sample.txt'}, page_content='LangChain: By centering its core philosophy on the "Chain" concept, the framework enables the automated linking of various componentsâ€”such as memory, external tools, and promptsâ€”into a unified workflow designed to handle multi-step reasoning tasks while building robust AI systems that maintain data connectivity across different environments.'),
  Document(id='d29c9519-d2b7-4cc1-b792-8a29e5e8b36c', metadata={'source': 'sampl