### Maximal Marginal Relevance 
MMR(Maximal Marginal Relevance) is a powerful diversity-aware retrieval technique used in information retrieval and RAG pipelines to balance relevance and novelty when selecting documents.


In [1]:
from langchain_community.vectorstores import FAISS
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_classic.document_loaders import TextLoader
from langchain_classic.text_splitter import RecursiveCharacterTextSplitter
from langchain.chat_models import init_chat_model
from langchain_classic.prompts import PromptTemplate
from langchain_classic.chains.combine_documents import create_stuff_documents_chain
from langchain_classic.chains.retrieval import create_retrieval_chain

In [2]:
import os
from dotenv import load_dotenv
load_dotenv()
os.environ["GROQ_API_KEY"] = os.getenv("GROQ_API_KEY")



In [7]:
# Step 1 : Document Load
loader= TextLoader("mmr_rag_practice_document.txt")
raw_docs = loader.load()

# Step 2: Split the document
splitter = RecursiveCharacterTextSplitter(
    chunk_size = 500,
    chunk_overlap = 50
)

chunks = splitter.split_documents(raw_docs)
# docs

In [8]:
# Step 3: Embedding Model and FAISS vector store
embedding = HuggingFaceEmbeddings(
    model_name="all-MiniLM-L6-v2"
)
vector_store = FAISS.from_documents(chunks, embedding)


In [9]:
# Step 4: Create MMR Retriever
retriever=vector_store.as_retriever(
    search_type="mmr",
    search_kwargs={"k":3}
)

In [13]:
# Step 5: Promt and LLM
prompt = PromptTemplate.from_template("""
Answer the question based on the context provided.

Context:{context}

Question: {input}
"""
)

llm = init_chat_model("groq:groq/compound")

In [14]:
# Step 6: RAG Pipeline
document_chain = create_stuff_documents_chain(llm=llm, prompt=prompt)
rag_chain= create_retrieval_chain(retriever=retriever, combine_docs_chain=document_chain)

In [15]:
# Step 7: Query
query = {"input": "How does Langchain support agents and memeory?"}
response = rag_chain.invoke(query)

In [16]:
print("✔️ Answer:\n", response["answer"])

✔️ Answer:
 **How LangChain supports agents and memory**

LangChain (and its companion project LangGraph) provides a full‑stack set of primitives that let developers build **LLM‑powered agents** that can **remember** what has happened before, update that knowledge, and reuse it later. The main pieces are:

| Area | What LangChain offers | How it is used |
|------|----------------------|----------------|
| **Agent framework** | • `AgentExecutor`, `create_agent`, `ZeroShotAgent`, tool‑calling agents, and the newer **Agentic Graph** API (via LangGraph). <br>• Built‑in support for tool integration, planning, and multi‑step reasoning. | You define a set of tools (search, DB lookup, code execution, etc.) and a prompt template; the executor orchestrates calls to the LLM, decides which tool to invoke, and stitches the results together. |
| **Short‑term (thread‑level) memory** | • `ConversationBufferMemory`, `ConversationBufferWindowMemory`, `ConversationSummaryMemory`, etc. <br>• Memory is att

In [17]:
response

{'input': 'How does Langchain support agents and memeory?',
 'context': [Document(id='32f856ca-998b-419f-82c7-2b5ea7875388', metadata={'source': 'mmr_rag_practice_document.txt'}, page_content='metrics, and another about hybrid retrieval connections. This increases coverage and improves the final generated answer. LangChain provides builtâ€‘in support for MMR retrieval strategies through its retriever configuration options. Developers can adjust parameters such as lambda, which controls the tradeâ€‘off between relevance and diversity. A higher lambda value prioritizes relevance, while a lower value increases diversity. Experimenting with these values is an excellent exercise for'),
  Document(id='19b8fca0-d6c5-4961-938f-74215e509764', metadata={'source': 'mmr_rag_practice_document.txt'}, page_content='embeddings, vector databases, and reâ€‘ranking techniques. If a user asks about embeddings, pure similarity search may return five nearly identical paragraphs explaining vector space conve