# Langchain Retrivers


In [24]:
from langchain_community.retrievers import WikipediaRetriever
retrieverwiki=WikipediaRetriever(top_k_results=2,lang='en')

In [25]:
query="Attention is all you need"
docs=retrieverwiki.invoke(query)

In [26]:
docs

[Document(metadata={'title': 'Attention Is All You Need', 'summary': '"Attention Is All You Need" is a 2017 landmark research paper in machine learning authored by eight scientists working at Google. The paper introduced a new deep learning architecture known as the transformer, based on the attention mechanism proposed in 2014 by Bahdanau et al. It is considered a foundational paper in modern artificial intelligence, and a main contributor to the AI boom, as the transformer approach has become the main architecture of a wide variety of AI, such as large language models. At the time, the focus of the research was on improving Seq2seq techniques for machine translation, but the authors go further in the paper, foreseeing the technique\'s potential for other tasks like question answering and what is now known as multimodal generative AI.\nThe paper\'s title is a reference to the song "All You Need Is Love" by the Beatles. The name "Transformer" was picked because Jakob Uszkoreit, one of 

In [27]:
docs[0].page_content

'"Attention Is All You Need" is a 2017 landmark research paper in machine learning authored by eight scientists working at Google. The paper introduced a new deep learning architecture known as the transformer, based on the attention mechanism proposed in 2014 by Bahdanau et al. It is considered a foundational paper in modern artificial intelligence, and a main contributor to the AI boom, as the transformer approach has become the main architecture of a wide variety of AI, such as large language models. At the time, the focus of the research was on improving Seq2seq techniques for machine translation, but the authors go further in the paper, foreseeing the technique\'s potential for other tasks like question answering and what is now known as multimodal generative AI.\nThe paper\'s title is a reference to the song "All You Need Is Love" by the Beatles. The name "Transformer" was picked because Jakob Uszkoreit, one of the paper\'s authors, liked the sound of that word.\nAn early design 

# Maximal Marginal Relavance (MMR)

to get diverse results

In [30]:
from langchain.schema import Document

doc = [
    Document(page_content="Agentic AI refers to artificial intelligence systems that can operate autonomously, make decisions, and take actions to achieve specific goals. Unlike traditional AI models that passively respond to prompts, agentic systems actively plan, reason, and interact with their environment. These systems combine LLMs with tools, memory, and multi-step workflows to complete tasks with minimal human input."),
    
    Document(page_content="One of the key features of agentic AI is its ability to break down complex tasks into smaller subtasks. Using reasoning capabilities and access to APIs, databases, or other tools, an agent can decide what actions to take and in what order. For example, a research assistant agent can read documents, summarize key points, and generate a report without being micromanaged."),
    
    Document(page_content="LangChain provides the infrastructure to build agentic AI applications by combining language models with tools, memory, and agents. Agents in LangChain use a chain of thought approach, where the LLM generates intermediate steps and decides which tools to use next. This makes it ideal for applications like automated customer support, market research, and workflow automation."),
    
    Document(page_content="Agentic AI is widely used in productivity tools such as smart email assistants, coding copilots, and task managers. These agents don't just respond — they reason, schedule tasks, write emails, and even manage projects by interacting with multiple apps. This level of autonomy can save professionals hours of repetitive work every week."),
    
    Document(page_content="Despite its power, agentic AI poses challenges such as reliability, safety, and alignment with human intent. Agents may take incorrect actions if not properly constrained or monitored. Developers must implement safeguards, reasoning checks, and fallback strategies to ensure agentic systems behave responsibly and transparently."),
]

In [31]:
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from dotenv import load_dotenv
load_dotenv()
embeddings=OpenAIEmbeddings()
vectorstore=FAISS.from_documents(
    doc,embeddings
)

In [35]:
# enable MMR in the retriver to get diverse results
retriever=vectorstore.as_retriever(
    search_type='mmr', # enable retriver MMR
    search_kwargs={"k":3, "lambda_mult" :0.5}
)

In [36]:
query="key features of agentic Ai"
results=retriever.invoke(query)

In [37]:
for i , doc in enumerate(results):
    print(f"Result {i+1}:")
    print(doc.page_content)
    print("-"*100)

Result 1:
One of the key features of agentic AI is its ability to break down complex tasks into smaller subtasks. Using reasoning capabilities and access to APIs, databases, or other tools, an agent can decide what actions to take and in what order. For example, a research assistant agent can read documents, summarize key points, and generate a report without being micromanaged.
----------------------------------------------------------------------------------------------------
Result 2:
LangChain provides the infrastructure to build agentic AI applications by combining language models with tools, memory, and agents. Agents in LangChain use a chain of thought approach, where the LLM generates intermediate steps and decides which tools to use next. This makes it ideal for applications like automated customer support, market research, and workflow automation.
----------------------------------------------------------------------------------------------------
Result 3:
Despite its power, 

# Multi-Query Retriver 

Solve Ambuiguity by generating multiple queries 

In [39]:
from langchain.retrievers.multi_query import MultiQueryRetriever
from langchain_openai import ChatOpenAI

# Relevant health & wellness documents
all_docs = [
    Document(page_content="Regular walking boosts heart health and can reduce symptoms of depression.", metadata={"source": "H1"}),
    Document(page_content="Consuming leafy greens and fruits helps detox the body and improve longevity.", metadata={"source": "H2"}),
    Document(page_content="Deep sleep is crucial for cellular repair and emotional regulation.", metadata={"source": "H3"}),
    Document(page_content="Mindfulness and controlled breathing lower cortisol and improve mental clarity.", metadata={"source": "H4"}),
    Document(page_content="Drinking sufficient water throughout the day helps maintain metabolism and energy.", metadata={"source": "H5"}),
    Document(page_content="The solar energy system in modern homes helps balance electricity demand.", metadata={"source": "I1"}),
    Document(page_content="Python balances readability with power, making it a popular system design language.", metadata={"source": "I2"}),
    Document(page_content="Photosynthesis enables plants to produce energy by converting sunlight.", metadata={"source": "I3"}),
    Document(page_content="The 2022 FIFA World Cup was held in Qatar and drew global energy and excitement.", metadata={"source": "I4"}),
    Document(page_content="Black holes bend spacetime and store immense gravitational energy.", metadata={"source": "I5"}),
]

In [40]:
vectorstore2=FAISS.from_documents(all_docs,embedding=embeddings)

In [41]:
similarity_retriver=vectorstore2.as_retriever(search_type="similarity",search_kwargs={"k":5})

In [52]:
multiquery_retriever=MultiQueryRetriever.from_llm(
    retriever=vectorstore2.as_retriever(search_kwargs={"k":5}),
    llm=ChatOpenAI(model="gpt-4o"),
)

In [53]:
query="how to improve energy level and maintain balance?"

In [54]:
# retrive results
similarity_results=similarity_retriver.invoke(query)
multiquery_results=multiquery_retriever.invoke(query)

In [55]:
for i, doc in enumerate(similarity_results):
    print(f"Result {i+1}:")
    print(doc.page_content)
    print("-"*100)

Result 1:
Drinking sufficient water throughout the day helps maintain metabolism and energy.
----------------------------------------------------------------------------------------------------
Result 2:
Mindfulness and controlled breathing lower cortisol and improve mental clarity.
----------------------------------------------------------------------------------------------------
Result 3:
Consuming leafy greens and fruits helps detox the body and improve longevity.
----------------------------------------------------------------------------------------------------
Result 4:
Regular walking boosts heart health and can reduce symptoms of depression.
----------------------------------------------------------------------------------------------------
Result 5:
Deep sleep is crucial for cellular repair and emotional regulation.
----------------------------------------------------------------------------------------------------


In [56]:
for i, doc in enumerate(multiquery_results):
    print(f"Result {i+1}:")
    print(doc.page_content)
    print("-"*100)

Result 1:
Drinking sufficient water throughout the day helps maintain metabolism and energy.
----------------------------------------------------------------------------------------------------
Result 2:
Consuming leafy greens and fruits helps detox the body and improve longevity.
----------------------------------------------------------------------------------------------------
Result 3:
Mindfulness and controlled breathing lower cortisol and improve mental clarity.
----------------------------------------------------------------------------------------------------
Result 4:
The solar energy system in modern homes helps balance electricity demand.
----------------------------------------------------------------------------------------------------
Result 5:
Deep sleep is crucial for cellular repair and emotional regulation.
----------------------------------------------------------------------------------------------------
Result 6:
Regular walking boosts heart health and can reduce s

# contextual Compression Retriver

In [67]:
# Recreate the document objects from the previous data
docs1 = [
    Document(page_content=(
        """The Grand Canyon is one of the most visited natural wonders in the world.
        Photosynthesis is the process by which green plants convert sunlight into energy.
        Millions of tourists travel to see it every year. The rocks date back millions of years."""
    ), metadata={"source": "Doc1"}),

    Document(page_content=(
        """In medieval Europe, castles were built primarily for defense.
        The chlorophyll in plant cells captures sunlight during photosynthesis.
        Knights wore armor made of metal. Siege weapons were often used to breach castle walls."""
    ), metadata={"source": "Doc2"}),

    Document(page_content=(
        """Basketball was invented by Dr. James Naismith in the late 19th century.
        It was originally played with a soccer ball and peach baskets. NBA is now a global league."""
    ), metadata={"source": "Doc3"}),

    Document(page_content=(
        """The history of cinema began in the late 1800s. Silent films were the earliest form.
        Thomas Edison was among the pioneers. Photosynthesis does not occur in animal cells.
        Modern filmmaking involves complex CGI and sound design."""
    ), metadata={"source": "Doc4"})
]

In [83]:
from langchain.retrievers.contextual_compression import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import LLMChainExtractor

In [84]:
vectorstore4=FAISS.from_documents(docs1,embeddings)

In [86]:
base_retriever=vectorstore4.as_retriever(search_kwargs={"k":5})

In [87]:
# setup the compressor using an LLM
llm=ChatOpenAI(model="gpt-4o")
compressor=LLMChainExtractor.from_llm(llm)

In [88]:
# retriver compressoin
compression_retriever=ContextualCompressionRetriever(
    base_retriever=base_retriever,
    base_compressor=compressor)

In [89]:
query3="What is photosynthesis?"
compressed_results=compression_retriever.invoke(query3)

In [90]:
for i , doc in enumerate(compressed_results):
    print(f"Result {i+1}:")
    print(doc.page_content)
    print("-"*100)

Result 1:
Photosynthesis is the process by which green plants convert sunlight into energy.
----------------------------------------------------------------------------------------------------
Result 2:
The chlorophyll in plant cells captures sunlight during photosynthesis.
----------------------------------------------------------------------------------------------------
Result 3:
Photosynthesis does not occur in animal cells.
----------------------------------------------------------------------------------------------------
