# Agentic Search 

<p>Start agentic search using Langchain. We will define a agent with the following tools:</p>

- Ability to query our Qdrant database 
  
  - Allow it to also make it's own queries and get the best query
  
- Ability to open our pdf and read a particular page

In [148]:
# !pip install -qU langchain-qdrant

In [149]:
MAX_CHUNK_SIZE = 1024

In [150]:
from langchain_openai import OpenAIEmbeddings
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_qdrant import QdrantVectorStore
from langchain.tools.retriever import create_retriever_tool
from qdrant_client import QdrantClient
from langchain_core.tools import Tool
from langchain_core.tools import tool
from langchain.agents.output_parsers.openai_tools import OpenAIToolsAgentOutputParser
from langchain.agents.format_scratchpad.openai_tools import (
    format_to_openai_tool_messages,
)
import pymupdf

from langchain.agents import AgentExecutor, Tool, ZeroShotAgent
from langchain.chains import LLMChain
from langchain_openai import ChatOpenAI 

embeddings_model = OpenAIEmbeddings(model="text-embedding-3-small", chunk_size=MAX_CHUNK_SIZE)
client = QdrantClient(url="http://localhost:6333")

In [151]:
vector_store = QdrantVectorStore(
    client=client,
    collection_name="pdf_collection",
    embedding=embeddings_model,
)
retriever = vector_store.as_retriever()

## Building The Agent

In [152]:
@tool
def natural_language_query(input: str):
    """ Will query the vector store that contains information about the research papers and return the top 10 most relevant chunks from all the papers.

    Args:
    input: A string that represents the query. This could be anything, a question, a sentence, etc. Feel free to try different queries if you are not able to get the desired results.

    Each result has the following structure:
    {
        "text": "string",
        "pdf_path": "string", # the path to the pdf file
        "page_number": 0, # the page number where the text is located
        "idx": string # the pdf is chunked into smaller pieces, this is the id of the chunk. eg: 1.txt, 2.txt, etc. Each document is split sequentially into chunks.
        "score": 0.0 # the cosine similarity between the input and the text
    }
    """
    result = client.search(collection_name="pdf_collection", query_vector=embeddings_model.embed_documents([input])[0], limit=10)
    return [{'text': r.payload['text'], 'pdf_path': r.payload['pdf_file'], 'page_number': r.payload['page'], 'idx': r.payload['idx'], 'score': r.score} for r in result]

In [153]:
@tool
def read_pdf_page(pdf_file: str, page: int):
    """ Will read the entire page for the pdf file and return the text.
        Args:
        pdf_file: The path to the pdf file.
        page: The page number to read from the pdf file.
    """

    pdf = pymupdf.open(pdf_file)
    return pdf[page - 1].get_text()

In [154]:
research_prompt = ChatPromptTemplate.from_messages([
    (
        "system",
        """
        You are a agent that is going to be searching technical research papers. You will be given a topic to find information on. You will need to find the most relevant information and return back the information with sources. 
        Make sure you only return information that you have retrieved from the research papers. Do not make up any information. You will have access to tools to help you find the information.

        Make sure to make uses of the tools provided. You can use the tools as many times as needed. Make sure you have all the information needed before returning the information. If you cannot find the information, just state that you could not find the information. Do not make up any information.

        Return the output by stating the information you found and the source of the information in the format pdf_path - page number - the section (if mentioned) in the papers we queried. The page number and pdf_path should only be from our vector store, do not make up any page numbers or pdf paths.
        Return only the bare minimum information needed and nothing more. Just enough to answer the question. Answer it concisely and only based on the information we have gotten through tool use.

        Always make sure to use read_pdf_page to make sure we have the right answer and understand the context of the information.
        Also make sure to use the natural_language_query tool many times with different queries that can be used to find the information needed. This means phrasing the question in different ways to get the most relevant information, asking your own questions, etc.

        eg: 
        // brief summary of the information only based on the tools we used
        // pdf_path 
        // page number
        // section (if it can be found in the text of the result)
        """
    ),
    ("human", "Research {research_topic}"),
    MessagesPlaceholder(variable_name="agent_scratchpad"),
])

In [155]:
llm = ChatOpenAI(
    model="gpt-4o",
    max_retries=2,
    temperature=0.1,
)
agent_tools = [natural_language_query, read_pdf_page]
llm = llm.bind_tools(agent_tools)

In [156]:
def agent_scratchpad_formatter(x):
    return format_to_openai_tool_messages(x["intermediate_steps"])

agent = (
    {
        "research_topic": lambda x: x["research_topic"],
        "agent_scratchpad": agent_scratchpad_formatter,
    }
    | research_prompt
    | llm
    | OpenAIToolsAgentOutputParser() # This gets the output from the OpenAI api call and gets the tool invocations from it. This then is executed by the AgentExecutor.
)

In [157]:
agent_executor = AgentExecutor(agent=agent, tools=agent_tools, verbose=True)

In [158]:
research_agent_result = await agent_executor.ainvoke({"research_topic": "What is Paxos in Spanner?"})
research_agent_result['output']



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `natural_language_query` with `{'input': 'Paxos in Spanner'}`


[0m[36;1m[1;3m[{'text': 'To support replication, each spanserver implements a single Paxos state machine on top of each tablet. (An early Spanner incarnation supported multiple Paxos state machines per tablet, which allowed for more ﬂexible replication conﬁgurations. The complexity of that design led us to abandon it.) Each state machine stores its meta- data and log in its corresponding tablet. Our Paxos implementation supports long-lived leaders with time-based leader leases, whose length defaults to 10 seconds. The cur- rent Spanner implementation logs every Paxos write twice: once in the tablet’s log, and once in the Paxos log. This choice was made out of expediency, and we are likely to rem- edy this eventually. Our implementation of Paxos is pipelined, so as to improve Span- ner’s throughput in the presence of WAN latencies. By “pipelined,” we

'Paxos in Spanner is implemented as a single Paxos state machine on top of each tablet to support replication. Each state machine stores its metadata and log in its corresponding tablet. The Paxos implementation in Spanner supports long-lived leaders with time-based leader leases, typically lasting 10 seconds. The implementation is pipelined to improve throughput in the presence of WAN latencies, allowing for concurrent voting on different decrees. Although decrees may be approved out of order, they are applied in order. The Paxos state machines are used to implement a consistently replicated bag of mappings, with writes initiating the Paxos protocol at the leader and reads accessing state directly from any sufficiently up-to-date replica.\n\n- pdf_path: /Users/faizahmed/Documents/SJSU/topics_in_db/Project/data-ingestion/data/spanner.pdf\n- page number: 4'

[Rect(98.37019348144531, 306.89703369140625, 195.812255859375, 317.1660461425781)]