# **Agentic RAG**

Authored by [Kalyan KS](https://www.linkedin.com/in/kalyanksnlp/). To stay updated with LLM, RAG and Agent updates, you can follow me on [Twitter](https://x.com/kalyan_kpl).


1. **Rewrite Agent**: Rewrites the user query if required.

2. **Router Agent**: Analyzes the user query and returns "vector_store" if the query is related to Akmmus AI Labs, "web_search" otherwise.

3. **Retriever Agent**: Depending on the output of router agent, uses website search tool (vector store) or DuckDuckGoSearch tool and returns the relevant context.

4. **Evaluator Agent**: Analyzes the query and context, returns "yes" if the context is relevant and contains the answer, "no" otherwise.

5. **Answer Agent**: Based on evaluator's output, generates a clear and concise answer if "yes", returns "Unable to answer the given query" if "no"

## **Install and import libraries**

In [2]:
# %pip install crewai crewai-tools langchain-community langchain_huggingface duckduckgo-search

In [3]:
from crewai import Agent, Task, Crew, LLM
from crewai_tools import PDFSearchTool
from langchain_community.tools import DuckDuckGoSearchResults

  from .autonotebook import tqdm as notebook_tqdm
c:\Users\Naveen\OneDrive - FactEntry\Documents\GenAI_Stuffs\Personal_Works\Agentic_RAG\agentic_env\Lib\site-packages\pydantic\_internal\_config.py:295: PydanticDeprecatedSince20: Support for class-based `config` is deprecated, use ConfigDict instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.10/migration/
  warn(
c:\Users\Naveen\OneDrive - FactEntry\Documents\GenAI_Stuffs\Personal_Works\Agentic_RAG\agentic_env\Lib\site-packages\crewai_tools\tools\scrapegraph_scrape_tool\scrapegraph_scrape_tool.py:34: PydanticDeprecatedSince20: Pydantic V1 style `@validator` validators are deprecated. You should migrate to Pydantic V2 style `@field_validator` validators, see the migration guide for more details. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.10/migration/
  @validator("website_url")
c:\Users\Navee

## **Set up the LLM API Key**

In [4]:
import os
groq_api_key = os.getenv("GROQ_API_KEY_FE")

In [5]:
from langchain_community.embeddings import SentenceTransformerEmbeddings
custom_embeddings = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")

  custom_embeddings = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")


## **Set up Tools**

In [6]:
# Set up website search tool
pdf_search_tool = PDFSearchTool(pdf='git-cheat-sheet-education.pdf',
    config=dict(
        llm=dict(
            provider="groq",
            config=dict(
                model="llama-3.3-70b-versatile",
                max_tokens=50
            ),
        ),
        embedder=dict(
            provider="huggingface",
            config=dict(
                model="all-MiniLM-L6-v2"
            ),
        ),
    )
)



In [7]:
pdf_search_tool.run("What is the comment for git reset?")

Using Tool: Search a PDF's content


"Relevant Content:\nGIT CHEAT SHEET STAGE & SNAPSHOT Working with snapshots and the Git staging area git status show modiﬁed ﬁles in working directory, staged for your next commit git add [file] add a ﬁle as it looks now to your next commit (stage) git reset [file] unstage a ﬁle while retaining the changes in working directory git diff diﬀ of what is changed but not staged git diff -staged diﬀ of what is staged but not yet committed git commit -m “[descriptive message]” commit your staged content as a new commit snapshot SETUP Conﬁguring user information used across all local repositories git config -global user.name “[firstname lastname]” set a name that is identiﬁable for credit when review version history git config -global user.email “[valid-email]” set an email address that will be associated with each history marker git config -global color.ui auto set automatic command line coloring for Git for easy reviewing SETUP & INIT Conﬁguring user information, initializing and cloning rep

In [8]:
# Web Search Tool
from typing import Type, Any
from crewai.tools import BaseTool
from pydantic import BaseModel, Field

class DuckDuckGoSearchInput(BaseModel):
    """Input schema for DuckDuckGoSearch tool."""
    query: str = Field(..., description="The search query to use with DuckDuckGo.")

class DuckDuckGoSearchTool(BaseTool):
    name: str = "DuckDuckGo Search"
    description: str = "Searches the web using DuckDuckGo for the provided query."
    args_schema: Type[BaseModel] = DuckDuckGoSearchInput

    def _run(self, query: str) -> Any:
        search = DuckDuckGoSearchResults(output_format="list")
        results = search.run(query)
        return results

In [9]:
web_search_tool = DuckDuckGoSearchTool()
web_search_tool.run("Who is the current president of USA?")

Using Tool: DuckDuckGo Search


[{'snippet': 'The White House, official residence of the president of the United States, in July 2008. The president of the United States is the head of state and head of government of the United States, [1] indirectly elected to a four-year term via the Electoral College. [2] Under the U.S. Constitution, the officeholder leads the executive branch of the federal government and is the commander-in-chief of ...',
  'title': 'List of presidents of the United States - Wikipedia',
  'link': 'https://en.wikipedia.org/wiki/List_of_presidents_of_the_United_States'},
 {'snippet': 'As the head of the government of the United States, the president is arguably the most powerful government official in the world. The president is elected to a four-year term via an electoral college system. Since the Twenty-second Amendment was adopted in 1951, the American presidency has been limited to a maximum of two terms.. Click on a president below to learn more about each presidency ...',
  'title': 'list of

## **Define Agents**

In [10]:
# Set up LLM
llm = LLM(api_key=groq_api_key, model="groq/llama-3.3-70b-versatile", temperature=0)

In [11]:
# Rewrite Agent
rewrite_agent = Agent(
    role="Query Rewriting Specialist",
    goal="Analyze and improve search query for better results",
    backstory="""Expert in natural language processing and query optimization
    with years of experience in improving search effectiveness.""",
    llm=llm,
    allow_delegation=False,
    verbose=True
)

In [12]:
# Router Agent
router_agent = Agent(
    role="Traffic Router",
    goal="Determine the appropriate search destination based on query content",
    backstory="""Specialized in query analysis and routing with deep understanding
    of various data sources and their applicability.""",
    llm=llm,
    allow_delegation=False,
    verbose=True
)

In [13]:
# Retriever Agent
retriever_agent = Agent(
    role="Information Retriever",
    goal="Fetch relevant information from appropriate source",
    backstory="""Experienced in efficient information retrieval from multiple
    sources with expertise in both web and vector store searches.""",
    llm=llm,
    tools=[pdf_search_tool, web_search_tool],
    allow_delegation=False,
    verbose=True
)

In [14]:
# Evaluator Agent
evaluator_agent = Agent(
    role="Content Evaluator",
    goal="Assess the relevance and completeness of retrieved information",
    backstory="""Expert in content analysis and quality assessment with
    strong analytical skills and attention to detail.""",
    llm=llm,
    allow_delegation=False,
    verbose=True
)

In [15]:
# Answer Agent
answer_agent = Agent(
    role="Response Generator",
    goal="Generate clear and accurate answers",
    backstory="""Skilled in synthesizing information and crafting clear,
    concise responses while maintaining accuracy and relevance.""",
    llm=llm,
    allow_delegation=False,
    verbose=True
)

## **Define Tasks**

In [16]:
# Task for Rewrite Agent
rewrite_task = Task(
    description="""Analyze the input query: {query}, and rewrite it if needed to improve search effectiveness.
    Keep the query as is if it's already well-formed. Consider adding relevant keywords
    and removing unnecessary words while maintaining the original intent.""",
    expected_output="""A well-formed query that maintains the original intent but is optimized
    for search. Return the original query if no rewriting is needed.""",
    agent=rewrite_agent
)

In [17]:
# Task for Router Agent
router_task = Task(
    description="""Analyze the query to determine if it's related to git functions.
    Return 'vector_store' if the query is about git, otherwise return 'web_search'.
    Look for keywords related to git functions, or specific projects/products.""",
    expected_output="""A string containing either 'vector_store' or 'web_search' based on
    the query analysis. The output should be lowercase and match exactly one of these two options.""",
    agent=router_agent,
    context=[rewrite_task]
)

In [18]:
# Task for Retriever Agent
retriever_task = Task(
    description="""Use the appropriate search tool based on the router's output to fetch relevant information.
    For 'vector_store', search Website search tool. For 'web_search', use web search tool.""",
    expected_output="""A string containing the search results. The output should include relevant
    information that directly addresses the query, with sources when available.""",
    agent=retriever_agent,
    context=[rewrite_task, router_task]
)

In [19]:
# Task for Evaluator Agent
evaluator_task = Task(
    description="""Analyze the retrieved information to determine if it's relevant to the query,
    and contains a sufficient answer. Consider factors like content relevance, completeness,
    and reliability of the information.""",
    expected_output="""A string containing 'yes' if the content is relevant and contains
    a sufficient answer, or 'no' if the content is irrelevant or insufficient.""",
    agent=evaluator_agent,
    context=[rewrite_task, retriever_task]
)

In [20]:
# Task for Answer Agent
answer_task = Task(
    description="""Generate a comprehensive answer to the query based on the retrieved information.
    If the evaluator returned 'yes', synthesize the information into a clear and concise response.
    If 'no', return 'Unable to answer the given query'.""",
    expected_output="""A string containing either a well-formatted answer based on the retrieved
    information or 'Unable to answer the given query' if the information was deemed insufficient.""",
    agent=answer_agent,
    context=[rewrite_task, retriever_task, evaluator_task]
)

## **Define Crew**

In [21]:
agentic_rag_crew = Crew(
    agents=[rewrite_agent, router_agent, retriever_agent, evaluator_agent, answer_agent],
    tasks=[rewrite_task, router_task, retriever_task, evaluator_task, answer_task],
    verbose=True,
)

## **Run Crew**

In [22]:
inputs ={"query":"What is quantum computing? and What is the command for git rebase?"}
result = agentic_rag_crew.kickoff(inputs=inputs)

[1m[95m# Agent:[00m [1m[92mQuery Rewriting Specialist[00m
[95m## Task:[00m [92mAnalyze the input query: What is quantum computing? and What is the command for git rebase?, and rewrite it if needed to improve search effectiveness.
    Keep the query as is if it's already well-formed. Consider adding relevant keywords
    and removing unnecessary words while maintaining the original intent.[00m


[1m[95m# Agent:[00m [1m[92mQuery Rewriting Specialist[00m
[95m## Final Answer:[00m [92m
The input query consists of two separate questions: "What is quantum computing?" and "What is the command for git rebase?" 

For the first question, "What is quantum computing?", the query is already well-formed and clear. However, to improve search effectiveness, we can add relevant keywords. A rewritten query could be: "Introduction to quantum computing basics" or "Quantum computing definition and principles". 

For the second question, "What is the command for git rebase?", the query is 

In [23]:
print(result)

Introduction to quantum computing basics and Git rebase command syntax. Quantum computing is reshaping the technological landscape, offering unprecedented computational power to solve complex problems that were once deemed unsolvable. The git rebase command is used for rebasing commits to reflect a more logical flow, editing commit messages before pushing them to a remote repository, and for keeping a clean, linear commit history. The command for git rebase is "git rebase [branch name]".
