# Agentic RAG

- Author: [Harheem Kim](https://github.com/harheem)
- Design:
- Peer Review:
- This is a part of [LangChain Open Tutorial](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial)

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/LangChain-OpenTutorial/LangChain-OpenTutorial/blob/main/15-Agent/06-Agentic-RAG.ipynb) [![Open in GitHub](https://img.shields.io/badge/Open%20in%20GitHub-181717?style=flat-square&logo=github&logoColor=white)](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial/blob/main/10-Agent/06-Agentic-RAG.ipynb)

## Overview

**Agentic RAG** extends traditional RAG (Retrieval-Augmented Generation) systems by incorporating an agent-based approach for more sophisticated information retrieval and response generation. This system goes beyond simple document retrieval and response generation by enabling agents to utilize various tools for more intelligent information processing. These tools include `Tavily Search` for accessing up-to-date information, `Python` code execution capabilities, and custom function implementations, all integrated within the `LangChain` framework to provide a comprehensive solution for information processing and generation tasks.

This tutorial demonstrates how to build a document retrieval system using `FAISS DB` for effective PDF document processing and searching. Using the AI Brief from the Software Policy Research Institute as an example document, we'll explore how to integrate web-based document loaders, text splitters, vector stores, and `OpenAI` embeddings to create a practical **Agentic RAG** system. The implementation showcases how the `Retriever` tool can be effectively combined with various `LangChain` components to create a robust document search and response generation pipeline.

### Table of Contents

- [Overview](#overview)
- [Environment Setup](#environment-setup)
- [Configuring Tools](#configuring-tools)
- [Building the Agent](#building-the-agent)
- [Implementing Chat History](#implementing-chat-history)
- [Running Examples](#running-examples)

### References

- [LangChain Docs - Build an Agent with AgentExecutor (Legacy)](https://python.langchain.com/docs/how_to/agent_executor/)
- [LangChain Docs - How to use a vectorstore as a retriever](https://python.langchain.com/docs/how_to/vectorstore_retriever/)
- [LangCHain Docs - How to add chat history](https://python.langchain.com/docs/how_to/qa_chat_history_how_to/)
- [Tavily](https://tavily.com/)
----

## Environment Setup

Set up the environment. You may refer to [Environment Setup](https://wikidocs.net/257836) for more details.

**[Note]**
- `langchain-opentutorial` is a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials.
- You can checkout the [`langchain-opentutorial`](https://github.com/LangChain-OpenTutorial/langchain-opentutorial-pypi) for more details.

In [1]:
%%capture --no-stderr
%pip install langchain-opentutorial

In [2]:
# Install required packages
from langchain_opentutorial import package

package.install(
    [
        "langchain_community",
        "langchain_openai",
        "langchain_core",
        "faiss-cpu",
        "pypdf",
    ],
    verbose=False,
    upgrade=False,
)

`LangChain` provides built-in tools that make it easy to use the `Tavily` search engine as a tool in your applications.

To use `Tavily Search`, you'll need to obtain an API key.

Click [here](https://app.tavily.com/sign-in) to sign up on the `Tavily` website and get your `Tavily Search` API key.

In [3]:
# Set environment variables
from langchain_opentutorial import set_env

set_env(
    {
        "OPENAI_API_KEY": "",
        "LANGCHAIN_API_KEY": "",
        "TAVILY_API_KEY": "",
        "LANGCHAIN_TRACING_V2": "true",
        "LANGCHAIN_ENDPOINT": "https://api.smith.langchain.com",
        "LANGCHAIN_PROJECT": "06-Agentic-RAG",
    }
)

Environment variables have been set successfully.


You can alternatively set API keys in a `.env` file and load it.

[Note] This is not necessary if you've already set API keys in previous steps.

In [4]:
from dotenv import load_dotenv

# Load API key information
load_dotenv(override=True)

True

##  Configuring Tools

The foundational stage of setting up tools for the agent to use. We implement a web search tool using the `Tavily Search API` and a PDF document retrieval tool. These tools enable the agent to effectively search and utilize information from various sources. By combining these tools, the agent can select and use the appropriate tool based on the context.

### Implementing Web Search

The web search tool utilizes the `Tavily Search API` to retrieve real-time information from the web. It returns up to 6 results ranked by relevance, with each result containing a URL and content snippet.

In [None]:
from langchain_community.tools.tavily_search import TavilySearchResults

# Create a search tool instance that returns up to 6 results
search = TavilySearchResults(k=6)

In [None]:
# Example usage
result = search.invoke("When was the movie A.I. released and who is the director?")
print(result)

[{'url': 'https://spielberg.fandom.com/wiki/A.I._Artificial_Intelligence', 'content': "A.I. Artificial Intelligence | Spielberg Wiki | Fandom Spielberg Wiki All Pages Recent Blog Posts Blogs Wikis Explore Wikis Community Central Spielberg Wiki All Pages Recent Blog Posts Blogs Sign In Register Spielberg Wiki pages All Pages Recent Blog Posts Blogs A.I. Artificial Intelligence[1] (or simply A.I.) is a 2001 science fiction film directed by Steven Spielberg. In 1995, Kubrick handed A.I. to Spielberg, but the film did not gain momentum until Kubrick died in 1999. Spielberg remained close to Watson's treatment for the screenplay, and dedicated the film to Kubrick. In a 2016 BBC poll of 177 critics around the world, A.I. Artificial Intelligence was voted the eighty-third greatest film since 2000. Spielberg Wiki is a FANDOM Movies Community."}, {'url': 'https://movies.fandom.com/wiki/A.I._Artificial_Intelligence', 'content': 'films A.I. Artificial Intelligence, also known as A.I., is a 2001 A

### Implementing PDF Search

This tutorial demonstrates how to build a PDF search tool that leverages vector databases for efficient document retrieval. The system divides PDF documents into manageable chunks and utilizes `OpenAI` embeddings for text vectorization alongside `FAISS` for fast similarity searching.

For this tutorial, we'll work with a sample document from the academic text "*An Introduction to Ethics in Robotics and AI*" (2021). This comprehensive book explores fundamental concepts including AI definitions, machine learning principles, robotics fundamentals, and the current limitations of AI technology.

- Title: What Is AI?
- Authors:
    - Christoph Bartneck (University of Canterbury)
    - Christoph Lütge (Technical University of Munich)
- Link: https://www.researchgate.net/publication/343611353_What_Is_AI
- File: What_Is_AI.pdf

To begin, please place the PDF file in your data directory.

In [None]:
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain.document_loaders import PyPDFLoader
from langchain.tools.retriever import create_retriever_tool

# Load and process the PDF
loader = PyPDFLoader("data/What_Is_AI.pdf")

# Create text splitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)

# Split the document
split_docs = loader.load_and_split(text_splitter)

# Create vector store
vector = FAISS.from_documents(split_docs, OpenAIEmbeddings())

# Create retriever
retriever = vector.as_retriever()

# Create retriever tool
retriever_tool = create_retriever_tool(
    retriever,
    name="pdf_search",
    description="use this tool to search information from the PDF document",
)

In [None]:
retriever.invoke("What are the main limitations of AI discussed in the text?")

[Document(id='21264c71-a716-4b90-8909-2eece1374346', metadata={'source': 'What_Is_AI.pdf', 'page': 9}, page_content='14 2 What Is AI?\nalgorithms and optimisation software can handle everything from airline reservation\nsystems to the management of nuclear power plants. But they only take well-deﬁned\nactions within strictly deﬁned limits. In this section, we focus on some of the major\nchallenges that make AI so difﬁcult. The limitations of sensors and the resulting lack\nof perception have already been highlighted.\nAI systems are rarely capable of generalising across learned concepts. Although\na classiﬁer may be trained on very related problems, typically classiﬁer performance\ndrops substantially when the data is generated from other sources or in other ways.\nFor example, face recognition classiﬁers may obtain excellent results when faces are\nviewed straight on, but performance drops quickly as the view of the face changes\nto, say proﬁle. Considered another way, AI systems lack

### Combining Tools

We combine multiple tools into a single list, allowing the agent to select and use the appropriate tool based on the context. This enables flexible switching between web search and document retrieval.

In [None]:
# Combine tools into a single list for the agent to use
tools = [search, retriever_tool]

## Building the Agent

The core stage of building an agent. We initialize a Large Language Model (LLM) and set up prompt templates that enable the agent to effectively utilize tools. The agent is configured to combine PDF search and web search capabilities, allowing it to find answers from various information sources. Specifically, we use `create_tool_calling_agent` to create an agent with tool-using capabilities and explain how to set up the execution environment using `AgentExecutor`.

In [None]:
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain.agents import create_tool_calling_agent, AgentExecutor

# Initialize LLM
llm = ChatOpenAI(model="gpt-4o", temperature=0)

# Define prompt template
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. "
            "Make sure to use the `pdf_search` tool for searching information from the PDF document. "
            "If you can't find the information from the PDF document, use the `search` tool for searching information from the web.",
        ),
        ("placeholder", "{chat_history}"),
        ("human", "{input}"),
        ("placeholder", "{agent_scratchpad}"),
    ]
)

# Create agent
agent = create_tool_calling_agent(llm, tools, prompt)

# Create agent executor
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=False)

> *Note*: We set `verbose=False` to suppress intermediate step outputs from the agent executor.


## Implementing Chat History

The essential implementation stage for managing conversation history. We implement a session-based chat history store that allows the agent to remember and reference previous conversations. Using `ChatMessageHistory`, we maintain independent conversation histories for each session, and through `RunnableWithMessageHistory`, we enable the agent to understand conversation context and maintain natural dialogue flow. This allows users to ask follow-up questions naturally based on previous interactions.

In [None]:
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

# Create a store for session histories
store = {}


def get_session_history(session_ids):
    if session_ids not in store:
        store[session_ids] = ChatMessageHistory()
    return store[session_ids]


# Create agent with chat history
agent_with_chat_history = RunnableWithMessageHistory(
    agent_executor,
    get_session_history,
    input_messages_key="input",
    history_messages_key="chat_history",
)

## Running Examples

Introduction to running the implemented agent and examining its results. Using streaming output, we can observe the agent's thought process and results in real-time. Through various examples, we showcase the agent's core functionalities including PDF document search, web search, independent session management across conversations, and response restructuring. The `process_response` function helps structure the agent's responses, clearly showing tool usage and results in an organized manner.

In [None]:
def process_response(response):
    """
    Process and display streaming response from the agent.

    Args:
        response: Agent's streaming response iterator
    """
    for chunk in response:
        if chunk.get("output"):
            print(chunk["output"])
        elif chunk.get("actions"):
            for action in chunk["actions"]:
                print(f"\nTool Used: {action.tool}")
                print(f"Tool Input: {action.tool_input}")
                if action.log:
                    print(f"Tool Log: {action.log}")

In [None]:
# Example 1: Searching in PDF
response = agent_with_chat_history.stream(
    {
        "input": "What information can you find about Samsung's AI model in the document?"
    },
    config={"configurable": {"session_id": "tutorial_session_1"}},
)
process_response(response)


Tool Used: pdf_search
Tool Input: {'query': 'Samsung AI model'}
Tool Log: 
Invoking: `pdf_search` with `{'query': 'Samsung AI model'}`



The document does not contain specific information about Samsung's AI model. If you need information about Samsung's AI model, I can search the web for you. Would you like me to do that?


In [None]:
# Example 2: Following up with web search (same session)
response = agent_with_chat_history.stream(
    {
        "input": "Yes, please search the web for information about Samsung's latest AI model"
    },
    config={"configurable": {"session_id": "tutorial_session_1"}},
)
process_response(response)


Tool Used: tavily_search_results_json
Tool Input: {'query': 'Samsung latest AI model 2023'}
Tool Log: 
Invoking: `tavily_search_results_json` with `{'query': 'Samsung latest AI model 2023'}`



Samsung has recently unveiled its new generative AI model called Samsung Gauss. This model was introduced at the Samsung AI Forum 2023. Named after the mathematician Carl Friedrich Gauss, the model signifies the infinite possibilities of generative AI that Samsung aims to realize. Samsung Gauss is designed to improve performance and efficiency, and it will be used to enhance Galaxy AI features. It is also capable of generating text, code, and images, positioning it as a potential alternative to models like ChatGPT. The development of Samsung Gauss was led by Samsung Research.


In [None]:
# Example 3: New session with different topic (Session 2)
response = agent_with_chat_history.stream(
    {"input": "What can you tell me about Stroing and Weak AI from the PDF document?"},
    config={"configurable": {"session_id": "tutorial_session_2"}},
)
process_response(response)


Tool Used: pdf_search
Tool Input: {'query': 'Strong and Weak AI'}
Tool Log: 
Invoking: `pdf_search` with `{'query': 'Strong and Weak AI'}`



The PDF document discusses the concepts of Strong and Weak AI as follows:

- **Weak AI**: This type of AI is limited to a single, narrowly defined task. Most modern AI systems fall into this category. They are developed to handle a specific problem, task, or issue and generally cannot solve other problems, even if they are related. Examples of weak AI include systems that can beat a grandmaster in chess or Go, or experienced players in Poker.

- **Strong AI**: In contrast, Strong AI is defined by John Searle as an AI that, when appropriately programmed with the right inputs and outputs, would have a mind in exactly the same sense human beings have minds. This implies that Strong AI would have the ability to understand, reason, and have consciousness similar to humans. As of the document's writing, no AI system has achieved Strong AI.

The docume

In [None]:
# Example 4: Request to summarize previous response in a table (Session 2)
response = agent_with_chat_history.stream(
    {"input": "Can you organize your previous response into a table format?"},
    config={"configurable": {"session_id": "tutorial_session_2"}},
)
process_response(response)

Certainly! Here's the information organized into a table format:

| Type of AI  | Description                                                                                                                                         | Examples                                                                                   |
|-------------|-----------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------|
| Weak AI     | Limited to a single, narrowly defined task. Most modern AI systems fall into this category and cannot solve unrelated problems.                     | Systems that can beat a grandmaster in chess or Go, or experienced players in Poker.       |
| Strong AI   | Defined by John Searle as an AI that would have a mind in the same sense human beings have minds, with the ability to understand and reason. 