# 1. Building a RAG Agent with CrewAI

Let's build a RAG agent with `CrewAI`. We will do that by integrating `RagTool` from `crewai_tools` with a `CrewAI` agent. `RagTool` provides a way to create and query knowledge bases from various data sources, and allows the agent to access specialized context. In this notebook, we will provide the RAG tool a pdf file containing details about Europe pacakges provided by a travel agency. By the end of the notebook, we build an travel package agent specialized in answering queries related to Europe tour packages. In the next notebook, we will wrap this agent in an ACP server and make it interact with other ACP agents.

## 1.1. Import Libraries

In [9]:
from crewai import Crew, Task, Agent, LLM
from crewai_tools import RagTool

In [10]:
import warnings
warnings.filterwarnings('ignore')

## 1.2. Define the Agent's Large Language Model

We'll now define the large language model that we will use for our CrewAI agent. `max_tokens`: maximum number of tokens the model can generate in a single response.

**Note**: If we will define this model locally, it requires that we define the API key in a **.env** file as follows:
```
# Required
OPENAI_API_KEY=sk-...

# Optional
OPENAI_API_BASE=<custom-base-url>
OPENAI_ORGANIZATION=<your-org-id>
```

In [11]:
llm = LLM(model="openai/gpt-4", max_tokens=1024)

## 1.3. Define the RAG Tool

For the RAG tool, we can define the model provider and the embedding model in a configuration Python dictionary. We can also define the details of our vector database. If we don't specify the vector database, the RagTool will use Chroma (ChromaDB) as the default vector database in local/in-memory mode.

In [12]:
config = {
    "llm": {
        "provider": "openai",
        "config": {
            "model": "gpt-4",
        }
    },
    "embedding_model": {
        "provider": "openai",
        "config": {
            "model": "text-embedding-ada-002"
        }
    }
}

We can then pass the `config` to the `RagTool`, and then specify the data source for which the knowledge base will be constructed. When embedding our data, the `RagTool` chunks our document into chunks and create an embedding vector for each chunk. We can specify the chunk size (`chunk_size`: number of characters) and how many characters overlap between consecutive chunks (`chunk_overlap`). We can also use the default behavior.

In [13]:
rag_tool = RagTool(config=config,  
                   chunk_size=1200,       
                   chunk_overlap=200,     
                  )

rag_tool.add("../data/Europe-Packages-from-Ahmedabad.pdf", data_type="pdf_file")
rag_tool.add("../data/Europe-Packages-from-Chennai.pdf", data_type="pdf_file")
rag_tool.add("../data/Europe-Packages-from-Delhi.pdf", data_type="pdf_file")
rag_tool.add("../data/Europe-Packages-from-Hyderabad.pdf", data_type="pdf_file")

## 1.4. Define the Travel Package Agent

Now that you have the `rag_tool` defined, we define the CrewAI agent that can assist with travel package queries.

In [14]:
travel_package_agent = Agent(
    role="Senior Travel Package Advisor",
    goal="Accurately answer questions about Europe travel packages including pricing, itinerary, inclusions, exclusions, and departure cities.",
    backstory=(
        "You are a seasoned travel consultant specializing in curated Europe travel packages. "
        "You assist customers by interpreting and explaining offerings from detailed travel brochures "
        "for different departure cities like Ahmedabad, Delhi, Chennai, and Hyderabad."
    ),
    verbose=True,
    allow_delegation=False,
    llm=llm,
    tools=[rag_tool],
    max_retry_limit=5
)

## 1.5. Define the Agent Task

Let's now test the travel package agent. For that, we need to define the agent task and pass to it the query and the agent.

In [15]:
task1 = Task(
    description=(
        "The user wants to know the total cost and duration of the 'Best of Switzerland & Italy' "
        "package from Hyderabad. Provide pricing, number of days, and what's included if available."
    ),
    expected_output=(
        "A clear and concise summary of the 'Best of Switzerland & Italy' package including cost, "
        "duration, and any details about inclusions such as flights, hotels, and sightseeing."
    ),
    agent=travel_package_agent
)

## 1.6. Run the Travel Package Agent

To run the agent, we need to pass the agent and the task to a Crew object that we can run using the `kickoff` method.

In [None]:
crew = Crew(agents=[travel_package_agent], tasks=[task1], verbose=True)
task_output = crew.kickoff()
print(task_output)