# CrewAI + LlamaIndex Cookbook

<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/cookbooks/crewai_llamaindex.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This cookbook shows you how to build an advanced research assistant by plugging LlamaIndex-powered tools into a CrewAI-powered multi-agent setup.

LlamaIndex is a framework enabling developers to easily build LLM-powered applications over their data; it contains production modules for indexing, retrieval, and prompt/agent orchestration. A core use case is building a generalized QA interface enabling knowledge synthesis over complex questions.

Plugging a LlamaIndex RAG pipeline as a tool into a CrewAI agent setup enables even more sophisticated/advanced research flows as shown below. We show the following:

1. **Setup a Simple Calculator Agent**: We port over the set of tools available on LlamaHub (https://llamahub.ai/).
2. **Setup a Financial Analyst Agent**: We plug in a RAG query engine as a tool accessible to a CrewAI agent.

In [None]:
import os
from crewai import Agent, Task, Crew, Process
from crewai_tools import LlamaIndexTool

In [None]:
!pip install llama-index-core
!pip install llama-index-readers-file
!pip install llama-index-tools-wolfram-alpha
!pip install 'crewai[tools]'

## Setup a Simple Calculator Agent

In this section we setup a crew of agents that can perform math and generate a 10-question multiple choice test (with answers) from it.

#### Wolfram Alpha Tool
Let's setup Wolfram Alpha as a general math computation tool.

In [None]:
from llama_index.tools.wolfram_alpha import WolframAlphaToolSpec

wolfram_spec = WolframAlphaToolSpec(app_id="<app_id>")
wolfram_tools = wolfram_spec.to_tool_list()

In [None]:
wolfram_tools[0]("(7 * 12 ^ 10) / 321")

ToolOutput(content='144473849856/107', tool_name='wolfram_alpha_query', raw_input={'args': ('(7 * 12 ^ 10) / 321',), 'kwargs': {}}, raw_output='144473849856/107', is_error=False)

In [None]:
wolfram_tools[0]("How many calories are there in a pound of apples")

ToolOutput(content='about 227 dietary Calories', tool_name='wolfram_alpha_query', raw_input={'args': ('How many calories are there in a pound of apples',), 'kwargs': {}}, raw_output='about 227 dietary Calories', is_error=False)

In [None]:
crewai_wolfram_tools = [LlamaIndexTool.from_tool(t) for t in wolfram_tools]

In [None]:
print(crewai_wolfram_tools[0].description)

wolfram_alpha_query(query: 'string') - wolfram_alpha_query(query: str)          Make a query to wolfram alpha about a mathematical or scientific problem.          Example inputs:             "(7 * 12 ^ 10) / 321"             "How many calories are there in a pound of strawberries"          Args:             query (str): The query to be passed to wolfram alpha.          


In [None]:
# Define your agents with roles and goals
calculator_agent = Agent(
    role="Calculator",
    goal="Solve complex math problems",
    backstory="""You are an AI computer that has access to Wolfram Alpha to perform complex computations.""",
    verbose=True,
    tools=crewai_wolfram_tools,
)
teacher_agent = Agent(
    role="Math Teacher",
    goal="Make tests for students.",
    backstory="""You are a math teacher preparing a simple arithmetic test for your 2nd grade students.""",
    verbose=True,
    allow_delegation=False,
)

# Create tasks for your agents
task1 = Task(
    description="""Using the math operators (+, -, *, /), and numbers from 1-100, generate 10 medium-difficulty arithmetic problems
  that consist of numbers/operators/parentheses in different ways.

  Generate the actual answer for each problem too. Use the Wolfram tool for this.
  """,
    expected_output="10 arithmetic expressions with the actual answers",
    agent=calculator_agent,
)
# Generate a multiple choice math test with answers. Will later give this test out to students (without answers of course).

task2 = Task(
    description="""Using the generated expressions/answers, generate a multiple choice for students.
  Each question should have 4 options, one being the correct answer. """,
    expected_output="Test with 10 multiple choice questions",
    agent=teacher_agent,
)

# Instantiate your crew with a sequential process
crew = Crew(
    agents=[calculator_agent, teacher_agent],
    tasks=[task1, task2],
    verbose=2,  # You can set it to 1 or 2 to different logging levels
)



In [None]:
# Get your crew to work!
result = crew.kickoff()

print("######################")
print(result)

[1m[95m [2024-06-08 19:30:03][DEBUG]: == Working Agent: Calculator[00m
[1m[95m [2024-06-08 19:30:03][INFO]: == Starting Task: Using the math operators (+, -, *, /), and numbers from 1-100, generate 10 medium-difficulty arithmetic problems 
  that consist of numbers/operators/parentheses in different ways.

  Generate the actual answer for each problem too. Use the Wolfram tool for this.
  [00m


[1m> Entering new CrewAgentExecutor chain...[0m
[32;1m[1;3mTo generate 10 medium-difficulty arithmetic problems using numbers from 1-100 and various operators (+, -, *, /) and parentheses, I will create a set of problems first. Then, I will use Wolfram Alpha to find the actual answers for each problem.

Let's start by generating the problems:

1. (25 + 13) * 2
2. 100 / (5 + 3)
3. (45 - 15) * 3
4. 75 / (5 * 3)
5. (60 + 40) / 4
6. 35 * 2 - 50
7. (80 / 4) + 15
8. 90 - (20 * 2)
9. (50 + 25) * 2 - 10
10. 100 / (10 + 5) * 2

Next, I will use Wolfram Alpha to get the answer for each of these

## Setup a Simple Financial Analyst Agent

In this section we setup a crew that uses a LlamaIndex RAG pipeline over an Uber 10K as its core query tool.

In [None]:
!wget "https://s23.q4cdn.com/407969754/files/doc_financials/2019/ar/Uber-Technologies-Inc-2019-Annual-Report.pdf" -O uber_10k.pdf

In [None]:
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.llms.openai import OpenAI


reader = SimpleDirectoryReader(input_files=["uber_10k.pdf"])
docs = reader.load_data()

In [None]:
docs[1].get_content()

'69  \nCountries\n10K+  \nCities\n$65B  \nGross Bookings\n111M  \nMAPCs\n7B  \nTripsA global tech \nplatform at \nmassive scale\nServing multiple multi-trillion \ndollar markets with products \nleveraging our core technology \nand infrastructure\nWe believe deeply in our bold mission. Every minute \nof every day, consumers and Drivers on our platform \ncan tap a button and get a ride or tap a button and \nget work. We revolutionized personal mobility with \nridesharing, and we are leveraging our platform to \nredefine the massive meal delivery and logistics \nindustries. The foundation of our platform is our \nmassive network, leading technology, operational \nexcellence, and product expertise. Together, these \nelements power movement from point A to point B.'

In [None]:
llm = OpenAI(model="gpt-4o")
index = VectorStoreIndex.from_documents(docs)
query_engine = index.as_query_engine(similarity_top_k=5, llm=llm)

In [None]:
# try out query engine tool

query_tool = LlamaIndexTool.from_query_engine(
    query_engine,
    name="Uber 2019 10K Query Tool",
    description="Use this tool to lookup the 2019 Uber 10K Annual Report",
)

In [None]:
query_tool.args_schema.schema()

{'title': 'QueryToolSchema',
 'description': 'Schema for query tool.',
 'type': 'object',
 'properties': {'query': {'title': 'Query',
   'description': 'Search query for the query tool.',
   'type': 'string'}},
 'required': ['query']}

## Generate a Research Report

Now that we have the query interface over the Uber 10K setup with LlamaIndex, we can now generate a research report with CrewAI.


We follow the agent/writer setup in the CrewAI quickstart tutorial, and modify it to use the query tool.

We then run it and analyze the results.

In [None]:
# Define your agents with roles and goals
researcher = Agent(
    role="Senior Financial Analyst",
    goal="Uncover insights about different tech companies",
    backstory="""You work at an asset management firm.
  Your goal is to understand tech stocks like Uber.""",
    verbose=True,
    allow_delegation=False,
    tools=[query_tool],
)
writer = Agent(
    role="Tech Content Strategist",
    goal="Craft compelling content on tech advancements",
    backstory="""You are a renowned Content Strategist, known for your insightful and engaging articles.
  You transform complex concepts into compelling narratives.""",
    verbose=True,
    allow_delegation=False,
)

# Create tasks for your agents
task1 = Task(
    description="""Conduct a comprehensive analysis of Uber's risk factors in 2019.""",
    expected_output="Full analysis report in bullet points",
    agent=researcher,
)

task2 = Task(
    description="""Using the insights provided, develop an engaging blog
  post that highlights the headwinds that Uber faces.
  Your post should be informative yet accessible, catering to a casual audience.
  Make it sound cool, avoid complex words.""",
    expected_output="Full blog post of at least 4 paragraphs",
    agent=writer,
)

# Instantiate your crew with a sequential process
crew = Crew(
    agents=[researcher, writer],
    tasks=[task1, task2],
    verbose=2,  # You can set it to 1 or 2 to different logging levels
)

In [None]:
# Get your crew to work!
result = crew.kickoff()

print("######################")
print(result)

[1m[95m [2024-06-08 21:33:31][DEBUG]: == Working Agent: Senior Financial Analyst[00m
[1m[95m [2024-06-08 21:33:31][INFO]: == Starting Task: Conduct a comprehensive analysis of Uber's risk factors in 2019.[00m


[1m> Entering new CrewAgentExecutor chain...[0m
[32;1m[1;3mTo conduct a comprehensive analysis of Uber's risk factors in 2019, I need to query the Uber 2019 10K Annual Report for relevant information.

Action: Uber 2019 10K Query Tool
Action Input: {"query": "risk factors"}[0m()
kwargs: 
getting query str
...got query str {'query': 'risk factors'}
[95m 

The company faces several risk factors, including liabilities from data security breaches, challenges in retaining and attracting users to its platform, and the need to continuously invest in new products and features. Additionally, government inquiries and investigations, as well as compliance with stringent data privacy laws, pose significant risks. The company is also exposed to claims related to traffic accidents