# Automatic Deep Research 

Welcome to this new practice lab! By now you should have a clearer view of the elements that compose a multi-agent system. In this lab you will get to put it into action by creating your first crew.

**What you'll learn:**
- How to define agents with specific roles and expertise
- How to provide agents with tools to perform their tasks
- How to create your own tasks that agents will execute
- How to assemble agents and tasks into a Crew, all using CrewAI

## Background

As a research consultant, you're constantly tasked with producing comprehensive reports on diverse topics for demanding clients. You need to build an automatic deep research solution that can rapidly gather, verify, and synthesize information from across the internet, delivering reliable, fact-checked reports that meet tight deadlines and exacting standards regardless of the subject matter. 

## General instructions
In this lab you will be presented with a structure of the code, but you will need to complete some of it. 

To successfully run this lab, replace all instances of the placeholder `None` with your own code. Sections where you need to write code will be delimited between `### START CODE HERE ###` and `### END CODE HERE ###`.

If you are stuck, or simply want to copy a solution into your notebook so that you can execute it, you can find all solution code inside the [Solution](Solution) folder.

**<font color='#5DADEC'>Please make sure to save your work periodically, so you don't lose any progress.</font>**

## Table of contents

- [1. Understanding the problem](#1)
- [2. Set up your notebook](#2)
- [3. Define the Agents](#3)
  - [3.1. Create tool instances](#3-1)
  - [3.2. Define the Research Planner agent](#3-2)
  - [3.3. Define the remaining agents](#3-3)
- [4. Create the Tasks](#4)
  - [4.1. Define the Create research plan task](#4-1)
  - [4.2. Define the remaining tasks](#4-2)
- [5. Define the Crew and get the results](#5)

<a id="1"></a>

## 1. Understanding the problem
In this lab, you will focus on building a custom deep research crew. This Crew will be in charge of creating a research plan based on the user's input, and executing it, while reviewing and checking the facts. Finally, with the gathered information a report needs to be generated.

Take some time to decompose the problem into different tasks. Who would be the appropriate "person" to solve each task? 

Once you've done your thinking, click below to find an agent/task diagram for this lab.    


<details>    
<summary>
    <font size="3" color="#237b946b"><b>Diagram</b></font>
</summary>

<img src="../images/lab2-agents-tasks-diagram.PNG">

<a id="2"></a>

## 2. Set up your notebook

Before you start coding, run the next two cells to import all necessary modules and configure the environment variables. 

In [1]:
from crewai import Agent, Task, Crew
import os
os.environ["CREWAI_TESTING"] = "true"
from utils import get_openai_api_key

# set the OpenAI model (gpt-4o-mini)
os.environ["MODEL"] = "gpt-4o-mini"
# set up the OpenAI API key 
os.environ["OPENAI_API_KEY"] = get_openai_api_key()

<a id="3"></a>

## 3. Define the Agents

Based on the diagram, you should have four agents:
- **Research Planner**: its goal is to analyze queries and break them down into smaller, specific research topics.
- **Internet Researcher**: its job is to perform research tasks.
- **Fact checker**: its goal is to review information for fact accuracy to avoid misinformation. 
- **Report Writer**: is in charge of writing reports, based on gathered information.

<a id="3-1"></a>

### 3.1. Create tool instances
As you can see in the diagram, you will be providing the **Internet Researcher Agent** with tools, so that it can better do their job. In particular, you will give this agent access to search the internet and scrape information from the retrieved webpages. 

There are different tools inside CrewAI you can use to search the web, in this lab you will use the [**EXA Search Web Loader**](https://docs.crewai.com/en/tools/search-research/exasearchtool#exa-search-web-loader) tool, which is designed to perform a semantic search for a specified query from a textâ€™s content across the internet. It utilizes the [exa.ai](https://exa.ai/) API to fetch and display the most relevant search results based on the query provided by the user. exa.ai enhances semantic search by capturing richer contextual relationships between concepts, allowing for more precise information retrieval than conventional embedding approaches.

For webscraping, you will use the [**Scrape Website**](https://docs.crewai.com/en/tools/web-scraping/scrapewebsitetool) tool, which is designed to extract and read the content of a specified website.

In the next cell you will define instances of these tools, so you can later assign them to the agents.

In [2]:
# import the tools
from crewai_tools import EXASearchTool, ScrapeWebsiteTool
from utils import get_exa_api_key

# set the exa API key
os.environ["EXA_API_KEY"] = get_exa_api_key()

### START CODE HERE ###

# Create the EXASearchTool instance
exa_search_tool = EXASearchTool(base_url=os.getenv("EXA_BASE_URL"))
# Create the ScrapeWebsiteTool instance
scrape_website_tool = ScrapeWebsiteTool()

### END CODE HERE ###

<a id="3-2"></a>

### 3.2. Define the Research Planner agent

In the cell below, you will see how you can create the first agent. This time, all the parameters are set up for you. Here is a quick recap of what each of the parameters represent:

- `Role`: If this was a person doing the job, what title would they have?
- `Goal`: What is the goal this agent in particular is trying to accomplish? Make sure to write concrete goal
- `Background`: it should be something the highlights the skills of the agent relevant to its role. Make sure to use keywords that will actually help your agent get better results.

In the labs, we have added two parameters not shown in the demo videos: `max_rpm`, and `max_iter`. `max_rpm` sets the maximum requests per minute to avoid rate limits, while `max_iter` limits the maximum iterations before the agent must provide its best answer. Setting these two parameters helps make the agents run a little faster, so the lab doesn't take as long to complete. 

In [3]:
# define the research planner agent
research_planner = Agent(
    role="Research Planner",
    goal="Analyze queries and break them down into smaller, specific research topics.",
    backstory=(
         "You are a research strategist who excels at breaking down complex questions "
         "into manageable research components. You identify what needs to be researched "
         "and create clear research objectives."
    ),
    verbose=True, # set to True to see detailed agent actions
    max_rpm=150,
    max_iter=10
)

<a id="3-3"></a>

### 3.3. Define the remaining agents

Now you can define the three remaining agents. The `role` and `goal` parameters are already filled in for you; use your own creativity to fill in the `backstory`.  

Do not forget to assign the tools to the **Internet Researcher** and **Fact Checker** agents. You can do this by setting the `tools` argument.

In [4]:
researcher = Agent(
    role="Internet Researcher",
    goal="Research thoroughly all assigned topics",
    ### START CODE HERE ###
    backstory=(
        "You are an skilled Researcher with experience in finding up-to-date, reliable information and data collection. You know how to find reliable sources, extract relevant as well as recent information, and always verify facts across multiple sources to avoid misinformation."
        "You never invent facts and always trace information back to its origin."
    ),
    # add the 2 tool instances you created
    tools=[exa_search_tool, scrape_website_tool],
    ### END CODE HERE ###
    verbose=True,
    max_rpm=150,
    max_iter=10
)

fact_checker = Agent(
    role="Fact Checker",
    goal=(
        "Verify data for accuracy, identify inconsistencies, "
        "and flag potential misinformation"
    ),
    ### START CODE HERE ###
    backstory=( 
        "You are a Quality Assurance Specialist with expertise in fact-checking. You identify any misinformation and hallucinations in the data presented to you. You cross-reference information to spot inconsistencies and ensure all data meets high accuracy standards. You correctly check for hallucinated or invented content and make sure that all facts are supported by evidence."
    ),
    tools=[exa_search_tool, scrape_website_tool],
    ### END CODE HERE ###
    verbose=True,
    max_rpm=150,
    max_iter=10
)

report_writer = Agent(
    role="Report Writer",
    goal="Write clear, concise, and well-structured reports based on gathered information",
    ### START CODE HERE ###
    backstory=( 
         "You are an expert writer who specializes in preparing clear, concise and well-structured research reports. You convert complex information into readable formats and always include proper citations and sources."
        # "You are known for giving a brief introduction in a very short paragraph before highlighting the key facts and trends in a listed manner."
        # "Your ability to describe complex concepts or terms in a simple manner is highly appreciated by non-technical readers and professionals from other domains and backgrounds."
    ),
    ### END CODE HERE ###
    verbose=True,
    max_rpm=150,
    max_iter=10
)
        

<a id="4"></a>

## 4. Create the Tasks

Now that you have set up the agents, it is time to define the tasks. If you go back to the diagram, you will see you need four tasks:

- **Create research plan**: Based on the user's query, break it down into specific topics and key questions, and create a focused research plan.
    - Output: A research plan with main research topics to investigate, key questions for each topic, and success criteria for the research.

- **Gather research data**: Using the research plan, collect information on all identified topics. Cite all sources used.
    - Output: Comprehensive research data including: information for each research topic, and citations used along with source credibility notes.

- **Verify information quality**: Review all collected research. Identify any conflicting information, potential misinformation, or gaps that need addressing.
    - Output: A report with the all the collected data, and its review. It should include consistency check results and source reliability ratings

- **Write final report**: Create a comprehensive report that answers the original query using all verified research data. Structure it with clear sections, include citations, and provide actionable insights.
    - Output: The final research report. In addition to the full answer, it should have an executive summary, and complete source citations.


For each `Task` you need to define the following parameters:
- `description`: A thorough description of the task. You can even break it down into different items.
- `expected_output`: what should the output return. Be specific, specially if you want any structure in your result, like a dictionary with specific keys.
- `agent`: who is performing the task? You need to match the task to one of the agents you already defined

In the description you will need to pass the inputs to the tasks. In this lab, you will only have as input the user's query, which will be saved as `user_query`:


<a id="4-1"></a>

### 4.1. Define the Create research plan task

In the cell below, you will see how you can create the first task. This time, all the parameters are set up for you. Notice how the context variables are passed the the description between curly brackets. 

In [5]:
# define the create research plan task
create_research_plan_task = Task(
    description=(
        "Based on the user's query, break it down into specific topics and key questions, "
        "and create a focused research plan."
        "The user's query is: {user_query}"
    ),
    expected_output=(
        "A research plan with main research topics to investigate, "
        "key questions for each topic, and success criteria for the research."
        ),
    agent=research_planner,
)

<a id="4-2"></a>

### 4.2. Define the remaining tasks

Now define the three remaining tasks. The `description` is already filled in for you, you will need to define the `expected_output` and `agent` for each of the Tasks.

In [6]:
# define the gather research data task
gather_research_data_task = Task(
    description=(
        "Using the research plan, collect information on all identified topics. "
        "Cite all sources used."
    ),
    ### START CODE HERE ###
    expected_output=(
        "Comprehensive research data including: information for each research topic, and citations user along with source credibility notes."
    ),
    agent=researcher
    ### END CODE HERE ###
)

#define the verify information quality task
verify_information_quality_task = Task(
    description=(
        "Review all collected research. Identify any conflicting information, "
        "potential misinformation, or gaps that need addressing."
    ),
    ### START CODE HERE ###
    expected_output=( 
        "A report with all the original data you have and any verified facts v/s questionable information, make sure this is as comprehensive as possible for the final report generation."
    ),
    agent=fact_checker
    ### END CODE HERE ###
)

# define the write final report task
write_final_report_task = Task(
    description=(
        "Create a comprehensive report that answers the original query using all verified research data. "
        "Structure it with clear sections, include citations, and provide actionable insights."
    ),
    ### START CODE HERE ###
    expected_output=( 
        "A final research report containing: executive summary, detailed findings that answer the user query, supporting evidence and analysis, complete source citations."
    ),
    agent=report_writer
    ### END CODE HERE ###
)
    

<a id="5"></a>

## 5. Define the Crew and get the results

Once the agents and tasks have been defined, you are ready to create the crew. In order to so, you will need to set the following arguments:
- `agents`: list of agents in the crew
- `tasks`: list of tasks in the crew. The tasks should be listed in the order they should be executed

In the next cell, fill in the agents and tasks for the crew.

In [7]:
# create the crew with the defined agents and tasks
crew = Crew(
    ### START CODE HERE ###
    agents=[research_planner, researcher, fact_checker, report_writer],
    tasks=[create_research_plan_task, gather_research_data_task, verify_information_quality_task, write_final_report_task]
    ### 
)

Before running the crew, you need to define the query, which will be used as input for the tasks.

In [8]:
### START CODE HERE ###

# Write your query, which will be used as input for the tasks.
user_query = "Evaluate the top five emerging AI tools for automating competitive market analysis for 2025, including their features, limitations, costs and ideal use cases for a mid-sized marketing firm."

### END CODE HERE ###

Now you are only left with kickstarting the crew to get the results. Since you set `verbose=True` in the agents, you should monitor all the process.

In [9]:
result = crew.kickoff(
    inputs={
        "user_query": user_query,
    }
)

From the output of the previous cell check all the outputs for each task. Do they match what you expected? If not, go back and refine the `expected_output`. 

You can also print the final report to see the final result of the crew

In [11]:
from IPython.display import Markdown
Markdown(result.raw) 

**Comprehensive Research Report on Emerging AI Tools for Automating Competitive Market Analysis**

---

### Executive Summary
The landscape of competitive market analysis is rapidly evolving, with emerging artificial intelligence (AI) tools enhancing the capabilities of marketing firms. This report evaluates the top five emerging AI tools specifically designed to automate competitive market analysis for mid-sized marketing organizations. The analysis includes identification of the tools, features, limitations, cost breakdowns, and ideal use cases, aiming to provide actionable insights and recommendations.

---

### 1. Identification of Top Five Emerging AI Tools

#### 1.1 Madgicx AI Marketer
- **Description**: An advanced AI tool that helps optimize advertising strategies through real-time insights into competitorsâ€™ marketing activities.
- **Differentiation**: It uniquely integrates competitor data with campaign optimization features, providing a holistic view of performance.
- **User Feedback**: Users report substantial time efficiencies and enhanced results in campaign metrics.
- **Verified Sources**: [Madgicx Blog](https://madgicx.com/blog/competitive-benchmarking-ai).

#### 1.2 Izola by Forrester
- **Description**: A generative AI solution that synthesizes research insights via a conversational interface.
- **Differentiation**: Offers seamless access to evidence-backed summaries directly linked to authoritative Forrester reports.
- **User Feedback**: Early users commend its speed and efficacy in retrieving actionable insights.
- **Verified Sources**: [Forrester Press Release](https://www.forrester.com/press-newsroom/forrester-izola-generative-ai-tool/).

#### 1.3 AlphaSense
- **Description**: A comprehensive market intelligence platform utilizing generative AI to extract insights from extensive financial documentation and news sources.
- **Differentiation**: Provides notable coverage of both public and private corporate data, enhancing the breadth of insights available.
- **User Feedback**: Praises focus on usability and the ability to digest large datasets effectively.
- **Verified Sources**: [AlphaSense Blog](https://www.alpha-sense.com/blog/product/smart-summaries-for-companies/).

#### 1.4 Nexis+ AI
- **Description**: An advanced business intelligence platform that employs generative AI to deliver research insights quickly and efficiently.
- **Differentiation**: Notable for its combination of robust data sources, yielding reliable and timely insights.
- **User Feedback**: Users commend its streamlined research capabilities and adaptability.
- **Verified Sources**: [LexisNexis](https://www.lexisnexis.com/en-us/products/nexis-plus-ai.page).

#### 1.5 Comparables.ai
- **Description**: A tool that enables comprehensive company and market analysis, focusing on performance insights.
- **Differentiation**: Excels at identifying key valuation drivers and enabling cross-market comparisons.
- **User Feedback**: Praised for facilitating quick understanding of market dynamics.
- **Verified Sources**: [Comparables.ai](https://www.comparables.ai/use-cases/ai-insights).

---

### 2. Features of Each AI Tool

| Tool Name        | Key Features                                                               | Unique Capabilities                                 |
|------------------|---------------------------------------------------------------------------|----------------------------------------------------|
| Madgicx AI       | Automated ad monitoring, comprehensive budget insights                     | AI-driven creative optimization                      |
| Izola            | Chat-based inquiries for fast research insights                            | Dynamic generative responses across vast data       |
| AlphaSense       | Real-time data extraction from diverse datasets                            | Amalgamation of both historical and live data      |
| Nexis+ AI        | Multi-source answer generation, customizable reporting capabilities        | Strong focus on legal compliance and market data    |
| Comparables.ai   | Detailed analysis on performance metrics, comparables visualization        | Insight into market trends influencing valuation processes |

---

### 3. Limitations of Each AI Tool

- **Madgicx AI Marketer**: Primarily targets advertising strategies; it may not provide comprehensive insights outside marketing contexts.
- **Izola**: Access is restricted to Forrester subscribers; breadth of information may be limited depending on subscription scope.
- **AlphaSense**: Operating costs may restrict its accessibility for smaller firms needing budget-friendly options.
- **Nexis+ AI**: The platform's complexity may demand training, presenting a steep learning curve for new users.
- **Comparables.ai**: Effectiveness is tied to the quality of available data, posing challenges in niche market assessments.

---

### 4. Cost Analysis of Each AI Tool

| Tool Name        | Pricing Model                      | Cost Range                      |
|------------------|-----------------------------------|---------------------------------|
| Madgicx AI       | Subscription-based                 | Starting at $99/month           |
| Izola            | Included in Forrester subscription  | Pricing varies based on plan    |
| AlphaSense       | Tiered subscription                | Typically starts from $1000/month |
| Nexis+ AI        | Custom pricing                     | Begins at $200/month             |
| Comparables.ai   | Subscription-based                 | Starting at $50/month            |

---

### 5. Ideal Use Cases for a Mid-Sized Marketing Firm

- **Madgicx AI Marketer**: Best implemented for enhancing and optimizing ad campaigns on social media platforms.
- **Izola**: Most beneficial for synthesizing quick research insights ahead of strategic meetings.
- **AlphaSense**: Designed for firms that require an in-depth market and financial analysis support in strategic decision-making.
- **Nexis+ AI**: Optimal for conducting legal and regulatory market research where compliance is a critical factor.
- **Comparables.ai**: Highly recommended for benchmarking company performance against fast-changing competitors in the marketplace.

---

### Conclusions and Recommendations

The analysis provides crucial insights into five emerging AI tools that enhance competitive market analysis efforts within mid-sized marketing firms. 

1. **Ad-Focused Firms**: Should consider integrating Madgicx AI and Nexis+ AI for extensive advertising oversight and business intelligence.
   
2. **Comprehensive Data Needs**: AlphaSense and Comparables.ai are ideal choices for firms seeking extensive data analysis for effective decision-making.

3. **Research-Oriented Environments**: Firms engaged in research-intensive activities will benefit significantly from Izola and Nexis+ AI for obtaining quick, actionable insights.

By leveraging the attributes of these AI tools, mid-sized marketing firms can enhance their market positioning and streamline their competitive analysis efforts, ensuring they remain agile and effective in an evolving marketplace.

--- 

This structured research provides not only a detailed evaluation of the tools available but also insights that are actionable, ensuring marketing firms can make informed decisions that best suit their strategies.

You made it to the end of the lab! You can go back and experiment with the goals and backstories of the agents, as well as description and expected outputs of tasks. You can also change the inputs to any research topic you wish. Have fun with it!

### 4. Cost Analysis of Each AI Tool                                                                           
                                                                                                                 
  | Tool Name        | Pricing Model                      | Cost Range                      |                    
  |------------------|-----------------------------------|---------------------------------|                     
  | Madgicx AI       | Subscription-based                 | Starting at $99 month           |                    
  | Izola            | Included in Forrester subscription  | Pricing varies based on plan    |                   
  | AlphaSense       | Tiered subscription                | Typically starts from $1000//month |                  
  | Nexis+ AI        | Custom pricing                     | Begins at $200/month             |                   
  | Comparables.ai   | Subscription-based                 | Starting at $50/month            |   