# Automatic Deep Research - Adding custom tools

Welcome to the second practice lab of this module! 

In this lab, you will continue working on the deep research crew from Lesson 1. This time you will be writing you own custom tools, and adding them to your agents so that they can give more accurate responses.

**What you'll learn:**
- How to create custom tools for your agents

## Background

As a research consultant, you're constantly tasked with producing comprehensive reports on diverse topics for demanding clients. You need to build an AI research crew that can rapidly gather, verify, and synthesize information from across the internet, delivering reliable, fact-checked reports that meet tight deadlines and exacting standards regardless of the subject matter.

## General instructions
In this lab you will be presented with a structure of the code, but you will need to complete some of it. 

To successfully run this lab, replace all instances of the placeholder `None` with your own code. Sections where you need to write code will be delimited between `### START CODE HERE ###` and `### END CODE HERE ###`.

**<font color='#5DADEC'>Please make sure to save your work periodically, so you don't lose any progress.</font>**

## Table of contents

- [1. Problem statement](#1)
- [2. Set up your notebook](#2)
- [3. Tools](#3)
- [4. Agents](#4)
- [5. Guardrails](#5)
- [6. Tasks](#6)
- [7. Execution hooks](#7)
- [8. Crew](#8)
  - [8.1. Define the crew](#8-1)
  - [8.2. Define the inputs](#8-2)
  - [8.3. Run the crew](#8-3)

<a id="1"></a>

## 1. Problem statement

The goal of this lab is to take the multi-agent system that can interpret a user's input, and create an action plan, then do the actual research and fact checking, and finally output a report you can share with the client, and add tools to the agents so they can be better at achieving their goals.

You will reuse the code from the first practice lab of this module, so you only need to write new code in the sections [Tools](#3), [Agents](#4), [Tasks](#6) and [Define the inputs](#8.2), the rest of the lab remains the same, with the solution to the previous lab already given to you.

Here is a visual summary of the structure of your crew, as well as the new elements you will be adding: 

<img src="../images/lab2-agents-tasks-diagram.png">


<a id="2"></a>

## 2. Set up your notebook

Begin by setting up the notebook by importing all necessary modules, and configuring the environment variables so you can connect to OpenAI.

In [6]:
from crewai import Agent, Task, Crew, LLM
from crewai_tools import EXASearchTool, ScrapeWebsiteTool
import os
os.environ["CREWAI_TESTING"] = "true"
from utils import get_openai_api_key, get_exa_api_key
from IPython.display import Markdown
import yaml

# set the OpenAI model (gpt-4o-mini)
os.environ["MODEL"] = "gpt-4o-mini"
# set up the OpenAI API key 
os.environ["OPENAI_API_KEY"] = get_openai_api_key()
# set the exa API key
os.environ["EXA_API_KEY"] = get_exa_api_key()

<a id="3"></a>

## 3. Tools

The final goal of this Crew you've been building during the course is to provide the user with a complete report containing researched information about a topic, and what is a report without some cool graphics?

In the next cell, you will be writing a custom tool that automatically creates charts based on a report of gathered information. This tool will be added to the **Report Writer** agent, so it can add visualization into the final report. 

Remember that the base structure for a custom tool is
```python
class MyCustomTool(BaseTool):
    name: str = "Name of my tool"
    description: str = "What this tool does. It's vital for effective utilization."
    args_schema: Type[BaseModel] = {}

    def _run(self, argument: str) -> str:
        # Your tool's logic here
        return "Tool's result"
```

In this case, you will need to complete the `CustomPlotTool` class with: 
- `name`: a suitable name for the tool
- `description`: This should be a detailed description of the tool. Mention:
    - The expected input: the full validated information, as a string
    - What it does: automatically generates plots from text
- `_run()` function: specify the type of input and output expected by the tool

The code for generating the plots is already given to you.

In [7]:
# import packages needed for the custom tool
from crewai.tools import BaseTool
from crewai import LLM
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime
import json

# Define the custom tool for creating plots
class CustomPlotTool(BaseTool):
    ### START CODE HERE ###
    name: None="Custom tool"
    description: None="Custom tool"
    def _run(self, research: None) -> str:
    ### END CODE HERE ###
        try:
            extraction_prompt = f"""
            You are an expert data visualization assistant. Analyze the provided research text and identify meaningful, insightful charts that can be created to visualize quantifiable data supporting the research's key insights and findings. Only suggest charts for data that includes numerical values, measurable trends, comparisons, or categorical distributions that can be effectively plotted.

            Focus on creating visualizations that highlight trends, comparisons, distributions, or relationships that add value to the research. Avoid suggesting charts for purely qualitative or non-quantifiable information.

            For each chart, provide a JSON object with:
              - "chart_type" (string: choose from "line" for trends over time/continuous, "bar" for comparisons, "histogram" for distributions, "scatter" for relationships, "pie" for proportions)
              - "x_axis" (string: variable name for x-axis, e.g., "year", "category")
              - "y_axis" (string: variable name for y-axis, e.g., "value", "count")
              - "color" (string: optional variable for color grouping/hue, or null if not applicable)
              - "Title" (string: descriptive, insightful title that explains what the chart shows)
              - "data" (dictionary: keys matching x_axis, y_axis, and color variables; values as lists of extracted numerical/categorical data from the research)

            Ensure data is accurately extracted and formatted as lists. If a variable has multiple series (e.g., for color), include all in the data dictionary.

            If no quantifiable data suitable for meaningful visualization is present in the research, return an empty array [].

            Text:
            {research}

            Example output (return valid JSON only):
            [
              {{"chart_type": "line", "x_axis": "year", "y_axis": "funding_amount", "color": "sector", "Title": "AI Research Funding Trends by Sector", "data": {{"year": [2020, 2021, 2022], "funding_amount": [2.5, 3.8, 5.2], "sector": ["Healthcare", "Finance", "Tech"]}}}},
              {{"chart_type": "bar", "x_axis": "tool_name", "y_axis": "adoption_rate", "color": null, "Title": "Market Adoption Rates of AI Tools", "data": {{"tool_name": ["ToolA", "ToolB", "ToolC"], "adoption_rate": [45, 67, 23]}}}}
            ]

            Return only the JSON array, no additional text or explanations.
            """
            llm = LLM(model="gpt-4o-mini",)  # Initialize the LLM instance
            llm_response = llm.call([{"role": "user", "content": extraction_prompt}])

            # Clean the response to extract just the JSON part
            llm_response = llm_response.strip()
            if llm_response.startswith('```json'):
                llm_response = llm_response[7:]  # Remove ```json
            if llm_response.endswith('```'):
                llm_response = llm_response[:-3]  # Remove ```
            llm_response = llm_response.strip()

            # --- Step 2: Parse the LLM output ---
            charts_data = json.loads(llm_response)

            if not isinstance(charts_data, list) or len(charts_data) == 0:
                return "No information found in the research to visualize."

            plots_created = []

            # --- Step 3: Create plots for each chart ---
            for i, chart_info in enumerate(charts_data):
                try:
                    # Extract chart configuration
                    chart_type = chart_info.get("chart_type", None).lower()
                    x_axis = chart_info.get("x_axis", "x")
                    y_axis = chart_info.get("y_axis", "y") 
                    title = chart_info.get("Title", f"Chart {i+1}")
                    hue = chart_info.get("color", None)
                    data = chart_info.get("data", {})

                    # Create DataFrame from the data
                    df = pd.DataFrame(data)

                    if df.empty:
                        continue

                    # Create the plot
                    plt.figure(figsize=(10, 6))

                    if chart_type == "line":
                        sns.lineplot(data=df, x=x_axis, y=y_axis, marker="o", hue=hue)
                    elif chart_type in ["bar", "column"]:
                        sns.barplot(data=df, x=x_axis, y=y_axis, hue=hue)
                    elif chart_type == "histogram":
                        plt.hist(df[y_axis], bins=10, alpha=0.7, hue=hue)
                        plt.xlabel(y_axis)
                        plt.ylabel("Frequency")
                    elif chart_type == "scatter":
                        # Default to scatter plot
                        sns.scatterplot(data=df, x=x_axis, y=y_axis, hue=hue)
                    elif chart_type == "pie":
                        # For pie chart, assume y_axis is values, x_axis is labels
                        plt.pie(df[y_axis], labels=df[x_axis], autopct='%1.1f%%', startangle=90)
                        plt.title(title)
                        plt.axis('equal')  # Equal aspect ratio ensures that pie is drawn as a circle.

                    plt.title(title)
                    plt.xticks(rotation=45)
                    plt.tight_layout()

                    # --- Step 4: Save the plot ---
                    os.makedirs("plots", exist_ok=True)
                    filename = f"plots/plot_{i+1}_{datetime.now().strftime('%Y%m%d_%H%M%S')}.png"
                    plt.savefig(filename, dpi=300, bbox_inches='tight')
                    plt.close()

                    plots_created.append(filename)

                except Exception as e:
                    print(f"Error creating chart {i+1}: {str(e)}")
                    continue

            if plots_created:
                return f"Successfully created {len(plots_created)} plots: {', '.join(plots_created)}"
            else:
                return "No plots could be created from the extracted data."

        except json.JSONDecodeError as e:
            return f"Error parsing LLM response as JSON: {str(e)}"
        except Exception as e:
            return f"Error generating smart plot: {str(e)}"

As in the previous labs, you will still use the web search and scraping tools for the **Internet Researcher** and **Fact Checker** agents, which you will initialize in the next cell

In [8]:
# create tools instances
exa_search_tool = EXASearchTool(base_url=os.getenv("EXA_BASE_URL"))
scrape_website_tool = ScrapeWebsiteTool()

<a id="4"></a>

## 4. Agents

For this system, you will use four agents:
- Research Planner
- Internet Researcher
- Fact checker
- Report Writer

All their arguments (`role`, `goal`, `backstory`) are already given to you in a YAML file, which you will use to configure the agents. If you want to take a closer look go to the [config/agents.yaml](config/agents.yaml) file on the file navigator on the left.

Don't forget to add the new custom tool to the **Report Writer** agent!

In [11]:
# load the configuration file for the agents
with open('config/agents.yaml', 'r') as file:
        agent_config = yaml.safe_load(file)


# create the agents using the configuration
research_planner = Agent(
        config=agent_config['research_planner'],
        verbose=True
        )

internet_researcher = Agent( 
        config=agent_config['internet_researcher'],
        verbose=True,
        tools=[exa_search_tool, scrape_website_tool]
        )

fact_checker = Agent(
        config=agent_config['fact_checker'],
        verbose=True,
        tools=[exa_search_tool, scrape_website_tool]
        )
report_writer = Agent(
        config=agent_config['report_writer'],
        verbose=True,
        ### START CODE HERE ### 
        # add the automatic plot tool
        tools=[CustomPlotTool()]
        ### END CODE HERE ###
        )

<a id="5"></a>

## 5. Guardrails

To make your system more robust, you want to add guardrails to your tasks. These guardrails provide a way to validate and transform task outputs before they are passed to the next task, helping ensure data quality and providing feedback to agents when their output doesn't meet specific criteria. You can find out more about guardrails in the [docs](https://docs.crewai.com/en/concepts/tasks#task-guardrails).


In particular, you will implement a guardrail for the final output. You want to make sure the final report has all the sections needed: 
- Summary
- Insights (or recommendations)
- Citations (or References)

In [12]:
import re

# write the custom guardrail function
def write_report_guardrail(output):
    # get the raw output from the TaskOutput object
    try:
        output = output if type(output)==str else output.raw 
    except Exception as e:
        return (False, ("Error retrieving the `raw` argument: "
                        f"\n{str(e)}\n"
                        )
                )
    
    # convert the output to lowercase
    output_lower = output.lower()

    # check that the summary section exists
    if not re.search(r'#+.*summary', output_lower):
        return (False, 
                "The report must include a Summary section with a header like '## Summary'"
                )

    # check that the insights or recommendations sections exist
    if not re.search(r'#+.*insights|#+.*recommendations', output_lower):
        return (False, 
                "The report must include an Insights section with a header like '## Insights'"
                )

    # check that the citations (or references) section exists
    if not re.search(r'#+.*citations|#+.*references', output_lower): 
        return (False, 
                "The report must include a Citations (or References) section with a header like '## Citations'"
                )
    return (True, output)

<a id="6"></a>

## 6. Tasks
Now you are ready to create the tasks. Just as you did with the agents, you will load the configuration from a YAML file, which you can find in [`config/tasks.yaml`](config/tasks.yaml).  

In order to actually add the charts to the final report you will need to update the `write_final_report` task in the [`tasks.yaml`](config/tasks.yaml) file. Adapt the `description` and `expected_output` to include instructions include the charts generated by the custom tool.

Once that's done, run the next cell to create each task. The agents, guardrails, and context are already defined.

In [13]:
# load the configuration file for the tasks
with open('config/tasks.yaml', 'r') as file:
    task_config = yaml.safe_load(file)


# create the tasks using the configuration
create_research_plan = Task( 
    config=task_config['create_research_plan'],
    agent=research_planner 
)

gather_research_data = Task(
    config=task_config['gather_research_data'],
    agent=internet_researcher,
)

verify_information_quality = Task(
    config=task_config['verify_information_quality'],
    agent=fact_checker, 
)

write_final_report = Task( 
    config=task_config['write_final_report'],
    agent=report_writer, 
    guardrails=[write_report_guardrail],
)

<a id="7"></a>

## 7. Execution hooks

The last step before creating the Crew is creating an [after kickoff hook](https://docs.crewai.com/en/learn/before-and-after-kickoff-hooks#after-kickoff-hook). 

In this case, you will create a hook that takes the final output and saves it to a Markdown file on your local file system. 

In [14]:
def save_file_hook(result):
    """
    Save the final research report to a local markdown file
    """
    try:
        # Get the final report content from the last task output
        if hasattr(result, 'tasks_output') and result.tasks_output:
            report_content = result.tasks_output[-1].raw
        else:
            report_content = str(result)
        
        filename = f"research_report-p2.md"
        
        # Save to file
        with open(filename, 'w', encoding='utf-8') as f:
            f.write(report_content)
        
        print(f"Report successfully saved to: {filename}")
        
    except Exception as e:
        print(f"Error saving report to file: {str(e)}")

<a id="8"></a>

## 8. Crew

<a id="8-1"></a>

### 8.1. Define the crew
Run the next cell to define the crew. 

In [15]:
# Create the urban planning crew
deep_research_crew = Crew(
    # include all the agents
    agents=[research_planner, 
            internet_researcher, 
            fact_checker, 
            report_writer],
    # include all the tasks in the order to be executed
    tasks=[create_research_plan, 
           gather_research_data, 
           verify_information_quality, 
           write_final_report],
    # add memory to the crew
    memory=True,
    # add the after kickoff hook
    after_kickoff_callbacks=[save_file_hook]
)

<a id="8-2"></a>

### 8.2. Define the inputs

Use the next cell to define the inputs to your Crew. This should represent the user's query. Try using the same query as in the previous lab, so you can compare the results.

In [16]:
### START CODE HERE ###

# write your query in the "user_query" value
inputs = { 
        "user_query": "how to be the best ceo?"
}
### END CODE HERE ###   

<a id="8-3"></a>

### 8.3. Run the crew
Now you can run, or kick off, the crew to get the result.

In [17]:
# Execute the crew's tasks
result = deep_research_crew.kickoff(inputs=inputs)

[91m

I encountered an error while trying to use the tool. This was the error: Arguments validation failed: 1 validation error for CustomPlotToolSchema
research
  Input should be None [type=none_required, input_value={'description': 'Create a...s.', 'type': 'NoneType'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.12/v/none_required.
 Tool Custom tool accepts these inputs: Tool Name: Custom tool
Tool Arguments: {'research': {'description': None, 'type': 'NoneType'}}
Tool Description: Custom tool
[0m


[91m

I encountered an error while trying to use the tool. This was the error: Arguments validation failed: 1 validation error for CustomPlotToolSchema
research
  Input should be None [type=none_required, input_value={'description': 'How to b...e.', 'type': 'NoneType'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.12/v/none_required.
 Tool Custom tool accepts these inputs: Tool Name: Custom tool
Tool Arguments: {'research': {'description': None, 'type': 'NoneType'}}
Tool Description: Custom tool
[0m


Report successfully saved to: research_report-p2.md


After it finishes running, you should be able to see the newly created Markdown file with your report in the file navigator on the left. You can compare it with the results from the first lab, can you see any differences?

Congratulations! You've successfully completed this lab ðŸŽ‰