# Automatic Deep Research - Building more reliable systems

Welcome to the first practice lab of this module! 

In the last module, you went through a very interesting use case of multi agent systems: building your custom deep research system. In this lab you will use what you have already built in Module 1, and add decision making tools, like execution hooks and guardrails, as well as memory to improve the reliability of your crew.

**What you'll learn:**
- How to add programmatic guardrails to make your multi agent system more robust
- How to add execution hooks to inject logic after the agents run
- How to add memory to your crew

## Background

As a research consultant, you're constantly tasked with producing comprehensive reports on diverse topics for demanding clients. You need to build an AI research crew that can rapidly gather, verify, and synthesize information from across the internet, delivering reliable, fact-checked reports that meet tight deadlines and exacting standards regardless of the subject matter.

## General instructions
In this lab you will be presented with a structure of the code, but you will need to complete some of it. 

To successfully run this lab, replace all instances of the placeholder `None` with your own code. Sections where you need to write code will be delimited between `### START CODE HERE ###` and `### END CODE HERE ###`.

If you are stuck, or simply want to copy a solution into your notebook so that you can execute it, you can find all solution code inside the [Solution](Solution) folder.

**<font color='#5DADEC'>Please make sure to save your work periodically, so you don't lose any progress.</font>**

## Table of contents

- [1. Problem statement](#1)
- [2. Set up your notebook](#2)
- [3. Agents](#3)
- [4. Guardrails](#4)
- [5. Tasks](#5)
- [6. Execution hooks](#6)
- [7. Crew](#7)
  - [7.1. Define the crew](#7-1)
  - [7.2. Define the inputs](#7-2)
  - [7.3. Run the crew](#7-3)

<a id="1"></a>

## 1. Problem statement

The goal of this lab is to take a multi-agent system that can interpret a user's input, and create an action plan, then do the actual research and fact checking, and finally output a report you can share with the client. In order to make the output more reliable, you will add new  guardrails, execution hooks and memory into strategic elements of Crew. 

Here is a visual summary of the structure of your crew, as well as the new elements you will be adding: 

<img src="../images/lab1-agents-tasks-diagram.PNG">


<a id="2"></a>

## 2. Set up your notebook

Begin by setting up the notebook by importing all necessary modules, and configuring the environment variables so you can connect to OpenAI.

In [None]:
from crewai import Agent, Task, Crew
from crewai_tools import EXASearchTool, ScrapeWebsiteTool
import os
os.environ["CREWAI_TESTING"] = "true"
from utils import get_openai_api_key, get_exa_api_key
from IPython.display import Markdown
import yaml

# set the OpenAI model (gpt-4o-mini)
os.environ["MODEL"] = "gpt-4o-mini"
# set up the OpenAI API key 
os.environ["OPENAI_API_KEY"] = get_openai_api_key()
# set the EXA API key
os.environ["EXA_API_KEY"] = get_exa_api_key()

<a id="3"></a>

## 3. Agents

For this system, you will use four agents:
- Research Planner
- Internet Researcher
- Fact checker
- Report Writer

All their arguments (`role`, `goal`, `backstory`) are already given to you, and given in a YAML file you can use to import the configuration. If you want to take a closer look, open the [config/agents.yaml](config/agents.yaml) file in the file navigator on the left.

In the labs, we have added two parameters not shown in the demo videos: `max_rpm`, and `max_iter`. `max_rpm` sets the maximum requests per minute to avoid rate limits, while `max_iter` limits the maximum iterations before the agent must provide its best answer. Setting these two parameters helps make the agents run a little faster, so the lab doesn't take as long to complete. 

Run the next cell to create an instance of each agent, as well as the tools for the **Internet Researcher** agents.

In [None]:
# create the tool instances
exa_search_tool = EXASearchTool(base_url=os.getenv("EXA_BASE_URL")) 
scrape_website_tool = ScrapeWebsiteTool()

# load the configuration file for the agents
with open('config/agents.yaml', 'r') as file:
        agent_config = yaml.safe_load(file)

# create the agents using the configuration
research_planner = Agent(
        config=agent_config['research_planner'],
        verbose=True,
        max_rpm=150,
        max_iter=15
        )
internet_researcher = Agent(
        config=agent_config['internet_researcher'],
        tools=[exa_search_tool, scrape_website_tool],
        verbose=True,
        max_rpm=150,
        max_iter=15
        )
fact_checker = Agent(
        config=agent_config['fact_checker'],
        tools=[exa_search_tool, scrape_website_tool],
        verbose=True,
        max_rpm=150,
        max_iter=15
        )
report_writer = Agent(
        config=agent_config['report_writer'],
        verbose=True,
        max_rpm=150,
        max_iter=15
        )

<a id="4"></a>

## 4. Guardrails

To make your system more robust, you want to add guardrails to your tasks. These guardrails provide a way to validate and transform task outputs before they are passed to the next task, helping ensure data quality and providing feedback to agents when their output doesn't meet specific criteria. You can find out more about guardrails in the [docs](https://docs.crewai.com/en/concepts/tasks#task-guardrails).

In this lab, you will be working with [**Task Guardrails**](https://docs.crewai.com/en/concepts/tasks#task-guardrails). These are custom functions that check if a task's output meets your requirements before passing it to the next task. They help ensure quality and give feedback to agents when their work needs improvement.

The guardrail functions must accept exactly one parameter (the task output they are reviewing), and should return a tuple of `(bool, Any)`. If the validation is successful, it returns a tuple of `(bool, Any)`. For example: (True, validated_result). If it fails, it needs to return a tuple of `(bool, str)`. For example: (False, "Error message explaining the failure"). For more information, you can check out the [docs](https://docs.crewai.com/en/concepts/tasks#task-guardrails).

In particular, you will implement a guardrail for the final output. You want to make sure the final report has all the sections needed: 
- Summary
- Insights (or recommendations)
- Citations (or References)

To make sure the keywords are in fact in a section title, you should check the line begins with `#`. You will use regular expressions for that.

In [None]:
import re

# write the custom guardrail function
def write_report_guardrail(output):
    # get the raw output from the TaskOutput object
    try:
        output = output if type(output)==str else output.raw 
    except Exception as e:
        return (False, ("Error retrieving the `raw` argument: "
                        f"\n{str(e)}\n"
                        )
                )
    
    # convert the output to lowercase
    output_lower = output.lower()

    # check that the summary section exists
    if not re.search(r'#+.*summary', output_lower):
        return (False, 
                "The report must include a Summary section with a header like '## Summary'"
                )

    # check that the insights or recommendations sections exist
    if not re.search(r'#+.*insights|#+.*recommendations', output_lower):
        return (False, 
                "The report must include an Insights section with a header like '## Insights'"
                )

    ### START CODE HERE ###

    # check that the citations (or references) section exists
    if not re.search(None, None):
        return (None, None)

    ### END CODE HERE ###
    return (True, output)

Run the next two cells to test the guardrail function, one cell has a valid structure, and the other is missing sections

In [None]:
test_report_pass = """
# Report title

## Executive Summary
This is a summary.

## Insights
These are the insights.

## Citations
1. Citation 1
2. Citation 2
"""

write_report_guardrail(test_report_pass)

In [None]:
test_report_fail = """
# Report title

## Executive Summary
This is a summary.
"""

write_report_guardrail(test_report_fail)

<a id="5"></a>

## 5. Tasks
Now you are ready to create the tasks. Just as you did with the agents, you will load the configuration from a YAML file. You can find it in [`config/tasks.yaml`](config/tasks.yaml). 
In this case, you will need to add the agents, and the guardrails you just created to the corresponding tasks.

In [None]:
# load the configuration file for the tasks
with open('config/tasks.yaml', 'r') as file:
    task_config = yaml.safe_load(file)

### START CODE HERE ###

# create the tasks using the configuration
create_research_plan = Task( 
    config=task_config['create_research_plan'],
    agent=None
)

gather_research_data = Task( 
    config=task_config['gather_research_data'],
    agent=None,
)

verify_information_quality = Task( 
    config=task_config['verify_information_quality'],
    agent=None,
)

write_final_report = Task( 
    config=task_config['write_final_report'],
    agent=None,
    guardrails=[None], # add the custom guardrail
)

### END CODE HERE ###

<a id="6"></a>

## 6. Execution hooks

The last step before creating the Crew is creating an [after kickoff hook](https://docs.crewai.com/en/learn/before-and-after-kickoff-hooks#after-kickoff-hook). This is a function that will execute after your crew has finished all the tasks. These functions receive a result object, which contains the outputs of the crew's execution.

In this case, you will create a hook that takes the final output and saves it to a Markdown file on your local file system. You do not need to write any code in this next cell.

In [None]:
def save_file_hook(result):
    """
    Save the final research report to a local markdown file
    """
    try:
        # Get the final report content from the last task output
        if hasattr(result, 'tasks_output') and result.tasks_output:
            report_content = result.tasks_output[-1].raw
        else:
            report_content = str(result)
        
        filename = f"research_report.md"
        
        # Save to file
        with open(filename, 'w', encoding='utf-8') as f:
            f.write(report_content)
        
        print(f"Report successfully saved to: {filename}")
        
    except Exception as e:
        print(f"Error saving report to file: {str(e)}")

<a id="7"></a>

## 7. Crew

<a id="7-1"></a>

### 7.1. Define the crew
Now you are ready to define the crew to run the deep research. As with the previous lab, you will need to define the agents and tasks. This time, you will also add the after kickoff hook and memory to the Crew.

To add the execution hook, you need to set the argument `after_kickoff_callbacks` with a list containing all the after kickoff hooks you need to run, in this case the `save_file_hook`/

In [None]:
# Create the urban planning crew
deep_research_crew = Crew(
    # include all the agents
    agents=[research_planner, 
            internet_researcher, 
            fact_checker, 
            report_writer],
    # include all the tasks in the order to be executed
    tasks=[create_research_plan, 
           gather_research_data, 
           verify_information_quality, 
           write_final_report],

    ### START CODE HERE ###
    
    # add memory to the crew
    memory=None,
    # add the after kickoff hook
    after_kickoff_callbacks=[None]

    ### END CODE HERE ###
)

<a id="7-2"></a>

### 7.2. Define the inputs

Use the next cell to define the inputs to your Crew. This should represent the user's query. Write your own query, what would you like information about?

In [None]:
### START CODE HERE ###

# write your query in the "user_query" value
inputs = { 
    "user_query": None
}

### END CODE HERE ###

<a id="7-3"></a>

### 7.3. Run the crew
Now you can run, or kick off, the crew to get the result.

In [None]:
# Execute the crew's tasks
result = deep_research_crew.kickoff(inputs=inputs)

Once the crew is done, you should be able to see the newly created Markdown file with your report in the file navigator on the left. 

Congratulations, you reached the end of this lab! ðŸŽ‰