# Lab 6 - Agent Collaboration

In this notebook, we will explore new aspects to manage collaboration among a team of agents. It will contain guided exercises, where the steps you have to complete will be explained, and you'll need to try completing them.

## What are we going to cover in this notebook?

We will cover the following topics:

1. Hierarchy
2. Managers
3. Delegation
4. Asynchronous tasks
5. Example Use Case

We'll see agent collaboration while trying to solve another example use case.

Having a compilation of AI-related news ([AI news](https://www.kaggle.com/datasets/deepanshudalal09/mit-ai-news-published-till-2023)) as a CSV file, we will create a team of agents to extract and summarize one article that was written on a specific month and year. The problem statement is as follows:

**Team Task**: News Article Summarization for a Specific Month and Year
The goal of this task is to summarize one selected news article from a CSV file that was published in a specific month and year, which will be determined at runtime.

**Description of the Data**:
The input data is a CSV file containing news article metadata and text. Each row represents a single news article.

**Team Goal**:
The team's objective is to extract all news articles from a specific month and year (which will be determined at runtime), select one of them, and provide a brief summary of the article that is understandable by the general public.

**Note**: This is just a toy example to illustrate several collaboration functionalities. The objective is not to obtain the best outputs, but rather to learn how to design and configure an appropriate Multi-Agent System.

## Step 1: Analyse the data

1. First, you'll need to download the articles CSV file from the URV Virtual Campus or Kaggle.
2. Open the file and analyze the columns it contains.

In [1]:
import pandas as pd

articles = pd.read_csv('AI_articles.csv')
articles.head()

Unnamed: 0.1,Unnamed: 0,Published Date,Author,Source,Article Header,Sub_Headings,Article Body,Url
0,0,"July 7, 2023",Adam Zewe,MIT News Office,Learning the language of molecules to predict ...,This AI system only needs a small amount of da...,['Discovering new materials and drugs typicall...,https://news.mit.edu/2023/learning-language-mo...
1,1,"July 6, 2023",Alex Ouyang,Abdul Latif Jameel Clinic for Machine Learning...,MIT scientists build a system that can generat...,"BioAutoMATED, an open-source, automated machin...",['Is it possible to build machine-learning mod...,https://news.mit.edu/2023/bioautomated-open-so...
2,2,"June 30, 2023",Jennifer Michalowski,McGovern Institute for Brain Research,"When computer vision works more like a brain, ...",Training artificial neural networks with data ...,"['From cameras to self-driving cars, many of t...",https://news.mit.edu/2023/when-computer-vision...
3,3,"June 30, 2023",Mary Beth Gallagher,School of Engineering,Educating national security leaders on artific...,"Experts from MIT’s School of Engineering, Schw...",['Understanding artificial intelligence and ho...,https://news.mit.edu/2023/educating-national-s...
4,4,"June 30, 2023",Adam Zewe,MIT News Office,Researchers teach an AI to write better chart ...,A new dataset can help scientists develop auto...,['Chart captions that explain complex trends a...,https://news.mit.edu/2023/researchers-chart-ca...


## Step 2: Library Initialization

1. Load the necessary crewai libraries

In [None]:
# Load libraries



In [2]:
from crewai import Agent, Task, Crew, Process
from IPython.display import Markdown

## Step 3: Create a Data Schema

Create a new pydantic data schema to store an article. The schema should contain:
- **Class Name**: News
- **Fields**:
  - Heading (str)
  - Subheading (str)
  - Text (str)

**Note**: don't forget to add the `get_schema` @classmethod from Lab 05!

In [None]:
# Create the News pydantic schema



In [3]:
from pydantic import BaseModel, Field
from typing import List

class News(BaseModel):
    """Output of the NewsLoader tool"""
    heading: str = Field(..., description='Heading of the article')
    subheading: str = Field(..., description='Subheading of the article')
    text: str = Field(..., description='Text of the article')

    @classmethod
    def get_schema(cls) -> str:
        schema = '\n'
        for field_name, field_instance in cls.__fields__.items():
            schema += f'{field_name}, described as: {field_instance.description}\n'
        return schema

## Step 4: Create a New Custom Tool

We'll create a new custom tool to load CSV data and select one article. The tool must have the following features:

- Have an option to load the CSV when it is initialized
- The run method must take **month** and **year** as input parameters
- The run method can optionally load the CSV at runtime
- Filter the news articles for the specified month and year
- Select one of them, which can be done randomly or using a more complex approach

A boilerplate for the tool is provided.

In [4]:
from crewai_tools import BaseTool
import pandas as pd


class NewsLoader(BaseTool):
    name: str
    description: str
    news_df: pd.DataFrame = None

    class Config:
        arbitrary_types_allowed = True
    
    def __init__(self, csv_path: str = None, **kwargs):
        super().__init__(**kwargs)
        # Code to load the CSV file if provided
    
    def _run(self, month: int, year: int, csv_path: str = None) -> News:
        # Optionally, load the CSV from the csv_path
        # Code to filter the dataframe and return one randomly selected article
        
        
        return selected_article

In [5]:
import ast, random
from crewai_tools import BaseTool
import pandas as pd


class NewsLoader(BaseTool):
    name: str = 'CSV News Loader'
    description: str = 'Returns one complete article from a CSV file writen in a specified month and year'
    news_df: pd.DataFrame = None

    class Config:
        arbitrary_types_allowed = True
    
    def __init__(self, csv_path: str = None, **kwargs):
        super().__init__(**kwargs)
        if csv_path: # Load CSV if provided
            try:
                self.news_df = pd.read_csv(csv_path)
            except Exception as e:
                print(f'Error loading CSV file: {e}')
    
    def _run(self, month: int, year: int, csv_path: str = None) -> News:
        if csv_path: # Load CSV at runtime if provided
            try:
                self.news_df = pd.read_csv(csv_path)
            except Exception as e:
                print(f'Error loading CSV file: {e}')
                return
        if self.news_df is None: # Check if a CSV file has been previously provided 
            print('The csv_path to the file must be provided!')
            return
        self.news_df['Published Date'] = pd.to_datetime(self.news_df['Published Date'])
        filtered_df = self.news_df.loc[(self.news_df['Published Date'] >= f'{year}-{month}-1') & (self.news_df['Published Date'] <= f'{year}-{month}-31')] # Filter by month and year
        filtered_news = []
        for _, row in filtered_df.iterrows(): # For each available article
            text = ast.literal_eval(row['Article Body']) # Preprocess the multiple paragraphs text as a single string
            final_text = ''
            for paragraph in text:
                final_text += paragraph
            filtered_news.append(News(heading=row['Article Header'], subheading=row['Sub_Headings'], text=final_text)) # Create a list of articles    
        
        return random.choice(filtered_news) # Randomly select one of them

Now, you can verify that the tool is working correctly.

We're providing the [`result_as_answer=True` parameter](https://docs.crewai.com/how-to/Force-Tool-Ouput-as-Result/) to ensure the tool's output will be provided as the agents answer.

In [6]:
news_loader = NewsLoader(result_as_answer=True)
news = news_loader.run(month=1, year=2023, csv_path='AI_articles.csv')

Using Tool: CSV News Loader


In [7]:
news

News(heading='2022-23 Takeda Fellows: Leveraging AI to positively impact human health', subheading='New fellows are working on health records, robot control, pandemic preparedness, brain injuries, and more.', text='The MIT-Takeda Program, a collaboration between MIT’s School of Engineering and Takeda Pharmaceuticals Company, fuels the development and application of artificial intelligence capabilities to benefit human health and drug development. Part of the Abdul Latif Jameel Clinic for Machine Learning in Health, the program coalesces disparate disciplines, merges theory and practical implementation, combines algorithm and hardware innovations, and creates multidimensional collaborations between academia and industry.With the aim of building a community dedicated to the next generation of AI and system-level breakthroughs, the MIT-Takeda Program is also creating educational opportunities. Every year Takeda funds fellowships to support graduate students pursuing research related to he

## Step 5: Create the Agents

Now we'll create the agents.
We propose creating three agents (Data Ingestion Agent, Writer Agent, and Editor Agent) to work together efficiently in a collaborative environment.

Each agent has a specific role:
- **Data Ingestion Agent**: processes and filters the news data. It has access to the News Loader tool in order to complete its main task.
- **Writer Agent**: creates concise summaries for easier understanding.
- **Editor Agent**: reviews and refines content for publication.

We enabled `allow_delegation` in the Writer Agent so it can collaborate with the Editor Agent towards obtaining a text of greater quality.

A proposed set of agents is provided below. Now, you must create them. You're free to modify the descriptions!

**Data Ingestion Agent**
| **Role** | **Goal** | **Backstory** | **Allow Delegation** | **Tools** |
| --- | --- | --- | --- | --- |
| Data Reader | Fetch an article writen in a given time period from a CSV file source. | The Data Ingestion Agent is responsible for reading the CSV file {csv_file} containing the news articles in a given time period of month {month} and year {year}, providing one article to other agents. | False |  [News Loader] | 

**Writer Agent**
| **Role** | **Goal** | **Backstory** | **Allow Delegation** |
| --- | --- | --- | --- |
| Writer | Summarize the fetched news article and write a report on that article. | The writer Agent takes the fetched article from the Data Reader and creates a concise summaries to provide an easy description of the article for the general public. | True |

**Editor Agent**
| **Role** | **Goal** | **Backstory** | **Allow Delegation** |
| --- | --- | --- | --- |
| Content Editor | Review news articles for publication. | As an experienced editor, you'll be reviewing the content written by other agents to ensure it meets our quality standards. | False |

In [None]:
# Create the agents




In [8]:
data_ingestion_agent = Agent(
    role='Data Reader',
    goal='Fetch a news containing header, subheader and text writen in a given time period from a CSV file source.',
    backstory=(
        "The Data Ingestion Agent is responsible for reading the CSV file {csv_file} containing "
        "a news article in a given time period of month {month} and year {year}, providing "
        "one news to other agents."
    ),
    allow_delegation=False,
    tools=[NewsLoader(result_as_answer=True)],
    verbose=True,
    llm='ollama/llama3.1'
)

writer_agent = Agent(
    role='Writer',
    goal='Summarize the fetched news article and write a report on that article.',
    backstory=(
        "The writer Agent takes the fetched article from the Data Reader and creates a summary "
        "to provide an easy description of the article for the general public."
    ),
    allow_delegation=True,
    verbose=True,
    llm='ollama/llama3.1'
)

editor_agent = Agent(
    role='Content Editor',
    goal='Spellcheck news articles for publication.',
    backstory=(
        "As an experienced editor, you'll be providing spellchecked content written by other agents "
        "to ensure it meets our quality standards."
    ),
    allow_delegation=False,
    verbose=True,
    llm='ollama/llama3.1'
)

## Step 6: Create a Callback Function

We will create a [callback function](https://docs.crewai.com/concepts/tasks#callback-mechanism), which is a function that is executed after completing a task. Our usage will be to write the final output of the agents in a Markdown file.

When creating a callback function, the input parameter of the function will be a [Task Output](https://docs.crewai.com/concepts/tasks#task-output-attributes). Check the attributes of the Task Output and complete the following function to save the results on a Markdown file:

**Note**: To implement the callback function properly, please refer to the documentation for available attributes in Task Output.

In [9]:
def save_markdown(task_output):
    filename = 'Articles Summary.md'
    # Save task output to 'Articles Summary.md'
    

In [10]:
def save_markdown(task_output):
    filename = 'Articles Summary.md'
    with open(filename, 'w') as file:
        file.write(task_output.raw)
    print(f'Articles saved as {filename}')

## Step 7: Create the Tasks

Now we'll create the needed tasks. The proposed ones are specified below, although you're free to modify them or introduce additional ones. Before implementing them, take a look at them to discover the purpose of each of them.

One possibility of tasks is [asynchronous execution](https://docs.crewai.com/how-to/sequential-process#asynchronous-execution), which allows for asynchronously executing tasks and parallel processing. In this simple example, it's not required since the tasks are quite streamlined.

You can also [limit the amount of iterations](https://docs.crewai.com/how-to/hierarchical-process#key-features) to complete tasks or agents' answers. Here, we want to directly provide the fetched text.

Finally, you can provide [context](https://docs.crewai.com/concepts/tasks#referring-to-other-tasks) to specifically define which tasks' outputs should be used as context for a given task. In this simple case, it's not necessary, but complex cases might benefit from it, as it helps the system know how to coordinate tasks.

**Fetch News**

| **Task Detail** | **Value** |
| --- | --- |
| **Description** | Provide one complete news article from the period month {month} of year {year} from the CSV file {csv_file}. |
| **Agent** | `data_ingestion_agent` |
| **Async Execution** | False |
| **Expected Output** | One news article of a specific month ({month}) and year ({year}). The news should be complete, withhout introducing any modification, providing heading, subheading and the article text. The article must not be summarised. f"The output should adhere to the following schema: {News.get_schema()} |
| **Output Pydantic** | News |
| **Max iterations** | 1 |

**Summarize News**

| **Task Detail** | **Value** |
| --- | --- |
| **Description** | Summarize a news story that has been fetched by the data ingestion agent into a summary of around 500 words. |
| **Agent** | `writer_agent` |
| **Async Execution** | False |
| **Expected Output** | A concise summary highlighting the key points and findings of an article. The summary should be no longer than 500 words. Summary should be formatted with concise language to make complex concepts accessible to a general audience. |
| **Context** | [fetch_news] |

**Edit News**

| **Task Detail** | **Value** |
| --- | --- |
| **Description** | A news article that has been spellchecked and is ready for publication. The article summary is provided by the writer agent. |
| **Agent** | `editor_agent` |
| **Async Execution** | False |
| **Expected Output** | A news article, ensuring it has been spellchecked, with any necessary corrections or revisions made. The editor should:<br>- Introduce any necessary modifications to obtain a spellchecked text.<br>- Provide the news article without any additional comments. |
| **Context** | [summarise_news]
| **Callback Function** | `save_markdown` |

In [None]:
# Create the tasks




In [11]:
fetch_news = Task(
    description='Provide one complete news article from the period month {month} of year {year} from the CSV file {csv_file}.',
    agent=data_ingestion_agent,
    async_execution=False,
    expected_output=(
        "One news article of a specific month ({month}) and year ({year}). The news should be complete, "
        "withhout introducing any modification, providing heading, subheading and the article text. The "
        "article must not be summarised. "
        f"The output should adhere to the following schema: {News.get_schema()}"
    ),
    output_pydantic=News,
    max_iter=1
)

summarise_news = Task(
    description='Summarize a news story that has been fetched by the data ingestion agent into a summary of around 500 words.',
    agent=writer_agent,
    async_execution=False,
    expected_output=(
        "A concise summary highlighting the key points and findings of an article."
        "The summary should be no longer than 500 words. Summary should be formatted with concise language to "
        "make complex concepts accessible to a general audience."
    ),
    context=[fetch_news]
)

edit_news = Task(
    description='A news article that has been spellchecked and is ready for publication. The article summary is provided by the writer agent',
    agent=editor_agent,
    async_execution=False,
    expected_output=(
        "A news article, ensuring it has been spellchecked, with "
        "any necessary corrections or revisions made. The editor should:\n"
        "- Introduce any necessary modifications to obtain a spellchecked text.\n"
        "- Provide the news article without any additional comments."
    ),
    context=[summarise_news],
    callback=save_markdown
)

## Step 8: Crew and Process

The next step is to create our team of agents. To do this, we need to create a crew that specifies the agents and tasks in no specific order.

Then, add the following parameters to configure the crew:

```python
process=Process.hierarchical
manager_llm='ollama/llama3.1'
```

In [None]:
# Create the crew



In [12]:
crew = Crew(
    agents=[data_ingestion_agent, writer_agent, editor_agent],
    tasks=[fetch_news, summarise_news, edit_news],
    process=Process.hierarchical,
    #manager_agent=manager_agent,
    manager_llm='ollama/llama3.1',
)

## Hierarchical Process and Planning

By specifying `Process.hierarchical`, we replace the previous sequential task completion behavior. Instead of executing tasks in the specified order, a manager agent delegates tasks to worker agents to execute.

When using a [hierarchical process](https://docs.crewai.com/concepts/processes), you must define either a `manager_agent` or a `manager_llm` that will perform the management task. In this case, we used a `manager_llm`. The temperature of the model controls its level of creativity; a higher value can lead to more innovative solutions.

A manager is created in the same way as other agents, and its role is to oversee the execution of tasks by worker agents.

In contrast to hierarchical processes, sequential process cases offer the option to use [Planning](https://docs.crewai.com/concepts/planning) to enhance agent outputs. Before starting execution, a [planning_llm](https://docs.crewai.com/concepts/planning#planning-llm) creates a step-by-step plan to complete tasks, potentially leading to improved outcomes.

## Step 9: Run the Crew

You've now reached the final step: executing your crew!

**Important:** Before running the crew, make sure you provide the necessary inputs:
- A month
- A year
- The path to the CSV file containing the news articles

With these inputs in place, you can execute the crew and see it in action. Results will be saved in a Markdown file, and you can also visualise them here.

In [None]:
# Kick off the crew


In [13]:

result = crew.kickoff(inputs={'month': 1, 'year': 2023, 'csv_file': './AI_articles.csv'})

[1m[95m# Agent:[00m [1m[92mData Reader[00m
[95m## Task:[00m [92mGet one complete news article from the CSV file ./AI_articles.csv[00m


[1m[95m# Agent:[00m [1m[92mData Reader[00m
[95m## Using tool:[00m [92mCSV News Loader[00m
[95m## Tool Input:[00m [92m
"{\"month\": 1, \"year\": 2023, \"csv_path\": \"./AI_articles.csv\"}"[00m
[95m## Tool Output:[00m [92m
heading='Gaining real-world industry experience through Break Through Tech AI at MIT' subheading='A new experiential learning opportunity challenges undergraduates across the Greater Boston area to apply their AI skills to a range of industry projects.' text='Taking what they learned conceptually about artificial intelligence and machine learning (ML) this year, students from across the Greater Boston area had the opportunity to apply their new skills to real-world industry projects as part of an experiential learning opportunity offered through Break Through Tech AI at MIT.Hosted by the MIT Schwarzman Colleg

## Step 10: Reviewing Results

With the crew execution complete, you can now review the results.

You can either:
- Check the crew's output directly
- Open the `Articles Summary.md` file to view the summary of news articles

In [14]:
Markdown(result.raw)

The Breakthrough Tech AI program is an innovative 18-month education initiative that provides students with hands-on experience in machine learning and Artificial Intelligence (AI). This comprehensive program aims to bridge the gap between theoretical knowledge and practical application. Empowering students with the skills required to excel in the ever-evolving tech industry.

The Break Through Tech AI program has made a significant impact in its inaugural year, empowering students with the knowledge and skills required to succeed in the tech industry. By providing hands-on experience in machine learning and AI, this initiative is poised to create a new generation of innovators who can tackle complex problems and drive positive change.

## Conclusions

There are several tools available to enhance collaboration capabilities among agents or facilitate their collaboration more effectively.

One primary option is the hierarchical process, where a manager agent oversees task delegation and distribution. While powerful and potentially efficient, this approach relies heavily on the quality of the LLM providing effective management. Unfortunately, LLMs often struggle with "hallucination", generating responses that are not based on actual data or context.

This limitation is particularly relevant in this use case.

As an alternative to hierarchical processes, you can consider using sequential processes with planning. This approach involves creating a step-by-step plan for task execution and can help mitigate the challenges associated with LLMs. Additionally, more advanced techniques such as routing or conditional tasks may also be used to optimize collaboration among agents.

## Useful resources

- [CrewAI Docs](https://docs.crewai.com/)
- [CrewAI Repository](https://github.com/crewAIInc/crewAI)
- [Langchain Docs](https://python.langchain.com/v0.2/docs/introduction/)