# Data analytical multi-agent workflow:

You can also use multiple agents in a workflow. Here is an example:

1. Data Analysis Planning:

  The planning agent writes a comprehensive data analysis plan, outlining the steps required to analyze the data.

2. Code Generation and Execution:

  For each step in the analysis plan, the Python agent generates the corresponding code.
The Python agent then executes the generated code to perform the specified analysis.

3. Analysis Report Summarization:

  Based on the results of the executed code, the summarization agent writes an analysis report.
The report summarizes the findings and insights derived from the data analysis.


## Install dependencies

First we will install the python SDK and set our API key!

In [None]:
!pip install mistralai==1.0.0

In [None]:
import os
from mistralai import Mistral
import re

api_key = os.environ["MISTRAL_API_KEY"]

client = Mistral(api_key=api_key)

## Agents
You can create an Agent in https://console.mistral.ai/build/agents/new, for this notebook we will use mistral-large-2407 as the model powering our agents!

Here are the instructions provided to the agents we created:

### Planning agent:

```
You are a data analytical planning assistant. Given a dataset and its description,
your task is to provide specific and simple analysis plans, detailed instructions,
and suggested Python code that can later be given to a separate Python agent to generate
the Python code for executing the analysis plan.
Do not create figures.

Return output with the following format:

## Total number of steps:

## Step 1:
```

### Python agent:
```
You are a Python coding assistant that only outputs Python code without any explanations or comments.
Given an instruction and the suggested Python code, return the correct Python code.
```

### Summarization agent:
```
You are an analysis summarization assistant.
Given a dataset's description and the analysis results. Provide an analysis report.
```

### Agents IDs
Next, we will retrieve the Agents IDs from the UI where we created the agents.


In [None]:
planning_agent_id =  "ag:04f796e4:20240731:planning-assistant:fcaeb51a"
summarization_agent_id = "ag:04f796e4:20240731:analysis-summarization-assistant:e43645c4"
python_agent_id = "ag:04f796e4:20240731:python-assistant:65d8e2f6"

# Analysis Planning

In [None]:
def run_analysis_planning_agent(query):
    """
    Sends a user query to a Python agent and returns the response.

    Args:
        query (str): The user query to be sent to the Python agent.

    Returns:
        str: The response content from the Python agent.
    """
    print("### Run Planning agent")
    print(f"User query: {query}")
    try:
        response = client.agents.completion(
            agent_id= agent_id,
            messages = [
                {
                    "role": "user",
                    "content":  query
                },
            ]
        )
        result = response.json()['choices'][0]['message']['content']
        return result
    except requests.RequestException as e:
        print(f"Request failed: {e}. Please check your request.")
        return None

In [None]:
query = """
Load this data: https://raw.githubusercontent.com/fivethirtyeight/data/master/bad-drivers/bad-drivers.csv

The dataset consists of 51 datapoints and has eight columns:
- State
- Number of drivers involved in fatal collisions per billion miles
- Percentage Of Drivers Involved In Fatal Collisions Who Were Speeding
- Percentage Of Drivers Involved In Fatal Collisions Who Were Alcohol-Impaired
- Percentage Of Drivers Involved In Fatal Collisions Who Were Not Distracted
- Percentage Of Drivers Involved In Fatal Collisions Who Had Not Been Involved In Any Previous Accidents
- Car Insurance Premiums ($)
- Losses incurred by insurance companies for collisions per insured driver ($)
"""

In [None]:
planning_result = run_analysis_planning_agent(query)

### Run Python agent
User query: 
Load this data: https://raw.githubusercontent.com/fivethirtyeight/data/master/bad-drivers/bad-drivers.csv

The dataset consists of 51 datapoints and has eight columns:
- State
- Number of drivers involved in fatal collisions per billion miles
- Percentage Of Drivers Involved In Fatal Collisions Who Were Speeding
- Percentage Of Drivers Involved In Fatal Collisions Who Were Alcohol-Impaired
- Percentage Of Drivers Involved In Fatal Collisions Who Were Not Distracted
- Percentage Of Drivers Involved In Fatal Collisions Who Had Not Been Involved In Any Previous Accidents	
- Car Insurance Premiums ($)
- Losses incurred by insurance companies for collisions per insured driver ($)



In [None]:
print(planning_result)

## Total number of steps: 3

## Step 1:

### Description:
Load the dataset from the provided URL.

### Detailed Instructions:
1. Use the pandas library to read the CSV file from the URL.
2. Store the data in a DataFrame.

### Suggested Python Code:
```python
import pandas as pd

# Load the dataset
url = "https://raw.githubusercontent.com/fivethirtyeight/data/master/bad-drivers/bad-drivers.csv"
data = pd.read_csv(url)
```

## Step 2:

### Description:
Explore the dataset to understand its structure and contents.

### Detailed Instructions:
1. Display the first few rows of the DataFrame to get an overview of the data.
2. Check the data types of each column.
3. Get a summary of the dataset to understand the distribution of the data.

### Suggested Python Code:
```python
# Display the first few rows of the dataset
print(data.head())

# Check the data types of each column
print(data.dtypes)

# Get a summary of the dataset
print(data.describe())
```

## Step 3:

### Description:
Perform basi

# Generate and execute Python code for each planning step

In [None]:
class PythonAgentWorkflow:
    def __init__(self):
        pass

    def extract_pattern(self, text, pattern):
        """
        Extracts a pattern from the given text.

        Args:
            text (str): The text to search within.
            pattern (str): The regex pattern to search for.

        Returns:
            str: The extracted pattern or None if not found.
        """
        match = re.search(pattern, text, flags=re.DOTALL)
        if match:
            return match.group(1).strip()
        return None

    def extract_step_i(self, planning_result, i, n_step):
        """
        Extracts the content of a specific step from the planning result.

        Args:
            planning_result (str): The planning result text.
            i (int): The step number to extract.
            n_step (int): The total number of steps.

        Returns:
            str: The extracted step content or None if not found.
        """
        if i < n_step:
            pattern = rf'## Step {i}:(.*?)## Step {i+1}'
        elif i == n_step:
            pattern = rf'## Step {i}:(.*)'
        else:
            print(f"Invalid step number {i}. It should be between 1 and {n_step}.")
            return None

        step_i = self.extract_pattern(planning_result, pattern)
        if not step_i:
            print(f"Failed to extract Step {i} content.")
            return None

        return step_i

    def extract_code(self, python_agent_result):
          """
          Extracts Python function and test case from the response content.

          Args:
              result (str): The response content from the Python agent.

          Returns:
              tuple: A tuple containing the extracted Python function and a retry flag.
          """
          retry = False
          print("### Extracting Python code")
          python_code = self.extract_pattern(python_agent_result, r'```python(.*?)```')
          if not python_code:
              retry = True
              print("Python function failed to generate or wrong output format. Setting retry to True.")

          return python_code, retry

    def run_python_agent(self, query):
        """
        Sends a user query to a Python agent and returns the response.

        Args:
            query (str): The user query to be sent to the Python agent.

        Returns:
            str: The response content from the Python agent.
        """
        print("### Run Python agent")
        print(f"User query: {query}")
        try:
            response = client.agents.completion(
                agent_id= python_agent_id,
                messages = [
                    {
                        "role": "user",
                        "content":  query
                    },
                ]
            )
            result = response.json()['choices'][0]['message']['content']
            return result

        except requests.RequestException as e:
            print(f"Request failed: {e}. Please check your request.")
            return None

    def check_code(self, python_function, state):
        """
        Executes the Python function and checks for any errors.

        Args:
            python_function (str): The Python function to be executed.

        Returns:
            bool: A flag indicating whether the code execution needs to be retried.

        Warning:
            This code is designed to run code that’s been generated by a model, which may not be entirely reliable.
            It's strongly recommended to run this in a sandbox environment.
        """
        retry = False
        try:
            print(f"### Python function to run: {python_function}")
            exec(python_function, state)
            print("Code executed successfully.")
        except Exception:
            print(f"Code failed.")
            retry = True
            print("Setting retry to True")
        return retry

    def process_step(self, planning_result, i, n_step, max_retries, state):
        """
        Processes a single step, including retries.

        Args:
            planning_result (str): The planning result text.
            i (int): The step number to process.
            n_step (int): The total number of steps.
            max_retries (int): The maximum number of retries.

        Returns:
            str: The extracted step content or None if not found.
        """

        retry = True
        j = 0
        while j < max_retries and retry:
            print(f"TRY # {j}")
            j += 1
            step_i = self.extract_step_i(planning_result, i, n_step)
            if step_i:
                print(step_i)
                python_agent_result = self.run_python_agent(step_i)
                python_code, retry = self.extract_code(python_agent_result)
                print(python_code)
                retry = self.check_code(python_code, state)
        return None

    def workflow(self, planning_result):
        """
        Executes the workflow for processing planning results.

        Args:
            planning_result (str): The planning result text.
        """
        state = {}
        print("### ENTER WORKFLOW")
        n_step = int(self.extract_pattern(planning_result, '## Total number of steps:\s*(\d+)'))
        for i in range(1, n_step + 1):
            print(f"STEP # {i}")
            self.process_step(planning_result, i, n_step, max_retries=2, state=state)


        print("### Exit WORKFLOW")
        return None

In [None]:
import sys
import io

# See the output of print statements in the console while also capturing it in a variable,
class Tee(io.StringIO):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.original_stdout = sys.stdout

    def write(self, data):
        self.original_stdout.write(data)
        super().write(data)

    def flush(self):
        self.original_stdout.flush()
        super().flush()

# Create an instance of the Tee class
tee_stream = Tee()

# Redirect stdout to the Tee instance
sys.stdout = tee_stream


Python_agent = PythonAgentWorkflow()
Python_agent.workflow(planning_result)

# Restore the original stdout
sys.stdout = tee_stream.original_stdout

# Get the captured output
captured_output = tee_stream.getvalue()

### ENTER WORKFLOW
STEP # 1
TRY # 0
### Description:
Load the dataset from the provided URL.

### Detailed Instructions:
1. Use the pandas library to read the CSV file from the URL.
2. Store the data in a DataFrame.

### Suggested Python Code:
```python
import pandas as pd

# Load the dataset
url = "https://raw.githubusercontent.com/fivethirtyeight/data/master/bad-drivers/bad-drivers.csv"
data = pd.read_csv(url)
```
### Run Python agent
User query: ### Description:
Load the dataset from the provided URL.

### Detailed Instructions:
1. Use the pandas library to read the CSV file from the URL.
2. Store the data in a DataFrame.

### Suggested Python Code:
```python
import pandas as pd

# Load the dataset
url = "https://raw.githubusercontent.com/fivethirtyeight/data/master/bad-drivers/bad-drivers.csv"
data = pd.read_csv(url)
```
### Extracting Python code
import pandas as pd

# Load the dataset
url = "https://raw.githubusercontent.com/fivethirtyeight/data/master/bad-drivers/bad-drivers.csv

# Summarization

In [None]:
response = client.agents.completion(
    agent_id= summarization_agent_id,
    messages = [
        {
            "role": "user",
            "content":  query + captured_output
        },
    ]
)
result = response.json()['choices'][0]['message']['content']


In [None]:
print(result)

### Analysis Report

#### Dataset Description
The dataset contains information on driving statistics and insurance data for 51 states in the United States. The dataset includes the following columns:
- **State**: The name of the state.
- **Number of drivers involved in fatal collisions per billion miles**: The rate of drivers involved in fatal collisions per billion miles driven.
- **Percentage Of Drivers Involved In Fatal Collisions Who Were Speeding**: The percentage of drivers involved in fatal collisions who were speeding.
- **Percentage Of Drivers Involved In Fatal Collisions Who Were Alcohol-Impaired**: The percentage of drivers involved in fatal collisions who were alcohol-impaired.
- **Percentage Of Drivers Involved In Fatal Collisions Who Were Not Distracted**: The percentage of drivers involved in fatal collisions who were not distracted.
- **Percentage Of Drivers Involved In Fatal Collisions Who Had Not Been Involved In Any Previous Accidents**: The percentage of drivers inv