# Building a Multi-Agent AI Solution

Daniel Godden

This document aims to understand the process of building a multi-agent approach to solving generative AI problems. By breaking down the problem into smaller tasks, multi-agent AI can solve more complex problems. The aim of this document is to work through the process of building a solution in Python using AutoGen Studio. This solution will take a PDF file and create an article/blog post. To achieve this, the problem will be broken down into various agents performing tasks such as: PDF extraction, summarisation, keyword extraction, title generation, and blog formatting. The LLM we will be using throughout this project is OpenAI’s GPT-4. 

## Python Based Multi-Agent Approach

### Workflow

1. **PDF Extraction**: Extracts text from an uploaded PDF document.
2. **Summarisation**: Breaks down the text into smaller chunks and generates summaries using GPT-4.
3. **Keyword Extraction**: Extracts keywords and key topics from the text.
4. **Title Generation**: Creates a title for the blog post.
5. **Blog Formatting**: Combines the title, summary, and keywords into a well-structured blog post.
6. **Coordinator**: Agent to coordinate other agents.

### Technology Stack

- **OpenAI GPT-4**: Used for generating summaries, extracting keywords, and creating titles.
- **PyPDF2**: Python library used to extract text from PDF files.
- **CrewAI**: Coordinates the multi-agent system, allowing agents to work together efficiently.

The following code demonstrates how to load environment variables from a .env file and retrieve a specific environment variable, in this case, an API key.

In [None]:
from dotenv import load_dotenv
import os

# Load the environment variables from .env file
load_dotenv()

# Retrieve the API key from environment variables
api_key = os.getenv("OPENAI_API_KEY")

#### 1. PDF Extraction Agent

This agent extracts the text from the PDF document. 

In [None]:
import PyPDF2

class PDFExtractionAgent:
    """
    A class for extracting text from a PDF file.
    """

    def extract_text(self, pdf_path):
        """
        Extracts text from the given PDF file.

        Args:
            pdf_path (str): Path to the PDF file.

        Returns:
            str: The extracted text from the PDF.

        Example:
            agent = PDFExtractionAgent()
            text = agent.extract_text('example.pdf')
        """
        with open(pdf_path, 'rb') as file:
            reader = PyPDF2.PdfReader(file)
            text = ""
            for page in reader.pages:
                text += page.extract_text()
        return text

#### 2. Summarisation Agent

This agent splits the extracted text into smaller chunks and generates a summary for each chuck using GPT-4.

In [None]:
import openai

# Set up OpenAI API key
openai.api_key = api_key

TOKEN_LIMIT = 2000  # Set token limit to avoid rate limits

def split_text(text, max_tokens):
    """
    Splits the input text into smaller chunks based on a token limit.

    Args:
        text (str): The text to split.
        max_tokens (int): The maximum number of tokens per chunk.

    Returns:
        list: A list of text chunks that fit within the token limit.

    Example:
        chunks = split_text(text, 2000)
    """
    sentences = text.split('.')
    chunks = []
    current_chunk = ''
    
    for sentence in sentences:
        if len(current_chunk.split()) + len(sentence.split()) <= max_tokens:
            current_chunk += sentence + '.'
        else:
            chunks.append(current_chunk)
            current_chunk = sentence + '.'
    
    if current_chunk:
        chunks.append(current_chunk)
    
    return chunks

class SummarisationAgent:
    """
    A class to generate text summaries using the OpenAI GPT model.
    """

    def generate_summary(self, text, chunk_size=TOKEN_LIMIT):
        """
        Generates a summary of the input text by splitting it into chunks.

        Args:
            text (str): The text to summarize.
            chunk_size (int): Maximum number of tokens per chunk (default is TOKEN_LIMIT).

        Returns:
            str: The combined summary of the input text.

        Example:
            agent = SummarisationAgent()
            summary = agent.generate_summary(text)
        """
        text_chunks = split_text(text, chunk_size)
        summary_chunks = []

        for chunk in text_chunks:
            prompt = f"Summarize the following text for a blog post:\n\n{chunk}"
            response = openai.chat.completions.create(
                model="gpt-4",
                messages=[
                    {"role": "system", "content": "You are a helpful assistant."},
                    {"role": "user", "content": prompt}
                ],
                temperature=0.7,
                max_tokens=2000,
            )
            summary = response.choices[0].message.content
            summary_chunks.append(summary)
        
        final_summary = ' '.join(summary_chunks)
        
        return final_summary

#### 3. Keyword Agent

This agent extracts keywords from the text using the GPT-4

In [None]:
class KeywordAgent:
    """
    A class to extract keywords from a given text using the OpenAI GPT model.
    """

    def extract_keywords(self, text, chunk_size=TOKEN_LIMIT):
        """
        Extracts keywords from the provided text by splitting it into smaller chunks.

        Args:
            text (str): The text from which to extract keywords.
            chunk_size (int): The maximum number of tokens allowed per chunk (default is TOKEN_LIMIT).

        Returns:
            str: A string of extracted keywords, separated by commas.

        Example:
            agent = KeywordAgent()
            keywords = agent.extract_keywords(text)
        """
        text_chunks = split_text(text, chunk_size)
        keyword_chunks = []

        for chunk in text_chunks:
            prompt = f"Extract keywords from the following text:\n\n{chunk}"
            response = openai.chat.completions.create(
                model="gpt-4",
                messages=[
                    {"role": "system", "content": "You are a helpful assistant."},
                    {"role": "user", "content": prompt}
                ],
                temperature=0.5,
                max_tokens=200,
            )
            keywords = response.choices[0].message.content
            keyword_chunks.append(keywords)
        
        final_keywords = ', '.join(keyword_chunks)
        return final_keywords

#### 4. Title Agent

This agent generates title for blog post.

In [None]:
class TitleAgent:
    """
    A class to generate an engaging title for a blog post using the OpenAI GPT model.
    """

    def generate_title(self, text, chunk_size=TOKEN_LIMIT):
        """
        Generates a title for the provided text by using the first chunk of the content.

        Args:
            text (str): The content based on which the title is generated.
            chunk_size (int): The maximum number of tokens in the first chunk (default is TOKEN_LIMIT).

        Returns:
            str: A generated title for the blog post.

        Example:
            agent = TitleAgent()
            title = agent.generate_title(text)
        """
        # Use only the first chunk for title generation
        text_chunk = split_text(text, chunk_size)[0]
        prompt = f"Generate an engaging title for a blog post based on this content:\n\n{text_chunk}"
        response = openai.chat.completions.create(
            model="gpt-4",
            messages=[
                {"role": "system", "content": "You are a helpful assistant."},
                {"role": "user", "content": prompt}
            ],
            temperature=0.7,
            max_tokens=60,
        )
        title = response.choices[0].message.content
        return title

#### 5. Blog Formatting Agent

This agent combines the tile, summary, and keywords into a fromatted blog post.

In [None]:
class FormattingAgent:
    """
    A class to format a blog post by combining a title, summary, and keywords.
    """

    def format_blog_post(self, title, summary, keywords):
        """
        Formats a blog post with a given title, summary, and keywords.

        Args:
            title (str): The title of the blog post.
            summary (str): The summary or main content of the blog post.
            keywords (str): A string of keywords related to the blog post.

        Returns:
            str: A formatted blog post with the title, summary, and keywords.

        Example:
            agent = FormattingAgent()
            blog_post = agent.format_blog_post("Blog Title", "This is a summary.", "keyword1, keyword2")
        """
        blog_post = f"# {title}\n\n{summary}\n\n**Keywords**: {keywords}\n"
        return blog_post

#### 6. Coordinator

The coordinator class manages all the agents, orchestrating the workflow from PDF extraction to blog post generation. 

In [None]:
class BlogPostGeneratorCoordinator:
    """
    A class to coordinate the generation of a complete blog post from a PDF file 
    by leveraging multiple agents (PDF extraction, summarisation, keyword extraction, title generation, and formatting).
    """

    def __init__(self):
        """
        Initializes the BlogPostGeneratorCoordinator by creating instances of the required agents:
        - PDFExtractionAgent for extracting text from PDF.
        - SummarisationAgent for summarising the extracted text.
        - KeywordAgent for extracting keywords.
        - TitleAgent for generating a title.
        - FormattingAgent for formatting the blog post.
        """
        self.pdf_agent = PDFExtractionAgent()
        self.summary_agent = SummarisationAgent()
        self.keyword_agent = KeywordAgent()
        self.title_agent = TitleAgent()
        self.formatting_agent = FormattingAgent()

    def generate_blog_post_from_pdf(self, pdf_path):
        """
        Generates a complete blog post from a PDF file by coordinating the use of several agents.

        Args:
            pdf_path (str): The file path to the PDF document.

        Returns:
            str: A formatted blog post containing a title, summary, and keywords, based on the content of the PDF.

        Example:
            coordinator = BlogPostGeneratorCoordinator()
            blog_post = coordinator.generate_blog_post_from_pdf('document.pdf')
        """
        # Agent 1: Extract text from PDF
        pdf_text = self.pdf_agent.extract_text(pdf_path)

        # Agent 2: Generate a summary for the blog post
        summary = self.summary_agent.generate_summary(pdf_text)

        # Agent 3: Extract keywords
        keywords = self.keyword_agent.extract_keywords(pdf_text)

        # Agent 4: Generate a title for the blog post
        title = self.title_agent.generate_title(pdf_text)

        # Agent 5: Format the blog post
        blog_post = self.formatting_agent.format_blog_post(title, summary, keywords)

        return blog_post

Now the agents have been configured, the process can be invoked. This will create a text file and word document file containing the blog post.

In [None]:
from docx import Document

if __name__ == "__main__":
    # Define the path to the PDF file that will be used for generating the blog post
    pdf_path = "C:/Users/DanielGodden/Downloads/Generative-AI-and-LLMs-for-Dummies.pdf"

    # Create an instance of BlogPostGeneratorCoordinator to handle the blog post generation
    coordinator = BlogPostGeneratorCoordinator()

    # Generate the blog post by extracting content from the specified PDF
    blog_post = coordinator.generate_blog_post_from_pdf(pdf_path)

    # Remove the "Keywords" section from the blog post before saving to a text file
    blog_post_without_keywords = blog_post.split("**Keywords**")[0]  # Split and exclude keywords section

    # Save the blog post (without keywords) to a text file
    with open("generated_blog_post.txt", "w") as text_file:
        text_file.write(blog_post_without_keywords)  # Write blog post content to the file
        print(blog_post_without_keywords)  # Optionally print the blog post to the console

    # Create a new Word document object to save the blog post in DOCX format
    doc = Document()

    # Split the blog post content into lines for further processing
    lines = blog_post_without_keywords.split('\n')

    # Extract and format the title from the first line of the blog post
    title = lines[0].strip("# ").strip()  # Remove the markdown-style "#" and surrounding spaces

    # Add the extracted title as a heading to the Word document
    doc.add_heading(title, 0)

    # Add the entire blog post content (excluding keywords) as a paragraph in the Word document
    doc.add_paragraph(blog_post_without_keywords)

    # Save the blog post as a Word document with the specified filename
    doc.save("blog_post.docx")

## Setting Up and Running Autogen Studio

### Introduction to Autogen Studio

Autogen Studio is a powerful tool designed to simplify the creation and management of AI-driven agents. These agents can automate tasks, interact with users, or perform various operations based on predefined instructions or AI models like GPT. Autogen Studio provides an intuitive user interface that allows you to design, test, and deploy these agents with ease.

In this guide, we'll walk you through the steps to set up Autogen Studio in your development environment and get it running.

### Prerequisites

Before you begin, ensure that you have Python installed on your machine. This guide assumes you are using an IDE (Integrated Development Environment) like Visual Studio Code.

### Step-by-Step Setup Guide

#### 1. Create a New Project Folder

1. Create a new folder where you want to store your project files.
2. Open this folder in your IDE.

#### 2. Set Up a Virtual Environment

1. Open the terminal in your IDE.
2. Create a virtual environment by typing the following command:
    ```bash
    python -m venv .venv
    ```
3. Activate the virtual environment:
    - On Windows, use:
      ```bash
      .venv\Scripts\activate
#### 3. Install Autogen Studio

1. With your virtual environment activated, install Autogen Studio using pip:
    ```bash
    pip install autogenstudio
    ```
#### 4. Clear the Terminal Screen (Optional)

- To keep your terminal clean, you can clear the screen:
    - On Windows:
      ```bash
      cls
      ```
#### 5. Set Your OpenAI API Key

1. Visit the OpenAI API Keys page: [OpenAI API Keys](https://platform.openai.com/api-keys).
2. Create a new secret key.
3. Set the API key in your terminal:
    ```bash
    set OPEN_API_KEY="your-secret-key"
    ```
   Replace `your-secret-key` with the actual key you generated.

#### 6. Start Autogen Studio

1. Start the Autogen Studio UI by typing the following command in your terminal:
    ```bash
    autogenstudio ui
    ```
2. The terminal will display a message similar to:
    ```
    Uvicorn running on http://127.0.0.1:8081
    ```
3. Hold `CTRL` and click on the provided address (`http://127.0.0.1:8081`). This will open the Autogen Studio UI in your default web browser.

## Autogen Studio UI Overview

Autogen Studio is a powerful platform designed to streamline the creation and management of automated workflows using AI-driven tools. The interface is user-friendly, making it accessible for both beginners and advanced users. The UI is divided into two main sections: **Build** and **Playground**.

### Build Section

The **Build** section is where you design, configure, and manage the components that drive your automated processes. It is divided into four main subsections:

#### 1. Skills
In the **Skills** section, you define the specific abilities your AI models will use. Skills can range from natural language processing to data manipulation, and they are the building blocks for your workflows. Here, you can create, edit, and manage these skills, tailoring them to suit the needs of your projects.

#### 2. Models
The **Models** section is where you select and configure the AI models that power your skills. Autogen Studio supports a variety of models, allowing you to choose the most appropriate one based on your task requirements. You can also manage model versions, update configurations, and monitor performance metrics.

#### 3. Agents
In the **Agents** section, you design and manage the agents that will execute tasks using the skills and models you've set up. Agents are essentially the active entities that perform actions based on the input they receive. You can customize their behavior, set triggers, and define how they interact with other agents or external systems.

#### 4. Workflows
The **Workflows** section ties everything together. Here, you create automated processes by linking skills, models, and agents into cohesive workflows. Workflows are visualized as flowcharts, making it easy to understand the sequence of operations. You can drag and drop components, set conditions, and define branching logic to create complex automation tasks.

### Playground Section

The **Playground** section is your testing ground. Once you've built your workflows, you can simulate them in the Playground to see how they perform in real-time. This section allows you to tweak parameters, observe outputs, and make adjustments before deploying your workflows in a live environment.

### Setting Up Your Environment

#### Step 1: Define Your Skills
Start by navigating to the **Skills** section in the Build tab. Here, you'll define the core capabilities your agents will need. For example, if you're building a chatbot, you might create skills for understanding user queries, retrieving information, and generating responses. In the case of the above problem, the PDF extraction and summarisation agents code should be adapted.

#### Step 2: Choose and Configure Models
Next, move to the **Models** section. Select an appropriate model for each skill you've defined. Configure the models according to your specific needs, adjusting parameters such as input data types, output formats, and processing speed.

#### Step 3: Design Your Agents
In the **Agents** section, create agents that will use the skills and models you've set up. Define their roles, set triggers for when they should act, and determine how they should interact with other agents or external systems. You can define them using the same configuration as given above, using the prompt as the message you provide them.

#### Step 4: Build Workflows
With your agents in place, head to the **Workflows** section. Start by dragging and dropping skills, models, and agents onto the workflow canvas. Connect these components to form a logical sequence of operations. Add conditions, loops, and branches as needed to handle different scenarios.

#### Step 5: Test in the Playground
Before deploying your workflow, test it in the **Playground** section. Run simulations to ensure everything functions as expected. Use this opportunity to fine-tune your setup, making any necessary adjustments to improve performance or accuracy.

### Conclusion

Autogen Studio's intuitive interface and modular design make it easy to build and deploy sophisticated automation workflows. By following the structured process of defining skills, configuring models, designing agents, and constructing workflows, you can create powerful AI-driven solutions tailored to your specific needs. The Playground section offers a safe environment to test and refine your workflows, ensuring they are ready for real-world application.
