# AutoBuild
AutoGen offers conversable agents powered by LLM, tool, or human, which can be used to perform tasks collectively via automated chat. This framework allows tool use and human participation through multi-agent conversation.
Please find documentation about this feature [here](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat).

In this notebook, we introduce a new class, `AgentBuilder`, to help user build an automatic task solving process powered by multi-agent system. Specifically, our main building pipeline include `build()` and `start()`. In `build()`, we prompt a LLM to create multiple participant agent and initialize a group chat, and specify whether this task need programming to solve. After that, user can call `start()` at a proper time to complete the task. AutoBuilder also support open-source LLMs by [vLLM](https://docs.vllm.ai/en/latest/index.html) and [Fastchat](https://github.com/lm-sys/FastChat). Check the supported model list [here](https://docs.vllm.ai/en/latest/models/supported_models.html).

## Requirement

AutoBuild need the latest version of AutoGen.
You can install AutoGen by the following command:

In [None]:
!pip install pyautogen

## Step 1: prepare configuration
Prepare a `config_path` for assistant agent to limit the choice of LLM you want to use in this task. This config can be a path of json file or a name of environment variable. A `default_llm_config` is also required for initialize the specific config of LLMs like seed, temperature, etc...

In [1]:
config_path = '/home/elpis_ubuntu/LLM/autogen/OAI_CONFIG_LIST'  # modify path
default_llm_config = {
    'temperature': 0
}

## Step 2: create a AgentBuilder
Create a `AgentBuilder` with the specified `config_path`. AgentBuilder will use GPT-4 in default to complete the whole process, you can also change the `builder_model` to other OpenAI model if you want. You can also specify a OpenAI or open-source LLM as agent backbone, see blog for more details.

In [2]:
from autogen.agentchat.contrib.agent_builder import AgentBuilder

builder = AgentBuilder(config_path=config_path, builder_model='gpt-4-1106-preview', agent_model='gpt-4-1106-preview')

## Step 3: specify a building task

Specify a building task with a general description. Building task will help build manager (a LLM) decide what agents should be build.

In [3]:
building_task = "Find a paper on arxiv by programming, and analysis its application in some domain. For example, find a latest paper about gpt-4 on arxiv and find its potential applications in software."

## Step 4: build group chat agents
Use `build()` to let build manager (the specified `builder_model`) complete the group chat agents generation. If you think coding is necessary in your task, you can use `coding=True` to add a user proxy (an automatic code interpreter) into the agent list, like: 
```python
builder.build(building_task, default_llm_config, coding=True)
```
If `coding` is not specified, AutoBuilder will determine on its own whether the user proxy should be added or not according to the task.

In [4]:
builder.build(building_task, default_llm_config)

Generating agents...
Data_scientist,Domain_expert,Software_engineer are generated.
Preparing configuration for Data_scientist...
Preparing configuration for Domain_expert...
Preparing configuration for Software_engineer...
Creating agent Data_scientist with backbone gpt-4-1106-preview...
Creating agent Domain_expert with backbone gpt-4-1106-preview...
Creating agent Software_engineer with backbone gpt-4-1106-preview...
Adding user console proxy...


## Step 5: execute task
Let agents generated in `build()` to complete the task collaboratively in a group chat.

In [5]:
builder.start(task="Find a latest paper about gpt-4 on arxiv and find its potential applications in software.")

[33mUser_console_and_Python_code_interpreter[0m (to chat_manager):

Find a latest paper about gpt-4 on arxiv and find its potential applications in software.

--------------------------------------------------------------------------------
[33mSoftware_engineer[0m (to chat_manager):

To find the latest paper about GPT-4 on arXiv, we can use the arXiv API to search for papers that match the query. We'll write a Python script that uses the `requests` library to make a GET request to the arXiv API with the appropriate query parameters. The script will parse the response to find the most recent paper related to GPT-4 and then print out the title, authors, summary, and potential applications in software.

First, ensure you have the `requests` library installed. If not, you can install it using pip:

```bash
pip install requests
```

Now, here's the Python script to find the latest paper about GPT-4 on arXiv:

```python
import requests
from datetime import datetime
import xml.etree.Eleme

execute_code was called without specifying a value for use_docker. Since the python docker package is not available, code will be run natively. Note: this fallback behavior is subject to change


[31m
>>>>>>>> EXECUTING CODE BLOCK 1 (inferred language is python)...[0m


execute_code was called without specifying a value for use_docker. Since the python docker package is not available, code will be run natively. Note: this fallback behavior is subject to change


[33mUser_console_and_Python_code_interpreter[0m (to chat_manager):

exitcode: 0 (execution succeeded)
Code output: 

Title: MEDITRON-70B: Scaling Medical Pretraining for Large Language Models
Authors: Zeming Chen, Alejandro Hernández Cano, Angelika Romanou, Antoine Bonnet, Kyle Matoba, Francesco Salvi, Matteo Pagliardini, Simin Fan, Andreas Köpf, Amirkeivan Mohtashami, Alexandre Sallinen, Alireza Sakhaeirad, Vinitra Swamy, Igor Krawczuk, Deniz Bayazit, Axel Marmet, Syrielle Montariol, Mary-Anne Hartley, Martin Jaggi, Antoine Bosselut
Summary: Large language models (LLMs) can potentially democratize access to medical
knowledge. While many efforts have been made to harness and improve LLMs'
medical knowledge and reasoning capacities, the resulting models are either
closed-source (e.g., PaLM, GPT-4) or limited in scale (<= 13B parameters),
which restricts their abilities. In this work, we improve access to large-scale
medical LLMs by releasing MEDITRON: a suite of open-source LLMs with 

## Step 6 (Optional): clear all agents and prepare for the next task
You can clear all agents generated in this task by the following code if your task is completed or the next task is largely different from the current task. If the agent's backbone is an open-source LLM, this process will also shutdown the endpoint server. If necessary, you can use `recycle_endpoint=False` to retain the previous open-source LLMs' endpoint server.

In [6]:
builder.clear_all_agents(recycle_endpoint=True)

All agents have been cleared.


## Save & load configs

You can save all necessary information of the built group chat agents. Here is a case for those agents generated in the above task:
```json
{
    "building_task": "Find a paper on arxiv by programming, and analysis its application in some domain. For example, find a latest paper about gpt-4 on arxiv and find its potential applications in software.",
    "agent_configs": [
        {
            "name": "Data_scientist",
            "model": "gpt-4-1106-preview",
            "system_message": "As a Data Scientist, you are tasked with automating the retrieval and analysis of academic papers from arXiv. Utilize your Python programming acumen to develop scripts for gathering necessary information such as searching for relevant papers, downloading them, and processing their contents. Apply your analytical and language skills to interpret the data and deduce the applications of the research within specific domains.\n\n1. To compile information, write and implement Python scripts that search and interact with online resources, download and read files, extract content from documents, and perform other information-gathering tasks. Use the printed output as the foundation for your subsequent analysis.\n\n2. Execute tasks programmatically with Python scripts when possible, ensuring results are directly displayed. Approach each task with efficiency and strategic thinking.\n\nProgress through tasks systematically. In instances where a strategy is not provided, outline your plan before executing. Clearly distinguish between tasks handled via code and those utilizing your analytical expertise.\n\nWhen providing code, include only Python scripts meant to be run without user alterations. Users should execute your script as is, without modifications:\n\n```python\n# filename: <filename>\n# Python script\nprint(\"Your output\")\n```\n\nUsers should not perform any actions other than running the scripts you provide. Avoid presenting partial or incomplete scripts that require user adjustments. Refrain from requesting users to copy-paste results; instead, use the 'print' function when suitable to display outputs. Monitor the execution results they share.\n\nIf an error surfaces, supply corrected scripts for a re-run. If the strategy fails to resolve the issue, reassess your assumptions, gather additional details as needed, and explore alternative approaches.\n\nUpon successful completion of a task and verification of the results, confirm the achievement of the stated objective. Ensuring accuracy and validity of the findings is paramount. Evidence supporting your conclusions should be provided when feasible.\n\nUpon satisfying the user's needs and ensuring all tasks are finalized, conclude your assistance with \"TERMINATE\"."
        },
        {
            "name": "Research_analyst",
            "model": "gpt-4-1106-preview",
            "system_message": "As a Research Analyst, you are expected to be a proficient AI assistant possessing a strong grasp of programming, specifically in Python, and robust analytical capabilities. Your primary responsibilities will include:\n\n1. Conducting comprehensive searches and retrieving information autonomously through Python scripts, such as querying databases, accessing web services (like arXiv), downloading and reading files, and retrieving system information.\n2. Analyzing the content of the retrieved documents, particularly academic papers, and extracting insights regarding their application in specific domains, such as the potential uses of GPT-4 in software development.\n3. Presenting your findings in a clear, detailed manner, explaining the implications of the research and its relevance to the assigned task.\n4. Employing your programming skills to automate tasks where possible, ensuring the output is delivered through Python code with clear, executable instructions. Your code will be designed for the user to execute without amendment or additional input.\n5. Verifying the results of information gathering and analysis to ensure accuracy and completeness, providing evidence to support your conclusions when available.\n6. Communicating the completion of each task and confirming that the user's needs have been satisfied through a clear and conclusive statement, followed by the word \"TERMINATE\" to signal the end of the interaction."
        },
        {
            "name": "Software_developer",
            "model": "gpt-4-1106-preview",
            "system_message": "As a dedicated AI assistant for a software developer, your role involves employing your Python programming prowess and proficiency in natural language processing to facilitate the discovery and analysis of scholarly articles on arXiv. Your tasks include crafting Python scripts to automatically search, retrieve, and present information regarding the latest research, with a focus on applicable advancements in technology such as GPT-4 and its potential impact on the domain of software development.\n\n1. Utilize Python to programmatically seek out and extract pertinent data, for example, navigating or probing the web, downloading/ingesting documents, or showcasing content from web pages or files. When enough information has been accumulated to proceed, you will then analyze and interpret the findings.\n\n2. When there's a need to perform an operation programmatically, your Python code should accomplish the task and manifest the outcome. Progress through the task incrementally and systematically.\n\nProvide a clear plan outlining each stage of the task, specifying which components will be executed through Python coding and which through your linguistic capabilities. When proposing Python code, remember to:\n\n- Label the script type within the code block\n- Avoid suggesting code that the user would need to alter\n- Refrain from including more than one code block in your response\n- Circumvent requesting the user to manually transcribe any results; utilize 'print' statements where applicable\n- Examine the user's reported execution outcomes\n\nIf an error arises, your responsibility is to rectify the issue and submit the corrected script. Should an error remain unresolvable, or if the task remains incomplete post successful code execution, re-evaluate the scenario, gather any further required information, and formulate an alternative approach.\n\nUpon confirming that the task has been satisfactorily accomplished and the user's requirements have been met, indicate closure of the procedure with a concluding statement."
        }
    ],
    "manager_system_message": "Group chat manager.",
    "coding": true,
    "default_llm_config": {
        "temperature": 0
    }
}
```
These information will be saved in JSON format. You can provide a specific filename, otherwise, AgentBuilder will save config to the current path with a generated filename 'save_config_TASK_MD5.json'.

In [7]:
saved_path = builder.save()

Building config saved to ./save_config_eb1be857faa608aeb4c5af11fe4ab245.json


After that, you can load the saved config and skip the building process. AutoBuilder will create agents with those information without prompting the builder manager.

In [8]:
new_builder = AgentBuilder(config_path=config_path).load(saved_path)
new_builder.start(task="Find a latest paper about Llava on arxiv and find its potential applications in multi-modal model.")
new_builder.clear_all_agents()

Loding config from ./save_config_eb1be857faa608aeb4c5af11fe4ab245.json
Creating agent Data_scientist with backbone gpt-4-1106-preview...
Creating agent Domain_expert with backbone gpt-4-1106-preview...
Creating agent Software_engineer with backbone gpt-4-1106-preview...
Adding user console proxy...
[33mUser_console_and_Python_code_interpreter[0m (to chat_manager):
Find a latest paper about Llava on arxiv and find its potential applications in multi-modal model.

--------------------------------------------------------------------------------
[33mSoftware_engineer[0m (to chat_manager):

To find the latest paper about "Llava" on arXiv and evaluate its potential applications in multi-modal models, we will break down the task into the following steps:

1. Search for the latest papers on arXiv with the keyword "Llava".
2. Download and read the most recent paper found.
3. Analyze the content to determine its potential applications in multi-modal models.

Let's start with step 1. We will 

execute_code was called without specifying a value for use_docker. Since the python docker package is not available, code will be run natively. Note: this fallback behavior is subject to change


[31m
>>>>>>>> EXECUTING CODE BLOCK 1 (inferred language is python)...[0m


execute_code was called without specifying a value for use_docker. Since the python docker package is not available, code will be run natively. Note: this fallback behavior is subject to change


[33mUser_console_and_Python_code_interpreter[0m (to chat_manager):

exitcode: 0 (execution succeeded)
Code output: 

Title: PG-Video-LLaVA: Pixel Grounding Large Video-Language Models
Authors: Shehan Munasinghe, Rusiru Thushara, Muhammad Maaz, Hanoona Abdul Rasheed, Salman Khan, Mubarak Shah, Fahad Khan
Abstract: Extending image-based Large Multimodal Models (LMM) to videos is challenging
due to the inherent complexity of video data. The recent approaches extending
image-based LMM to videos either lack the grounding capabilities (e.g.,
VideoChat, Video-ChatGPT, Video-LLaMA) or do not utilize the audio-signals for
better video understanding (e.g., Video-ChatGPT). Addressing these gaps, we
propose Video-LLaVA, the first LMM with pixel-level grounding capability,
integrating audio cues by transcribing them into text to enrich video-context
understanding. Our framework uses an off-the-shelf tracker and a novel
grounding module, enabling it to spatially and temporally localize objects in


## Use GPTs

[GPTs](https://openai.com/blog/introducing-gpts) allow user to create an assistant with a simple instruction of the task. It has plugin support that can let ChatGPT complete some complex instructions, and can optionally update the assistant's instruction to let it adapted to new task or improve on the current task.
AutoBuild also support GPTs api by adding `use_gpts=True` to the `build()` function.

In [9]:
new_builder = AgentBuilder(config_path=config_path)
new_builder.build(building_task, default_llm_config, use_gpts=True)  # Transfer to GPTs API.
new_builder.start(task="Find a latest paper about XAI on arxiv and find its potential applications in medical.")
new_builder.clear_all_agents()

Generating agents...
Data_scientist,Domain_expert,Software_engineer are generated.
Preparing configuration for Data_scientist...
Preparing configuration for Domain_expert...
Preparing configuration for Software_engineer...
Creating agent Data_scientist with backbone gpt-4-1106-preview...


Multiple assistants with name Data_scientist found. Using the first assistant in the list. Please specify the assistant ID in llm_config to use a specific assistant.


Creating agent Domain_expert with backbone gpt-4-1106-preview...
Creating agent Software_engineer with backbone gpt-4-1106-preview...
Adding user console proxy...
[33mUser_console_and_Python_code_interpreter[0m (to chat_manager):
Find a latest paper about XAI on arxiv and find its potential applications in medical.

--------------------------------------------------------------------------------
[33mSoftware_engineer[0m (to chat_manager):

To accomplish the task of finding the latest paper about Explainable Artificial Intelligence (XAI) on arXiv and evaluating its potential applications in the medical field, we will break down the task into several steps:

1. **Search arXiv for the latest papers on XAI**: Use the arXiv API to search for papers that match the topic "Explainable Artificial Intelligence".
2. **Download and read the paper's summary**: Extract and present the content of the paper's summary to gain insight into its subject matter.
3. **Evaluate real-world applications in

execute_code was called without specifying a value for use_docker. Since the python docker package is not available, code will be run natively. Note: this fallback behavior is subject to change


[33mUser_console_and_Python_code_interpreter[0m (to chat_manager):

exitcode: 1 (execution failed)
Code output: 
  File "", line 21
    print(f"Summary: {paper.summary.replace('\n', ' ')}")
                                                        ^
SyntaxError: f-string expression part cannot include a backslash


--------------------------------------------------------------------------------
[33mSoftware_engineer[0m (to chat_manager):

It appears there was an error due to the use of a backslash within an f-string expression. Let's correct the script to properly print the summary without causing a syntax error. We can use double quotes for the string inside the `replace` method to avoid this issue.

Here is the corrected Python script:

```python
import arxiv
import datetime

# Define the query parameters
query = "cat:cs.AI AND ti:XAI"
sort_by = arxiv.SortCriterion.SubmittedDate  # Sort by most recent submissions

# Search for papers on arXiv that match the query
search = arxiv.Sea

execute_code was called without specifying a value for use_docker. Since the python docker package is not available, code will be run natively. Note: this fallback behavior is subject to change


[33mUser_console_and_Python_code_interpreter[0m (to chat_manager):

exitcode: 0 (execution succeeded)
Code output: 
Title: Forms of Understanding of XAI-Explanations
Authors: Hendrik Buschmeier, Heike M. Buhl, Friederike Kern, Angela Grimminger, Helen Beierling, Josephine Fisher, André Groß, Ilona Horwath, Nils Klowait, Stefan Lazarov, Michael Lenke, Vivien Lohmer, Katharina Rohlfing, Ingrid Scharlau, Amit Singh, Lutz Terfloth, Anna-Lisa Vollmer, Yu Wang, Annedore Wilmes, Britta Wrede
Submitted Date: 2023-11-15 08:06:51+00:00
Summary: Explainability has become an important topic in computer science and artificial intelligence, leading to a subfield called Explainable Artificial Intelligence (XAI). The goal of providing or seeking explanations is to achieve (better) 'understanding' on the part of the explainee. However, what it means to 'understand' is still not clearly defined, and the concept itself is rarely the subject of scientific investigation. This conceptual article aims to p