<a href="https://colab.research.google.com/github/aknip/Autogen-AutoGPT/blob/main/AutoGen_groupchat_research.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Auto Generated Agent Chat: Performs Research with Multi-Agent Group Chat


Notes:
- Problems with nr. of tokens/min
- Source: https://github.com/microsoft/autogen/blob/main/notebook/agentchat_groupchat_research.ipynb


AutoGen offers conversable agents powered by LLM, tool, or human, which can be used to perform tasks collectively via automated chat. This framework allows tool use and human participation through multi-agent conversation.
Please find documentation about this feature [here](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat).


In [None]:
import json
import os
from getpass import getpass
import psutil
IN_NOTEBOOK = any(["jupyter-notebook" in i for i in psutil.Process().parent().cmdline()])
if IN_NOTEBOOK:
  CREDS = json.loads(getpass("Secrets (JSON string): "))
  os.environ['CREDS'] = json.dumps(CREDS)
  CREDS = json.loads(os.getenv('CREDS'))

Secrets (JSON string): ··········


In [None]:
%%capture --no-stderr
%pip install pyautogen~=0.1.0

## Set your API Endpoint

The [`config_list_from_json`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_from_json) function loads a list of configurations from an environment variable or a json file.

In [None]:
import autogen

config_list_gpt4 = [{'model': 'gpt-4', 'api_key': CREDS['OpenAI']['v1']['credential']}]

#print(config_list_gpt4)

llm_config = {
    "seed": 43,  # change the seed for different trials
    "temperature": 0,
    "config_list": config_list_gpt4,
    "request_timeout": 120,
}

## Construct Agents

In [None]:
gpt4_config = {
    "seed": 42,  # change the seed for different trials
    "temperature": 0,
    "config_list": config_list_gpt4,
    "request_timeout": 120,
}
user_proxy = autogen.UserProxyAgent(
   name="Admin",
   system_message="A human admin. Interact with the planner to discuss the plan. Plan execution needs to be approved by this admin.",
   code_execution_config=False,
)
engineer = autogen.AssistantAgent(
    name="Engineer",
    llm_config=gpt4_config,
    system_message='''Engineer. You follow an approved plan. You write python/shell code to solve tasks. Wrap the code in a code block that specifies the script type. The user can't modify your code. So do not suggest incomplete code which requires others to modify. Don't use a code block if it's not intended to be executed by the executor.
Don't include multiple code blocks in one response. Do not ask others to copy and paste the result. Check the execution result returned by the executor.
If the result indicates there is an error, fix the error and output the code again. Suggest the full code instead of partial code or code changes. If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.
''',
)
scientist = autogen.AssistantAgent(
    name="Scientist",
    llm_config=gpt4_config,
    system_message="""Scientist. You follow an approved plan. You are able to categorize papers after seeing their abstracts printed. You don't write code."""
)
planner = autogen.AssistantAgent(
    name="Planner",
    system_message='''Planner. Suggest a plan. Revise the plan based on feedback from admin and critic, until admin approval.
The plan may involve an engineer who can write code and a scientist who doesn't write code.
Explain the plan first. Be clear which step is performed by an engineer, and which step is performed by a scientist.
''',
    llm_config=gpt4_config,
)
executor = autogen.UserProxyAgent(
    name="Executor",
    system_message="Executor. Execute the code written by the engineer and report the result.",
    human_input_mode="NEVER",
    code_execution_config={"last_n_messages": 3, "work_dir": "paper"},
)
critic = autogen.AssistantAgent(
    name="Critic",
    system_message="Critic. Double check plan, claims, code from other agents and provide feedback. Check whether the plan includes adding verifiable info such as source URL.",
    llm_config=gpt4_config,
)
groupchat = autogen.GroupChat(agents=[user_proxy, engineer, scientist, planner, executor, critic], messages=[], max_round=50)
manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=gpt4_config)

## Start Chat

In [None]:
user_proxy.initiate_chat(
    manager,
    message="""
find papers on LLM applications from arxiv in the last week, create a markdown table of different domains.
""",
)

Admin (to chat_manager):


find papers on LLM applications from arxiv in the last week, create a markdown table of different domains.


--------------------------------------------------------------------------------
Planner (to chat_manager):

Plan:

1. Engineer: Write a script to scrape arXiv for papers on LLM (Language Model) applications published in the last week. The script should extract the title, authors, abstract, and link to the paper.

2. Scientist: Review the scraped data to identify the different domains in which LLM is applied. This could be based on the content of the abstracts or the sections of arXiv where the papers are published.

3. Engineer: Modify the script to categorize the papers based on the domains identified by the scientist. The script should output a markdown table with columns for the domain, title, authors, abstract, and link.

4. Scientist: Review the markdown table to ensure the papers are correctly categorized and the information is accurate.

5. Eng



Executor (to chat_manager):

exitcode: 1 (execution failed)
Code output: 
  File "", line 1
    !pip install arxiv
    ^
SyntaxError: invalid syntax


--------------------------------------------------------------------------------
Engineer (to chat_manager):

I apologize for the confusion. The `!pip install arxiv` command is intended to be run in a Jupyter notebook. If you're not using a Jupyter notebook, you can install the `arxiv` library by running `pip install arxiv` in your command line.

Let's proceed with the Python script to fetch the papers. Please ensure that the `arxiv` library is installed before running the script.

```python
import arxiv
import datetime
import pandas as pd

# Define the date one week ago
one_week_ago = (datetime.datetime.now() - datetime.timedelta(days=7)).strftime('%Y%m%d%H%M%S')

# Search for papers on LLM applications
search = arxiv.Search(
  query="LLM applications",
  max_results=100,
  sort_by=arxiv.SortCriterion.SubmittedDate,
  sort_order=arxiv.S



Executor (to chat_manager):

exitcode: 1 (execution failed)
Code output: 
Traceback (most recent call last):
  File "", line 1, in <module>
    import arxiv
ModuleNotFoundError: No module named 'arxiv'


--------------------------------------------------------------------------------
Engineer (to chat_manager):

I apologize for the confusion. It seems that the environment where the code is being executed does not have the `arxiv` library installed, and we are unable to install it directly here.

In this case, we can use an alternative approach by using the `requests` and `feedparser` libraries to fetch and parse the RSS feed from arXiv. These libraries are commonly included in Python environments.

Here is the Python script:

```python
import requests
import feedparser
import pandas as pd
from datetime import datetime, timedelta

# Define the date one week ago
one_week_ago = (datetime.now() - timedelta(days=7)).date()

# URL for the arXiv API
url = "http://export.arxiv.org/api/query?"




Executor (to chat_manager):

exitcode: 1 (execution failed)
Code output: 
Traceback (most recent call last):
  File "", line 2, in <module>
    import feedparser
ModuleNotFoundError: No module named 'feedparser'


--------------------------------------------------------------------------------
Engineer (to chat_manager):

I apologize for the confusion. It seems that the environment where the code is being executed does not have the `feedparser` library installed, and we are unable to install it directly here.

In this case, we can use an alternative approach by using the `requests` and `bs4` (BeautifulSoup) libraries to fetch and parse the HTML page from arXiv. These libraries are commonly included in Python environments.

Here is the Python script:

```python
import requests
from bs4 import BeautifulSoup
import pandas as pd
from datetime import datetime, timedelta

# Define the date one week ago
one_week_ago = (datetime.now() - timedelta(days=7)).date()

# URL for the arXiv search pag



Executor (to chat_manager):

exitcode: 1 (execution failed)
Code output: 
Traceback (most recent call last):
  File "", line 24, in <module>
    published_date = datetime.strptime(entry.find('p', class_='is-size-7').text.split(';')[1].strip(), '%a, %d %b %Y %H:%M:%S GMT').date()
  File "/usr/lib/python3.10/_strptime.py", line 568, in _strptime_datetime
    tt, fraction, gmtoff_fraction = _strptime(data_string, format)
  File "/usr/lib/python3.10/_strptime.py", line 349, in _strptime
    raise ValueError("time data %r does not match format %r" %
ValueError: time data 'originally announced October 2023.' does not match format '%a, %d %b %Y %H:%M:%S GMT'


--------------------------------------------------------------------------------
Engineer (to chat_manager):

I apologize for the confusion. It seems that the date format or the date string extracted from the webpage is not as expected, causing the `strptime` function to raise a `ValueError`.

In this case, we can use an alternative app



Executor (to chat_manager):

exitcode: 1 (execution failed)
Code output: 
Traceback (most recent call last):
  File "", line 24, in <module>
    published_date = datetime.strptime(entry.xpath('.//p[contains(@class, "is-size-7")]/text()')[0].split(';')[1].strip(), '%a, %d %b %Y %H:%M:%S GMT').date()
  File "/usr/lib/python3.10/_strptime.py", line 568, in _strptime_datetime
    tt, fraction, gmtoff_fraction = _strptime(data_string, format)
  File "/usr/lib/python3.10/_strptime.py", line 349, in _strptime
    raise ValueError("time data %r does not match format %r" %
ValueError: time data '' does not match format '%a, %d %b %Y %H:%M:%S GMT'


--------------------------------------------------------------------------------
[autogen.oai.completion: 10-25 05:39:45] {252} INFO - retrying in 10 seconds...
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/autogen/oai/completion.py", line 222, in _get_response
    response = openai_completion.create(**config)
  F

INFO:autogen.oai.completion:retrying in 10 seconds...
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/autogen/oai/completion.py", line 222, in _get_response
    response = openai_completion.create(**config)
  File "/usr/local/lib/python3.10/dist-packages/openai/api_resources/chat_completion.py", line 25, in create
    return super().create(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/openai/api_resources/abstract/engine_api_resource.py", line 155, in create
    response, _, api_key = requestor.request(
  File "/usr/local/lib/python3.10/dist-packages/openai/api_requestor.py", line 299, in request
    resp, got_stream = self._interpret_response(result, stream)
  File "/usr/local/lib/python3.10/dist-packages/openai/api_requestor.py", line 710, in _interpret_response
    self._interpret_response_line(
  File "/usr/local/lib/python3.10/dist-packages/openai/api_requestor.py", line 775, in _interpret_response_line
    raise self.handle_er

[autogen.oai.completion: 10-25 05:39:55] {252} INFO - retrying in 10 seconds...
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/autogen/oai/completion.py", line 222, in _get_response
    response = openai_completion.create(**config)
  File "/usr/local/lib/python3.10/dist-packages/openai/api_resources/chat_completion.py", line 25, in create
    return super().create(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/openai/api_resources/abstract/engine_api_resource.py", line 155, in create
    response, _, api_key = requestor.request(
  File "/usr/local/lib/python3.10/dist-packages/openai/api_requestor.py", line 299, in request
    resp, got_stream = self._interpret_response(result, stream)
  File "/usr/local/lib/python3.10/dist-packages/openai/api_requestor.py", line 710, in _interpret_response
    self._interpret_response_line(
  File "/usr/local/lib/python3.10/dist-packages/openai/api_requestor.py", line 775, in _interpret_response_lin

INFO:autogen.oai.completion:retrying in 10 seconds...
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/autogen/oai/completion.py", line 222, in _get_response
    response = openai_completion.create(**config)
  File "/usr/local/lib/python3.10/dist-packages/openai/api_resources/chat_completion.py", line 25, in create
    return super().create(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/openai/api_resources/abstract/engine_api_resource.py", line 155, in create
    response, _, api_key = requestor.request(
  File "/usr/local/lib/python3.10/dist-packages/openai/api_requestor.py", line 299, in request
    resp, got_stream = self._interpret_response(result, stream)
  File "/usr/local/lib/python3.10/dist-packages/openai/api_requestor.py", line 710, in _interpret_response
    self._interpret_response_line(
  File "/usr/local/lib/python3.10/dist-packages/openai/api_requestor.py", line 775, in _interpret_response_line
    raise self.handle_er

KeyboardInterrupt: ignored

## Create Group Chat without Critic for Comparison

In [None]:
groupchat_nocritic = autogen.GroupChat(agents=[user_proxy, engineer, scientist, planner, executor], messages=[], max_round=50)
for agent in groupchat.agents:
    agent.reset()
manager_nocritic = autogen.GroupChatManager(groupchat=groupchat_nocritic, llm_config=gpt4_config)
user_proxy.initiate_chat(
    manager_nocritic,
    message="""
find papers on LLM applications from arxiv in the last week, create a markdown table of different domains.
""",
)

[33mAdmin[0m (to chat_manager):


find papers on LLM applications from arxiv in the last week, create a markdown table of different domains.


--------------------------------------------------------------------------------
[33mPlanner[0m (to chat_manager):

Plan:

1. Engineer: Write a script to scrape the arXiv website for papers related to LLM (Language Model) applications published in the last week. The script should extract the title, authors, abstract, and link to the paper.

2. Scientist: Review the scraped data to identify the different domains in which LLM is applied. This could be based on keywords in the title or abstract, or the scientist's knowledge of the field.

3. Engineer: Modify the script to categorize the papers based on the domains identified by the scientist. The script should output a markdown table with columns for the domain, title, authors, abstract, and link.

4. Scientist: Review the markdown table to ensure the papers are correctly categorized and the in