<a href="https://colab.research.google.com/github/microsoft/autogen/blob/main/notebook/agentchat_groupchat_RAG.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Auto Generated Agent Chat: Group Chat with Retrieval Augmented Generation

AutoGen supports conversable agents powered by LLMs, tools or humans, performing tasks collectively via automated chat. This framework allows tool use and human participation through multi-agent conversation.
Please find documentation about this feature [here](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat).

## Requirements

AutoGen requires `Python>=3.8`. To run this notebook example, please install:
```bash
pip install pyautogen
```

In [None]:
%pip install pyautogen[retrievechat]

In [6]:
import logging
import sys

## Set your API Endpoint

The [`config_list_from_json`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_from_json) function loads a list of configurations from an environment variable or a json file.

In [7]:
import autogen

config_list = autogen.config_list_from_json(
    "OAI_CONFIG_LIST",
    file_location=".",
    filter_dict={
        "model": ["gpt-3.5-turbo-16k", "gpt-35-turbo", "gpt-35-turbo-0613", "gpt-4", "gpt4", "gpt-4-32k"],
    },
)

print("LLM models: ", [config_list[i]["model"] for i in range(len(config_list))])

print(autogen.__file__)

LLM models:  ['gpt-4', 'gpt-3.5-turbo-16k']
/Users/parkgyutae/dev/Pado/autogen/autogen/__init__.py


It first looks for environment variable "OAI_CONFIG_LIST" which needs to be a valid json string. If that variable is not found, it then looks for a json file named "OAI_CONFIG_LIST". It filters the configs by models (you can filter by other keys as well).

The config list looks like the following:
```python
config_list = [
    {
        "model": "gpt-4",
        "api_key": "<your OpenAI API key>",
    },  # OpenAI API endpoint for gpt-4
    {
        "engine": "gpt-35-turbo-0631", 
        "model": "gpt-35-turbo-0631",  # 0631 or newer is needed to use functions
        "api_base": "<your Azure OpenAI API base>", 
        "api_type": "azure", 
        "api_version": "2023-07-01-preview", # 2023-07-01-preview or newer is needed to use functions
        "api_key": "<your Azure OpenAI API key>"
    }
]
```

If you open this notebook in colab, you can upload your files by clicking the file icon on the left panel and then choose "upload file" icon.

You can set the value of config_list in other ways you prefer, e.g., loading from a YAML file.

In [8]:
user_goal = "Making a Microsoft investment report for investor"
user_context = "Target user is not expert."
toc_part = "Company background in the Introduction"

In [9]:
supervisor_prompt_template = '''
You'll be responsible for a specific part of the report.
The report is designed to help users achieve their {goal} in the context of the user's {goal}. 
Your main role is to ask the questions you need to build that report, look at the search results for those questions, rephrase them if you don't think you're getting enough results, and ask other questions to build the report if you do. If it's insufficient, say "INSUFFICINET" and ask another question, if it's sufficient, say "SUFFICIENT" and ask another question.
ask question one by one. If you think you asked enough question, then call summarize function.
and after that say "TERMINATE".
The part you will be responsible for is the {toc_part} of the report.'''

dialog_summarizer_prompt_template =  '''
You'll be responsebile for a specifc part of the report
The report is designed to help users achieve their {goal} in the context of the user's {goal}. 
Your main role is read dialog between user(question) and assistant(answer), and summarize them and make them into a few paragraphs.
The part you will be responsible for is the {toc_part} of the report. If you finish summarize, just tell me "TERMINATE"
'''

In [10]:
from autogen.agentchat.contrib.retrieve_user_proxy_agent import RetrieveUserProxyAgent
from autogen.agentchat.contrib.retrieve_assistant_agent import RetrieveAssistantAgent
from autogen import AssistantAgent, UserProxyAgent
import chromadb
from chromadb.utils import embedding_functions

logging.basicConfig(format='%(asctime)s | %(filename)s:%(lineno)d | %(levelname)s : %(message)s',
                     level=logging.DEBUG, stream=sys.stdout)

openai_ef = embedding_functions.OpenAIEmbeddingFunction( 
    api_key=config_list[0]["api_key"],
    model_name="text-embedding-ada-002"
)
llm_config = {
    "request_timeout": 100,
    "seed": 42,
    "config_list": config_list,
    "temperature": 0,
}

llm_config_function=llm_config.copy()
llm_config_function['functions'] = [
        {
            "name": "ask_DB",
            "description": "ask expert that can access to my database",
            "parameters": {
                "type": "object",
                "properties": {
                    "message": {
                        "type": "string",
                        "description": "question to ask expert. Ensure the question includes enough context, such as the code and the execution result. The expert does not know the conversation between you and the user unless you share the conversation with the expert."
                    },
                    "is_sufficient": {
                        "type": "string",
                        "enum": ["yes", "no", "init"],
                        "description": "Indicate if the previous answer was sufficient. Use 'yes' if the previous answer was sufficient, 'no' if it was not, and 'init' for the first question."
                    }
                },
                "required": ["message"]
            }
        },
        {
            "name": "ask_Summary",
            "description": "ask expert to summarize all of communication you did in report form.",
            "parameters": {
                "type": "object", 
                "properties" : {
                    "name": {
                        "type": "string", 
                        "description": "your name"
                    }
                }, 
                "required": ["name"]
            }
        }
    ]

supervisor_prompt = supervisor_prompt_template.format(goal=user_goal, context = user_context, toc_part = toc_part)
dialog_summarize_prompt = dialog_summarizer_prompt_template.format(goal=user_goal, context= user_context, toc_part = toc_part)
autogen.ChatCompletion.start_logging()
termination_msg = lambda x: isinstance(x, dict) and "TERMINATE" == str(x.get("content", ""))[-9:].upper()

def ask_DB(message):
    try:
        toc_QA = RetrieveUserProxyAgent(
            name="toc_QA",
            is_termination_msg=termination_msg,
            system_message="Assistant who has extra content retrieval in database. ",
            human_input_mode="NEVER",
            retrieve_config={
                "task": "QA",
                "docs_path": "/Users/parkgyutae/dev/Pado/ASQ_Summarizer/test_files/",
                "chunk_token_size": 1024,
                "model": config_list[0]["model"],
                "client": chromadb.PersistentClient(path="/tmp/chromadb"),
                "collection_name": "groupchat",
                "embedding_function" : openai_ef,
                "get_or_create": True,
            },
            code_execution_config=False,  # we don't want to execute code in this case.
        )

        assistant = RetrieveAssistantAgent(
            name="assistant", 
            system_message="You are a helpful assistant.",
            is_termination_msg=termination_msg,
            llm_config={
                "request_timeout": 600,
                "seed": 42,
                "config_list": config_list,
            },
        )

        toc_QA.initiate_chat(assistant, problem= message, silent=False, n_results=5)
        # expert.human_input_mode, expert.max_consecutive_auto_reply = "NEVER", 0
        # final message sent from the expert
        # return the last message the expert received
        return assistant.last_message()["content"]
    except Exception as e:
        raise

def ask_Summary():
    toc_summarizer = AssistantAgent(
        name = 'toc_summarizer',
        system_message=dialog_summarize_prompt,
        llm_config=llm_config,
    )

    chat_log=toc_manager.chat_messages
    user_proxy.initiate_chat(toc_summarizer, message=chat_log)
    return user_proxy.last_message()["content"]

toc_manager = AssistantAgent(
    name = "toc_manager",
    is_termination_msg=termination_msg,
    system_message=supervisor_prompt,
    llm_config=llm_config_function,
    human_input_mode='NEVER',
    code_execution_config=False
)

user_proxy = UserProxyAgent(
    name = "tester",
    llm_config=False, 
    is_termination_msg=termination_msg,
    human_input_mode= "NEVER",
    code_execution_config=False,
    function_map={"ask_DB": ask_DB, "ask_Summary" : ask_Summary}
)

user_proxy.initiate_chat(toc_manager,silent=False,message="")


[33mtester[0m (to toc_manager):



--------------------------------------------------------------------------------
2023-11-07 16:11:22,223 | conversable_agent.py:822 | INFO : It executed generate_oai_reply
[33mtoc_manager[0m (to tester):

{
"message": "Can you provide me with a detailed background of Microsoft Corporation?"
}
[32m***** Suggested function Call: ask_DB *****[0m
Arguments: 
{
"message": "Can you provide me with a detailed background of Microsoft Corporation?"
}
[32m*******************************************[0m

--------------------------------------------------------------------------------
[35m
>>>>>>>> EXECUTING FUNCTION ask_DB...[0m
Trying to create collection.


0it [00:00, ?it/s]
  0%|          | 0/3 [00:00<?, ?it/s]

2023-11-07 16:11:22,251 | directory.py:98 | DEBUG : Processing file: /Users/parkgyutae/dev/Pado/ASQ_Summarizer/test_files/MSFT Financials.pdf


2023-11-07 16:11:22,662 | connectionpool.py:291 | DEBUG : Resetting dropped connection: app.posthog.com
2023-11-07 16:11:23,302 | connectionpool.py:474 | DEBUG : https://app.posthog.com:443 "POST /batch/ HTTP/1.1" 200 None


 33%|███▎      | 1/3 [00:02<00:04,  2.50s/it]

2023-11-07 16:11:24,753 | directory.py:98 | DEBUG : Processing file: /Users/parkgyutae/dev/Pado/ASQ_Summarizer/test_files/microsoft 10k 2022.pdf


 67%|██████▋   | 2/3 [00:13<00:07,  7.26s/it]

2023-11-07 16:11:35,343 | directory.py:98 | DEBUG : Processing file: /Users/parkgyutae/dev/Pado/ASQ_Summarizer/test_files/microsoft 10k 2023.pdf


100%|██████████| 3/3 [00:23<00:00,  7.76s/it]
0it [00:00, ?it/s]

2023-11-07 16:11:45,577 | retrieve_utils.py:325 | INFO : Found 341 chunks.
2023-11-07 16:11:45,584 | util.py:60 | DEBUG : message='Request to OpenAI API' method=post path=https://api.openai.com/v1/engines/text-embedding-ada-002/embeddings
2023-11-07 16:11:45,589 | util.py:60 | DEBUG : api_version=None data='{"input": ["Important disclosures appear on the last page of this report.  The Henry Fund    Henr y B. Tippie College of Business   Fangxing Wei  [Fangxing -wei@uiowa.edu]      MICROSOFT CORPORATION (MSFT)  November  17, 202 1  Information Technology \\u2013 Software  Stock Rating  HOLD   Investment Thesis  Target Price  $366    Based on our research, we have a HOLD  rating on Microsoft Corporation.  Microsoft\'s stock is now trading at $339.12 and our model\'s price target is  $366, which implies an upside of 7.92%. But Microsoft already has the largest  weighting of any investment in  the Henry Fund portfolio, 8.92%, which is  already 0.88 percentage points over market weight. The




2023-11-07 16:11:47,726 | connectionpool.py:474 | DEBUG : https://api.openai.com:443 "POST /v1/engines/text-embedding-ada-002/embeddings HTTP/1.1" 200 None
2023-11-07 16:11:48,909 | util.py:60 | DEBUG : message='OpenAI API response' path=https://api.openai.com/v1/engines/text-embedding-ada-002/embeddings processing_ms=880 request_id=45ce68f786e78d9ef4ca2010d5ca9aed response_code=200
2023-11-07 16:11:49,704 | util.py:60 | DEBUG : message='Request to OpenAI API' method=post path=https://api.openai.com/v1/engines/text-embedding-ada-002/embeddings
2023-11-07 16:11:49,705 | util.py:60 | DEBUG : api_version=None data='{"input": ["Can you provide me with a detailed background of Microsoft Corporation?"], "encoding_format": "base64"}' message='Post details'
2023-11-07 16:11:49,962 | connectionpool.py:474 | DEBUG : https://api.openai.com:443 "POST /v1/engines/text-embedding-ada-002/embeddings HTTP/1.1" 200 None
2023-11-07 16:11:49,973 | util.py:60 | DEBUG : message='OpenAI API response' path=ht

KeyboardInterrupt: 

In [None]:
from autogen.retrieve_utils import split_files_to_chunks

files = ["/Users/parkgyutae/dev/Pado/ASQ_Summarizer/test_files/MSFT Financials.pdf",
         "/Users/parkgyutae/dev/Pado/ASQ_Summarizer/test_files/microsoft 10k 2022.pdf",
           "/Users/parkgyutae/dev/Pado/ASQ_Summarizer/test_files/microsoft 10k 2023.pdf" ]
chunks=split_files_to_chunks(files=files)
print(chunks)

In [None]:
chunks[0]

## Construct Agents

In [None]:
from autogen.agentchat.contrib.retrieve_user_proxy_agent import RetrieveUserProxyAgent
from autogen.agentchat.contrib.qdrant_retrieve_user_proxy_agent import QdrantRetrieveUserProxyAgent
from autogen import AssistantAgent
from memgpt.autogen.memgpt_agent import create_memgpt_autogen_agent_from_config

import chromadb

llm_config = {
    "request_timeout": 60,
    "seed": 42,
    "config_list": config_list,
    "temperature": 0,
}

autogen.ChatCompletion.start_logging()
termination_msg = lambda x: isinstance(x, dict) and "TERMINATE" == str(x.get("content", ""))[-9:].upper()

boss = autogen.UserProxyAgent(
    name="Boss",
    is_termination_msg=termination_msg,
    human_input_mode="TERMINATE",
    system_message="The boss who ask questions and give tasks.",
    code_execution_config=False,  # we don't want to execute code in this case.
)

boss_aid = RetrieveUserProxyAgent(
    name="Boss_Assistant",
    is_termination_msg=termination_msg,
    system_message="Assistant who has extra content retrieval power for solving difficult problems.",
    human_input_mode="TERMINATE",
    max_consecutive_auto_reply=3,
    retrieve_config={
        "task": "code",
        "docs_path": "/Users/parkgyutae/dev/Pado/ASQ_Summarizer/markdown/",
        "chunk_token_size": 3000,
        "model": config_list[0]["model"],
        "client": chromadb.PersistentClient(path="/tmp/chromadb"),
        "collection_name": "groupchat",
        "get_or_create": True,
    },
    code_execution_config=False,  # we don't want to execute code in this case.
)

coder = AssistantAgent(
    name="Senior_Python_Engineer",
    is_termination_msg=termination_msg,
    system_message="You are a senior python engineer. Reply `TERMINATE` in the end when everything is done.",
    llm_config=llm_config,
)

pm = autogen.AssistantAgent(
    name="Product_Manager",
    is_termination_msg=termination_msg,
    system_message="You are a product manager. Reply `TERMINATE` in the end when everything is done.",
    llm_config=llm_config,
)

reviewer = autogen.AssistantAgent(
    name="Code_Reviewer",
    is_termination_msg=termination_msg,
    system_message="You are a code reviewer. Reply `TERMINATE` in the end when everything is done.",
    llm_config=llm_config,
)

PROBLEM = "How to use spark for parallel training in FLAML? Give me sample code."

def _reset_agents():
    boss.reset()
    boss_aid.reset()
    coder.reset()
    pm.reset()
    reviewer.reset()

def rag_chat():
    _reset_agents()
    groupchat = autogen.GroupChat(
        agents=[boss_aid, coder, pm, reviewer], messages=[], max_round=12
    )
    manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)

    # Start chatting with boss_aid as this is the user proxy agent.
    boss_aid.initiate_chat(
        manager,
        problem=PROBLEM,
        n_results=3,
    )

def norag_chat():
    _reset_agents()
    groupchat = autogen.GroupChat(
        agents=[boss, coder, pm, reviewer], messages=[], max_round=12
    )
    manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)

    # Start chatting with boss as this is the user proxy agent.
    boss.initiate_chat(
        manager,
        message=PROBLEM,
    )

def call_rag_chat():
    _reset_agents()
    # In this case, we will have multiple user proxy agents and we don't initiate the chat
    # with RAG user proxy agent.
    # In order to use RAG user proxy agent, we need to wrap RAG agents in a function and call
    # it from other agents.
    def retrieve_content(message, n_results=3):
        boss_aid.n_results = n_results  # Set the number of results to be retrieved.
        # Check if we need to update the context.
        update_context_case1, update_context_case2 = boss_aid._check_update_context(message)
        if (update_context_case1 or update_context_case2) and boss_aid.update_context:
            boss_aid.problem = message if not hasattr(boss_aid, "problem") else boss_aid.problem
            _, ret_msg = boss_aid._generate_retrieve_user_reply(message)
        else:
            ret_msg = boss_aid.generate_init_message(message, n_results=n_results)
        return ret_msg if ret_msg else message
    
    boss_aid.human_input_mode = "NEVER" # Disable human input for boss_aid since it only retrieves content.
    
    llm_config = {
        "functions": [
            {
                "name": "retrieve_content",
                "description": "retrieve content for code generation and question answering.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "message": {
                            "type": "string",
                            "description": "Refined message which keeps the original meaning and can be used to retrieve content for code generation and question answering.",
                        }
                    },
                    "required": ["message"],
                },
            },
        ],
        "config_list": config_list,
        "request_timeout": 60,
        "seed": 42,
    }

    for agent in [coder, pm, reviewer]:
        # update llm_config for assistant agents.
        agent.llm_config.update(llm_config)

    for agent in [boss, coder, pm, reviewer]:
        # register functions for all agents.
        agent.register_function(
            function_map={
                "retrieve_content": retrieve_content,
            }
        )

    groupchat = autogen.GroupChat(
        agents=[boss, coder, pm, reviewer], messages=[], max_round=12
    )
    manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)

    # Start chatting with boss as this is the user proxy agent.
    boss.initiate_chat(
        manager,
        message=PROBLEM,
    )

## Start Chat

### UserProxyAgent doesn't get the correct code
[FLAML](https://github.com/microsoft/FLAML) was open sourced in 2020, so ChatGPT is familiar with it. However, Spark-related APIs were added in 2022, so they were not in ChatGPT's training data. As a result, we end up with invalid code.

In [None]:
norag_chat()

### RetrieveUserProxyAgent get the correct code
Since RetrieveUserProxyAgent can perform retrieval-augmented generation based on the given documentation file, ChatGPT can generate the correct code for us!

In [None]:
rag_chat()
# type exit to terminate the chat

### Call RetrieveUserProxyAgent while init chat with another user proxy agent
Sometimes, there might be a need to use RetrieveUserProxyAgent in group chat without initializing the chat with it. In such scenarios, it becomes essential to create a function that wraps the RAG agents and allows them to be called from other agents.

In [None]:
call_rag_chat()