# Group Chat with Retrieval Augmented Generation

AutoGen supports conversable agents powered by LLMs, tools, or humans, performing tasks collectively via automated chat. This framework allows tool use and human participation through multi-agent conversation.
Please find documentation about this feature [here](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat).

````{=mdx}
:::info Requirements
Some extra dependencies are needed for this notebook, which can be installed via pip:

```bash
pip install pyautogen[retrievechat]
```

For more information, please refer to the [installation guide](/docs/installation/).
:::
````

## Set your API Endpoint

The [`config_list_from_json`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_from_json) function loads a list of configurations from an environment variable or a json file.

In [1]:
import chromadb
from typing_extensions import Annotated

import sys
sys.path.append('/home/simges/autogen/autogen/')

import autogen
from autogen import AssistantAgent
from autogen.agentchat.contrib.retrieve_user_proxy_agent import RetrieveUserProxyAgent

config_list = autogen.config_list_from_json("OAI_CONFIG_LIST")

print("LLM models: ", [config_list[i]["model"] for i in range(len(config_list))])

LLM models:  ['mistral']


````{=mdx}
:::tip
Learn more about configuring LLMs for agents [here](/docs/topics/llm_configuration).
:::
````

## Construct Agents

In [3]:
def termination_msg(x):
    return False
    # return isinstance(x, dict) and "TERMINATE" == str(x.get("content", ""))[-9:].upper()


llm_config = {"config_list": config_list, "timeout": 60, "temperature": 0.8, "seed": 1234}

boss = autogen.UserProxyAgent(
    name="Boss",
    is_termination_msg=termination_msg,
    human_input_mode="NEVER",
    code_execution_config=False,  # we don't want to execute code in this case.
    default_auto_reply="Reply `TERMINATE` if the task is done.",
    description="The boss who ask questions and give tasks.",
)

boss_aid = RetrieveUserProxyAgent(
    name="Boss_Assistant",
    is_termination_msg=termination_msg,
    human_input_mode="NEVER",
    default_auto_reply="Reply `TERMINATE` if the task is done.",
    max_consecutive_auto_reply=3,
    retrieve_config={
        "task": "code",
        "docs_path": "https://raw.githubusercontent.com/microsoft/FLAML/main/website/docs/Examples/Integrate%20-%20Spark.md",
        "chunk_token_size": 1000,
        "model": config_list[0]["model"],
        "collection_name": "groupchat",
        "get_or_create": True,
    },
    code_execution_config=False,  # we don't want to execute code in this case.
    description="Assistant who has extra content retrieval power for solving difficult problems.",
)

coder = AssistantAgent(
    name="Senior_Python_Engineer",
    is_termination_msg=termination_msg,
    system_message="You are a senior python engineer, you provide python code to answer questions. Reply `TERMINATE` in the end when everything is done.",
    llm_config=llm_config,
    description="Senior Python Engineer who can write code to solve problems and answer questions.",
)

pm = autogen.AssistantAgent(
    name="Product_Manager",
    is_termination_msg=termination_msg,
    system_message="You are a product manager. Reply `TERMINATE` in the end when everything is done.",
    llm_config=llm_config,
    description="Product Manager who can design and plan the project.",
)

reviewer = autogen.AssistantAgent(
    name="Code_Reviewer",
    is_termination_msg=termination_msg,
    system_message="You are a code reviewer. Reply `TERMINATE` in the end when everything is done.",
    llm_config=llm_config,
    description="Code Reviewer who can review the code.",
)

PROBLEM = "How to use spark for parallel training in FLAML? Give me sample code."


def _reset_agents():
    boss.reset()
    boss_aid.reset()
    coder.reset()
    pm.reset()
    reviewer.reset()


def rag_chat():
    _reset_agents()
    groupchat = autogen.GroupChat(
        agents=[boss_aid, pm, coder, reviewer], messages=[], max_round=12, speaker_selection_method="round_robin"
    )
    manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)

    # Start chatting with boss_aid as this is the user proxy agent.
    boss_aid.initiate_chat(
        manager,
        message=boss_aid.message_generator,
        problem=PROBLEM,
        n_results=3,
    )


def norag_chat():
    _reset_agents()
    groupchat = autogen.GroupChat(
        agents=[boss, pm, coder, reviewer],
        messages=[],
        max_round=12,
        speaker_selection_method="auto",
        allow_repeat_speaker=False,
    )
    manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)

    # Start chatting with the boss as this is the user proxy agent.
    boss.initiate_chat(
        manager,
        message=PROBLEM,
    )


def call_rag_chat():
    _reset_agents()

    # In this case, we will have multiple user proxy agents and we don't initiate the chat
    # with RAG user proxy agent.
    # In order to use RAG user proxy agent, we need to wrap RAG agents in a function and call
    # it from other agents.
    def retrieve_content(
        message: Annotated[
            str,
            "Refined message which keeps the original meaning and can be used to retrieve content for code generation and question answering.",
        ],
        n_results: Annotated[int, "number of results"] = 3,
    ) -> str:
        print("fffffffffffffffffffffffffffffffffffff")
        boss_aid.n_results = n_results  # Set the number of results to be retrieved.
        # Check if we need to update the context.
        update_context_case1, update_context_case2 = boss_aid._check_update_context(message)
        if (update_context_case1 or update_context_case2) and boss_aid.update_context:
            boss_aid.problem = message if not hasattr(boss_aid, "problem") else boss_aid.problem
            _, ret_msg = boss_aid._generate_retrieve_user_reply(message)
        else:
            _context = {"problem": message, "n_results": n_results}
            ret_msg = boss_aid.message_generator(boss_aid, None, _context)
        return ret_msg if ret_msg else message

    boss_aid.human_input_mode = "NEVER"  # Disable human input for boss_aid since it only retrieves content.

    for caller in [pm, coder, reviewer]:
        d_retrieve_content = caller.register_for_llm(
            description="retrieve content for code generation and question answering."
        )(retrieve_content)
        print("funcccccccccccs: " + str(caller.llm_config["tools"]))

    for executor in [boss, pm]:
        executor.register_for_execution()(d_retrieve_content)
    assert pm.function_map["retrieve_content"]._origin == retrieve_content

    groupchat = autogen.GroupChat(
        agents=[boss, pm, coder, reviewer],
        messages=[],
        max_round=12,
        speaker_selection_method="round_robin",
        allow_repeat_speaker=False,
    )

    manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)

    # Start chatting with the boss as this is the user proxy agent.
    boss.initiate_chat(
        manager,
        message=PROBLEM,
    )

import torch

print("torch.cuda.is_available(): ", torch.cuda.is_available())
print("torch.cuda.device_count(): ", torch.cuda.device_count())
print("torch.cuda.current_device(): ", torch.cuda.current_device())
print("torch.cuda.device(0): ", torch.cuda.device(0))
print("torch.cuda.get_device_name(0): ", torch.cuda.get_device_name(0))

[runtime logging] log_new_agent: autogen logger is None
[runtime logging] log_new_agent: autogen logger is None
[runtime logging] log_new_agent: autogen logger is None
[runtime logging] log_new_agent: autogen logger is None
[runtime logging] log_new_wrapper: autogen logger is None
[runtime logging] log_new_client: autogen logger is None
[runtime logging] log_new_agent: autogen logger is None
[runtime logging] log_new_agent: autogen logger is None
[runtime logging] log_new_wrapper: autogen logger is None
[runtime logging] log_new_client: autogen logger is None
[runtime logging] log_new_agent: autogen logger is None
[runtime logging] log_new_agent: autogen logger is None
[runtime logging] log_new_wrapper: autogen logger is None
[runtime logging] log_new_client: autogen logger is None
[runtime logging] log_new_agent: autogen logger is None
[runtime logging] log_new_agent: autogen logger is None


torch.cuda.is_available():  True
torch.cuda.device_count():  1
torch.cuda.current_device():  0
torch.cuda.device(0):  <torch.cuda.device object at 0x7fe010253190>
torch.cuda.get_device_name(0):  NVIDIA Graphics Device


## Start Chat

### UserProxyAgent doesn't get the correct code
[FLAML](https://github.com/microsoft/FLAML) was open sourced in 2020, so ChatGPT is familiar with it. However, Spark-related APIs were added in 2022, so they were not in ChatGPT's training data. As a result, we end up with invalid code.

In [None]:
norag_chat()

### RetrieveUserProxyAgent get the correct code
Since RetrieveUserProxyAgent can perform retrieval-augmented generation based on the given documentation file, ChatGPT can generate the correct code for us!

In [None]:
rag_chat()
# type exit to terminate the chat

### Call RetrieveUserProxyAgent while init chat with another user proxy agent
Sometimes, there might be a need to use RetrieveUserProxyAgent in group chat without initializing the chat with it. In such scenarios, it becomes essential to create a function that wraps the RAG agents and allows them to be called from other agents.

In [4]:
call_rag_chat()

[runtime logging] log_new_wrapper: autogen logger is None
[runtime logging] log_new_client: autogen logger is None
[runtime logging] log_new_wrapper: autogen logger is None
[runtime logging] log_new_client: autogen logger is None
[runtime logging] log_new_wrapper: autogen logger is None
[runtime logging] log_new_client: autogen logger is None
[runtime logging] log_new_wrapper: autogen logger is None
[runtime logging] log_new_client: autogen logger is None
[runtime logging] log_new_agent: autogen logger is None
[runtime logging] log_new_agent: autogen logger is None


funcccccccccccs: [{'type': 'function', 'function': {'description': 'retrieve content for code generation and question answering.', 'name': 'retrieve_content', 'parameters': {'type': 'object', 'properties': {'message': {'type': 'string', 'description': 'Refined message which keeps the original meaning and can be used to retrieve content for code generation and question answering.'}, 'n_results': {'type': 'integer', 'default': 3, 'description': 'number of results'}}, 'required': ['message']}}}]
funcccccccccccs: [{'type': 'function', 'function': {'description': 'retrieve content for code generation and question answering.', 'name': 'retrieve_content', 'parameters': {'type': 'object', 'properties': {'message': {'type': 'string', 'description': 'Refined message which keeps the original meaning and can be used to retrieve content for code generation and question answering.'}, 'n_results': {'type': 'integer', 'default': 3, 'description': 'number of results'}}, 'required': ['message']}}}]
func

[runtime logging] log_chat_completion: autogen logger is None


reply funcccc : <function GroupChatManager.a_run_chat at 0x7fe0d54d76a0>
reply funcccc : <function GroupChatManager.run_chat at 0x7fe0d54d7600>
reply funcccc : <function ConversableAgent.a_check_termination_and_human_reply at 0x7fe0d54d47c0>
reply funcccc : <function ConversableAgent.check_termination_and_human_reply at 0x7fe0d54d4720>
reply funcccc : <function ConversableAgent.a_generate_function_call_reply at 0x7fe0d54d4400>
reply funcccc : <function ConversableAgent.generate_function_call_reply at 0x7fe0d54d4360>
messageee : {'content': 'How to use spark for parallel training in FLAML? Give me sample code.', 'name': 'Boss', 'role': 'user'}
reply funcccc : <function ConversableAgent.a_generate_tool_calls_reply at 0x7fe0d54d4680>
reply funcccc : <function ConversableAgent.generate_tool_calls_reply at 0x7fe0d54d4540>
reply funcccc : <function ConversableAgent.a_generate_oai_reply at 0x7fe0d54d4180>
reply funcccc : <function ConversableAgent.generate_oai_reply at 0x7fe0d54d4040>
[33mPr

[runtime logging] log_chat_completion: autogen logger is None


reply funcccc : <function ConversableAgent.a_check_termination_and_human_reply at 0x7fe0d54d47c0>
reply funcccc : <function ConversableAgent.check_termination_and_human_reply at 0x7fe0d54d4720>
reply funcccc : <function ConversableAgent.a_generate_function_call_reply at 0x7fe0d54d4400>
reply funcccc : <function ConversableAgent.generate_function_call_reply at 0x7fe0d54d4360>
messageee : {'content': ' In Fairlearn for Accurate, Robust and Unbiased Machine Learning (FLAML), Spark can be utilized for parallel training of machine learning models. Here\'s an example using a logistic regression model:\n\nFirstly, you need to set up a Spark session:\n\n```python\nfrom pyspark.sql import SparkSession\nspark = SparkSession.builder \\\n    .appName("FLAML Sample Code") \\\n    .config("spark.some.config.option", "some-value") \\\n    .getOrCreate()\n```\n\nNow, let\'s load the data into a DataFrame:\n\n```python\nfrom pyspark.sql.types import DoubleType, StringType\ndata = [(0.0, 1), (1.0, 0), (

2024-09-28 18:23:29,677 - autogen.agentchat.contrib.retrieve_user_proxy_agent - INFO - [32mUse the existing collection `groupchat`.[0m


fffffffffffffffffffffffffffffffffffff
Trying to create collection.


2024-09-28 18:23:30,131 - autogen.agentchat.contrib.retrieve_user_proxy_agent - INFO - Found 1 chunks.[0m
Number of requested results 3 is greater than number of elements in index 1, updating n_results = 1
Model mistral not found. Using cl100k_base encoding.


VectorDB returns doc_ids:  [['bdfbc921']]
[32mAdding content of doc bdfbc921 to context.[0m
[33mBoss[0m (to chat_manager):

[33mBoss[0m (to chat_manager):

[32m***** Response from calling tool (call_6zj6i061) *****[0m
You're a retrieve augmented coding assistant. You answer user's questions based on your own knowledge and the
context provided by the user.
If you can't answer the question with or without the current context, you should reply exactly `UPDATE CONTEXT`.
For code generation, you must obey the following rules:
Rule 1. You MUST NOT install any packages because all the packages needed are already installed.
Rule 2. You must follow the formats below to write your code:
```language
# your code
```

User's question is: You can refer the provided code as an example on how to use spark for parallel training in FLAML, specifically with a logistic regression model.

Context is: # Integrate - Spark

FLAML has integrated Spark for distributed training. There are two main aspect

[runtime logging] log_chat_completion: autogen logger is None


reply funcccc : <function ConversableAgent.a_check_termination_and_human_reply at 0x7fe0d54d47c0>
reply funcccc : <function ConversableAgent.check_termination_and_human_reply at 0x7fe0d54d4720>
reply funcccc : <function ConversableAgent.a_generate_function_call_reply at 0x7fe0d54d4400>
reply funcccc : <function ConversableAgent.generate_function_call_reply at 0x7fe0d54d4360>
messageee : {'content': 'You\'re a retrieve augmented coding assistant. You answer user\'s questions based on your own knowledge and the\ncontext provided by the user.\nIf you can\'t answer the question with or without the current context, you should reply exactly `UPDATE CONTEXT`.\nFor code generation, you must obey the following rules:\nRule 1. You MUST NOT install any packages because all the packages needed are already installed.\nRule 2. You must follow the formats below to write your code:\n```language\n# your code\n```\n\nUser\'s question is: You can refer the provided code as an example on how to use spark 

[runtime logging] log_chat_completion: autogen logger is None


reply funcccc : <function ConversableAgent.a_check_termination_and_human_reply at 0x7fe0d54d47c0>
reply funcccc : <function ConversableAgent.check_termination_and_human_reply at 0x7fe0d54d4720>
reply funcccc : <function ConversableAgent.a_generate_function_call_reply at 0x7fe0d54d4400>
reply funcccc : <function ConversableAgent.generate_function_call_reply at 0x7fe0d54d4360>
messageee : {'content': ' It looks like you have provided a brief explanation of using SparkML models with AutoML in Flaml, along with an example code snippet and links to relevant notebooks for further reading. The first part explains how Flaml uses estimators with the `_spark` postfix by default when working with data in pandas-on-spark format.\n\nThe second part explains the use of Spark as a parallel backend for tasks such as AutoML and Hyperparameter Tuning, providing benefits for large models and datasets but also increasing overhead in some cases. It outlines various arguments that are available to control t

[runtime logging] log_chat_completion: autogen logger is None


reply funcccc : <function ConversableAgent.a_check_termination_and_human_reply at 0x7fe0d54d47c0>
reply funcccc : <function ConversableAgent.check_termination_and_human_reply at 0x7fe0d54d4720>
reply funcccc : <function ConversableAgent.a_generate_function_call_reply at 0x7fe0d54d4400>
reply funcccc : <function ConversableAgent.generate_function_call_reply at 0x7fe0d54d4360>
messageee : {'content': ' Here is some python code to further elaborate on your explanation:\n\n```python\nfrom flaml import AutoML, HyperparameterTuner\nfrom pyspark.ml.feature import Word2Vec, HashingTF\nfrom pyspark.ml.classification import LogisticRegression, RandomForestClassifier\nfrom pyspark.ml.regression import LinearRegression\n\n# Initialize Spark session\nspark = SparkSession.builder\\\n    .appName(\'FLAML demo\')\\\n    .getOrCreate()\n\n# Prepare data\ndata = spark.read.csv("data.csv", header=True)\n\n# Define model types for AutoML\nmodel_types = ["Classification", "Regression"]\n\n# Initialize Auto

[runtime logging] log_chat_completion: autogen logger is None


reply funcccc : <function ConversableAgent.a_check_termination_and_human_reply at 0x7fe0d54d47c0>
reply funcccc : <function ConversableAgent.check_termination_and_human_reply at 0x7fe0d54d4720>
reply funcccc : <function ConversableAgent.a_generate_function_call_reply at 0x7fe0d54d4400>
reply funcccc : <function ConversableAgent.generate_function_call_reply at 0x7fe0d54d4360>
messageee : {'content': 'Reply `TERMINATE` if the task is done.', 'name': 'Boss', 'role': 'user'}
reply funcccc : <function ConversableAgent.a_generate_tool_calls_reply at 0x7fe0d54d4680>
reply funcccc : <function ConversableAgent.generate_tool_calls_reply at 0x7fe0d54d4540>
reply funcccc : <function ConversableAgent.a_generate_oai_reply at 0x7fe0d54d4180>
reply funcccc : <function ConversableAgent.generate_oai_reply at 0x7fe0d54d4040>
[33mProduct_Manager[0m (to chat_manager):

 TERMINATE

--------------------------------------------------------------------------------


[runtime logging] log_chat_completion: autogen logger is None


reply funcccc : <function ConversableAgent.a_check_termination_and_human_reply at 0x7fe0d54d47c0>
reply funcccc : <function ConversableAgent.check_termination_and_human_reply at 0x7fe0d54d4720>
reply funcccc : <function ConversableAgent.a_generate_function_call_reply at 0x7fe0d54d4400>
reply funcccc : <function ConversableAgent.generate_function_call_reply at 0x7fe0d54d4360>
messageee : {'content': ' TERMINATE', 'name': 'Product_Manager', 'role': 'user'}
reply funcccc : <function ConversableAgent.a_generate_tool_calls_reply at 0x7fe0d54d4680>
reply funcccc : <function ConversableAgent.generate_tool_calls_reply at 0x7fe0d54d4540>
reply funcccc : <function ConversableAgent.a_generate_oai_reply at 0x7fe0d54d4180>
reply funcccc : <function ConversableAgent.generate_oai_reply at 0x7fe0d54d4040>
[33mSenior_Python_Engineer[0m (to chat_manager):

 Thank you for your suggestions and improvements! I have updated the explanation to include relevant comments in the example code, provide an examp

[runtime logging] log_chat_completion: autogen logger is None


reply funcccc : <function ConversableAgent.a_check_termination_and_human_reply at 0x7fe0d54d47c0>
reply funcccc : <function ConversableAgent.check_termination_and_human_reply at 0x7fe0d54d4720>
reply funcccc : <function ConversableAgent.a_generate_function_call_reply at 0x7fe0d54d4400>
reply funcccc : <function ConversableAgent.generate_function_call_reply at 0x7fe0d54d4360>
messageee : {'content': ' Thank you for your suggestions and improvements! I have updated the explanation to include relevant comments in the example code, provide an example error handling section, and suggest setting up environment prerequisites for getting started with Flaml using SparkML models, AutoML or Hyperparameter Tuning. Additionally, I\'ve added a note mentioning known limitations and potential performance issues when utilizing these features. Here\'s the updated explanation:\n\n   To utilize Flaml with SparkML models in your machine learning tasks, follow this brief guide, complete with an example code