In [2]:
import os
from dotenv import load_dotenv
from autogen.agentchat import ConversableAgent

# Comfigure the LLM

You need to set up your favoriate LLM API as the backend for AutoGen. Save the API key, url, and other information in a .env file.


In [3]:
load_dotenv()

llm_list = [
    {
        "model": "gpt-4o",
        "api_type": "azure",
        "api_key": os.getenv("AZURE_OPENAI_KEY"),
        "api_version": os.getenv("OPENAI_API_VERSION"),
        "base_url": os.getenv("AZURE_OPENAI_ENDPOINT"),
    }
]

llm_config = {
    "temperature": 0.7,
    "config_list": llm_list,
    "timeout": 60,
}

### Check if the LLM is working

In [6]:
assistant = ConversableAgent(
    name='assistant',
    llm_config=llm_config,
    code_execution_config=False, # turn off code execution
    function_map=None, # No registered functions, by default it is None,
    human_input_mode='NEVER', # turn off human input
    description='helpful assistant',
)

response = assistant.generate_reply(
    messages = [
        {
            "content": "tell me a joke",
            "role": "user"
        }
    ]
)

print(response)

Sure, here's a joke for you:

Why don't scientists trust atoms?

Because they make up everything!


## Human Input

In this section, I will explain how to use human input mode in AutoGen.

There are three modes:

1. `NEVER`: the agent will never prompt for human input. Under this mode, the conversation stops when the number of auto reply reaches the `max_consecutive_auto_reply` or when `is_termination_msg` is True.
2. `TERMINATE`: The agent only prompts for human input only when a termination message is received or the number of auto reply reaches the `max_consecutive_auto_reply`.
3. `ALWAYS`: the agent prompts for human input every time a message is received. Under this mode, the conversation stops when the human input is "exit", or when `is_termination_msg` is True and there is no human input.

The following is an example when set to `ALWAYS`:

In [8]:
human_proxy = ConversableAgent(
    name='human_proxy',
    llm_config=False, # no need to use llm for human input
    human_input_mode='ALWAYS',
    description='human proxy',
)

agent_with_number = ConversableAgent(
    name='agent_with_number',
    system_message="""
        You are playing a game of guess-my-number. You have the number 53 in you mind, and I will try to guess it.
        If i guess too hig, you will say 'too high', if i guess too low, you will say 'too low'.
    """,
    llm_config=llm_config,
    is_termination_msg=lambda x: '53' in x['content'], # The dict can contain the following keys: "content", "role", "name", "function_call".
    human_input_mode='NEVER', # turn off human input
    description='agent with number',
)

human_proxy.initiate_chat(
    recipient=agent_with_number,
    message="17"
)

[33mhuman_proxy[0m (to agent_with_number):

17

--------------------------------------------------------------------------------
[33magent_with_number[0m (to human_proxy):

too low

--------------------------------------------------------------------------------
[33mhuman_proxy[0m (to agent_with_number):

23

--------------------------------------------------------------------------------
[33magent_with_number[0m (to human_proxy):

too low

--------------------------------------------------------------------------------
[33mhuman_proxy[0m (to agent_with_number):

43

--------------------------------------------------------------------------------
[33magent_with_number[0m (to human_proxy):

too low

--------------------------------------------------------------------------------
[33mhuman_proxy[0m (to agent_with_number):

51

--------------------------------------------------------------------------------
[33magent_with_number[0m (to human_proxy):

too low

-------------

ChatResult(chat_id=None, chat_history=[{'content': '17', 'role': 'assistant', 'name': 'human_proxy'}, {'content': 'too low', 'role': 'user', 'name': 'agent_with_number'}, {'content': '23', 'role': 'assistant', 'name': 'human_proxy'}, {'content': 'too low', 'role': 'user', 'name': 'agent_with_number'}, {'content': '43', 'role': 'assistant', 'name': 'human_proxy'}, {'content': 'too low', 'role': 'user', 'name': 'agent_with_number'}, {'content': '51', 'role': 'assistant', 'name': 'human_proxy'}, {'content': 'too low', 'role': 'user', 'name': 'agent_with_number'}, {'content': '53', 'role': 'assistant', 'name': 'human_proxy'}], summary='53', cost={'usage_including_cached_inference': {'total_cost': 0.0021200000000000004, 'gpt-4o-2024-05-13': {'cost': 0.0021200000000000004, 'prompt_tokens': 400, 'completion_tokens': 8, 'total_tokens': 408}}, 'usage_excluding_cached_inference': {'total_cost': 0.0017250000000000002, 'gpt-4o-2024-05-13': {'cost': 0.0017250000000000002, 'prompt_tokens': 327, 'com

## Tool Use

Tools are predefined functions that agents can use. Instead of writing arbitrary code, agents can call tools to perform actions, such as searching the web, performing calculations, reading files, or calling remote APIs. Because you can control what tools are available to an agent, you can control what actions an agent can perform.

### 1. Define the tool

The tool is a regular python function.
To let LLM better understands our tools, we need provide docstrings (description) for each tool and define the input and output type of the tool. Note that to explain the definition of the input and output, we need to use `Annotated` from `typing`. If just writing the docstring of the input and output, all the docstrings will be function description only.
```txt
    Generate a random number from a Gamma distribution with the given shape and scale.

        Args:
            shape (int): The shape parameter of the Gamma distribution.
            scale (int): The scale parameter of the Gamma distribution.

        Returns:
            int: A random number from the Gamma distribution.
```

The following is an example of a tool that generates a random number from a Gamma distribution.

In [8]:
import numpy as np
from typing import Annotated

def generate_random_number(
    shape: Annotated[int, "The shape parameter of the Gamma distribution."],
    scale: Annotated[int, "The scale parameter of the Gamma distribution."]
) -> Annotated[int, "A random number from the Gamma distribution."]:
    """
    Generate a random number from a Gamma distribution with the given shape and scale.

    Args:
        shape (int): The shape parameter of the Gamma distribution.
        scale (int): The scale parameter of the Gamma distribution.

    Returns:
        int: A random number from the Gamma distribution.
    """
    if (shape <= 0) or (scale <= 0):
        raise ValueError("Shape and scale must be positive integers.")
        
    return int(np.random.gamma(shape, scale))

generate_random_number(2, 100)


34

### 2. Register the tool

A tool must be registered with at least two agents for it to be useful in conversation. The agent registered with the tool’s signature through `register_for_llm` can call the tool; the agent registered with the tool’s function object through `register_for_execution` can execute the tool’s function.

- `register_for_llm`
    - Used when the response needs to be interpreted by the LLM
    - Typically for natural language responses
    - Part of the conversation flow
    - Affects the context window size
    
    ```python
    # Used for responses that should be processed by the LLM
    @agent.register_for_llm()
    ```

- `register_for_execution`
    - Used for executing specific actions or code
    - Handles non-llm tasks: running code, api calls, file operations, etc.

    ```python
    # Used for executing specific actions or code
    @agent.register_for_execution()
    ```


In [28]:
assistant = ConversableAgent(
    name='Assistant',
    system_message="""
        You are a helpful assistant that can use tools to perform actions.
        You can help with generating a random number from a Gamma distribution.
        Return 'DONE' when the task is completed.
        """,
    llm_config=llm_config
)


# the user proxy is used for interacting with the assistant agent and executes tool calls
user_proxy = ConversableAgent(
    name='User',
    llm_config=False, 
    is_termination_msg=lambda x: x.get('content') is not None and 'DONE' in x.get('content'),
    human_input_mode='NEVER'
)

# register the tool with the assistant agent
assistant.register_for_llm(
    name='generator'
)(generate_random_number)

# register the tool with user proxy agent
user_proxy.register_for_execution(
    name='generator'
)(generate_random_number)


<autogen.tools.tool.Tool at 0x1e258200cd0>

In the following, we can see the function's signature. All the docstrings and input/output explaination are included automatically, such as function description, input and output type.

In [29]:
assistant.llm_config['tools']

[{'type': 'function',
  'function': {'description': '\n    Generate a random number from a Gamma distribution with the given shape and scale.\n\n    Args:\n        shape (int): The shape parameter of the Gamma distribution.\n        scale (int): The scale parameter of the Gamma distribution.\n\n    Returns:\n        int: A random number from the Gamma distribution.\n    ',
   'name': 'generator',
   'parameters': {'type': 'object',
    'properties': {'shape': {'type': 'integer',
      'description': 'The shape parameter of the Gamma distribution.'},
     'scale': {'type': 'integer',
      'description': 'The scale parameter of the Gamma distribution.'}},
    'required': ['shape', 'scale']}}}]

### 3. Using the tool

Once the tool is registered, we can use it in conversation. In the code below, we ask the assistant to generate a random number from a Gamma distribution with the given shape and scale.


In [30]:
number = user_proxy.initiate_chat(
    recipient=assistant,
    message="Generate a random number from a Gamma distribution with the shape 2 and scale 100."
)


[33mUser[0m (to Assistant):

Generate a random number from a Gamma distribution with the shape 2 and scale 100.

--------------------------------------------------------------------------------
[31m
>>>>>>>> USING AUTO REPLY...[0m
[33mAssistant[0m (to User):

[32m***** Suggested tool call (call_2Q2RmhWCxw0XgK8VFnzTBsOt): generator *****[0m
Arguments: 
{"shape":2,"scale":100}
[32m**************************************************************************[0m

--------------------------------------------------------------------------------
[35m
>>>>>>>> EXECUTING FUNCTION generator...
Call ID: call_2Q2RmhWCxw0XgK8VFnzTBsOt
Input arguments: {'shape': 2, 'scale': 100}[0m
[33mUser[0m (to Assistant):

[32m***** Response from calling tool (call_2Q2RmhWCxw0XgK8VFnzTBsOt) *****[0m
127
[32m**********************************************************************[0m

--------------------------------------------------------------------------------
[31m
>>>>>>>> USING AUTO REPLY...[0

## Conversation Patterns

There are four types of conversation:
1. **Two-agent chat**: The simplest form of conversation pattern involves two agents communicating with each other using `initiate_chat`. This basic interaction allows direct communication between agents, with one agent initiating the conversation and the other responding.

2. **Sequential chat**: This pattern enables a sequence of chats between two agents. The conversations are linked through a carryover mechanism that passes the summary of previous chats into the context of subsequent chats. This allows for continuous, context-aware conversations where each interaction builds upon previous discussions.

3. **Group Chat**: This advanced pattern enables multiple agents to participate in a single conversation. The key consideration is determining which agent speaks next. AutoGen provides several selection strategies:
    - `round_robin`: Agents take turns in a fixed order
    - `random`: Random selection of next speaker
    - `manual`: Human user selects the next speaker
    - `auto`: Default mode where an LLM decides the next speaker
    
    The system also supports constraining speaker selection and custom selection functions. This enables building StateFlow models for deterministic workflows among agents.

4. **Nested Chat**: This pattern allows packaging complex multi-agent workflows into a single agent interface for modular reuse in larger systems. By encapsulating multiple agents and their interactions within a single interface, nested chats enable building hierarchical and modular conversation systems. This is particularly useful for creating reusable components in larger automated workflows.

### Group Chat

The figure illustrates how group chat works:

![group_chat_figure](https://microsoft.github.io/autogen/0.2/assets/images/group-chat-2c88a3b4769d60898d0d4d2313d2d168.png)

Teh `GroupChatManager` can use several strategies to select the next agents, and those strategies are mentioned above. To help the Group Chat Manager select the next agent, the `description` of the agents need to be provided. Without the `description`, the Group Chat Manager will use the agents’ `system_message`, which may be **not** be the best choice.

First, create a `GroupChat` objet and provide the list of agents. Next, create a `GroupChatManager` object and provide the `GroupChat` object.

Example:
```python
from autogen import GroupChat, GroupChatManager

group_chat = GroupChat(
    agents=[agent1, agent2, agent3],
    messages=[],
    max_round=6
)

group_chat_manager = GroupChatManager(
    groupchat=group_chat,
    llm_config=llm_config
)

# default is auto
chat_result = agent1.initiate_chat(
    group_chat_manager,
    message="My number is 3, I want to turn it into 13.",
    summary_method="reflection_with_llm",
)
```

In the previous example, we set the `description` of the agents to help the Group Chat Manager select the next agent. This only helps the Group Chat Manager, however, does not help the participating agents to know about each other. Sometimes it is useful have each agent introduce themselves to other agents in the group chat. This can be done by setting the `send_introductions=True`.

```python
group_chat_with_instruction = GroupChat(
    agents=[agent1, agent2, agent3],
    messages=[],
    max_round=6,
    send_introductions=True
)
```

### Group Chat in a Sequential Chat
Group chat can also be used as a part of a sequential chat. In this case, the Group Chat Manager is treated as a regular agent in the sequence of two-agent chats.

```python
group_chat_manager_with_intros = GroupChatManager(
    groupchat=group_chat_with_instruction,
    llm_config=llm_config,
)

chat_result = agent4.initial_chats(
    group_chat_manager_with_intros,
    message="My number is 3, I want to turn it into 13.",
    summary_method="reflection_with_llm"
)
```

### Constrained Speaker Selection

When the number of agents is large, it is hard to control conversation flow. `allowed_or_disallowed_speaker_transitions` argument is a dictionary that maps a given agent to a list of agents that can (or cannot) be selected to speak next. 

```python
allowed_or_disallowed_speaker_transitions = {
    "agent1": ["agent2", "agent3"],
    "agent2": ["agent1", "agent3"]
    }

constrained_group_chat = GroupChat(
    agents=[agent1, agent2, agent3],
    messages=[],
    max_round=6,
    allowed_or_disallowed_speaker_transitions=allowed_transitions,
    speaker_transitions_type="allowed",
    send_introductions=True,
)

# the following is the same as the above

```



### Nested Chats

Package a workflow into a single agent for reuse in a larger workflow. Nested chats is powered by the nested chats handler, which is a pluggable component of `ConversableAgent`.  In each of the nested chats, the sender agent is always the same agent that triggered the nested chats. In the end, the nested chat handler uses the results of the nested chats to produce a response to the original message. By default, the nested chat handler uses the summary of the last chat as the response.

![nested-chat](https://microsoft.github.io/autogen/0.2/assets/images/nested-chats-857c3a6a2115d53d588e4fea09d5daf7.png)