# Customize Speaker Selection

In GroupChat, we can also customize the speaker selection by passing in a function to `speaker_selection_method`:
```python
def custom_speaker_selection_func(
    last_speaker: Agent,
    groupchat: GroupChat
) -> Union[Agent, Literal['auto', 'manual', 'random' 'round_robin'], None]:

    """Define a customized speaker selection function.
    A recommended way is to define a transition for each speaker in the groupchat.

    Parameters:
        - last_speaker: Agent
            The last speaker in the group chat.
        - groupchat: GroupChat
            The GroupChat object
    Return:
        Return one of the following:
        1. an `Agent` class, it must be one of the agents in the group chat.
        2. a string from ['auto', 'manual', 'random', 'round_robin'] to select a default method to use.
        3. None, which indicates the chat should be terminated.
    """
    pass

groupchat = autogen.GroupChat(
    speaker_selection_method=custom_speaker_selection_func,
    ...,
)
```
The last speaker and the groupchat object are passed to the function.
Commonly used variables from groupchat are `groupchat.messages` and `groupchat.agents`, which is the message history and the agents in the group chat respectively. You can access other attributes of the groupchat, such as `groupchat.allowed_speaker_transitions_dict` for pre-defined `allowed_speaker_transitions_dict`.

Heres is a simple example to build workflow for research with customized speaker selection.


```{=mdx}
![group_chat](../../../blog/2024-02-29-StateFlow/img/sf_example_1.png)
```

We define the following agents:

- Initializer: Start the workflow by sending a task.
- Coder: Retrieve papers from the internet by writing code.
- Executor: Execute the code.
- Scientist: Read the papers and write a summary.

In the Figure, we define a simple workflow for research with 4 states: Init, Retrieve, Research and End. Within each state, we will call different agents to perform the tasks.

Init: We use the initializer to start the workflow.
Retrieve: We will first call the coder to write code and then call the executor to execute the code.
Research: We will call the scientist to read the papers and write a summary.
End: We will end the workflow.

In [None]:
!pip install pyautogen[retrievechat]



In [None]:
from dotenv import load_dotenv
import os
os.chdir("../../")
load_dotenv()

True

In [None]:
import autogen

# Put your api key in the environment variable OPENAI_API_KEY
config_list = [
    {
        "model": "gpt-4-0125-preview",
        "api_key": os.getenv("OPENAI_API_KEY"),
    },
    {
        "model": "gemini-pro",
        "api_key": os.getenv("GOOGLE_API_KEY"),
        "api_type": "google"
    },
    {
        "model": "gemini-pro-vision",
        "api_key": os.getenv("GOOGLE_API_KEY"),
        "api_type": "google"
    }
]

# You can also create an file called "OAI_CONFIG_LIST" and store your config there
# config_list = autogen.config_list_from_json(
#     "OAI_CONFIG_LIST",
#     filter_dict={
#         "model": ["gpt-4-0125-preview"],
#     },
# )

gpt4_config = {
    "cache_seed": 42,  # change the cache_seed for different trials
    "temperature": 0,
    "config_list": [config_list[0]],
    "timeout": 120,
}
gemini_config = {
    "cache_seed": 42,  # change the cache_seed for different trials
    "temperature": 0,
    "config_list": [config_list[1]],
    "timeout": 120,
}
# choose llm config
llm_config = gemini_config

  from .autonotebook import tqdm as notebook_tqdm


In [None]:
initializer = autogen.UserProxyAgent(
    name="Init",
)

coder = autogen.AssistantAgent(
    name="Retrieve_coder",
    llm_config=llm_config,
    system_message="""You are the Coder. Given a topic, write code to retrieve related papers from the arXiv API, print their title, authors, abstract, and link.
    You write python/shell code to solve tasks. Wrap the code in a code block that specifies the script type. The user can't modify your code. So do not suggest incomplete code which requires others to modify. Don't use a code block if it's not intended to be executed by the executor.
    Don't include multiple code blocks in one response. Do not ask others to copy and paste the result. Check the execution result returned by the executor.
    If the result indicates there is an error, fix the error and output the code again. Suggest the full code instead of partial code or code changes. If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try. Ensure proper error handling such that an approrpriate format of  results is returned with the error code.
    """,
)
executor = autogen.UserProxyAgent(
    name="Retrieve_executer",
    system_message="Executor. Execute the code written by the Coder and report the result.",
    human_input_mode="NEVER",
    code_execution_config={
        "last_n_messages": 3,
        "work_dir": "paper",
        "use_docker": False,
    },  # Please set use_docker=True if docker is available to run the generated code. Using docker is safer than running the generated code directly.
)
scientist = autogen.AssistantAgent(
    name="Research_Action_1",
    llm_config=llm_config,
    system_message="""You are the Scientist. Please categorize papers after seeing their abstracts printed and create a markdown table with Domain, Title, Authors, Summary and Link""",
)


def state_transition(last_speaker, groupchat):
    messages = groupchat.messages

    if last_speaker is initializer:
        # init -> retrieve
        return coder
    elif last_speaker is coder:
        # retrieve: action 1 -> action 2
        return executor
    elif last_speaker is executor:
        if messages[-1]["content"] == "exitcode: 1":
            # retrieve --(execution failed)--> retrieve
            return coder
        else:
            # retrieve --(execution success)--> research
            return scientist
    elif last_speaker == "Scientist":
        # research -> end
        return None


groupchat = autogen.GroupChat(
    agents=[initializer, coder, executor, scientist],
    messages=[],
    max_round=20,
    speaker_selection_method=state_transition,
)
manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=gpt4_config)



In [None]:
from datetime import datetime
chat_result = initializer.initiate_chat(
    manager, message=f"Topic: LLM applications papers within the past month from today: {str(datetime.now().date())}. Requirement: 5 papers from different domains."
)

[33mInit[0m (to chat_manager):

Topic: LLM applications papers within the past month from today: 2024-06-23. Requirement: 5 papers from different domains.

--------------------------------------------------------------------------------
[33mRetrieve_coder[0m (to chat_manager):

```python
import requests
import datetime
from lxml import html

def get_arxiv_papers(query, start_date, end_date, max_results=10):
  """Get a list of arXiv papers matching the given query and date range."""

  # Construct the arXiv API query URL
  base_url = "http://export.arxiv.org/api/query?"
  params = {
      "search_query": query,
      "start_date": start_date,
      "end_date": end_date,
      "sortBy": "relevance",
      "max_results": max_results
  }
  url = base_url + "&".join("%s=%s" % (k, v) for k, v in params.items())

  # Make the request and parse the response
  response = requests.get(url)
  root = html.fromstring(response.content)

  # Extract the paper titles, authors, abstracts, and links

In [None]:
import json
print(json.dumps(chat_result.chat_history[-1], indent=4)) 

{
    "content": "| Domain | Title | Authors | Summary | Link |\n| ----------- | ----------- | ----------- | ----------- | ----------- |\n| Information Retrieval | Lost in Translation: Large Language Models in Non-English Content Analysis | Gabriel Nicholas, Aliya Bhatia | Explores the capabilities and limits of multilingual language models and offers recommendations for their research, development, and deployment. | http://arxiv.org/abs/2306.07377v1 |\n| Language Modeling | Cedille: A large autoregressive French language model | Martin M\u00fcller, Florian Laurent | Introduces Cedille, a large open-source autoregressive language model specifically trained for the French language, demonstrating its competitive performance on French zero-shot benchmarks. | http://arxiv.org/abs/2202.03371v1 |\n| Machine Translation | How Good are Commercial Large Language Models on African Languages? | Jessica Ojo, Kelechi Ogueji | Analyzes the performance of commercial language models on eight African l