Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Code execution seems strange for AUTOGEN STUDIO's group chat feature #1423

Open
ShaneYuTH opened this issue Jan 26, 2024 · 21 comments
Open
Assignees
Labels
bug Something isn't working studio Related to AutoGen Studio.

Comments

@ShaneYuTH
Copy link

Describe the bug

Hi @victordibia, I did a clean installation to test the group chat feature, and it seems the role selection is quite strange when it comes to tool/function calling. From my experience working with autogen's native group chat, the code execution is done by userproxy agent, however, in autogen studio, code execution will be assigned to assistant agent. The assistant agent will try to execute the code but fail for sure, and even hallucinate under certain tries, giving false result (in my case claiming the image is generated but nothing showed up under scratch folder).

Steps to reproduce

  1. Create a group chat workflow setup as described
  2. Ask any painting related questions (e.g. Help me create a painting containing only these keywords: forest, monster, sunrise, human)

Expected Behavior

The tool/function call will not be executed correctly. In my case, an assistant agent will try to execute the code and give false results. It will even hallucinate under certain tries, saying the code is executed successfully and image is being generated, however nothing will present under scratch folder.

Screenshots and logs


Below is the workflow setup:

Workflow Name:
Creative Art Group Workflow
Workflow Description:
Creative Art Group Workflow
Summary Method:
last

Sender:
{"type": "userproxy", "config": {"name": "userproxy", "llm_config": false, "human_input_mode": "NEVER", "max_consecutive_auto_reply": 5, "system_message": "", "is_termination_msg": null, "code_execution_config": {"work_dir": null, "use_docker": false}}, "id": "c98af1ec-af7c-483e-86f6-0d2ec3b4cc05", "timestamp": "2024-01-26T17:20:22.881648", "user_id": "default", "skills": null, "description": "User proxy agent to execute code"}

Receiver (content for skills is not included, it is the default skill comes with dbdefaults.json):
{"type": "groupchat", "config": {"name": "group_chat_manager", "llm_config": {"config_list": [{"model": "gpt-4-1106-preview"}], "temperature": 0.1, "cache_seed": null, "timeout": 600}, "human_input_mode": "NEVER", "max_consecutive_auto_reply": 8, "system_message": "Group chat manager", "is_termination_msg": null, "code_execution_config": null}, "groupchat_config": {"agents": [{"type": "assistant", "config": {"name": "primary_assistant", "llm_config": {"config_list": [{"model": "gpt-4-1106-preview"}], "temperature": 0.1, "cache_seed": null, "timeout": 600}, "human_input_mode": "NEVER", "max_consecutive_auto_reply": 8, "system_message": "You are a helpful assistant that can provide creative art work for a user. You are the primary coordinator who will receive suggestions or advice from other agents (paint_assistant, design_assistant). You must ensure that the finally work integrates the suggestions and results from other agents or team members. YOUR FINAL RESPONSE MUST BE THE COMPLETE PLAN that ends with the word TERMINATE. ", "is_termination_msg": null, "code_execution_config": null}, "id": "c5861e30-0a56-476a-a372-576ffd811682", "timestamp": "2024-01-26T17:20:22.881968", "user_id": "default", "skills": null, "description": null}, {"type": "assistant", "config": {"name": "design_assistant", "llm_config": {"config_list": [{"model": "gpt-4-1106-preview"}], "temperature": 0.1, "cache_seed": null, "timeout": 600}, "human_input_mode": "NEVER", "max_consecutive_auto_reply": 8, "system_message": "You are a creative design assistant that provides design ideas for artists based on user requests.", "is_termination_msg": null, "code_execution_config": null}, "id": "05e3806e-cc76-4959-a9a1-55169b2fd220", "timestamp": "2024-01-26T17:20:22.882009", "user_id": "default", "skills": null, "description": "Design assistant is responsible for the creative design ideas of an art work."}, {"type": "assistant", "config": {"name": "paint_assistant", "llm_config": {"config_list": [{"model": "gpt-4-1106-preview"}], "temperature": 0.1, "cache_seed": null, "timeout": 600}, "human_input_mode": "NEVER", "max_consecutive_auto_reply": 8, "system_message": "You are a helpful assistant that can use available functions when needed to generate painting.", "is_termination_msg": null, "code_execution_config": null}, "id": "9251df00-dd77-4fa4-babe-1d729b59d035", "timestamp": "2024-01-26T17:15:44.452450", "user_id": "default", "skills": [{"title": "generate_images", "file_name": "generate_images.py", "content": (content for generate_images.py), "id": "b8c34d70-e8b3-4df6-aee0-77adfb4e81cc", "description": "This skill generates images from a given query using OpenAI's DALL-E model and saves them to disk.", "timestamp": "2024-01-26T17:15:44.452448", "user_id": "default"}], "description": "A paint assistant agent that code to generate images based on design_assistant's request."}], "admin_name": "groupchat_assistant", "messages": [], "max_round": 10, "speaker_selection_method": "auto", "allow_repeat_speaker": false}, "id": "e673a664-cb9d-473d-b04a-ffd86522f2a3", "timestamp": "2024-01-26T17:20:22.882050", "user_id": "default", "description": ""}


Below is the execution result in console:

Inside the execution result, it seems the execution is successful and the file is created however, nothing is generated under scratch folder!
Also worth noting is that assistant agent (design_assistant) is trying to execute code instead of user proxy agent.

userproxy (to group_chat_manager):

Help me create a painting containing only these keywords: forest, monster, sunrise, human

--------------------------------------------------------------------------------
paint_assistant (to group_chat_manager):

To create a painting containing the keywords "forest," "monster," "sunrise," and "human," I will use the `generate_and_save_images` function from the `skills.py` file. This function will generate an image based on the provided query using OpenAI's DALL-E model.

Here is the Python code that you need to execute. Make sure that the `skills.py` file is in the same directory as the script you are running. If you need to install any Python packages, please let me know before running the code, and I will provide the necessary shell script to install them.

```python
# filename: generate_painting.py
from generate_images import generate_and_save_images

# Define the query with the given keywords
query = "A painting of a forest with a monster, a human, and a sunrise in the background"

# Call the function to generate and save the image
generate_and_save_images(query)
```

Please save the above code in a file named `generate_painting.py` and run it. The function will generate the image and save it to your local disk. It will also print the path to the saved image file. Let me know once you have run the code and what the output is.

--------------------------------------------------------------------------------

>>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...
execute_code was called without specifying a value for use_docker. Since the python docker package is not available, code will be run natively. Note: this fallback behavior is subject to change
design_assistant (to group_chat_manager):

exitcode: 1 (execution failed)
Code output: 
Traceback (most recent call last):
  File "generate_painting.py", line 2, in <module>
    from generate_images import generate_and_save_images
ModuleNotFoundError: No module named 'generate_images'


--------------------------------------------------------------------------------
paint_assistant (to group_chat_manager):

It seems there was a mistake in the import statement. The function `generate_and_save_images` should be imported from `skills.py` rather than `generate_images`. Let's correct the import statement and try running the code again.

Please update the code in `generate_painting.py` with the following corrected version:

```python
# filename: generate_painting.py
from skills import generate_and_save_images

# Define the query with the given keywords
query = "A painting of a forest with a monster, a human, and a sunrise in the background"

# Call the function to generate and save the image
generate_and_save_images(query)
```

Make sure that the `skills.py` file is in the same directory as the `generate_painting.py` script. Run the updated script and let me know the output.

--------------------------------------------------------------------------------

>>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...
execute_code was called without specifying a value for use_docker. Since the python docker package is not available, code will be run natively. Note: this fallback behavior is subject to change
primary_assistant (to group_chat_manager):

exitcode: 1 (execution failed)
Code output: 
Traceback (most recent call last):
  File "generate_painting.py", line 2, in <module>
    from skills import generate_and_save_images
ModuleNotFoundError: No module named 'skills'


--------------------------------------------------------------------------------
paint_assistant (to group_chat_manager):

It appears that there is an issue with locating the `skills.py` module. To resolve this, we need to ensure that the `skills.py` file is in the same directory as the script you are trying to run. If the `skills.py` file is indeed in the same directory, then the issue might be with the Python environment or the way the script is being executed.

Let's try a different approach by directly including the `generate_and_save_images` function in the script you are running. This will eliminate the need for an import statement.

Please update the code in `generate_painting.py` with the following code that includes the function definition:

```python
# filename: generate_painting.py
from typing import List
import uuid
import requests
from pathlib import Path

# Assuming openai package is installed and OpenAI key is set in the environment
from openai import OpenAI

def generate_and_save_images(query: str, image_size: str = "1024x1024") -> List[str]:
    client = OpenAI()  # Initialize the OpenAI client
    response = client.images.generate(model="dall-e-3", prompt=query, n=1, size=image_size)  # Generate images

    saved_files = []

    if response.data:
        for image_data in response.data:
            file_name = str(uuid.uuid4()) + ".png"
            file_path = Path(file_name)

            img_url = image_data.url
            img_response = requests.get(img_url)
            if img_response.status_code == 200:
                with open(file_path, "wb") as img_file:
                    img_file.write(img_response.content)
                    print(f"Image saved to {file_path}")
                    saved_files.append(str(file_path))
            else:
                print(f"Failed to download the image from {img_url}")
    else:
        print("No image data found in the response!")

    return saved_files

# Define the query with the given keywords
query = "A painting of a forest with a monster, a human, and a sunrise in the background"

# Call the function to generate and save the image
generate_and_save_images(query)
```

Please save the above code in a file named `generate_painting.py` and run it. This script includes the necessary function and should not require the `skills.py` file. Let me know the output after running this updated script.

--------------------------------------------------------------------------------

>>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...
execute_code was called without specifying a value for use_docker. Since the python docker package is not available, code will be run natively. Note: this fallback behavior is subject to change
primary_assistant (to group_chat_manager):

exitcode: 0 (execution succeeded)
Code output: 
Image saved to f1e69d8e-da51-487e-9945-b9259667c418.png


--------------------------------------------------------------------------------
paint_assistant (to group_chat_manager):

Great! The script executed successfully and the image has been saved with the filename `f1e69d8e-da51-487e-9945-b9259667c418.png`. You should now be able to find this image in the same directory where you ran the script.

The painting should contain the elements you requested: a forest, a monster, a human, and a sunrise. You can view the image by navigating to the file location and opening it with any image viewer software you have available.

If you need further assistance or would like to create another image, feel free to ask. Otherwise, if you're satisfied with the result, that concludes our task.

TERMINATE

--------------------------------------------------------------------------------

Additional Information

autogenstudio==0.0.33a0
pyautogen==0.2.0

@ShaneYuTH ShaneYuTH added the bug Something isn't working label Jan 26, 2024
@gagb gagb added the studio Related to AutoGen Studio. label Jan 26, 2024
@jmanhype
Copy link

Major issue

@ahernd2
Copy link

ahernd2 commented Feb 2, 2024

I am seeing the same. I have a group chat, and I can see in the chat log that scripts are being suggested, however the bots never create or run those scripts

@sonichi
Copy link
Contributor

sonichi commented Feb 11, 2024

@victordibia Is this a known bug?

@saibhaskerraju
Copy link

Hi @victordibia ,

this is really sad to see the state of AutogenStudio.

image

Autogenstudio : 0.0.44a
python : 3.11
env : Docker

@ShaneYuTH
Copy link
Author

@victordibia Is this a known bug?

Is it a known bug? I saw some similar issues in the past weeks that seem to relate to this problem.

@cleverpig
Copy link

MARK! it's very important

@khayyamkhan
Copy link

can only get code exeuction to work with two agent chat. once i add the same agents to a group chat, they fail to be able to execute code and return a generic error instead:
"I'm sorry for any confusion, but as an AI developed by OpenAI, I'm unable to directly execute code or scripts. However, I can guide you through the process of running the code on your local machine or server."

@sidhujag
Copy link
Collaborator

sidhujag commented Feb 24, 2024

works for me... but maybe my fork is too different

@victordibia maybe we should sync up to see the differences and find a way to pull some changes to yours.. theres some dependencies that your autogen doesnt have like persistence but we can work around those things i think

@ShaneYuTH
Copy link
Author

works for me... but maybe my fork is too different

@victordibia maybe we should sync up to see the differences and find a way to pull some changes to yours.. theres some dependencies that your autogen doesnt have like persistence but we can work around those things i think

Would you mind sharing your version for autogenstudio and autogen? Thanks a lot.

@victordibia
Copy link
Collaborator

victordibia commented Feb 28, 2024

@ALL,
@saibhaskerraju , @ShaneYuTH , @khayyamkhan

The way GroupChat is configured is critical to the behaviors you see.
Understanding AutoGen agent class behaviors is valuable here. I wrote an article here. The relevant section is:

  • UserProxyAgent as code executor: Use the userProxyAgent as a code executor (it should have a code_execution_config) that serves to execute code in any message it receives and then shares the result with other agents.

  • AssistantAgent as the main work horse: The AssistantAgent should be primed to implement the core behaviors you want to achieve. E.g. if you want to generate charts, retrieve documents, call functions etc, these should be specified in the system message of an AssistantAgent. Also specific skills (either as function calls or code added to system message) should be added to the assistant agent (not the user proxy). In writing prompts for your assistantFrom this perspective, the AssistantAgent should not have code_execution_config.

  • Extend the Default Assistant Prompt: In writing prompts for your assistant, a good first step is to extend the the default assistant prompt and add your own instructions. The main benefit here is that the the default assistant prompt instructs the agent to create a plan and write code to solve the task. This is important as the this code writte is received by the user proxy and then executed. If no code is written, the user proxy does not get any code to execute an nothing happens. (@saibhaskerraju)

As an example, I am attaching a sample workflow config that shows a team structured to generate art content. 3 agents

  • art director - assistant agent that plans the art pieces to be created, generates ideas
  • art implementer -- assistant agent that writes the code that implements the ideas
  • user proxy -- executes code

Results are in the attached video.

workflow_Painter Team.json

groupchat.mp4

@victordibia
Copy link
Collaborator

victordibia commented Feb 28, 2024

@gagb @sonichi ,
Any other tips you might have to add to the above (mostly on how to configure agents to ensure code gets generated and executed correctly)?

@khayyamkhan
Copy link

@victordibia Thanks for sharing the workflow and video. It still seems strange to me that agents within the groupchat (underneath the groupchatmanager) can't execute any code. It doesnt serve any role separation, meaning I would have to stack all of the skills with the userproxy. Is that correct or am I still missing something?

@victordibia
Copy link
Collaborator

@khayyamkhan,

Can you explain what you mean by "agents within the groupchat cannot execute code?".
Would you expect that all agents should execute code, some of them or only one of them? What would be the behavior you expect?

@khayyamkhan
Copy link

@khayyamkhan,

Can you explain what you mean by "agents within the groupchat cannot execute code?". Would you expect that all agents should execute code, some of them or only one of them? What would be the behavior you expect?

The expected behavior is that an agent with skills should be able to join a groupchat and still be able to execute it's skills. I attached an image to try and help explain my issue.

When agent_bob is in a two-agent chat, his skills works. But when I add him into a group-chat, his skills do not work.
image

@ShaneYuTH
Copy link
Author

Thanks @victordibia's great explanation, it helps a lot!

Also, it is a bit counterintuitive that you have to add USERPROXY agent again after you click receiver (make sure to add it in group chat agents). It works for me now. @khayyamkhan.

@khayyamkhan
Copy link

Thanks @victordibia's great explanation, it helps a lot!

Also, it is a bit counterintuitive that you have to add USERPROXY agent again after you click receiver (make sure to add it in group chat agents). It works for me now. @khayyamkhan.

I added the USERPROXY agent into the groupchat but it still doesnt work. What am I missing here?

@ShaneYuTH
Copy link
Author

Thanks @victordibia's great explanation, it helps a lot!
Also, it is a bit counterintuitive that you have to add USERPROXY agent again after you click receiver (make sure to add it in group chat agents). It works for me now. @khayyamkhan.

I added the USERPROXY agent into the groupchat but it still doesnt work. What am I missing here?

If you'd like to share some more detail maybe I can try to figure out what's going on.

@hqnicolas
Copy link

Hello dear friends!
Here I have 2 agents working
and when I create the same stack, with 3 agents
the code doesn't get executed

@victordibia
Copy link
Collaborator

@hqnicolas , do you have a user_proxy in your groupchat?
There needs to be a user_proxy which is what executes the code.

Please revew the painter team example above.
workflow_Painter Team.json

@hqnicolas
Copy link

hqnicolas commented Apr 10, 2024

@victordibia yes i'm using userproxy and not user_proxy
I change from MacOS to Windows, and got the same problem
the two agents execute code normally and the group chat can't run the code.

Two agents example:
workflow_Youtube Video Transcript.json

Just change the LLM to your GPT4 agent and run

Edit: delete C:\Users\username\.autogenstudio
pip uninstall autogenstudio
pip install autogenstudio
autogenstudio ui --port 8081
delete all userproxy inside workflows
use the original userproxy that came with autogen

@hqnicolas
Copy link

hqnicolas commented Apr 10, 2024

@victordibia I have edited the original userproxy
it causes that error on group chat
please unable the "remove" option from userproxy agent, it will cause that bug

Here is the Fixed Workflow

workflow_Youtube Video Transcript With Group Chat.json

Huge Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working studio Related to AutoGen Studio.
Projects
None yet
Development

No branches or pull requests