Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPT-4 often outputs code before describing dependencies. This leads to inefficiencies when Autogen runs them in this order. #430

Open
afourney opened this issue Oct 25, 2023 · 1 comment
Assignees
Labels
code-execution execute generated code robustness issues/pr related to robustness

Comments

@afourney
Copy link
Member

When asked to perform a task that involves code execution, GPT-4 often correctly outputs a code block, and information about installing necessary dependencies -- but it does so in that order.

This results in Autogen attempting to execute the code block, failing (due to dependencies), debugging the failure, and only then trying to execute the code block that would have correctly installed the dependencies in the first place.

This can lead to numerous messages and inefficiencies, and can sometimes result in total failures.

Proposed solution: Update the AssistantAgent's prompt to list code blocks in the order they should be executed.

Here is a typical trace of this problem:

user_proxy (to assistant):

Plot and save to disk a chart of NVDA and TESLA stock price YTD.

--------------------------------------------------------------------------------
assistant (to user_proxy):

Sure. We will use 'yfinance' to download stock prices from Yahoo finance. 'Matplotlib' will be used for plotting the graph. 

Here is the Python code for you to execute. 

'''python
# filename : plot_stock.py

import yfinance as yf
import matplotlib.pyplot as plt
from datetime import date, timedelta

# use 'yfinance' to download the stock prices
nvda = yf.download('NVDA', start=date(date.today().year, 1, 1), end=date.today())
tsla = yf.download('TSLA', start=date(date.today().year, 1, 1), end=date.today())

plt.figure(figsize=(14,7))
plt.plot(nvda.index, nvda.Close, label='NVDA')
plt.plot(tsla.index, tsla.Close, label='TSLA')
plt.xlabel('Date')
plt.ylabel('Closing Price')
plt.title('NVDA and TESLA stock price YTD')
plt.legend(loc='upper left')
plt.grid(True)
plt.savefig('plot.png')
print("'plot.png' has been saved!")
'''

Before running this, make sure you have all the required dependencies by running:

'''sh
pip install yfinance matplotlib
'''
After your dependencies are installed, save the Python code into a file named 'plot_stock.py' and run the script with `python plot_stock.py`. This code downloads the stock data for NVDA and TESLA from Yahoo finance and plots them. A png image named 'plot.png' containing the graph will be saved to disk.

--------------------------------------------------------------------------------

>>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...
user_proxy (to assistant):

exitcode: 1 (execution failed)
Code output: 
Traceback (most recent call last):
  File "", line 3, in <module>
    import yfinance as yf
ModuleNotFoundError: No module named 'yfinance'


--------------------------------------------------------------------------------
assistant (to user_proxy):

I apologize for the inconvenience. It seems that the 'yfinance' module is not installed in your environment. To resolve this issue, you need to install the 'yfinance' module. The pip command installs this module:

'''sh
pip install yfinance
'''
  
Then, please rerun our python script:

'''python
# filename : plot_stock.py

import yfinance as yf
import matplotlib.pyplot as plt
from datetime import date, timedelta

# use 'yfinance' to download the stock prices
nvda = yf.download('NVDA', start=date(date.today().year, 1, 1), end=date.today())
tsla = yf.download('TSLA', start=date(date.today().year, 1, 1), end=date.today())

plt.figure(figsize=(14,7))
plt.plot(nvda.index, nvda.Close, label='NVDA')
plt.plot(tsla.index, tsla.Close, label='TSLA')
plt.xlabel('Date')
plt.ylabel('Closing Price')
plt.title('NVDA and TESLA stock price YTD')
plt.legend(loc='upper left')
plt.grid(True)
plt.savefig('plot.png')
print("'plot.png' has been saved!")
'''

Please install the library and rerun the python script.

--------------------------------------------------------------------------------

>>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is sh)...

>>>>>>>> EXECUTING CODE BLOCK 1 (inferred language is python)...
user_proxy (to assistant):

exitcode: 1 (execution failed)
Code output: 
Collecting yfinance
  Obtaining dependency information for yfinance from https://files.pythonhosted.org/packages/e6/b3/388ab967a387cc92926f70e97688dd9a7189b29a0773db815ffc5289e2b5/yfinance-0.2.31-py2.py3-none-any.whl.metadata
  Downloading yfinance-0.2.31-py2.py3-none-any.whl.metadata (11 kB)
Collecting pandas>=1.3.0 (from yfinance)
  Obtaining dependency information for pandas>=1.3.0 from https://files.pythonhosted.org/packages/de/ce/b5d9c7ce1aaf9023b823c81932a50cd5e8f407198a696b0d1c6025a40b03/pandas-2.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
  Downloading pandas-2.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (18 kB)
Requirement already satisfied: numpy>=1.16.5 in /usr/local/lib/python3.11/site-packages (from yfinance) (1.26.1)
Requirement already satisfied: requests>=2.31 in /usr/local/lib/python3.11/site-packages (from yfinance) (2.31.0)
Collecting multitasking>=0.0.7 (from yfinance)
  Downloading multitasking-0.0.11-py3-none-any.whl (8.5 kB)
Collecting lxml>=4.9.1 (from yfinance)
  Obtaining dependency information for lxml>=4.9.1 from https://files.pythonhosted.org/packages/ed/62/ffc30348ae141f69f9f23b65ba769db7ca209856c9a9b3406279e0ea24de/lxml-4.9.3-cp311-cp311-manylinux_2_28_x86_64.whl.metadata
  Downloading lxml-4.9.3-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (3.8 kB)
Collecting appdirs>=1.4.4 (from yfinance)
  Downloading appdirs-1.4.4-py2.py3-none-any.whl (9.6 kB)
Collecting pytz>=2022.5 (from yfinance)
  Obtaining dependency information for pytz>=2022.5 from https://files.pythonhosted.org/packages/32/4d/aaf7eff5deb402fd9a24a1449a8119f00d74ae9c2efa79f8ef9994261fc2/pytz-2023.3.post1-py2.py3-none-any.whl.metadata
  Downloading pytz-2023.3.post1-py2.py3-none-any.whl.metadata (22 kB)
Collecting frozendict>=2.3.4 (from yfinance)
  Downloading frozendict-2.3.8-py311-none-any.whl (14 kB)
Collecting peewee>=3.16.2 (from yfinance)
  Downloading peewee-3.17.0.tar.gz (2.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.9/2.9 MB 3.4 MB/s eta 0:00:00
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
  Preparing metadata (pyproject.toml): started
  Preparing metadata (pyproject.toml): finished with status 'done'
Collecting beautifulsoup4>=4.11.1 (from yfinance)
  Downloading beautifulsoup4-4.12.2-py3-none-any.whl (142 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 143.0/143.0 kB 9.7 MB/s eta 0:00:00
Collecting html5lib>=1.1 (from yfinance)
  Downloading html5lib-1.1-py2.py3-none-any.whl (112 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 112.2/112.2 kB 5.3 MB/s eta 0:00:00
Collecting soupsieve>1.2 (from beautifulsoup4>=4.11.1->yfinance)
  Obtaining dependency information for soupsieve>1.2 from https://files.pythonhosted.org/packages/4c/f3/038b302fdfbe3be7da016777069f26ceefe11a681055ea1f7817546508e3/soupsieve-2.5-py3-none-any.whl.metadata
  Downloading soupsieve-2.5-py3-none-any.whl.metadata (4.7 kB)
Collecting six>=1.9 (from html5lib>=1.1->yfinance)
  Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting webencodings (from html5lib>=1.1->yfinance)
  Downloading webencodings-0.5.1-py2.py3-none-any.whl (11 kB)
Collecting python-dateutil>=2.8.2 (from pandas>=1.3.0->yfinance)
  Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 247.7/247.7 kB 4.4 MB/s eta 0:00:00
Collecting tzdata>=2022.1 (from pandas>=1.3.0->yfinance)
  Downloading tzdata-2023.3-py2.py3-none-any.whl (341 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 341.8/341.8 kB 4.4 MB/s eta 0:00:00
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.11/site-packages (from requests>=2.31->yfinance) (3.3.1)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.11/site-packages (from requests>=2.31->yfinance) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.11/site-packages (from requests>=2.31->yfinance) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.11/site-packages (from requests>=2.31->yfinance) (2023.7.22)
Downloading yfinance-0.2.31-py2.py3-none-any.whl (65 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 65.6/65.6 kB 3.6 MB/s eta 0:00:00
Downloading lxml-4.9.3-cp311-cp311-manylinux_2_28_x86_64.whl (7.9 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.9/7.9 MB 5.6 MB/s eta 0:00:00
Downloading pandas-2.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.2 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.2/12.2 MB 6.2 MB/s eta 0:00:00
Downloading pytz-2023.3.post1-py2.py3-none-any.whl (502 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 502.5/502.5 kB 6.6 MB/s eta 0:00:00
Downloading soupsieve-2.5-py3-none-any.whl (36 kB)
Building wheels for collected packages: peewee
  Building wheel for peewee (pyproject.toml): started
  Building wheel for peewee (pyproject.toml): finished with status 'done'
  Created wheel for peewee: filename=peewee-3.17.0-cp311-cp311-linux_x86_64.whl size=895533 sha256=ccb5ca29911ba0c0ea55e6e022c2132292a87fe1dccd06e53a00b01861963aa3
  Stored in directory: /root/.cache/pip/wheels/02/20/23/74a10d0cd31f5d41c19b92ddf4c138ceff01b9f4675f19dbf5
Successfully built peewee
Installing collected packages: webencodings, pytz, peewee, multitasking, appdirs, tzdata, soupsieve, six, lxml, frozendict, python-dateutil, html5lib, beautifulsoup4, pandas, yfinance
Successfully installed appdirs-1.4.4 beautifulsoup4-4.12.2 frozendict-2.3.8 html5lib-1.1 lxml-4.9.3 multitasking-0.0.11 pandas-2.1.1 peewee-3.17.0 python-dateutil-2.8.2 pytz-2023.3.post1 six-1.16.0 soupsieve-2.5 tzdata-2023.3 webencodings-0.5.1 yfinance-0.2.31

Traceback (most recent call last):
  File "", line 4, in <module>
    import matplotlib.pyplot as plt
ModuleNotFoundError: No module named 'matplotlib'


--------------------------------------------------------------------------------
assistant (to user_proxy):

Ah, I see that 'matplotlib' is also not installed. Please run this shell command:

'''sh
pip install matplotlib
'''

After successfully installing 'matplotlib', rerun the python script:

'''python
# filename : plot_stock.py

import yfinance as yf
import matplotlib.pyplot as plt
from datetime import date, timedelta

# use 'yfinance' to download the stock prices
nvda = yf.download('NVDA', start=date(date.today().year, 1, 1), end=date.today())
tsla = yf.download('TSLA', start=date(date.today().year, 1, 1), end=date.today())

plt.figure(figsize=(14,7))
plt.plot(nvda.index, nvda.Close, label='NVDA')
plt.plot(tsla.index, tsla.Close, label='TSLA')
plt.xlabel('Date')
plt.ylabel('Closing Price')
plt.title('NVDA and TESLA stock price YTD')
plt.legend(loc='upper left')
plt.grid(True)
plt.savefig('plot.png')
print("'plot.png' has been saved!")
'''

Please remember to install the 'matplotlib' library before rerunning the Python script.

--------------------------------------------------------------------------------

>>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is sh)...

>>>>>>>> EXECUTING CODE BLOCK 1 (inferred language is python)...
user_proxy (to assistant):

exitcode: 0 (execution succeeded)
Code output: 
Collecting matplotlib
  Obtaining dependency information for matplotlib from https://files.pythonhosted.org/packages/65/5b/3b8fd7d66043f0638a35fa650570cbe69efd42fe169e5024f9307598b47e/matplotlib-3.8.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
  Downloading matplotlib-3.8.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.8 kB)
Collecting contourpy>=1.0.1 (from matplotlib)
  Obtaining dependency information for contourpy>=1.0.1 from https://files.pythonhosted.org/packages/b7/f6/78f60fa0b6ae64971178e2542e8b3ad3ba5f4f379b918ab7b18038a3f897/contourpy-1.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
  Downloading contourpy-1.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.9 kB)
Collecting cycler>=0.10 (from matplotlib)
  Obtaining dependency information for cycler>=0.10 from https://files.pythonhosted.org/packages/e7/05/c19819d5e3d95294a6f5947fb9b9629efb316b96de511b418c53d245aae6/cycler-0.12.1-py3-none-any.whl.metadata
  Downloading cycler-0.12.1-py3-none-any.whl.metadata (3.8 kB)
Collecting fonttools>=4.22.0 (from matplotlib)
  Obtaining dependency information for fonttools>=4.22.0 from https://files.pythonhosted.org/packages/72/2c/7634a6c16b29d0c31cf54051beefab796abdfe8f52abead6d09e5403696e/fonttools-4.43.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
  Downloading fonttools-4.43.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (152 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 152.4/152.4 kB 2.7 MB/s eta 0:00:00
Collecting kiwisolver>=1.0.1 (from matplotlib)
  Obtaining dependency information for kiwisolver>=1.0.1 from https://files.pythonhosted.org/packages/17/ba/17a706b232308e65f57deeccae503c268292e6a091313f6ce833a23093ea/kiwisolver-1.4.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
  Downloading kiwisolver-1.4.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.4 kB)
Requirement already satisfied: numpy<2,>=1.21 in /usr/local/lib/python3.11/site-packages (from matplotlib) (1.26.1)
Collecting packaging>=20.0 (from matplotlib)
  Obtaining dependency information for packaging>=20.0 from https://files.pythonhosted.org/packages/ec/1a/610693ac4ee14fcdf2d9bf3c493370e4f2ef7ae2e19217d7a237ff42367d/packaging-23.2-py3-none-any.whl.metadata
  Downloading packaging-23.2-py3-none-any.whl.metadata (3.2 kB)
Collecting pillow>=6.2.0 (from matplotlib)
  Obtaining dependency information for pillow>=6.2.0 from https://files.pythonhosted.org/packages/6f/d8/f31dd84b4363b5f24c71b25a13ec3855f5ff233e07e1c3f1f8e979e12be2/Pillow-10.1.0-cp311-cp311-manylinux_2_28_x86_64.whl.metadata
  Downloading Pillow-10.1.0-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (9.5 kB)
Collecting pyparsing>=2.3.1 (from matplotlib)
  Obtaining dependency information for pyparsing>=2.3.1 from https://files.pythonhosted.org/packages/39/92/8486ede85fcc088f1b3dba4ce92dd29d126fd96b0008ea213167940a2475/pyparsing-3.1.1-py3-none-any.whl.metadata
  Downloading pyparsing-3.1.1-py3-none-any.whl.metadata (5.1 kB)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.11/site-packages (from matplotlib) (2.8.2)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.11/site-packages (from python-dateutil>=2.7->matplotlib) (1.16.0)
Downloading matplotlib-3.8.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.6/11.6 MB 11.7 MB/s eta 0:00:00
Downloading contourpy-1.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (302 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 302.8/302.8 kB 11.5 MB/s eta 0:00:00
Downloading cycler-0.12.1-py3-none-any.whl (8.3 kB)
Downloading fonttools-4.43.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.8 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.8/4.8 MB 13.9 MB/s eta 0:00:00
Downloading kiwisolver-1.4.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.4 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.4/1.4 MB 13.7 MB/s eta 0:00:00
Downloading packaging-23.2-py3-none-any.whl (53 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 53.0/53.0 kB 5.4 MB/s eta 0:00:00
Downloading Pillow-10.1.0-cp311-cp311-manylinux_2_28_x86_64.whl (3.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.6/3.6 MB 14.9 MB/s eta 0:00:00
Downloading pyparsing-3.1.1-py3-none-any.whl (103 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 103.1/103.1 kB 7.6 MB/s eta 0:00:00
Installing collected packages: pyparsing, pillow, packaging, kiwisolver, fonttools, cycler, contourpy, matplotlib
Successfully installed contourpy-1.1.1 cycler-0.12.1 fonttools-4.43.1 kiwisolver-1.4.5 matplotlib-3.8.0 packaging-23.2 pillow-10.1.0 pyparsing-3.1.1


[*********************100%%**********************]  1 of 1 completed

[*********************100%%**********************]  1 of 1 completed
'plot.png' has been saved!


--------------------------------------------------------------------------------
assistant (to user_proxy):

Great! The python code ran successfully and the graph is saved in the file named 'plot.png'. This file is located in the same directory from where you ran the script. You can check it to see the plotted graph of NVDA and TESLA stock price Year-to-Date.

Feel free to ask if you have more queries or tasks.

TERMINATE

--------------------------------------------------------------------------------
@afourney afourney added robustness issues/pr related to robustness code-execution execute generated code labels Oct 25, 2023
@afourney afourney self-assigned this Oct 25, 2023
@afourney afourney changed the title GPT-4's often outputs code before describing dependencies. This leads to inefficiencies when Autogen runs them in this oder. GPT-4 often outputs code before describing dependencies. This leads to inefficiencies when Autogen runs them in this order. Oct 25, 2023
@bioshazard
Copy link
Collaborator

Would be great if it knew to recommend a requirements.txt as a first step

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code-execution execute generated code robustness issues/pr related to robustness
Projects
None yet
Development

No branches or pull requests

2 participants