-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there a better way to execute the generated code? #768
Comments
Good observation here. I recall that @afourney also noted a few cases where the order of code blocks had some impact on the results (e.g. installing deps before code) |
Yes @victordibia that is issue #430. I think prompting can help, but the existing prompt is pretty strong. It states: "The user cannot provide any other feedback or perform any other action beyond executing the code you suggest. The user can't modify your code. So do not suggest incomplete code which requires users to modify. Don't use a code block if it's not intended to be executed by the user. Don't include multiple code blocks in one response. Do not ask users to copy and paste the result. Instead, use 'print' function for the output when relevant. Check the execution result returned by the user. I think one problem is that the system prompt is right at the top of the conversation, and can be forgotten in longer conversations. Perhaps a floating system prompt would be better (moving it dynamically to right before the generation) |
Thanks @victordibia , @afourney , it seems that many weaker models will not follow the prompt as expected. |
What if, instead of executing code, we have the user proxy return a static message when the extracted block count is > 1 (and languages match). Something like "Please consolidate this into only one self-contained code block." This would result in an extra call, but would use the LLMs coding abilities to hopefully correctly synthesize the code. |
Sounds good! Maybe we can create a function to consolidate the code, and in the function, it actually calls the LLM to do it. In this way, we could save some token usage. |
Is this solved by using the stateful jupyter code executor? |
I get
Replace |
@jackgerrits please see the above comment. |
What version of |
websocket-client 1.6.4 |
Could you retry using |
1.7.0 works well. I checked the code of websocket-client, |
closing resolved |
The current code execute logic has an issue with code generated in several blocks and later blocks depends on the former ones.
For example:
The generated code is good but the execution is failed because the code blocks are exectuted separately. With the feedback
NameError: name 'sum_numbers' is not defined
, GPT-3.5-turbo usually can merge the blocks into one block, but other models which are not as good as GPT-3.5 will not merge them and keep failing. Anyway, it would be better to be able to correctly execute the blocks.The text was updated successfully, but these errors were encountered: