Refactored prompt.py to reduce token usage (#1996)

* Refactored prompt.py to reduce token usage * Reverted some destructive changes * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * Apply suggestions from code review * Apply suggestions from code review * Update agenthub/codeact_agent/prompt.py * fix integration test * make lint * feat: support ToolQA benchmark (#2263) * Add files via upload * Update README.md * Update run_infer.py * Update utils.py * make lint * Update evaluation/toolqa/run_infer.py --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: yufansong <yufan@risingwave-labs.com> Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk> * feat: revert hiden special paths change in file action (#2328) * revert change in file action * remove useless code * make lint * Support gpqa benchmark evaluation (#2080) * feat: add gpqa benchmark evaluation * add metrics * reset configs in final block * make lint --------- Co-authored-by: yufansong <yufan@risingwave-labs.com> * fix(frontend): prevent API key from resetting after modal change (#2329) * remove bottom chatbox fade * Modal wider; fix lint error * settings: attempt to not clear api key for same provider * prevent api key from resetting after changing the model * revert other changes and fix post test tear down error --------- Co-authored-by: amanape <83104063+amanape@users.noreply.github.com> * fix: codeact bug [If running a command that never returns, it gets stuck #1895] (#2034) * fix: codeact bug #1895 * fix: add CmdRunAction timeout hint. * Update agenthub/codeact_agent/prompt.py Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> * regenerate integration test --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: Graham Neubig <neubig@gmail.com> Co-authored-by: yufansong <yufan@risingwave-labs.com> * Feat: Support Gorilla APIBench (#2081) * removed unused files from gorilla * Update run_infer.py, removed unused imports * Update utils.py * Update ast_eval_hf.py * Update ast_eval_tf.py * Update ast_eval_th.py * Create README.md * Update run_infer.py * make lint * Update run_infer.py * fix lint --------- Co-authored-by: yufansong <yufan@risingwave-labs.com> * remote useless (#2332) * fix integration test * Update agenthub/codeact_agent/prompt.py * Update agenthub/codeact_agent/prompt.py * fix integration test --------- Co-authored-by: Xingyao Wang <xingyao6@illinois.edu> Co-authored-by: Frank Xu <frankxu2004@gmail.com> Co-authored-by: yufansong <yufan@risingwave-labs.com> Co-authored-by: yueqis <141804823+yueqis@users.noreply.github.com> Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk> Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com> Co-authored-by: Jaskirat Singh <1.jaskiratsingh@gmail.com> Co-authored-by: tobitege <tobitege@gmx.de> Co-authored-by: amanape <83104063+amanape@users.noreply.github.com> Co-authored-by: Aaron Xia <zhhuaxia@gmail.com> Co-authored-by: Graham Neubig <neubig@gmail.com>
OpenDevin · Jun 9, 2024 · e925cef · e925cef
1 parent 221a4e8
commit e925cef
Show file tree

Hide file tree

Showing 14 changed files with 226 additions and 212 deletions.
diff --git a/agenthub/codeact_agent/prompt.py b/agenthub/codeact_agent/prompt.py
@@ -10,7 +10,7 @@
 
 # ======= SYSTEM MESSAGE =======
 MINIMAL_SYSTEM_PREFIX = """A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
-The assistant can interact with an interactive Python (Jupyter Notebook) environment and receive the corresponding output when needed. The code should be enclosed using "<execute_ipython>" tag, for example:
+The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with <execute_ipython>.
 <execute_ipython>
 print("Hello World!")
 </execute_ipython>
@@ -23,33 +23,34 @@
 Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background.
 """
 
-BROWSING_PREFIX = """The assistant can browse the Internet with commands on behalf of the user by wrapping them with <execute_browse> and </execute_browse>.
-For example, you can browse a given URL by <execute_browse> Tell me the usa's president using google search </execute_browse>.
-The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block.
+BROWSING_PREFIX = """The assistant can browse the Internet with <execute_browse> and </execute_browse>.
+For example, <execute_browse> Tell me the usa's president using google search </execute_browse>.
+Or <execute_browse> Tell me what is in http://example.com </execute_browse>.
 """
 PIP_INSTALL_PREFIX = """The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: <execute_ipython> %pip install [package needed] </execute_ipython> and should always import packages and define variables before starting to use them."""
 
 SYSTEM_PREFIX = MINIMAL_SYSTEM_PREFIX + BROWSING_PREFIX + PIP_INSTALL_PREFIX
 
-GITHUB_MESSAGE = """To do any activities on GitHub, the assistant should use the token in the $GITHUB_TOKEN environment variable.
-For instance, to push a local branch `my_branch` to the github repo `owner/repo`, the assistant can use the following command:
+GITHUB_MESSAGE = """To interact with GitHub, use the $GITHUB_TOKEN environment variable.
+For example, to push a branch `my_branch` to the GitHub repo `owner/repo`:
 <execute_bash> git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch </execute_bash>
-If the assistant require access to GitHub but $GITHUB_TOKEN is not set, ask the user to set it."""
+If $GITHUB_TOKEN is not set, ask the user to set it."""
 
-SYSTEM_SUFFIX = """The assistant's response should be concise.
-The assistant should include ONLY ONE <execute_ipython> or <execute_bash> or <execute_browse> in every one of the responses, unless the assistant is finished with the task or need more input or action from the user in order to proceed.
-IMPORTANT: Whenever possible, execute the code for the user using <execute_ipython> or <execute_bash> or <execute_browse> instead of providing it.
+SYSTEM_SUFFIX = """Responses should be concise.
+The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block.
+Include ONLY ONE <execute_ipython>, <execute_bash>, or <execute_browse> per response, unless the assistant is finished with the task or need more input or action from the user in order to proceed.
+IMPORTANT: Execute code using <execute_ipython>, <execute_bash>, or <execute_browse> whenever possible.
 """
 
 
 # ======= EXAMPLE MESSAGE =======
 EXAMPLES = """
 --- START OF EXAMPLE ---
 
-USER: Can you create a list of numbers from 1 to 10, and create a web page to display them at port 5000?
+USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.
 
 ASSISTANT:
-Sure! Let me create a file first:
+Sure! Let me create a Python file `app.py`:
 <execute_ipython>
 create_file('app.py')
 </execute_ipython>
@@ -231,7 +232,7 @@ def index():
 [File updated. Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
 
 ASSISTANT:
-The file has been updated. Let me run the Python file again with the new changes:
+Running the updated file:
 <execute_bash>
 python3 app.py > server.log 2>&1 &
 </execute_bash>
@@ -241,14 +242,14 @@ def index():
 [1] 126
 
 ASSISTANT:
-The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Free free to let me know if you have any further requests!
+The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests!
 
 --- END OF EXAMPLE ---
 """
 
 INVALID_INPUT_MESSAGE = (
     "I don't understand your input. \n"
-    'If you want to execute a bash command, please use <execute_bash> YOUR_COMMAND_HERE </execute_bash>.\n'
-    'If you want to execute a block of Python code, please use <execute_ipython> YOUR_COMMAND_HERE </execute_ipython>.\n'
-    'If you want to browse the Internet, please use <execute_browse> YOUR_COMMAND_HERE </execute_browse>.\n'
+    'For bash commands, use <execute_bash> YOUR_COMMAND </execute_bash>.\n'
+    'For Python code, use <execute_ipython> YOUR_CODE </execute_ipython>.\n'
+    'For browsing, use <execute_browse> YOUR_COMMAND </execute_browse>.\n'
 )
diff --git a/tests/integration/mock/CodeActAgent/test_browse_internet/prompt_001.log b/tests/integration/mock/CodeActAgent/test_browse_internet/prompt_001.log
@@ -3,7 +3,7 @@
 ----------
 
 A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
-The assistant can interact with an interactive Python (Jupyter Notebook) environment and receive the corresponding output when needed. The code should be enclosed using "<execute_ipython>" tag, for example:
+The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with <execute_ipython>.
 <execute_ipython>
 print("Hello World!")
 </execute_ipython>
@@ -14,14 +14,14 @@ Important, however: do not run interactive commands. You do not have access to s
 Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution.
 For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: <execute_bash> python3 app.py > server.log 2>&1 & </execute_bash>
 Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background.
-The assistant can browse the Internet with commands on behalf of the user by wrapping them with <execute_browse> and </execute_browse>.
-For example, you can browse a given URL by <execute_browse> Tell me the usa's president using google search </execute_browse>.
-The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block.
+The assistant can browse the Internet with <execute_browse> and </execute_browse>.
+For example, <execute_browse> Tell me the usa's president using google search </execute_browse>.
+Or <execute_browse> Tell me what is in http://example.com </execute_browse>.
 The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: <execute_ipython> %pip install [package needed] </execute_ipython> and should always import packages and define variables before starting to use them.
-To do any activities on GitHub, the assistant should use the token in the $GITHUB_TOKEN environment variable.
-For instance, to push a local branch `my_branch` to the github repo `owner/repo`, the assistant can use the following command:
+To interact with GitHub, use the $GITHUB_TOKEN environment variable.
+For example, to push a branch `my_branch` to the GitHub repo `owner/repo`:
 <execute_bash> git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch </execute_bash>
-If the assistant require access to GitHub but $GITHUB_TOKEN is not set, ask the user to set it.
+If $GITHUB_TOKEN is not set, ask the user to set it.
 
 
 Apart from the standard Python library, the assistant can also use the following functions (already imported) in <execute_ipython> environment:
@@ -99,9 +99,10 @@ parse_pptx(file_path: str) -> None:
 
 Please note that THE `edit_file` FUNCTION REQUIRES PROPER INDENTATION. If the assistant would like to add the line '        print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run.
 
-The assistant's response should be concise.
-The assistant should include ONLY ONE <execute_ipython> or <execute_bash> or <execute_browse> in every one of the responses, unless the assistant is finished with the task or need more input or action from the user in order to proceed.
-IMPORTANT: Whenever possible, execute the code for the user using <execute_ipython> or <execute_bash> or <execute_browse> instead of providing it.
+Responses should be concise.
+The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block.
+Include ONLY ONE <execute_ipython>, <execute_bash>, or <execute_browse> per response, unless the assistant is finished with the task or need more input or action from the user in order to proceed.
+IMPORTANT: Execute code using <execute_ipython>, <execute_bash>, or <execute_browse> whenever possible.
 
 
 ----------
@@ -110,10 +111,10 @@ Here is an example of how you can interact with the environment for task solving
 
 --- START OF EXAMPLE ---
 
-USER: Can you create a list of numbers from 1 to 10, and create a web page to display them at port 5000?
+USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.
 
 ASSISTANT:
-Sure! Let me create a file first:
+Sure! Let me create a Python file `app.py`:
 <execute_ipython>
 create_file('app.py')
 </execute_ipython>
@@ -295,7 +296,7 @@ Observation:
 [File updated. Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
 
 ASSISTANT:
-The file has been updated. Let me run the Python file again with the new changes:
+Running the updated file:
 <execute_bash>
 python3 app.py > server.log 2>&1 &
 </execute_bash>
@@ -305,7 +306,7 @@ Observation:
 [1] 126
 
 ASSISTANT:
-The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Free free to let me know if you have any further requests!
+The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests!
 
 --- END OF EXAMPLE ---
 
@@ -316,4 +317,4 @@ NOW, LET'S START!
 
 Browse localhost:8000, and tell me the ultimate answer to life. Do not ask me for confirmation at any point.
 
-ENVIRONMENT REMINDER: You have 9 turns left to complete the task.
+ENVIRONMENT REMINDER: You have 9 turns left to complete the task.
diff --git a/tests/integration/mock/CodeActAgent/test_browse_internet/prompt_005.log b/tests/integration/mock/CodeActAgent/test_browse_internet/prompt_005.log
@@ -3,7 +3,7 @@
 ----------
 
 A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
-The assistant can interact with an interactive Python (Jupyter Notebook) environment and receive the corresponding output when needed. The code should be enclosed using "<execute_ipython>" tag, for example:
+The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with <execute_ipython>.
 <execute_ipython>
 print("Hello World!")
 </execute_ipython>
@@ -14,14 +14,14 @@ Important, however: do not run interactive commands. You do not have access to s
 Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution.
 For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: <execute_bash> python3 app.py > server.log 2>&1 & </execute_bash>
 Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background.
-The assistant can browse the Internet with commands on behalf of the user by wrapping them with <execute_browse> and </execute_browse>.
-For example, you can browse a given URL by <execute_browse> Tell me the usa's president using google search </execute_browse>.
-The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block.
+The assistant can browse the Internet with <execute_browse> and </execute_browse>.
+For example, <execute_browse> Tell me the usa's president using google search </execute_browse>.
+Or <execute_browse> Tell me what is in http://example.com </execute_browse>.
 The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: <execute_ipython> %pip install [package needed] </execute_ipython> and should always import packages and define variables before starting to use them.
-To do any activities on GitHub, the assistant should use the token in the $GITHUB_TOKEN environment variable.
-For instance, to push a local branch `my_branch` to the github repo `owner/repo`, the assistant can use the following command:
+To interact with GitHub, use the $GITHUB_TOKEN environment variable.
+For example, to push a branch `my_branch` to the GitHub repo `owner/repo`:
 <execute_bash> git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch </execute_bash>
-If the assistant require access to GitHub but $GITHUB_TOKEN is not set, ask the user to set it.
+If $GITHUB_TOKEN is not set, ask the user to set it.
 
 
 Apart from the standard Python library, the assistant can also use the following functions (already imported) in <execute_ipython> environment:
@@ -99,9 +99,10 @@ parse_pptx(file_path: str) -> None:
 
 Please note that THE `edit_file` FUNCTION REQUIRES PROPER INDENTATION. If the assistant would like to add the line '        print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run.
 
-The assistant's response should be concise.
-The assistant should include ONLY ONE <execute_ipython> or <execute_bash> or <execute_browse> in every one of the responses, unless the assistant is finished with the task or need more input or action from the user in order to proceed.
-IMPORTANT: Whenever possible, execute the code for the user using <execute_ipython> or <execute_bash> or <execute_browse> instead of providing it.
+Responses should be concise.
+The assistant should attempt fewer things at a time instead of putting too much commands OR code in one "execute" block.
+Include ONLY ONE <execute_ipython>, <execute_bash>, or <execute_browse> per response, unless the assistant is finished with the task or need more input or action from the user in order to proceed.
+IMPORTANT: Execute code using <execute_ipython>, <execute_bash>, or <execute_browse> whenever possible.
 
 
 ----------
@@ -110,10 +111,10 @@ Here is an example of how you can interact with the environment for task solving
 
 --- START OF EXAMPLE ---
 
-USER: Can you create a list of numbers from 1 to 10, and create a web page to display them at port 5000?
+USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.
 
 ASSISTANT:
-Sure! Let me create a file first:
+Sure! Let me create a Python file `app.py`:
 <execute_ipython>
 create_file('app.py')
 </execute_ipython>
@@ -295,7 +296,7 @@ Observation:
 [File updated. Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
 
 ASSISTANT:
-The file has been updated. Let me run the Python file again with the new changes:
+Running the updated file:
 <execute_bash>
 python3 app.py > server.log 2>&1 &
 </execute_bash>
@@ -305,7 +306,7 @@ Observation:
 [1] 126
 
 ASSISTANT:
-The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Free free to let me know if you have any further requests!
+The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests!
 
 --- END OF EXAMPLE ---
 
@@ -321,4 +322,4 @@ Browse localhost:8000, and tell me the ultimate answer to life. Do not ask me fo
 OBSERVATION:
 {'content': 'The answer to life, the universe, and everything is: OpenDevin is all you need!'}
 
-ENVIRONMENT REMINDER: You have 8 turns left to complete the task.
+ENVIRONMENT REMINDER: You have 8 turns left to complete the task.