fix(backend) changes to improve Command-R+ behavior, plus file i/o error improvements. #1347

computer-whisperer · 2024-04-24T21:59:15Z

With these changes Command-R+ runs alright. The prompt changes should also help other llms.

…ous file IO errors, added timeout and max return token configurations for the LLM api.

agenthub/monologue_agent/agent.py

agenthub/monologue_agent/utils/prompts.py

…sandbox when resolving paths in fileio operations, add customizable timeout for bash commands, mention said timeout in llm prompt.

computer-whisperer · 2024-04-25T02:11:25Z

The absolute path changes have been removed, instead fetching the current sandbox path when calculating paths for fileio actions. I also added a "SANDBOX_TIMEOUT" configuration that is shared with the LLM in it's prompt. This way the agent is expecting commands to timeout if they execute for too long.

…ng to delete it from the response afterwards, fixed get_working_directory for ssh_box.

rbren · 2024-04-25T19:02:33Z

opendevin/sandbox/docker/ssh_box.py

@@ -169,7 +169,7 @@ def setup_user(self):

    def start_ssh_session(self):
        # start ssh session at the background
-        self.ssh = pxssh.pxssh()
+        self.ssh = pxssh.pxssh(echo=False)


@xingyaoww wdyt, does this fix the "weird behavior" you saw?

I think this just prevents the command from being returned. it might break the way how ssh_box.sh currently finds the response.

The issue i was seeing is that, the first time you call .ssh.before, the return value you get might not be complete: you only get a part of the output. And in the next round, when you try to issue echo $? to get the exit code, you will get the other part of the output of previous command along with the exit code, which is highly undesirable.

This PR also dramatically simplifies the response-finding logic, all it does is return the text found .before the next cmd prompt match.

Do you have an example situation that can cause the return data to extend past the next cmd prompt match? I have some unit tests that I ran locally and couldn't easily come up with such a scenario.

that's why I call it "weird" in the sense that it is not easily reproducible (sometime happens, sometimes works) :(
But anyway, we can go with your current logic for now and try to fix it in the future if that issue comes up again.

…r-plus-changes # Conflicts: # opendevin/action/fileop.py

computer-whisperer · 2024-04-25T20:27:49Z

Looks like the integration test failure is due to the prompt messages changing.

li-boxuan · 2024-04-26T03:22:09Z

@computer-whisperer Glad to see you fixed the integration tests after the prompt changes. Did you encounter any difficulty? Did the README doc help? Anything you think could be improved, including but not limited to doc?

computer-whisperer · 2024-04-26T04:20:43Z

It was definitely a pain to diagnose and fix as-is. I had to get the tests running locally with a debugger before I realized what was going wrong, and I had to manually edit all of the prompt_00x.log files to match the new prompt notes. I think at the very least the error should be a lot more informative about why it can't find a prompt response at the very least.

I also don't like how making little changes to the agent prompt text can trigger the need to go hunt down and edit many different test case .log files, especially if you intend the system to be expanded with many more test sequences. Maybe a script to regenerate them all using an LLM would help? I'm not sure.

li-boxuan · 2024-04-26T04:36:11Z

@computer-whisperer Did you get a chance to read https://github.com/OpenDevin/OpenDevin/blob/main/tests/integration/README.md?

You should be able to do

poetry run python ./opendevin/main.py -i 10 -t "Write a shell script 'hello.sh' that prints 'hello'." -c "MonologueAgent" -d "./workspace"

and simply replace the test folder with new logs generated under logs folder. This process, will, however, be more complicated when we have more tests.

It shouldn't take more than a few minutes to do so. Manually editing prompt_00x.log files is definitely a huge pain.

li-boxuan · 2024-04-26T04:39:38Z

Initially I tried using a local vector DB to mock LLM, so that tiny changes to prompts shouldn't require any changes to test files. That didn't work well. The vector DB wasn't smart enough to retrieve the correct response.

I think going forward, when we have more tests, we need a script to run poetry run python ./opendevin/main.py -i 10 -t "<task>" -c <agent> -d "./workspace" in a batch.

… command-r-plus-changes # Conflicts: # agenthub/monologue_agent/utils/prompts.py # opendevin/llm/llm.py # opendevin/schema/config.py # tests/test_fileops.py

rbren · 2024-04-27T11:51:07Z

@li-boxuan it would be awesome if you could run something like

REGENERATE_TEST_FILES=true poetry run python ./opendevin/main.py -i 10 -t "Write a shell script 'hello.sh' that prints 'hello'." -c "MonologueAgent" -d "./workspace"

computer-whisperer added 3 commits April 24, 2024 18:19

Some improvements to prompts, some better exception handling for vari…

f0ee4f9

…ous file IO errors, added timeout and max return token configurations for the LLM api.

More monologue prompt improvements

c7e9aa9

Dynamically set username provided in prompt.

a137efe

computer-whisperer force-pushed the command-r-plus-changes branch from 7fb3b53 to a137efe Compare April 24, 2024 22:20

rbren reviewed Apr 24, 2024

View reviewed changes

agenthub/monologue_agent/agent.py Outdated Show resolved Hide resolved

rbren reviewed Apr 24, 2024

View reviewed changes

agenthub/monologue_agent/utils/prompts.py Outdated Show resolved Hide resolved

Remove absolute paths from llm prompts, fetch working directory from …

15238b4

…sandbox when resolving paths in fileio operations, add customizable timeout for bash commands, mention said timeout in llm prompt.

Merge branch 'main' into command-r-plus-changes

c5fdce2

rbren approved these changes Apr 25, 2024

View reviewed changes

Switched ssh_box to disabling tty echo and removed the logic attempti…

fbe205c

…ng to delete it from the response afterwards, fixed get_working_directory for ssh_box.

rbren reviewed Apr 25, 2024

View reviewed changes

Merge remote-tracking branch 'refs/remotes/origin/main' into command-…

acc904c

…r-plus-changes # Conflicts: # opendevin/action/fileop.py

Update prompts in integration tests to match monologue agent changes.

a7acb0f

computer-whisperer added 5 commits April 26, 2024 13:07

Minor tweaks to make merge easier.

51e3c3f

Merge remote-tracking branch 'refs/remotes/origin-upstream/main' into…

9b12d1a

… command-r-plus-changes # Conflicts: # agenthub/monologue_agent/utils/prompts.py # opendevin/llm/llm.py # opendevin/schema/config.py # tests/test_fileops.py

Another minor prompt tweak, better invalid json handling.

42a47d1

Fix lint error

89cfaf2

More catch-up to fix lint errors introduced by merge.

4fb90df

enyst mentioned this pull request Apr 26, 2024

[Bug]: OpenDevin freezes when an interactive CLI program is called #1280

Open

2 tasks

Merge branch 'main' into command-r-plus-changes

b4fd508

rbren enabled auto-merge (squash) April 27, 2024 11:51

rbren merged commit 44aea95 into OpenDevin:main Apr 27, 2024
7 of 14 checks passed

rbren mentioned this pull request Apr 27, 2024

Revert "fix(backend) changes to improve Command-R+ behavior, plus file i/o error improvements." #1405

Merged

computer-whisperer deleted the command-r-plus-changes branch April 27, 2024 13:01

li-boxuan mentioned this pull request Apr 27, 2024

Make it easier to write/update integration tests #1414

Closed

enyst mentioned this pull request May 19, 2024

Fix/re-implement the memory condenser #1771

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(backend) changes to improve Command-R+ behavior, plus file i/o error improvements. #1347

fix(backend) changes to improve Command-R+ behavior, plus file i/o error improvements. #1347

computer-whisperer commented Apr 24, 2024

computer-whisperer commented Apr 25, 2024

rbren Apr 25, 2024

xingyaoww Apr 26, 2024 •

edited

computer-whisperer Apr 26, 2024

xingyaoww Apr 26, 2024

computer-whisperer commented Apr 25, 2024

li-boxuan commented Apr 26, 2024

computer-whisperer commented Apr 26, 2024

li-boxuan commented Apr 26, 2024 •

edited

li-boxuan commented Apr 26, 2024

rbren commented Apr 27, 2024

fix(backend) changes to improve Command-R+ behavior, plus file i/o error improvements. #1347

fix(backend) changes to improve Command-R+ behavior, plus file i/o error improvements. #1347

Conversation

computer-whisperer commented Apr 24, 2024

computer-whisperer commented Apr 25, 2024

rbren Apr 25, 2024

Choose a reason for hiding this comment

xingyaoww Apr 26, 2024 • edited

Choose a reason for hiding this comment

computer-whisperer Apr 26, 2024

Choose a reason for hiding this comment

xingyaoww Apr 26, 2024

Choose a reason for hiding this comment

computer-whisperer commented Apr 25, 2024

li-boxuan commented Apr 26, 2024

computer-whisperer commented Apr 26, 2024

li-boxuan commented Apr 26, 2024 • edited

li-boxuan commented Apr 26, 2024

rbren commented Apr 27, 2024

xingyaoww Apr 26, 2024 •

edited

li-boxuan commented Apr 26, 2024 •

edited