Refactored prompt.py to reduce token usage #1996

temotskipa · 2024-05-23T11:15:42Z

Refactored prompt.py to reduce token usage. Also included a band-aid fix to resolve an issue Devin encountered while editing prompt.py. The issue in question was that random example commands included in prompt.py were getting executed in the terminal, which resulted in the agent getting stuck in a loop trying to edit the file unsuccessfully. This issue should be resolved in a timely manner with a proper fix imo.

enyst

Can you please tell, on what version did you encounter getting stuck in a loop? There is a fix merged on main just these days for that kind of issue, although I suspect it may still be missing commands because of different command ids. Do you by any chance have logs from that issue?

IMHO some edits are a great idea, but I hope @xingyaoww can take a look when we consider this kind of changes.

temotskipa · 2024-05-23T11:55:30Z

I encountered it on the opendevin:main docker image.

yufansong

I prefe to not change the prompts only if we can do experiments to show it do works.

neubig · 2024-05-23T12:04:28Z

I agree with @yufansong , this seems like a great change if it doesn't reduce our accuracy, but I am worried that it might. Maybe we could run swe bench lite with this

rbren

CC @xingyaoww

Most likely a lot of these extra tokens are required for quality--the LLM often needs some reminding.

But there are also a few typo fixes etc here so we might want to take some of it.

xingyaoww · 2024-05-23T13:00:20Z

Yep! I agree with @yufansong @neubig @rbren -- These typo changes are actually good. How about we wait until #1941 is merged (since it also changes a bunch of prompt and bump the version to v1.5). Then we merge all the typo fixes here -- so we can just do ONE pass eval of SWE-Bench lite to make sure we are not degrading performance?

temotskipa · 2024-05-23T18:13:34Z

Is there anything else that I can do, or do I just wait for #1941 to be merged?

xingyaoww · 2024-05-23T23:15:35Z

@temotskipa I think that PR is merged :) feel free to re-adjust the prompt!

temotskipa · 2024-05-24T06:01:16Z

I mean I personally think this prompt is fine, but IDK if the maintainers also think so.

xingyaoww · 2024-05-24T09:12:41Z

@temotskipa Can you resolve those merge conflict so we can review again and try to merge it?

rbren · 2024-06-03T13:52:42Z

@temotskipa are you interested in pushing this one forward? @xingyaoww do you have specific changes you'd like to see?

yufansong · 2024-06-03T13:57:40Z

@temotskipa are you interested in pushing this one forward? @xingyaoww do you have specific changes you'd like to see?

I suggest to hold any prompts change before we finishing all benchmark evaluation.

xingyaoww · 2024-06-03T14:00:03Z

I am mainly looking for changes that fix the conflicts - once those are fixed, I'm happy with merging this PR (after 2 days - when we finish running all the eval for the paper as yufan suggested)

temotskipa · 2024-06-03T18:01:20Z

So am I supposed to revert the changes and just fix any typos and things like that? It's unclear to me what is conflicting here.

xingyaoww · 2024-06-06T16:44:01Z

@temotskipa Sorry for getting back late! We were running for a huge deadline :(
Feel free to keep whatever you want to change! As long as the conflict is gone, we can review and merge it :)

agenthub/codeact_agent/prompt.py

* Add files via upload * Update README.md * Update run_infer.py * Update utils.py * make lint * Update evaluation/toolqa/run_infer.py --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: yufansong <yufan@risingwave-labs.com> Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* revert change in file action * remove useless code * make lint

* feat: add gpqa benchmark evaluation * add metrics * reset configs in final block * make lint --------- Co-authored-by: yufansong <yufan@risingwave-labs.com>

…nDevin#2329) * remove bottom chatbox fade * Modal wider; fix lint error * settings: attempt to not clear api key for same provider * prevent api key from resetting after changing the model * revert other changes and fix post test tear down error --------- Co-authored-by: amanape <83104063+amanape@users.noreply.github.com>

…uck OpenDevin#1895] (OpenDevin#2034) * fix: codeact bug OpenDevin#1895 * fix: add CmdRunAction timeout hint. * Update agenthub/codeact_agent/prompt.py Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> * regenerate integration test --------- Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: Graham Neubig <neubig@gmail.com> Co-authored-by: yufansong <yufan@risingwave-labs.com>

* removed unused files from gorilla * Update run_infer.py, removed unused imports * Update utils.py * Update ast_eval_hf.py * Update ast_eval_tf.py * Update ast_eval_th.py * Create README.md * Update run_infer.py * make lint * Update run_infer.py * fix lint --------- Co-authored-by: yufansong <yufan@risingwave-labs.com>

…prompts

agenthub/codeact_agent/prompt.py

xingyaoww · 2024-06-09T06:44:17Z

Integration test should be fixed as well -- @rbren @enyst feel free to take a look and unblock the PR if possible :)

I have a PR #2326 which also tweaks the prompt a bit. Follow-ups could be included in that PR.

temotskipa added 2 commits May 23, 2024 15:00

Refactored prompt.py to reduce token usage

c48f25e

Reverted some destructive changes

0a4da9d

enyst requested a review from xingyaoww May 23, 2024 11:23

enyst requested changes May 23, 2024

View reviewed changes

yufansong requested changes May 23, 2024

View reviewed changes

rbren previously requested changes May 23, 2024

View reviewed changes

neubig assigned temotskipa May 24, 2024

Merge branch 'main' into concise-prompts

366eb27

xingyaoww reviewed Jun 7, 2024

View reviewed changes

agenthub/codeact_agent/prompt.py Outdated Show resolved Hide resolved

Update agenthub/codeact_agent/prompt.py

6e529d7

xingyaoww reviewed Jun 7, 2024

View reviewed changes

agenthub/codeact_agent/prompt.py Outdated Show resolved Hide resolved

Update agenthub/codeact_agent/prompt.py

b7df8a4

xingyaoww reviewed Jun 7, 2024

View reviewed changes

agenthub/codeact_agent/prompt.py Outdated Show resolved Hide resolved

Update agenthub/codeact_agent/prompt.py

f81e449

xingyaoww reviewed Jun 7, 2024

View reviewed changes

agenthub/codeact_agent/prompt.py Outdated Show resolved Hide resolved

Update agenthub/codeact_agent/prompt.py

08e8f8c

xingyaoww reviewed Jun 7, 2024

View reviewed changes

agenthub/codeact_agent/prompt.py Outdated Show resolved Hide resolved

Update agenthub/codeact_agent/prompt.py

52d72df

neubig assigned xingyaoww and unassigned temotskipa Jun 8, 2024

frankxu2004 and others added 3 commits June 8, 2024 07:12

Merge branch 'main' into concise-prompts

deccc1b

fix integration test

e37f43c

make lint

0b8c686

yufansong approved these changes Jun 8, 2024

View reviewed changes

enyst mentioned this pull request Jun 9, 2024

Agent Config #2337

Open

yueqis and others added 9 commits June 9, 2024 13:04

feat: revert hiden special paths change in file action (OpenDevin#2328)

bc9d556

* revert change in file action * remove useless code * make lint

Support gpqa benchmark evaluation (OpenDevin#2080)

c54aada

* feat: add gpqa benchmark evaluation * add metrics * reset configs in final block * make lint --------- Co-authored-by: yufansong <yufan@risingwave-labs.com>

remote useless (OpenDevin#2332)

848068b

fix integration test

1243173

Merge commit 'c062468dcf169e8f7145e359fc5a341d66017c56' into concise-…

cb0712d

…prompts

li-boxuan reviewed Jun 9, 2024

View reviewed changes

agenthub/codeact_agent/prompt.py Outdated Show resolved Hide resolved

Merge branch 'main' into concise-prompts

8abee03

xingyaoww reviewed Jun 9, 2024

View reviewed changes

agenthub/codeact_agent/prompt.py Outdated Show resolved Hide resolved

Update agenthub/codeact_agent/prompt.py

84faa82

xingyaoww reviewed Jun 9, 2024

View reviewed changes

agenthub/codeact_agent/prompt.py Outdated Show resolved Hide resolved

xingyaoww added 2 commits June 9, 2024 14:26

Update agenthub/codeact_agent/prompt.py

091bd59

fix integration test

b7fb392

li-boxuan approved these changes Jun 9, 2024

View reviewed changes

enyst approved these changes Jun 9, 2024

View reviewed changes

li-boxuan requested a review from rbren June 9, 2024 17:18

li-boxuan merged commit e925cef into OpenDevin:main Jun 9, 2024
18 checks passed

temotskipa deleted the concise-prompts branch June 9, 2024 19:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactored prompt.py to reduce token usage #1996

Refactored prompt.py to reduce token usage #1996

temotskipa commented May 23, 2024

enyst left a comment

temotskipa commented May 23, 2024

yufansong left a comment

neubig commented May 23, 2024

rbren left a comment

xingyaoww commented May 23, 2024 •

edited

Loading

temotskipa commented May 23, 2024 •

edited

Loading

xingyaoww commented May 23, 2024

temotskipa commented May 24, 2024

xingyaoww commented May 24, 2024

rbren commented Jun 3, 2024

yufansong commented Jun 3, 2024

xingyaoww commented Jun 3, 2024 •

edited

Loading

temotskipa commented Jun 3, 2024

xingyaoww commented Jun 6, 2024

xingyaoww commented Jun 9, 2024

Refactored prompt.py to reduce token usage #1996

Refactored prompt.py to reduce token usage #1996

Conversation

temotskipa commented May 23, 2024

enyst left a comment

Choose a reason for hiding this comment

temotskipa commented May 23, 2024

yufansong left a comment

Choose a reason for hiding this comment

neubig commented May 23, 2024

rbren left a comment

Choose a reason for hiding this comment

xingyaoww commented May 23, 2024 • edited Loading

temotskipa commented May 23, 2024 • edited Loading

xingyaoww commented May 23, 2024

temotskipa commented May 24, 2024

xingyaoww commented May 24, 2024

rbren commented Jun 3, 2024

yufansong commented Jun 3, 2024

xingyaoww commented Jun 3, 2024 • edited Loading

temotskipa commented Jun 3, 2024

xingyaoww commented Jun 6, 2024

xingyaoww commented Jun 9, 2024

xingyaoww commented May 23, 2024 •

edited

Loading

temotskipa commented May 23, 2024 •

edited

Loading

xingyaoww commented Jun 3, 2024 •

edited

Loading