Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproduce results on visual7w. #513

Open
1 of 2 tasks
sleepyshep opened this issue Jul 28, 2024 · 2 comments
Open
1 of 2 tasks

Reproduce results on visual7w. #513

sleepyshep opened this issue Jul 28, 2024 · 2 comments

Comments

@sleepyshep
Copy link

System Info / 系統信息

cuda 11.8
torch 2.3.0

Who can help? / 谁可以帮助到您?

No response

Information / 问题信息

  • The official example scripts / 官方的示例脚本
  • My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

之前我成功复现了论文中refcoco的结果,但在evaluate visual7w时遇到了问题。
我参考了shikra在evaluate visual7w时的代码,使用如下的prompt来提示cogvlm
"Please give a brief and direct reply to 'Which item in the photo is the chair at the desk? Candidates: [433,755,512,966] [003,013,053,895] [002,816,438,996] [180,596,397,996] answer in box format.' with the image"

Expected behavior / 期待表现

请问我的prompt有问题吗,能否提供一份您用来做visual7w任务的prompt,感谢!

@sleepyshep
Copy link
Author

@1049451037

@1049451037
Copy link
Member

1049451037 commented Jul 28, 2024

The template for visual7w is

f"""{Question} Select from:
A. [[x1,y1,x2,y2]]
B. [[x1,y1,x2,y2]]
C. [[x1,y1,x2,y2]]
D. [[x1,y1,x2,y2]]
Answer:"""

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants