Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

💡 [REQUEST] - <title> #23

Closed
paulpaul91 opened this issue Aug 28, 2023 · 1 comment
Closed

💡 [REQUEST] - <title> #23

paulpaul91 opened this issue Aug 28, 2023 · 1 comment
Labels
question Further information is requested

Comments

@paulpaul91
Copy link

起始日期 | Start Date

08/28/2023

实现PR | Implementation PR

No response

相关Issues | Reference Issues

有两个问题想请教下:

  1. 在pretrain和multi-task pretrain分别使用了多少资源,预计需要多长时间
  2. 对于纯文效果,有多少损伤,有没有可比较结果

摘要 | Summary

基本示例 | Basic Example

缺陷 | Drawbacks

未解决问题 | Unresolved questions

No response

@paulpaul91 paulpaul91 added the question Further information is requested label Aug 28, 2023
@logicwong
Copy link
Member

你好,感谢你对我们工作的关注

  1. 有关训练资源的细节暂时还不会公布,后续如果有更新会放到arxiv论文上;
  2. 我们在MMLU和CMMLU上评测了Qwen-VL的效果,虽然相较于纯文本Qwen-7B有些下降,但它依然在这两个数据集上取得了比较领先的结果,希望这份结果对你有帮助。
Model MMLU CMMLU
LLaMA-7B 35.1 -
Baichuan-7B 42.3 44.4
ChatGLM2-6B 47.9 48.8
Qwen-7B 56.7 58.8
Qwen-VL 50.7 49.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants