ColossalllaMA-2-7b-base: 为什么咱们添加了少量中文数据之后,在英文的MMLU上增量这么大呀? #4868
tomyoung903
started this conversation in
Development | Core
Replies: 4 comments 2 replies
-
你好,首先,感谢对于 Colossal-LLaMA-2 的关注。 我们在增量预训练阶段,不仅仅添加了中文数据,还有少量的英文数据,主要用于 replay 的作用,缓解模型的灾难性遗忘的问题。这部分数据经过精心的筛选,以求最大程度的唤醒模型在预训练第一阶段(LLaMA-2)学到的知识。 |
Beta Was this translation helpful? Give feedback.
0 replies
-
That seems like a major new algorithm to me. Do you plan on open-sourcing
the whole training process?
Is there gonna be a detailed explanation on this on a paper/blog in the
future?
Cheers,
Tom Young
tomyoung903.github.io
…On Sun, Oct 8, 2023 at 10:22 PM Tong Li ***@***.***> wrote:
你好,首先,感谢对于 Colossal-LLaMA-2 的关注。
我们在增量预训练阶段,不仅仅添加了中文数据,还有少量的英文数据,主要用于 replay
的作用,缓解模型的灾难性遗忘的问题。这部分数据经过精心的筛选,以求最大程度的唤醒模型在预训练第一阶段(LLaMA-2)学到的知识。
—
Reply to this email directly, view it on GitHub
<#4868 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AKQ3UYEMEMTT26L5R2CNRDLX6KZLTAVCNFSM6AAAAAA5W4P3J2VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TEMRTGE3DG>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
1 reply
-
另外为啥如果只是防止遗忘/唤醒的话,为啥比遗忘前好了这么多 |
Beta Was this translation helpful? Give feedback.
1 reply
-
Thanks! Look forward to your report! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
为什么咱们添加了少量中文数据之后,在英文的MMLU上增量这么大呀?
潞晨科技公众号推送 《千元预算半天训练,效果媲美主流大模型,开源可商用中文LLaMA-2》

https://mp.weixin.qq.com/s/25r6hJqNDQhqR4EHu0uctA
Beta Was this translation helpful? Give feedback.
All reactions