Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StarCoderBase/StarCoder, 2023 #661

Open
AkihikoWatanabe opened this issue May 6, 2023 · 4 comments
Open

StarCoderBase/StarCoder, 2023 #661

AkihikoWatanabe opened this issue May 6, 2023 · 4 comments

Comments

@AkihikoWatanabe
Copy link
Owner

https://huggingface.co/bigcode/starcoderbase

@AkihikoWatanabe
Copy link
Owner Author

・15.5Bパラメータ
・80種類以上のプログラミング言語で訓練
・Multi Query Attentionを利用
・context window size 8192
・Fill in the middle objectiveを利用

Instruction tuningがされておらず、prefixとsuffixの間を埋めるような訓練のされ方をしているので、たとえば関数名をinputして、そのmiddle(関数の中身)を出力させる、といった使い方になる模様。

@AkihikoWatanabe
Copy link
Owner Author

@AkihikoWatanabe AkihikoWatanabe changed the title StarCoder StarCoderBase/StarCoder May 6, 2023
@AkihikoWatanabe
Copy link
Owner Author

StarCoder:
https://huggingface.co/bigcode/starcoder

@AkihikoWatanabe
Copy link
Owner Author

StarCoderBaseを35Bのpython tokenでfinetuningしたモデル。
既存モデルよりも高性能と主張

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant