-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support top chinese language models #116
Comments
List of expected supported models:
|
Transformers officially supports Flash Attention 2, but it only covers models like Llama. Is it necessary for us to support Flash Attention 2 for other models? |
huggingface/transformers#26350 Transformers community is adding support for more architecture. |
Done for baichuan, e770404 Data for test:
|
more test please! |
Qwen-14B is being supported:
|
What's the progress? |
Add RWKV Would you like implementing RLHF for RWKV, here are some suggestions:
|
It seems that it is not easy to be compatible with RWKV in OpenRLHF. |
I am wondering if |
Not yet |
OpenRLHF supports mistral since it's model architecture is the same as llama2. |
现在qwen支持了吗 |
我看qwen的 特殊token <|endoftext|> 在代码里面没有体现? |
应该是支持了,周末测测看~ |
No description provided.
The text was updated successfully, but these errors were encountered: