-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Gemma model #2964
Add Gemma model #2964
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for submitting the PR!
This version is for more model support. Add support for Gemma models (vllm-project#2964) and OLMo models (vllm-project#2832).
This version is for more model support. Add support for Gemma models (vllm-project#2964) and OLMo models (vllm-project#2832).
@xiangxu-google The deployment command used is:
The test request used is:
The response returned is as follows:
|
Is this error Gemma specific? This PR added the modeling code only, which should not affect the API server. |
The chat template doesn't support system role https://huggingface.co/google/gemma-2b-it/blob/718cb189da9c5b2e55abe86f2eeffee9b4ae0dad/tokenizer_config.json#L59 |
https://blog.google/technology/developers/gemma-open-models/
The PR contains the Gemma model implementation which is able to load checkpoints from HF: