Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Gemma model #2964

Merged
merged 1 commit into from
Feb 21, 2024
Merged

Add Gemma model #2964

merged 1 commit into from
Feb 21, 2024

Conversation

xiangxu-google
Copy link
Contributor

@xiangxu-google xiangxu-google commented Feb 21, 2024

https://blog.google/technology/developers/gemma-open-models/

The PR contains the Gemma model implementation which is able to load checkpoints from HF:

  • google/gemma-2b
  • google/gemma-2b-it
  • google/gemma-7b
  • google/gemma-7b-it

Copy link
Collaborator

@WoosukKwon WoosukKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for submitting the PR!

@WoosukKwon WoosukKwon merged commit 5253eda into vllm-project:main Feb 21, 2024
19 of 21 checks passed
zhuohan123 added a commit that referenced this pull request Feb 21, 2024
This version is for more model support. Add support for Gemma models (#2964) and OLMo models (#2832).
simon-mo pushed a commit that referenced this pull request Feb 21, 2024
This version is for more model support. Add support for Gemma models (#2964) and OLMo models (#2832).
xjpang pushed a commit to xjpang/vllm that referenced this pull request Feb 22, 2024
xjpang pushed a commit to xjpang/vllm that referenced this pull request Feb 22, 2024
This version is for more model support. Add support for Gemma models (vllm-project#2964) and OLMo models (vllm-project#2832).
xjpang pushed a commit to xjpang/vllm that referenced this pull request Mar 4, 2024
xjpang pushed a commit to xjpang/vllm that referenced this pull request Mar 4, 2024
This version is for more model support. Add support for Gemma models (vllm-project#2964) and OLMo models (vllm-project#2832).
@MrRace
Copy link

MrRace commented Mar 6, 2024

@xiangxu-google
After deploying Gemma using the VLLM deployment in the OpenAI API format, it seems that inputs from the Chat API are not supported.

The deployment command used is:

python3 -m vllm.entrypoints.openai.api_server --served-model-name gemma-2b-it --model /data/share_model_zoo/LLM/google/gemma-2b-it

The test request used is:

curl http://localhost:8000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "gemma-2b-it",
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Hello"}
        ]
    }'

The response returned is as follows:

{"object":"error","message":"System role not supported","type":"BadRequestError","param":null,"code":400}

@xiangxu-google
Copy link
Contributor Author

Is this error Gemma specific? This PR added the modeling code only, which should not affect the API server.

@simon-mo
Copy link
Collaborator

simon-mo commented Mar 8, 2024

The chat template doesn't support system role https://huggingface.co/google/gemma-2b-it/blob/718cb189da9c5b2e55abe86f2eeffee9b4ae0dad/tokenizer_config.json#L59

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants