Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Model] Add support for xverse #3610

Merged
merged 4 commits into from
Mar 28, 2024
Merged

[Model] Add support for xverse #3610

merged 4 commits into from
Mar 28, 2024

Conversation

hxer7963
Copy link
Contributor

Add support for xverse models.

We tested the xverse 7B/13B/65B chat models and the quantized GPTQ model locally, and vllm responses are normal.

You can verify by downloading xverse models from Hugging Face (https://huggingface.co/xverse) or Modelscope (https://www.modelscope.cn/search?search=xverse).

Furthermore, before the PR, we executed format.sh.

Request: After the current PR is merged, can we add a new tag so that the latest version of the package supports inference of xverse models via pip installation.

@Alec-Lin
Copy link

What is the difference of arch between llama and xverse?

@hxer7963
Copy link
Contributor Author

What is the difference of arch between llama and xverse?

Nice question.

The current xverse model architecture is no different from llama, and it is expected that xverse will add moe features within two weeks and merge them into xverse.py.

To maintain an independent update progress, it is necessary to separately support the xverse architecture in VLLM.

@hxer7963
Copy link
Contributor Author

Hello @WoosukKwon ,

I've submitted a PR #3610 and I would greatly appreciate your help with reviewing the code changes and provide your feedback.

Thank you for your time and assistance.

Best regards,
willhe.

@WoosukKwon
Copy link
Collaborator

@hxer7963 Could you please fix the formatting error by running pip install -r requirements-dev.txt; ./format.sh?

README.md Outdated
Comment on lines 92 to 93
- Yi (`01-ai/Yi-6B`, `01-ai/Yi-34B`, etc.)
- Xverse (`xverse/XVERSE-7B-Chat`, `xverse/XVERSE-13B-Chat`, `xverse/XVERSE-65B-Chat`, etc.)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Let's keep the alphabetic order :)

Suggested change
- Yi (`01-ai/Yi-6B`, `01-ai/Yi-34B`, etc.)
- Xverse (`xverse/XVERSE-7B-Chat`, `xverse/XVERSE-13B-Chat`, `xverse/XVERSE-65B-Chat`, etc.)
- Xverse (`xverse/XVERSE-7B-Chat`, `xverse/XVERSE-13B-Chat`, `xverse/XVERSE-65B-Chat`, etc.)
- Yi (`01-ai/Yi-6B`, `01-ai/Yi-34B`, etc.)

hf_model_weights_iterator)
from vllm.sequence import SamplerOutput

KVCache = Tuple[torch.Tensor, torch.Tensor]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please replace KVCache with torch.Tensor. We changed the type recently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for WoosukKwon's reply and suggestions. The issues mentioned in the review have been fixed in the latest commit.

@WoosukKwon WoosukKwon added new model Requests to new models action-required labels Mar 27, 2024
@esmeetu
Copy link
Collaborator

esmeetu commented Mar 27, 2024

Can we simply use this: "XverseForCausalLM": ("llama", "LlamaForCausalLM")?
Will the new moe model use the same name with XverseForCausalLM?

- Fix typo in README to keep the alphabetic order
- Replace KVCache with torch.Tensor in xverse.py
@hxer7963
Copy link
Contributor Author

Can we simply use this: "XverseForCausalLM": ("llama", "LlamaForCausalLM")? Will the new moe model use the same name with XverseForCausalLM?

Yes, the new moe model will use the same name with XverseForCausalLM, so we want to add a new name.

@hxer7963
Copy link
Contributor Author

Hello @WoosukKwon ,

I've addressed all the feedback in the PR, and all checks have passed.

Could you please take another look at the code changes and provide your review?

Thank you for your time and assistance.

Best regards,

hxer7963.

Copy link
Collaborator

@WoosukKwon WoosukKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for submitting the PR. Looking forward to the next model release!

@WoosukKwon WoosukKwon merged commit 098e177 into vllm-project:main Mar 28, 2024
33 checks passed
xjpang pushed a commit to xjpang/vllm that referenced this pull request Mar 31, 2024
Co-authored-by: willhe <hexin@xverse.cn>
Co-authored-by: root <root@localhost.localdomain>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action-required new model Requests to new models
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants