[Model] Add support for xverse #3610

hxer7963 · 2024-03-25T08:45:29Z

Add support for xverse models.

We tested the xverse 7B/13B/65B chat models and the quantized GPTQ model locally, and vllm responses are normal.

You can verify by downloading xverse models from Hugging Face (https://huggingface.co/xverse) or Modelscope (https://www.modelscope.cn/search?search=xverse).

Furthermore, before the PR, we executed format.sh.

Request: After the current PR is merged, can we add a new tag so that the latest version of the package supports inference of xverse models via pip installation.

Alec-Lin · 2024-03-25T12:27:49Z

What is the difference of arch between llama and xverse?

hxer7963 · 2024-03-26T07:46:45Z

What is the difference of arch between llama and xverse?

Nice question.

The current xverse model architecture is no different from llama, and it is expected that xverse will add moe features within two weeks and merge them into xverse.py.

To maintain an independent update progress, it is necessary to separately support the xverse architecture in VLLM.

hxer7963 · 2024-03-27T04:00:56Z

Hello @WoosukKwon ,

I've submitted a PR #3610 and I would greatly appreciate your help with reviewing the code changes and provide your feedback.

Thank you for your time and assistance.

Best regards,
willhe.

WoosukKwon · 2024-03-27T04:51:00Z

@hxer7963 Could you please fix the formatting error by running pip install -r requirements-dev.txt; ./format.sh?

WoosukKwon · 2024-03-27T04:51:46Z

README.md

 - Yi (`01-ai/Yi-6B`, `01-ai/Yi-34B`, etc.)
+- Xverse (`xverse/XVERSE-7B-Chat`, `xverse/XVERSE-13B-Chat`, `xverse/XVERSE-65B-Chat`, etc.)


nit: Let's keep the alphabetic order :)

Suggested change

- Yi (`01-ai/Yi-6B`, `01-ai/Yi-34B`, etc.)

- Xverse (`xverse/XVERSE-7B-Chat`, `xverse/XVERSE-13B-Chat`, `xverse/XVERSE-65B-Chat`, etc.)

- Xverse (`xverse/XVERSE-7B-Chat`, `xverse/XVERSE-13B-Chat`, `xverse/XVERSE-65B-Chat`, etc.)

- Yi (`01-ai/Yi-6B`, `01-ai/Yi-34B`, etc.)

WoosukKwon · 2024-03-27T04:52:19Z

vllm/model_executor/models/xverse.py

+                                              hf_model_weights_iterator)
+from vllm.sequence import SamplerOutput
+
+KVCache = Tuple[torch.Tensor, torch.Tensor]


Please replace KVCache with torch.Tensor. We changed the type recently.

Thank you for WoosukKwon's reply and suggestions. The issues mentioned in the review have been fixed in the latest commit.

esmeetu · 2024-03-27T05:18:31Z

Can we simply use this: "XverseForCausalLM": ("llama", "LlamaForCausalLM")?
Will the new moe model use the same name with XverseForCausalLM?

- Fix typo in README to keep the alphabetic order - Replace KVCache with torch.Tensor in xverse.py

hxer7963 · 2024-03-27T07:11:49Z

Can we simply use this: "XverseForCausalLM": ("llama", "LlamaForCausalLM")? Will the new moe model use the same name with XverseForCausalLM?

Yes, the new moe model will use the same name with XverseForCausalLM, so we want to add a new name.

hxer7963 · 2024-03-27T11:31:05Z

Hello @WoosukKwon ,

I've addressed all the feedback in the PR, and all checks have passed.

Could you please take another look at the code changes and provide your review?

Thank you for your time and assistance.

Best regards,

hxer7963.

WoosukKwon

LGTM! Thanks for submitting the PR. Looking forward to the next model release!

Co-authored-by: willhe <hexin@xverse.cn> Co-authored-by: root <root@localhost.localdomain>

willhe and others added 2 commits March 24, 2024 23:38

Support xverse arch in vllm.

6f51b07

Add support for xverse

7c51d2f

WoosukKwon reviewed Mar 27, 2024

View reviewed changes

WoosukKwon added new model Requests to new models action-required labels Mar 27, 2024

hxer7963 added 2 commits March 27, 2024 14:29

Merge branch 'vllm-project:main' into main

3a569bc

- Fix formatting error with format.sh

fb87902

- Fix typo in README to keep the alphabetic order - Replace KVCache with torch.Tensor in xverse.py

hxer7963 requested a review from WoosukKwon March 27, 2024 08:04

WoosukKwon approved these changes Mar 28, 2024

View reviewed changes

WoosukKwon merged commit 098e177 into vllm-project:main Mar 28, 2024
33 checks passed

xjpang pushed a commit to xjpang/vllm that referenced this pull request Mar 31, 2024

[Model] Add support for xverse (vllm-project#3610)

ca537d8

Co-authored-by: willhe <hexin@xverse.cn> Co-authored-by: root <root@localhost.localdomain>

dtrifiro mentioned this pull request May 15, 2024

bump ubi base image tag opendatahub-io/vllm#24

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model] Add support for xverse #3610

[Model] Add support for xverse #3610

hxer7963 commented Mar 25, 2024

Alec-Lin commented Mar 25, 2024

hxer7963 commented Mar 26, 2024

hxer7963 commented Mar 27, 2024

WoosukKwon commented Mar 27, 2024

WoosukKwon Mar 27, 2024

WoosukKwon Mar 27, 2024

hxer7963 Mar 27, 2024

esmeetu commented Mar 27, 2024 •

edited

hxer7963 commented Mar 27, 2024

hxer7963 commented Mar 27, 2024

WoosukKwon left a comment

		- Yi (`01-ai/Yi-6B`, `01-ai/Yi-34B`, etc.)
		- Xverse (`xverse/XVERSE-7B-Chat`, `xverse/XVERSE-13B-Chat`, `xverse/XVERSE-65B-Chat`, etc.)

[Model] Add support for xverse #3610

[Model] Add support for xverse #3610

Conversation

hxer7963 commented Mar 25, 2024

Alec-Lin commented Mar 25, 2024

hxer7963 commented Mar 26, 2024

hxer7963 commented Mar 27, 2024

WoosukKwon commented Mar 27, 2024

WoosukKwon Mar 27, 2024

Choose a reason for hiding this comment

WoosukKwon Mar 27, 2024

Choose a reason for hiding this comment

hxer7963 Mar 27, 2024

Choose a reason for hiding this comment

esmeetu commented Mar 27, 2024 • edited

hxer7963 commented Mar 27, 2024

hxer7963 commented Mar 27, 2024

WoosukKwon left a comment

Choose a reason for hiding this comment

esmeetu commented Mar 27, 2024 •

edited