[Doc] Add doc to state our model support policy #3948

youkaichao · 2024-04-09T19:58:05Z

Part of #3780 .

youkaichao · 2024-04-09T20:09:40Z

Two TODO:

How do we list the test status of models? Adding one more column would make the large table very crowded. And it might insult some contributors when they see their models are not tested :(
It seems we only have Strict consistency test for facebook/opt-125m and meta-llama/Llama-2-7b-hf, under tests/basic_correctness/test_basic_correctness.py. Output sensibility test is on the way [CI/Build] A perplexity-computing test for the FP8 KV cache system. Originally used in the context of PR #3290 #3730 . Runtime functionality test works for some models, but I don't have a complete list yet. Currently I do a grep search in tests and examples to collect all the model names.

rkooo567 · 2024-04-09T23:02:07Z

docs/source/models/supported_models.rst

+
+We have the following levels of testing for models:
+
+1. **Strict consistency**: We compare the output of the model with the output of the model in the HuggingFace Transformers library under greedy decoding. This is the most stringent test. The following models fall under this category:


QQ: This list could be out of date very easily. Should we just link to a test file instead?

That is also an option. But essentially it does not tell users what models are tested. Or we can tell users to grep our tests & examples folder to see if a model is tested? 🤣

second @rkooo567 that a static list on the doc is probably not ideal - maybe we can refer to the CI?

One simple solution I can think of is to centralize this constant to tested_model.py (new file) and link to this file instead?

vllm/tests/models/test_models.py

Line 10 in 0258b7a

MODELS = [

vllm/tests/models/test_big_models.py

Line 9 in 0258b7a

MODELS = [

ywang96

Left a few comments! Thank you very much for putting this down in the doc!

docs/source/models/supported_models.rst

ywang96 · 2024-04-10T05:39:58Z

docs/source/models/supported_models.rst

+    - EleutherAI/pythia-70m
+    - bigcode/tiny_starcoder_py
+    - gpt2
+4. **Community feedback**: We rely on the community to provide feedback on the models. If a model is broken or not working as expected, we encourage users to raise issues to report it or open pull requests to fix it. The rest of the models fall under this category.


The naming here doesn't feel like a level but rather a general guideline - do you mean if certain models are not working at all (due to layer changes, kernel changes, etc), then we rely on the community to fix these models?

do you mean if certain models are not working at all (due to layer changes, kernel changes, etc), then we rely on the community to fix these models

Currently I would say yes. Because we don't test them, it is possible they are broken. But we will do our best to maintain them. e.g. when we make some change in vllm core, we typically update the model files. It's best-effort anyway.

ywang96 · 2024-04-10T05:43:46Z

docs/source/models/supported_models.rst

+
+At vLLM, we are committed to facilitating the integration and support of third-party models within our ecosystem. Our approach is designed to balance the need for robustness and the practical limitations of supporting a wide range of models. Here’s how we manage third-party model support:
+
+1. **Community-Driven Support**: We encourage community contributions for adding new models. When a user requests support for a new model, we welcome pull requests (PRs) from the community. These contributions are evaluated primarily on the sensibility of the output they generate, rather than strict consistency with existing implementations such as those in transformers.


Probably also worth pointing out that a basic sensibility test is also required in the model support PR.

A sensibility report is required in the PR. However, with respect to adding it into test, there is a strong concern on the time and resource burden it adds to our CI system.

Totally make sense - I think an attached report from the author in the PR is fine for now and we don't need to build it into our CI.

docs/source/models/supported_models.rst

ywang96 · 2024-04-10T05:54:16Z

docs/source/models/supported_models.rst

+
+We have the following levels of testing for models:
+
+1. **Strict consistency**: We compare the output of the model with the output of the model in the HuggingFace Transformers library under greedy decoding. This is the most stringent test. The following models fall under this category:


second @rkooo567 that a static list on the doc is probably not ideal - maybe we can refer to the CI?

youkaichao · 2024-04-10T06:35:18Z

@ywang96 @rkooo567 please check ff1ae0d where I leave urls instead of static list of model supported.

youkaichao · 2024-04-10T06:35:43Z

@ywang96 thanks for the detailed review!

rkooo567 · 2024-04-10T11:40:54Z

docs/source/models/supported_models.rst

+
+We have the following levels of testing for models:
+
+1. **Strict Consistency**: We compare the output of the model with the output of the model in the HuggingFace Transformers library under greedy decoding. This is the most stringent test. Please refer to https://github.com/vllm-project/vllm/tree/main/tests/basic_correctness folder for the models that have passed this test.


nit; but I feel like linking this file https://github.com/vllm-project/vllm/blob/main/tests/models/test_models.py (small), https://github.com/vllm-project/vllm/blob/main/tests/models/test_big_models.py (big) is better because it has more strict consistency check (basically it checks tokens up to 96 whereas basic correctness test only checks the first 5).

fixed in d59e482

rkooo567 · 2024-04-10T11:41:41Z

docs/source/models/supported_models.rst

+We have the following levels of testing for models:
+
+1. **Strict Consistency**: We compare the output of the model with the output of the model in the HuggingFace Transformers library under greedy decoding. This is the most stringent test. Please refer to https://github.com/vllm-project/vllm/tree/main/tests/basic_correctness folder for the models that have passed this test.
+2. **Output Sensibility**: We check if the output of the model is sensible and coherent, by measuring the perplexity of the output and checking for any obvious errors. This is a less stringent test.


Also consider adding a link?

We don't have any test for Output Sensibility yet.

rkooo567 · 2024-04-10T11:42:03Z

Looks pretty good to me! just some nits

ywang96

LGTM! Left some final nits to add names to the links (otherwise they will all show as vllm-project/vllm on the doc.)

docs/source/models/supported_models.rst

Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>

add draft

3901d2f

youkaichao requested a review from simon-mo April 9, 2024 20:12

youkaichao added 2 commits April 9, 2024 14:35

update doc

db9610c

typo

be5f104

rkooo567 reviewed Apr 9, 2024

View reviewed changes

ywang96 reviewed Apr 10, 2024

View reviewed changes

youkaichao mentioned this pull request Apr 10, 2024

[CI/Build] Reduce race condition in docker build #3959

Closed

youkaichao added 4 commits April 9, 2024 23:17

call for contribution from vendors

258d1bb

add etiquette for bugfix pr

971c75f

format

9d90059

leave pointer instead of static list of model supported

ff1ae0d

rkooo567 approved these changes Apr 10, 2024

View reviewed changes

use specific link for strict consistency models

d59e482

youkaichao requested a review from ywang96 April 10, 2024 15:51

ywang96 approved these changes Apr 10, 2024

View reviewed changes

docs/source/models/supported_models.rst Outdated Show resolved Hide resolved

docs/source/models/supported_models.rst Outdated Show resolved Hide resolved

youkaichao and others added 2 commits April 10, 2024 09:24

Update docs/source/models/supported_models.rst

271baac

Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>

Update docs/source/models/supported_models.rst

b769a1f

Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>

ywang96 enabled auto-merge (squash) April 10, 2024 16:29

ywang96 merged commit e353974 into vllm-project:main Apr 10, 2024
35 checks passed

youkaichao deleted the doc_model_policy branch April 10, 2024 17:06

youkaichao mentioned this pull request Apr 10, 2024

[RFC] How do we test and support third-party models #3780

Open

SageMoore pushed a commit to neuralmagic/nm-vllm that referenced this pull request Apr 11, 2024

[Doc] Add doc to state our model support policy (vllm-project#3948)

fba0984

Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>

andy-neuma pushed a commit to neuralmagic/nm-vllm that referenced this pull request Apr 12, 2024

[Doc] Add doc to state our model support policy (vllm-project#3948)

3389870

Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>

z103cb pushed a commit to z103cb/opendatahub_vllm that referenced this pull request Apr 22, 2024

[Doc] Add doc to state our model support policy (vllm-project#3948)

0b96655

Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>

dtrifiro mentioned this pull request May 15, 2024

bump ubi base image tag opendatahub-io/vllm#24

Merged

mawong-amd pushed a commit to ROCm/vllm that referenced this pull request Jun 3, 2024

[Doc] Add doc to state our model support policy (vllm-project#3948)

624a986

Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Doc] Add doc to state our model support policy #3948

[Doc] Add doc to state our model support policy #3948

youkaichao commented Apr 9, 2024

youkaichao commented Apr 9, 2024 •

edited

rkooo567 Apr 9, 2024

youkaichao Apr 9, 2024

ywang96 Apr 10, 2024

rkooo567 Apr 10, 2024 •

edited

ywang96 left a comment

ywang96 Apr 10, 2024

youkaichao Apr 10, 2024

ywang96 Apr 10, 2024

youkaichao Apr 10, 2024

ywang96 Apr 10, 2024

ywang96 Apr 10, 2024

youkaichao commented Apr 10, 2024

youkaichao commented Apr 10, 2024

rkooo567 Apr 10, 2024

youkaichao Apr 10, 2024

rkooo567 Apr 10, 2024

youkaichao Apr 10, 2024

rkooo567 commented Apr 10, 2024

ywang96 left a comment •

edited


		We have the following levels of testing for models:

		1. Strict consistency: We compare the output of the model with the output of the model in the HuggingFace Transformers library under greedy decoding. This is the most stringent test. The following models fall under this category:


		At vLLM, we are committed to facilitating the integration and support of third-party models within our ecosystem. Our approach is designed to balance the need for robustness and the practical limitations of supporting a wide range of models. Here’s how we manage third-party model support:

		1. Community-Driven Support: We encourage community contributions for adding new models. When a user requests support for a new model, we welcome pull requests (PRs) from the community. These contributions are evaluated primarily on the sensibility of the output they generate, rather than strict consistency with existing implementations such as those in transformers.

[Doc] Add doc to state our model support policy #3948

[Doc] Add doc to state our model support policy #3948

Conversation

youkaichao commented Apr 9, 2024

youkaichao commented Apr 9, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rkooo567 Apr 10, 2024 • edited

Choose a reason for hiding this comment

ywang96 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

youkaichao commented Apr 10, 2024

youkaichao commented Apr 10, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rkooo567 commented Apr 10, 2024

ywang96 left a comment • edited

Choose a reason for hiding this comment

youkaichao commented Apr 9, 2024 •

edited

rkooo567 Apr 10, 2024 •

edited

ywang96 left a comment •

edited