-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Doc] Add doc to state our model support policy #3948
Conversation
Two TODO:
|
|
||
We have the following levels of testing for models: | ||
|
||
1. **Strict consistency**: We compare the output of the model with the output of the model in the HuggingFace Transformers library under greedy decoding. This is the most stringent test. The following models fall under this category: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QQ: This list could be out of date very easily. Should we just link to a test file instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is also an option. But essentially it does not tell users what models are tested. Or we can tell users to grep
our tests
& examples
folder to see if a model is tested? 🤣
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
second @rkooo567 that a static list on the doc is probably not ideal - maybe we can refer to the CI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One simple solution I can think of is to centralize this constant to tested_model.py
(new file) and link to this file instead?
vllm/tests/models/test_models.py
Line 10 in 0258b7a
MODELS = [ |
vllm/tests/models/test_big_models.py
Line 9 in 0258b7a
MODELS = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments! Thank you very much for putting this down in the doc!
- EleutherAI/pythia-70m | ||
- bigcode/tiny_starcoder_py | ||
- gpt2 | ||
4. **Community feedback**: We rely on the community to provide feedback on the models. If a model is broken or not working as expected, we encourage users to raise issues to report it or open pull requests to fix it. The rest of the models fall under this category. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming here doesn't feel like a level but rather a general guideline - do you mean if certain models are not working at all (due to layer changes, kernel changes, etc), then we rely on the community to fix these models?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mean if certain models are not working at all (due to layer changes, kernel changes, etc), then we rely on the community to fix these models
Currently I would say yes. Because we don't test them, it is possible they are broken. But we will do our best to maintain them. e.g. when we make some change in vllm core, we typically update the model files. It's best-effort anyway.
|
||
At vLLM, we are committed to facilitating the integration and support of third-party models within our ecosystem. Our approach is designed to balance the need for robustness and the practical limitations of supporting a wide range of models. Here’s how we manage third-party model support: | ||
|
||
1. **Community-Driven Support**: We encourage community contributions for adding new models. When a user requests support for a new model, we welcome pull requests (PRs) from the community. These contributions are evaluated primarily on the sensibility of the output they generate, rather than strict consistency with existing implementations such as those in transformers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably also worth pointing out that a basic sensibility test is also required in the model support PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A sensibility report is required in the PR. However, with respect to adding it into test, there is a strong concern on the time and resource burden it adds to our CI system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally make sense - I think an attached report from the author in the PR is fine for now and we don't need to build it into our CI.
|
||
We have the following levels of testing for models: | ||
|
||
1. **Strict consistency**: We compare the output of the model with the output of the model in the HuggingFace Transformers library under greedy decoding. This is the most stringent test. The following models fall under this category: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
second @rkooo567 that a static list on the doc is probably not ideal - maybe we can refer to the CI?
@ywang96 thanks for the detailed review! |
|
||
We have the following levels of testing for models: | ||
|
||
1. **Strict Consistency**: We compare the output of the model with the output of the model in the HuggingFace Transformers library under greedy decoding. This is the most stringent test. Please refer to https://github.com/vllm-project/vllm/tree/main/tests/basic_correctness folder for the models that have passed this test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit; but I feel like linking this file https://github.com/vllm-project/vllm/blob/main/tests/models/test_models.py (small), https://github.com/vllm-project/vllm/blob/main/tests/models/test_big_models.py (big) is better because it has more strict consistency check (basically it checks tokens up to 96 whereas basic correctness test only checks the first 5).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in d59e482
We have the following levels of testing for models: | ||
|
||
1. **Strict Consistency**: We compare the output of the model with the output of the model in the HuggingFace Transformers library under greedy decoding. This is the most stringent test. Please refer to https://github.com/vllm-project/vllm/tree/main/tests/basic_correctness folder for the models that have passed this test. | ||
2. **Output Sensibility**: We check if the output of the model is sensible and coherent, by measuring the perplexity of the output and checking for any obvious errors. This is a less stringent test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also consider adding a link?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have any test for Output Sensibility yet.
Looks pretty good to me! just some nits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Left some final nits to add names to the links (otherwise they will all show as vllm-project/vllm
on the doc.)
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Part of #3780 .