-
-
Notifications
You must be signed in to change notification settings - Fork 12.6k
Description
Proposal to improve performance
How to install and use vLLM to serve multiple large language models:
google/gemma-3-27b-it
mistralai/Mistral-Small-3.1-24B-Instruct-2503
Skywork/Skywork-R1V-38B
Qwen/Qwen2.5-VL
The core issue is that gemma-3-27b requires a very specific (often newer) version of the transformers library. Installing this specific version might break compatibility with vLLM itself or with other models (like certain Mistral or Qwen versions) which might rely on a different, possibly more stable or slightly older, transformers version that vLLM commonly supports.
Report of performance regression
How to install and use vLLM to serve multiple large language models:
google/gemma-3-27b-it
mistralai/Mistral-Small-3.1-24B-Instruct-2503
Skywork/Skywork-R1V-38B
Qwen/Qwen2.5-VL
The core issue is that gemma-3-27b requires a very specific (often newer) version of the transformers library. Installing this specific version might break compatibility with vLLM itself or with other models (like certain Mistral or Qwen versions) which might rely on a different, possibly more stable or slightly older, transformers version that vLLM commonly supports.
Misc discussion on performance
How to install and use vLLM to serve multiple large language models:
google/gemma-3-27b-it
mistralai/Mistral-Small-3.1-24B-Instruct-2503
Skywork/Skywork-R1V-38B
Qwen/Qwen2.5-VL
The core issue is that gemma-3-27b requires a very specific (often newer) version of the transformers library. Installing this specific version might break compatibility with vLLM itself or with other models (like certain Mistral or Qwen versions) which might rely on a different, possibly more stable or slightly older, transformers version that vLLM commonly supports.
Your current environment (if you think it is necessary)
The output of `python collect_env.py`
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.