How to install and use vLLM to serve multiple large language models

### Proposal to improve performance

How to install and use vLLM to serve multiple large language models:

google/gemma-3-27b-it 
mistralai/Mistral-Small-3.1-24B-Instruct-2503
Skywork/Skywork-R1V-38B
Qwen/Qwen2.5-VL

The core issue is that gemma-3-27b requires a very specific (often newer) version of the transformers library. Installing this specific version might break compatibility with vLLM itself or with other models (like certain Mistral or Qwen versions) which might rely on a different, possibly more stable or slightly older, transformers version that vLLM commonly supports.

### Report of performance regression

How to install and use vLLM to serve multiple large language models:

google/gemma-3-27b-it 
mistralai/Mistral-Small-3.1-24B-Instruct-2503
Skywork/Skywork-R1V-38B
Qwen/Qwen2.5-VL

The core issue is that gemma-3-27b requires a very specific (often newer) version of the transformers library. Installing this specific version might break compatibility with vLLM itself or with other models (like certain Mistral or Qwen versions) which might rely on a different, possibly more stable or slightly older, transformers version that vLLM commonly supports.

### Misc discussion on performance

How to install and use vLLM to serve multiple large language models:

google/gemma-3-27b-it 
mistralai/Mistral-Small-3.1-24B-Instruct-2503
Skywork/Skywork-R1V-38B
Qwen/Qwen2.5-VL

The core issue is that gemma-3-27b requires a very specific (often newer) version of the transformers library. Installing this specific version might break compatibility with vLLM itself or with other models (like certain Mistral or Qwen versions) which might rely on a different, possibly more stable or slightly older, transformers version that vLLM commonly supports.

### Your current environment (if you think it is necessary)

```text
The output of `python collect_env.py`
```


### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

How to install and use vLLM to serve multiple large language models #15602

Proposal to improve performance

Report of performance regression

Misc discussion on performance

Your current environment (if you think it is necessary)

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

How to install and use vLLM to serve multiple large language models #15602

Description

Proposal to improve performance

Report of performance regression

Misc discussion on performance

Your current environment (if you think it is necessary)

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions