Skip to content

[Usage]: V0 Does Qwen2-VL Support torch.compile in vllm? #12693

@bao231

Description

@bao231

Your current environment

Hi there,

First of all, thank you for all the hard work on vllm—it’s an excellent project!

I am currently exploring the use of torch.compile within vllm to optimize inference performance. I have seen that many decoder-only models (such as GPT-series and LLaMA) work well with torch.compile. However, I am particularly interested in using the Qwen2-VL model and could not find any documentation or discussion regarding torch.compile support for it.

Could you please clarify the following:

Is Qwen2-VL currently supported with torch.compile in the latest version of vllm?
If not, are there any plans to add support for Qwen2-VL with torch.compile in the near future?
Are there any known workarounds or tips for using torch.compile with multi-modal models like Qwen2-VL?
Any guidance or insights would be greatly appreciated!

Thank you for your time and assistance.

How would you like to use vllm

I want to run inference of a [specific model](put link here). I don't know how to integrate it with vllm.

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

Labels

staleOver 90 days of inactivityusageHow to use vllm

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions