vllm-project / vllm Public

Notifications
Fork 6.3k
Star 42k

Code
Issues 1.5k
Pull requests 525
Discussions
Actions
Projects 7
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q1 2025

#11862 opened Jan 8, 2025 by simon-mo

Open 7

[V1] Feedback Thread

#12568 opened Jan 30, 2025 by simon-mo

Open 67

Labels 42 Milestones 1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1,503 Open 5,765 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

missing latest tag in vllm-cpu image usage

How to use vllm

#15142 opened Mar 19, 2025 by nirrozenbaum

1 task done

[Feature]: Configurable metrics export format - Prometheus, OpenTelemetry feature request

New feature or request

#15141 opened Mar 19, 2025 by pavolloffay

1 task done

[Bug]: vllm 0.7.3-0.8.0 Qwen VL 2.5 - No available block found in 60 second. for hours for a video of 300 frames (cpus 100%, gpu: 0%) bug

Something isn't working

#15136 opened Mar 19, 2025 by denadai2

1 task done

[Feature]: Don't append pid in triton cache directory when using vllm's torch_compile_cache feature request

New feature or request

#15133 opened Mar 19, 2025 by youkaichao

1 task done

[Usage]: multiround QA when using qwen2.5vl with the same input image usage

How to use vllm

#15132 opened Mar 19, 2025 by QuLiao1117

1 task done

[Usage]: relationship between embedding size and vocab_size usage

How to use vllm

#15131 opened Mar 19, 2025 by Happy2Git

1 task done

[Bug]: vllm.engine.async_llm_engine.AsyncEngineDeadError bug

Something isn't working

#15127 opened Mar 19, 2025 by Arya-Mayank

1 task done

How to use vllm

#15125 opened Mar 19, 2025 by chiaitian

1 task done

[Bug]: AssertionError - assert loaded_weight.shape[output_dim] == self.org_vocab_size bug

Something isn't working

#15124 opened Mar 19, 2025 by UCC-team

1 task done

[RFC]: layer-wise kv cache offloading to enable larger batches RFC

#15123 opened Mar 19, 2025 by sleepwalker2017

1 task done

[Bug]: Qwen VL 2.5 doesn't work in v0.8.0 - again bug

Something isn't working

#15122 opened Mar 19, 2025 by andrePankraz

1 task done

[Bug]: [V1] AWQ, GPTQ fail to start due to TypeError: an integer is required bug

Something isn't working

#15121 opened Mar 19, 2025 by nFunctor

1 task done

[Bug]: BadRequestError(400) when using completions API with stream=true and echo=true bug

Something isn't working

#15119 opened Mar 19, 2025 by fyuan1316

1 task done

[Feature]: Improve GPTQ implementation feature request

New feature or request

#15116 opened Mar 19, 2025 by 1096125073

1 task done

[Usage]: Model compute_logits always get None for sampling_metadata usage

How to use vllm

#15115 opened Mar 19, 2025 by yanyongyu

1 task done

[Bug]: flash_attn_with_kvcache kernel, an illegal memory access bug

Something isn't working

#15113 opened Mar 19, 2025 by pengcuo

1 task done

[Bug]: Internal Server Error when using Qwen2-VL-7B with vLLM Docker Container bug

Something isn't working

#15110 opened Mar 19, 2025 by ChengaoJ

1 task done

[Usage]: How to use VLLM added functions for torch in a separate environment? usage

How to use vllm

#15108 opened Mar 19, 2025 by luentong

1 task done

First tpot/itl is too long? performance

Performance-related issues

#15106 opened Mar 19, 2025 by jessiewiswjc

[Bug]: 0.8.0(V1) RayChannelTimeoutError when inferencing DeepSeekV3 on 16 H20 with large batch size bug

Something isn't working

#15102 opened Mar 19, 2025 by jeffye-dev

1 task done

[Bug]: Run DeepSeek-R1-awq model on AMD MI210 meet an error bug

Something isn't working

#15101 opened Mar 19, 2025 by liebedir

1 task done

[Bug]: 0.8.0(V1) Ray cannot find model pyarrow and pandas bug

Something isn't working

#15100 opened Mar 19, 2025 by jeffye-dev

1 task done

[Bug]: 0.8.0(V1) crash on NCCL when load MoE model on 16 GPUs(H20) bug

Something isn't working

#15098 opened Mar 19, 2025 by jeffye-dev

1 task done

[Bug]: loading default chat template occurs TypeError: unhashable type: 'dict' bug

Something isn't working

#15095 opened Mar 19, 2025 by Shuntw6096

1 task done

[Usage]: torch.compile is turned on, but the model LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct does not support it. usage

How to use vllm

#15093 opened Mar 19, 2025 by Bhaveshdhapola

1 task done

Previous 1 2 3 4 5 … 60 61 Next

Previous Next

ProTip! Updated in the last three days: updated:>2025-03-16.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly