Issue search results

Filter by

9k results

(88 ms)invllm-project/vllm (press backspace or delete to remove)

vllm-project/vllm
[Bug]: Blackwell, no kernel image is available for execution on the device

Your current environment details summary The output of code python collect_env.py /code /summary ============================== System Info ============================== OS ...

bug

twright8

Opened
1 hour ago

#20193

vllm-project/vllm
[CI Failure]: Quantization Test - tests/quantization/test_fp8.py::test_scaled_fp8_quant

Name of failing test tests/quantization/test_fp8.py::test_scaled_fp8_quant Basic information - [ ] Flaky test - [x] Can reproduce locally - [ ] Caused by external libraries (e.g. bug in transformers) ...

ci-failure

mgoin

Opened
2 hours ago

#20192

vllm-project/vllm
[Bug]: [V1] DeepSeek MTP is broken after #20022

Something has changed since the working commit (0.9.2.dev223+gee5ad8d2c plus my PR). I can reproduce the same gibberish on 0.9.2.dev283+ge9fd658af even without the full cudagraph compile option. After ...

cjackal

Opened
4 hours ago

#20186

vllm-project/vllm
[Bug]: OOM when using pp=2 qwen2.5 vl 32B on 2 L20

WARNING 06-27 13:31:35 [sampling_params.py:344] temperature 1e-06 is less than 0.01, which may cause numerical errors nan or inf in tensors. We have maxed it out to 0.01. (VllmWorker rank=1 pid=1162710) ...

bug

sleepwalker2017

Opened
4 hours ago

#20184

vllm-project/vllm
[Usage]: Can someone please share dependencies for running Hunyuan-A13B?

I ve spent hours trying to get the Hunyuan model working with vLLM. Downgrading, upgrading, testing different versions..nothing worked so far. The team shared a Docker image using vLLM 0.8.5, so I’m assuming ...

usage

summersonnn

Opened
5 hours ago

#20183

vllm-project/vllm
[Feature]: Support for Hunyuan-A13B-Instruct

🚀 The feature, motivation and pitch Tencent released this new model: https://huggingface.co/tencent/Hunyuan-A13B-Instruct It matches bigger models on benchmarks. It has a decent size to run locally and ...

feature request

RodriMora

Opened
7 hours ago

#20182

vllm-project/vllm
[Feature]: Batch inference for Multi-Modal Online Serving

🚀 The feature, motivation and pitch It would be great if we could have support for batch inference for online serving. It seems only supported for offline inference. Also, it seems that the OpenAI interface ...

feature request

eslambakr

Opened
7 hours ago

#20181

vllm-project/vllm
[Bug]: Failed profiling vllm (both offline and server) with Nsight Systems

Your current environment details summary The output of code python collect_env.py /code /summary Collecting environment information... ============================== System Info ============================== ...

bug

tfia

Opened
9 hours ago

#20178

vllm-project/vllm
[Bug]: RuntimeError: TopPSamplingFromProbs failed with error code an illegal memory access was encountered

bug

luoling1993

Opened
9 hours ago

#20177

vllm-project/vllm
[Performance]: supports of fused moe kernel implementation

Proposal to improve performance By reading relative parts of source code and running some test, we find that when launching a MoE model like Qwen3, vLLM seems to use Triton-based fused moe kernel. While ...

performance

oldcpple

Opened
9 hours ago

#20176

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues

ProTip!

Press the

key to activate the search input again and adjust your query.

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues

ProTip!

Press the

key to activate the search input again and adjust your query.

Languages

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filter by

State

Advanced

vllm-project/vllm
[Bug]: Blackwell, no kernel image is available for execution on the device

vllm-project/vllm
[CI Failure]: Quantization Test - tests/quantization/test_fp8.py::test_scaled_fp8_quant

vllm-project/vllm
[Bug]: [V1] DeepSeek MTP is broken after #20022

vllm-project/vllm
[Bug]: OOM when using pp=2 qwen2.5 vl 32B on 2 L20

vllm-project/vllm
[Usage]: Can someone please share dependencies for running Hunyuan-A13B?

vllm-project/vllm
[Feature]: Support for Hunyuan-A13B-Instruct

vllm-project/vllm
[Feature]: Batch inference for Multi-Modal Online Serving

vllm-project/vllm
[Bug]: Failed profiling vllm (both offline and server) with Nsight Systems

vllm-project/vllm
[Bug]: RuntimeError: TopPSamplingFromProbs failed with error code an illegal memory access was encountered

vllm-project/vllm
[Performance]: supports of fused moe kernel implementation

Learn how you can use GitHub Issues to plan and track your work.

Learn how you can use GitHub Issues to plan and track your work.

issues Search Results · repo:vllm-project/vllm language:Python

Filter by

State

Advanced

9k results

Learn how you can use GitHub Issues to plan and track your work.

Learn how you can use GitHub Issues to plan and track your work.