Skip to content

[Bug]: qwen36 infinite loop issue #1737

@wenhuach21

Description

@wenhuach21

Problem Description

docker run --gpus all --rm --name qwen36-35 -p 8080:8000 -v ~/.cache/huggingface:/root/.cache/huggingface --ipc=host -e HUGGING_FACE_HUB_TOKEN="$HUGGING_FACE_HUB_TOKEN" -e VLLM_ATTENTION_BACKEND=FLASH_ATTN vllm-nightly-transformers-main --model Intel/Qwen3.6-35B-A3B-int4-AutoRound --served-model-name "qwen/qwen36-35b" --trust-remote-code --api-key mumu-102495153 --max-model-len 192382 --max-num-seqs 4 --gpu-memory-utilization 0.98 --enable-auto-tool-choice --tool-call-parser qwen3_coder --kv-cache-dtype fp8 --reasoning-parser qwen3 --max-num-batched-tokens 8192 --enable-prefix-caching

lianglv
Intel org
2 days ago

We don't have your docker image. Could you provide minimal steps to reproduce the infinite loop issue?

wenhuach
Intel org
2 days ago

Does this issue occur for all prompts, or only for specific ones? We would appreciate it if you could share some example prompts that reproduce the issue.

pathosethoslogos
2 days ago

edited 2 days ago

I can confirm, indeed this is the case.

You can tell by the high number of downloads and low number of hearts for this model.

zsmweb
about 21 hours ago

I use an RTX 3090 with 24GB to run the model. Not every conversation gets stuck in a loop. I use CherryStudio to check the weather.
Here is my Dockerfile.

cat Dockerfile
FROM nvidia/cuda:12.4.1-devel-ubuntu22.04

ENV DEBIAN_FRONTEND=noninteractive
ENV PIP_NO_CACHE_DIR=1
ENV PYTHONUNBUFFERED=1

RUN apt-get update && apt-get install -y
python3 python3-pip git
&& rm -rf /var/lib/apt/lists/*

RUN python3 -m pip install --upgrade pip setuptools wheel

RUN python3 -m pip install -U
vllm --pre
--index-url https://pypi.org/simple
--extra-index-url https://wheels.vllm.ai/nightly

RUN python3 -m pip install -U
git+https://github.com/huggingface/transformers.git

RUN python3 -m pip install conch-triton-kernels

ENTRYPOINT ["python3", "-m", "vllm.entrypoints.openai.api_server"]

Reproduction Steps

~

Environment Information

No response

Error Logs

Additional Context

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions