Qwen3.5-35B-A3B Mean acceptance length low

Hello, I tested DFlash with Qwen3.5-35B-A3B using vLLM, and found that most of the time the mean acceptance length is only around 2–3, regardless of whether num_spec_tokens is set to 4 or 16.
I wonder if there are multiple versions of Qwen3.5-35B-A3B, and whether the DFlash draft model only works with a specific version?

Also filed an issue in vllm: https://github.com/vllm-project/vllm/issues/42505

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen3.5-35B-A3B Mean acceptance length low #118

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Qwen3.5-35B-A3B Mean acceptance length low #118

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions