Skip to content

Qwen3.5-35B-A3B Mean acceptance length low #118

@luolun

Description

@luolun

Hello, I tested DFlash with Qwen3.5-35B-A3B using vLLM, and found that most of the time the mean acceptance length is only around 2–3, regardless of whether num_spec_tokens is set to 4 or 16.
I wonder if there are multiple versions of Qwen3.5-35B-A3B, and whether the DFlash draft model only works with a specific version?

Also filed an issue in vllm: vllm-project/vllm#42505

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions