[SpecDecode] support spec decode by MengqingCao · Pull Request #395 · vllm-project/vllm-ascend

MengqingCao · 2025-03-26T03:01:35Z

What this PR does / why we need it?

support spec decode.
plz merge this after vllm-project/vllm#15195 is merged

Does this PR introduce any user-facing change?

N/A

How was this patch tested?

Signed-off-by: MengqingCao <cmq0113@163.com>

MengqingCao · 2025-03-27T10:51:35Z

Using this pr will raise segment fault in vllm/spec_decode/metrics.py at

        self._aggregate_num_accepted_tokens = torch.tensor(
            0, dtype=torch.long, device="cpu", pin_memory=pin_memory)
        self._aggregate_num_emitted_tokens = torch.tensor(
            0, dtype=torch.long, device="cpu", pin_memory=pin_memory)

This is a bug form pytorch 2.5.1, reproduction code:

import torch
torch.tensor(0, dtype=torch.long, device="cpu", pin_memory=True)

[SpecDecode] support spec decode

96134f5

Signed-off-by: MengqingCao <cmq0113@163.com>

github-actions bot added the module:core label Mar 26, 2025

wangxiyuan closed this May 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[SpecDecode] support spec decode#395

[SpecDecode] support spec decode#395
MengqingCao wants to merge 1 commit intovllm-project:mainfrom
MengqingCao:spec

MengqingCao commented Mar 26, 2025

Uh oh!

MengqingCao commented Mar 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

MengqingCao commented Mar 26, 2025

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

MengqingCao commented Mar 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants