Skip to content

Comments

[SpecDecode] support spec decode#395

Closed
MengqingCao wants to merge 1 commit intovllm-project:mainfrom
MengqingCao:spec
Closed

[SpecDecode] support spec decode#395
MengqingCao wants to merge 1 commit intovllm-project:mainfrom
MengqingCao:spec

Conversation

@MengqingCao
Copy link
Collaborator

What this PR does / why we need it?

support spec decode.
plz merge this after vllm-project/vllm#15195 is merged

Does this PR introduce any user-facing change?

N/A

How was this patch tested?

Signed-off-by: MengqingCao <cmq0113@163.com>
@MengqingCao
Copy link
Collaborator Author

Using this pr will raise segment fault in vllm/spec_decode/metrics.py at

        self._aggregate_num_accepted_tokens = torch.tensor(
            0, dtype=torch.long, device="cpu", pin_memory=pin_memory)
        self._aggregate_num_emitted_tokens = torch.tensor(
            0, dtype=torch.long, device="cpu", pin_memory=pin_memory)

This is a bug form pytorch 2.5.1, reproduction code:

import torch
torch.tensor(0, dtype=torch.long, device="cpu", pin_memory=True)

@wangxiyuan wangxiyuan closed this May 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants