fix Q.to on multi-gpu gptq when proceeding fast and has many experts and gpus #1774

avtc · 2025-09-09T10:54:59Z

This fixes Q.to on multi-gpu gptq when proceeding fast and has many experts and gpus (for example with mock_quantization=True on 8 gpus on GLM-4.5-Air)

The error originally thrown:
torch.AcceleratorError: CUDA error: invalid argument
For me it throws after first layer with experts finished processing.

Retry of Q.to fixes it.

Original code Q.type_as not only changes the type but also moves the tensor to weight.data.device
The logic of moving to device preserved.
And it seems that wq = wq.to(device=DEVICE_0, non_blocking=False) is redundant so was removed.

===
torch_empty_cache is redundant as well, as retry works without it
I have tried torch.sync and torch.accelerator.synchronize but only retry with try catch works

…k_quantization=True)

Qubitium · 2025-09-09T13:35:31Z

@avtc Good catch! Looks like you found a torch/cuda sync bug ,or, gtqmode is not safely locking the cuda ops in our GIL=0 setup. I have a feeling it's the latter. Merge for now and log this for future TODO.

avtc added 2 commits September 9, 2025 13:28

fix Q.to on multi-gpu gptq when proceeding fast (for example when moc…

4fac618

…k_quantization=True)

remove redundant torch_empty_cache from fix

769d0a6

Qubitium merged commit 64b3901 into ModelCloud:main Sep 9, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix Q.to on multi-gpu gptq when proceeding fast and has many experts and gpus #1774

fix Q.to on multi-gpu gptq when proceeding fast and has many experts and gpus #1774

Uh oh!

avtc commented Sep 9, 2025 •

edited

Loading

Uh oh!

Qubitium commented Sep 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix Q.to on multi-gpu gptq when proceeding fast and has many experts and gpus #1774

fix Q.to on multi-gpu gptq when proceeding fast and has many experts and gpus #1774

Uh oh!

Conversation

avtc commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Qubitium commented Sep 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

avtc commented Sep 9, 2025 •

edited

Loading