Retry partial to to fix accelerate invalid argument for first moe layer (reapply) #2169

avtc · 2025-11-03T10:24:54Z

@Qubitium the issue reproduced on latest main branch (a0e065a)

INFO  | gptq    | 0     | block_sparse_moe.experts.128.w1 | 3072, 1536    | bf16: 9.6MB  | 999999999.0000000000 | 0       | 0.02500 | 0.032  | 27.833   | cuda 9.18G, 16.72G, 21.02G, 16.74G, 17.09G, 16.8G, 16.74G, 16.74G  |         |
INFO  +---------+-------+---------------------------------+---------------+--------------+----------------------+---------+---------+--------+----------+-----------------------------------Traceback (most recent call last):---------+
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/utils/threadx.py", line 415, in _run░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░| 0:04:59 / 5:08:58 [1/62] 1.6%
    result = fn(*args, **kwargs)
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/looper/stage_subset.py", line 436, in _process_on_worker
    proc.process(
    ~~~~~~~~~~~~^
        module=nm,
        ^^^^^^^^^^
    ...<3 lines>...
        subset_total=subset_total_count,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/looper/gptq_processor.py", line 186, in process
    wq, q_scales, q_zeros, q_g_idx, duration, avg_loss, damp_percent, nsamples = g.quantize()
                                                                                 ~~~~~~~~~~^^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/quantization/gptq.py", line 725, in quantize
    self.finalize_hessian(target_device=target_device)
    ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/quantization/gptq.py", line 588, in finalize_hessian
    self.materialize_global_hessian(target_device=target_device)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/quantization/gptq.py", line 573, in materialize_global_hessian
    result_accum.add_(partial.to(device=result_accum.device, dtype=torch.float32))
                      ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.AcceleratorError: CUDA error: invalid argument

I propose to add retry back, as it works.

avtc added 2 commits November 3, 2025 11:55

reapply try partial.to

f754f83

refactor

1413146

Qubitium merged commit 1a1c5a5 into ModelCloud:main Nov 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Retry partial to to fix accelerate invalid argument for first moe layer (reapply) #2169

Retry partial to to fix accelerate invalid argument for first moe layer (reapply) #2169

Uh oh!

avtc commented Nov 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Retry partial to to fix accelerate invalid argument for first moe layer (reapply) #2169

Retry partial to to fix accelerate invalid argument for first moe layer (reapply) #2169

Uh oh!

Conversation

avtc commented Nov 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants