Skip to content

Conversation

@avtc
Copy link
Contributor

@avtc avtc commented Nov 3, 2025

@Qubitium the issue reproduced on latest main branch (a0e065a)

INFO  | gptq    | 0     | block_sparse_moe.experts.128.w1 | 3072, 1536    | bf16: 9.6MB  | 999999999.0000000000 | 0       | 0.02500 | 0.032  | 27.833   | cuda 9.18G, 16.72G, 21.02G, 16.74G, 17.09G, 16.8G, 16.74G, 16.74G  |         |
INFO  +---------+-------+---------------------------------+---------------+--------------+----------------------+---------+---------+--------+----------+-----------------------------------Traceback (most recent call last):---------+
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/utils/threadx.py", line 415, in _run░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░| 0:04:59 / 5:08:58 [1/62] 1.6%
    result = fn(*args, **kwargs)
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/looper/stage_subset.py", line 436, in _process_on_worker
    proc.process(
    ~~~~~~~~~~~~^
        module=nm,
        ^^^^^^^^^^
    ...<3 lines>...
        subset_total=subset_total_count,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/looper/gptq_processor.py", line 186, in process
    wq, q_scales, q_zeros, q_g_idx, duration, avg_loss, damp_percent, nsamples = g.quantize()
                                                                                 ~~~~~~~~~~^^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/quantization/gptq.py", line 725, in quantize
    self.finalize_hessian(target_device=target_device)
    ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/quantization/gptq.py", line 588, in finalize_hessian
    self.materialize_global_hessian(target_device=target_device)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/quantization/gptq.py", line 573, in materialize_global_hessian
    result_accum.add_(partial.to(device=result_accum.device, dtype=torch.float32))
                      ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.AcceleratorError: CUDA error: invalid argument

I propose to add retry back, as it works.
image

@Qubitium Qubitium merged commit 1a1c5a5 into ModelCloud:main Nov 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants