Skip to content

[BUG] AttributeError: module 'torch.backends' has no attribute 'fp32_precision' #1976

@avtc

Description

@avtc

Describe the bug

Trying to quantize GLM-4.5-Air with 8 x 3090, on kubuntu 24.04, torch 2.8.0, cuda 12.8, python 3.13.7t
and got an error:

INFO  Hooked Modules: Using legacy based config for targeting of modules
INFO  ModuleLooper: forward start (processor=`gptq`, layer=`model.layers.0`, subset=1/7, batches=1057)                                                                                     %
Quantizing layer 0 of 45 [0 of 45] ██------------------------------------------------------------------------------------------------------------------------| 0:00:02 / 0:01:32 [1/46] 2.2%Traceback (most recent call last):
  File "/home/ubuntu/Documents/Quantize/quantize-glm4.5-Air-gptqmodel-moe-prune-smart-4.py", line 489, in <module>
    model.quantize(
    ~~~~~~~~~~~~~~^
        calibration_dataset,
        ^^^^^^^^^^^^^^^^^^^^
        batch_size=BATCH_SIZE,
        ^^^^^^^^^^^^^^^^^^^^^^
        )
        ^
  File "/home/ubuntu/git/avtc/GPTQModel/gptqmodel/models/base.py", line 875, in quantize
    return module_looper.loop(
           ~~~~~~~~~~~~~~~~~~^
        backend=backend,
        ^^^^^^^^^^^^^^^^
        fail_safe=self.quantize_config.fail_safe,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
  File "/home/ubuntu/git/avtc/GPTQModel/gptqmodel/looper/module_looper.py", line 786, in loop
    forward_outputs = self._run_forward_batches(
        module=module,
    ...<10 lines>...
        reuse_kv=reuse_kv,
    )
  File "/home/ubuntu/git/avtc/GPTQModel/gptqmodel/looper/module_looper.py", line 248, in _run_forward_batches
    return self._run_forward_batches_parallel(
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        module=module,
        ^^^^^^^^^^^^^^
    ...<11 lines>...
        devices=devices,
        ^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/ubuntu/git/avtc/GPTQModel/gptqmodel/looper/module_looper.py", line 394, in _run_forward_batches_parallel
    batch_idx, module_output, kv_next = fut.result()
                                        ~~~~~~~~~~^^
  File "/home/ubuntu/.pyenv/versions/3.13.7t/lib/python3.13t/concurrent/futures/_base.py", line 456, in result
    return self.__get_result()
           ~~~~~~~~~~~~~~~~~^^
  File "/home/ubuntu/.pyenv/versions/3.13.7t/lib/python3.13t/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/home/ubuntu/git/avtc/GPTQModel/gptqmodel/utils/threadx.py", line 367, in _run
    result = fn(*args, **kwargs)
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
  File "/home/ubuntu/git/avtc/GPTQModel/gptqmodel/utils/looper_helpers.py", line 291, in forward_batch_worker
    module_output = module(*inputs, **additional_inputs)
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/transformers/modeling_layers.py", line 94, in __call__
    return super().__call__(*args, **kwargs)
           ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/transformers/utils/deprecation.py", line 172, in wrapped_func
    return func(*args, **kwargs)
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/transformers/models/glm4_moe/modeling_glm4_moe.py", line 380, in forward
    hidden_states, _ = self.self_attn(
                       ~~~~~~~~~~~~~~^
        hidden_states=hidden_states,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<6 lines>...
        **kwargs,
        ^^^^^^^^^
    )
    ^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/transformers/utils/deprecation.py", line 172, in wrapped_func
    return func(*args, **kwargs)
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/transformers/models/glm4_moe/modeling_glm4_moe.py", line 170, in forward
    query_states = self.q_proj(hidden_states).view(hidden_shape)
                   ~~~~~~~~~~~^^^^^^^^^^^^^^^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
  File "/home/ubuntu/git/avtc/GPTQModel/gptqmodel/nn_modules/hooked_linear.py", line 218, in forward
    with tf32_enable_guard():
         ~~~~~~~~~~~~~~~~~^^
  File "/home/ubuntu/.pyenv/versions/3.13.7t/lib/python3.13t/contextlib.py", line 141, in __enter__
    return next(self.gen)
  File "/home/ubuntu/git/avtc/GPTQModel/gptqmodel/utils/torch.py", line 251, in tf32_enable_guard
    if torch.backends.fp32_precision == "tf32":
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'torch.backends' has no attribute 'fp32_precision'
terminate called without an active exception
terminate called recursively
Aborted (core dumped)

Show output of:

gptqmodel, commit hash: 5d80bdcc28e88ea642cdd79a2e9dd6fd78c8b7e9
last known working hash: d8f3c78988bb8f11982a5e52361537ffba05d145
(did not checked middle hashes)

transformers Version: 4.56.1
accelerate Version: 1.10.1
triton Version: 3.4.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions