Directly save meta files to disk on model save #1954

Qubitium · 2025-09-30T09:14:21Z

zoom zoom

Signed-off-by: Qubitium <Qubitium@modelcloud.ai>

Qubitium · 2025-09-30T09:53:12Z

@codex Check this PR for bugs

avtc · 2025-10-01T13:14:49Z

@Qubitium Please check error during save, can be related to this PR:
main branch hash: 3da0344

INFO  Format: Converting GPTQ v2 to v1                                                                                     
Traceback (most recent call last):
  File "/home/ubuntu/Documents/Quantize/quantize-glm4.5-air-gptqmodel-clean.py", line 63, in <module>
    model.save(OUTPUT_DIR)
    ~~~~~~~~~~^^^^^^^^^^^^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/models/base.py", line 832, in save
    self.save_quantized(
    ~~~~~~~~~~~~~~~~~~~^
        save_dir=save_dir,
        ^^^^^^^^^^^^^^^^^^
    ...<2 lines>...
        meta_quantizer=meta_quantizer,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        eora_path=eora_path)
        ^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/models/writer.py", line 231, in save_quantized
    model = convert_gptq_v2_to_v1_format(
        model, quantize_config=quantize_config, qlinear_kernel=self.qlinear_kernel
    )
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/utils/model.py", line 669, in convert_gptq_v2_to_v1_format
    convert_gptq_v2_to_v1_format_module(module=submodule, quantize_config=quantize_config)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/utils/model.py", line 635, in convert_gptq_v2_to_v1_format_module
    module.qzeros.data[:, range(0, module.qzeros.data.shape[1], 3)] -= (
    ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Inplace update to inference tensor outside InferenceMode is not allowed.You can make a clone to get a normal tensor before doing inplace update.See https://github.com/pytorch/rfcs/pull/17 for more details.

Qubitium · 2025-10-01T13:16:28Z

@Qubitium Please check error during save, can be related to this PR: main branch hash: 3da0344

INFO  Format: Converting GPTQ v2 to v1                                                                                     
Traceback (most recent call last):
  File "/home/ubuntu/Documents/Quantize/quantize-glm4.5-air-gptqmodel-clean.py", line 63, in <module>
    model.save(OUTPUT_DIR)
    ~~~~~~~~~~^^^^^^^^^^^^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/models/base.py", line 832, in save
    self.save_quantized(
    ~~~~~~~~~~~~~~~~~~~^
        save_dir=save_dir,
        ^^^^^^^^^^^^^^^^^^
    ...<2 lines>...
        meta_quantizer=meta_quantizer,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        eora_path=eora_path)
        ^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/models/writer.py", line 231, in save_quantized
    model = convert_gptq_v2_to_v1_format(
        model, quantize_config=quantize_config, qlinear_kernel=self.qlinear_kernel
    )
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/utils/model.py", line 669, in convert_gptq_v2_to_v1_format
    convert_gptq_v2_to_v1_format_module(module=submodule, quantize_config=quantize_config)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/utils/model.py", line 635, in convert_gptq_v2_to_v1_format_module
    module.qzeros.data[:, range(0, module.qzeros.data.shape[1], 3)] -= (
    ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Inplace update to inference tensor outside InferenceMode is not allowed.You can make a clone to get a normal tensor before doing inplace update.See https://github.com/pytorch/rfcs/pull/17 for more details.

Ok. this look to be an easy fix. Looks unrelated to the PR but new code caused different timing so you are hitting more/different thread state bugs.

Qubitium added 4 commits September 30, 2025 09:10

save meta files directly without load to cpu

511d744

Signed-off-by: Qubitium <Qubitium@modelcloud.ai>

format

74becd2

Signed-off-by: Qubitium <Qubitium@modelcloud.ai>

clean

991879d

Signed-off-by: Qubitium <Qubitium@modelcloud.ai>

reuse fully fixed/static buffer

ccb79a7

Signed-off-by: Qubitium <Qubitium@modelcloud.ai>

Qubitium marked this pull request as ready for review September 30, 2025 09:52

Qubitium merged commit 10f3d1f into main Sep 30, 2025
5 checks passed

Qubitium deleted the meta-dedup branch September 30, 2025 09:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Directly save meta files to disk on model save #1954

Directly save meta files to disk on model save #1954

Uh oh!

Qubitium commented Sep 30, 2025 •

edited

Loading

Uh oh!

Qubitium commented Sep 30, 2025

Uh oh!

Uh oh!

avtc commented Oct 1, 2025

Uh oh!

Qubitium commented Oct 1, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Directly save meta files to disk on model save #1954

Directly save meta files to disk on model save #1954

Uh oh!

Conversation

Qubitium commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Qubitium commented Sep 30, 2025

Uh oh!

Uh oh!

avtc commented Oct 1, 2025

Uh oh!

Qubitium commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Qubitium commented Sep 30, 2025 •

edited

Loading

Qubitium commented Oct 1, 2025 •

edited

Loading