Skip to content

safetensors_rust.SafetensorError: Error while serializing: IoError(Os { code: 5, kind: Uncategorized, message: "Input/output error" }) #35895

@JohnConnor123

Description

@JohnConnor123

System Info

  • transformers version: 4.47.1
  • Platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.39
  • Python version: 3.12.3
  • Huggingface_hub version: 0.27.0
  • Safetensors version: 0.4.5
  • Accelerate version: 1.2.1
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.5.1+cu124 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using distributed or parallel set-up in script?: No
  • Using GPU in script?: Yes
  • GPU type: NVIDIA GeForce RTX 3060 Laptop GPU

Who can help?

@SunMarc @MekkCyber

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Error:

Traceback (most recent call last):
  File "/mnt/d/Python_Projects/Jupyter/other/call-center-prompter/debug/quant/check-quantizations.py", line 31, in <module>
    quantize_gptq(model_id=model_id, quant_config=gptq_config, prefix_dir=prefix_dir)
  File "/mnt/d/Python_Projects/Jupyter/other/call-center-prompter/debug/quant/gptq_quantize.py", line 32, in quantize_gptq
    model.save_pretrained(prefix_dir + quant_path)
  File "/mnt/d/Python_Projects/Jupyter/other/call-center-prompter/debug/quant/venv-wsl2/lib/python3.12/site-packages/transformers/modeling_utils.py", line 3034, in save_pretrained
    safe_save_file(shard, os.path.join(save_directory, shard_file), metadata={"format": "pt"})
  File "/mnt/d/Python_Projects/Jupyter/other/call-center-prompter/debug/quant/venv-wsl2/lib/python3.12/site-packages/safetensors/torch.py", line 286, in save_file
    serialize_file(_flatten(tensors), filename, metadata=metadata)
safetensors_rust.SafetensorError: Error while serializing: IoError(Os { code: 5, kind: Uncategorized, message: "Input/output error" })

Code:

import os
import logging
from transformers import AutoModelForCausalLM, AutoTokenizer, GPTQConfig
from huggingface_hub import login, snapshot_download

logger = logging.getLogger(__name__)

logger.info("Logging in HF")
login(token=<mytoken>)

def quantize_gptq(model_id: str, quant_config: dict, prefix_dir: str = './') -> str:
    prefix_dir += '/' if prefix_dir[-1] != '/' else ''
    model_path = prefix_dir + model_id.split('/')[1] if os.path.exists(prefix_dir + model_id.split('/')[1]) else model_id
    quant_path = model_id.split('/')[1] + f"-GPTQ-{quant_config['bits']}bit"

    if os.path.exists(prefix_dir + quant_path):
        logger.info("Skipping GPTQ quantization because it already exists")
    else:
        tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=True)
        config = GPTQConfig(**quant_config, dataset="c4", tokenizer=tokenizer) # exllama_config={"version":2}

        model = AutoModelForCausalLM.from_pretrained(
            model_path,
            device_map="auto",
            trust_remote_code=False,
            quantization_config=config,
            revision="main"
        )

        logger.info("Save GPTQ quantized model")
        os.makedirs(prefix_dir + quant_path, exist_ok=True)
        model.save_pretrained(prefix_dir + quant_path)
        tokenizer.save_pretrained(prefix_dir + quant_path)

        logger.info("Push to hub GPTQ quantized model")
        model.push_to_hub(quant_path)
        tokenizer.push_to_hub(quant_path)

    return prefix_dir + quant_path

Expected behavior

Model saving without errors

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions