Skip to content

unexpected slow speed on windonws with RTX4060 #870

@kalle07

Description

@kalle07
from auto_round import AutoRound
from pathlib import Path

model_dir = Path(r"c:\Users\XXX\Documents\python\autoround\facebook")  # <-- hier deinen Pfad eintragen

# --- Überprüfen, ob der Pfad existiert und ein Verzeichnis ist ---
if not model_dir.exists():
    print(f"❌ Der Pfad '{model_dir}' existiert nicht.")
elif not model_dir.is_dir():
    print(f"⚠️ Der Pfad '{model_dir}' ist keine Verzeichnisstruktur.")
else:
    print(f"📁 Inhalte des Verzeichnisses: {model_dir}\n")

    # --- Alle Dateien im Verzeichnis auflisten ---
    files = [f for f in model_dir.iterdir() if f.is_file()]

    if not files:
        print("ℹ️ Keine Dateien im Verzeichnis gefunden.")
    else:
        for file in files:
            print(f" - {file.name}")

# Available schemes: "W2A16", "W3A16", "W4A16", "W8A16", "NVFP4", "MXFP4" (no real kernels), "GGUF:Q4_K_M", etc.
ar = AutoRound(model_dir, scheme="W4A16", iters=50, lr=5e-3)

# Highest accuracy (4–5× slower).
# `low_gpu_mem_usage=True` saves ~20GB VRAM but runs ~30% slower.
# ar = AutoRound(model_name_or_path, nsamples=512, iters=1000, low_gpu_mem_usage=True)

# Faster quantization (2–3× speedup) with slight accuracy drop at W4G128.
# ar = AutoRound(model_name_or_path, nsamples=128, iters=50, lr=5e-3)

# Supported formats: "auto_round" (default), "auto_gptq", "auto_awq", "llm_compressor", "gguf:q4_k_m", etc.
ar.quantize_and_save(output_dir=r"c:\Users\XXX\Documents\python\autoround\tmp_autoround", format="q4_k_m")

thats shows the shell:

2025-10-08 16:52:23,670 INFO utils.py L164: NumExpr defaulting to 16 threads.
📁 Inhalte des Verzeichnisses: c:\Users\XXX\Documents\python\autoround\facebook

  • config.json
  • flax_model.msgpack
  • generation_config.json
  • gitattributes
  • LICENSE.md
  • merges.txt
  • pytorch_model.bin
  • README.md
  • special_tokens_map.json
  • tf_model.h5
  • tokenizer_config.json
  • vocab.json
    Traceback (most recent call last):
    File "C:\Users\XXX\Documents\python\autoround\auto01.py", line 24, in
    ar = AutoRound(model_dir, scheme="W4A16", iters=50, lr=5e-3)
    TypeError: AutoRound.init() missing 1 required positional argument: 'tokenizer'

what do i miss ?

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions