Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4bit量化Qwen1.5-72B 卡在:Quantizing model.layers blocks : 100% .... #4018

Closed
1 task done
camposs1979 opened this issue May 31, 2024 · 0 comments
Closed
1 task done
Labels
wontfix This will not be worked on

Comments

@camposs1979
Copy link

Reminder

  • I have read the README and searched the existing issues.

Reproduction

#!/bin/bash

DO NOT use quantized model or quantization_bit when merging lora weights

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5 python3.10 export_model.py
--model_name_or_path /hy-tmp/models/Qwen1.5-72B-Chat-sft
--export_quantization_bit 4
--export_quantization_dataset ../data/c4_demo.json
--template qwen
--export_dir ../../models/Qwen1.5-72B-Chat-sft-INT4
--export_size 2
--export_device cpu
--export_legacy_format False
卡在了:
Quantizing model.layers blocks : 100%|████████████████████████████████████████████████████████████| 80/80 [1:30:22<00:00, 67.78s/it]
/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:4225: FutureWarning: _is_quantized_training_enabled is going to be deprecated in transformers 4.39.0. Please use model.hf_quantizer.is_trainable instead
warnings.warn(
一直没有动静。

Expected behavior

希望能够正常量化Qwen1.5-72B

System Info

Package Version


accelerate 0.28.0
addict 2.4.0
aiofiles 23.2.1
aiohttp 3.9.3
aiosignal 1.3.1
aliyun-python-sdk-core 2.15.0
aliyun-python-sdk-kms 2.16.2
altair 5.2.0
annotated-types 0.6.0
anyio 4.3.0
async-timeout 4.0.3
attrs 23.2.0
auto_gptq 0.7.1
bitsandbytes 0.43.0
certifi 2019.11.28
cffi 1.16.0
chardet 3.0.4
charset-normalizer 3.3.2
click 8.1.7
cloudpickle 3.0.0
cmake 3.29.2
coloredlogs 15.0.1
contourpy 1.2.0
crcmod 1.7
cryptography 42.0.5
cupy-cuda12x 12.1.0
cycler 0.12.1
datasets 2.18.0
dbus-python 1.2.16
deepspeed 0.14.0
dill 0.3.8
diskcache 5.6.3
distro 1.4.0
distro-info 0.23ubuntu1
docstring_parser 0.16
einops 0.7.0
exceptiongroup 1.2.0
fastapi 0.110.0
fastrlock 0.8.2
ffmpy 0.3.2
filelock 3.13.3
fire 0.6.0
flash-attn 2.5.9.post1
fonttools 4.50.0
frozenlist 1.4.1
fsspec 2024.2.0
galore-torch 1.0
gast 0.5.4
gekko 1.0.7
gradio 4.29.0
gradio_client 0.16.1
h11 0.14.0
hjson 3.1.0
httpcore 1.0.4
httptools 0.6.1
httpx 0.27.0
huggingface-hub 0.22.0
humanfriendly 10.0
idna 2.8
importlib_metadata 7.1.0
importlib_resources 6.4.0
interegular 0.3.3
Jinja2 3.1.3
jmespath 0.10.0
joblib 1.3.2
jsonschema 4.21.1
jsonschema-specifications 2023.12.1
kiwisolver 1.4.5
lark 1.1.9
llvmlite 0.42.0
markdown-it-py 3.0.0
MarkupSafe 2.1.5
matplotlib 3.8.3
mdurl 0.1.2
modelscope 1.13.3
mpmath 1.3.0
msgpack 1.0.8
multidict 6.0.5
multiprocess 0.70.16
nest-asyncio 1.6.0
networkx 3.2.1
ninja 1.11.1.1
numba 0.59.1
numpy 1.26.4
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu12 2.18.1
nvidia-nvjitlink-cu12 12.4.99
nvidia-nvtx-cu12 12.1.105
optimum 1.16.0
orjson 3.9.15
oss2 2.18.4
outlines 0.0.34
packaging 24.0
pandas 2.2.1
peft 0.10.0
pillow 10.2.0
pip 24.0
platformdirs 4.2.0
prometheus_client 0.20.0
protobuf 3.20.3
psutil 5.9.8
py-cpuinfo 9.0.0
pyarrow 15.0.2
pyarrow-hotfix 0.6
pycparser 2.21
pycryptodome 3.20.0
pydantic 2.6.4
pydantic_core 2.16.3
pydub 0.25.1
Pygments 2.17.2
PyGObject 3.36.0
pynvml 11.5.0
pyparsing 3.1.2
python-apt 2.0.1+ubuntu0.20.4.1
python-dateutil 2.9.0.post0
python-dotenv 1.0.1
python-multipart 0.0.9
pytz 2024.1
PyYAML 6.0.1
ray 2.10.0
referencing 0.34.0
regex 2023.12.25
requests 2.31.0
requests-unixsocket 0.2.0
rich 13.7.1
rouge 1.0.1
rpds-py 0.18.0
ruff 0.4.6
safetensors 0.4.2
scipy 1.12.0
semantic-version 2.10.0
sentencepiece 0.2.0
setuptools 69.2.0
shellingham 1.5.4
shtab 1.7.1
simplejson 3.19.2
six 1.14.0
sniffio 1.3.1
sortedcontainers 2.4.0
sse-starlette 2.0.0
ssh-import-id 5.10
starlette 0.36.3
sympy 1.12
termcolor 2.4.0
tiktoken 0.6.0
tokenizers 0.15.2
tomli 2.0.1
tomlkit 0.12.0
toolz 0.12.1
torch 2.1.0
tqdm 4.66.2
transformers 4.39.1
triton 2.1.0
trl 0.8.2
typer 0.12.3
typing_extensions 4.10.0
tyro 0.7.3
tzdata 2024.1
unattended-upgrades 0.1
unsloth 2024.5
urllib3 2.2.1
uvicorn 0.29.0
uvloop 0.19.0
vllm 0.4.0
watchfiles 0.21.0
websockets 11.0.3
wheel 0.43.0
xformers 0.0.22.post7+cu118
xxhash 3.4.1
yapf 0.40.2
yarl 1.9.4
zipp 3.18.1

Others

No response

@camposs1979 camposs1979 changed the title 4bit量化Qwen1.5-72B 卡在:Quantizing model.layers blocks : 100%|████████████████████████████████████████████████████████████| 80/80 [1:30:22<00:00, 67.78s/it] /usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:4225: FutureWarning: _is_quantized_training_enabled is going to be deprecated in transformers 4.39.0. Please use model.hf_quantizer.is_trainable instead warnings.warn( 4bit量化Qwen1.5-72B 卡在:Quantizing model.layers blocks : 100% .... May 31, 2024
@hiyouga hiyouga added pending This problem is yet to be addressed wontfix This will not be worked on and removed pending This problem is yet to be addressed labels Jun 3, 2024
@hiyouga hiyouga closed this as not planned Won't fix, can't repro, duplicate, stale Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants