Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lora微调报错TypeError: QWenPreTrainedModel._set_gradient_checkpointing() got an unexpected keyword argument 'enable' #661

Closed
1424153694 opened this issue Nov 22, 2023 · 6 comments

Comments

@1424153694
Copy link

**root@llm-server:/home/llm/Qwen-main# sh finetune/finetune_lora_ds.sh
[2023-11-22 01:09:58,735] torch.distributed.run: [WARNING]
[2023-11-22 01:09:58,735] torch.distributed.run: [WARNING] *****************************************
[2023-11-22 01:09:58,735] torch.distributed.run: [WARNING] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
[2023-11-22 01:09:58,735] torch.distributed.run: [WARNING] *****************************************
[2023-11-22 01:09:59,854] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2023-11-22 01:09:59,856] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2023-11-22 01:09:59,856] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2023-11-22 01:09:59,874] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
/opt/conda/lib/python3.10/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
/opt/conda/lib/python3.10/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
/opt/conda/lib/python3.10/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
/opt/conda/lib/python3.10/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
[2023-11-22 01:10:01,260] [INFO] [comm.py:637:init_distributed] cdb=None
[2023-11-22 01:10:01,260] [INFO] [comm.py:637:init_distributed] cdb=None
[2023-11-22 01:10:01,260] [INFO] [comm.py:637:init_distributed] cdb=None
[2023-11-22 01:10:01,260] [INFO] [comm.py:637:init_distributed] cdb=None
[2023-11-22 01:10:01,260] [INFO] [comm.py:668:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl
Warning: please make sure that you are using the latest codes and checkpoints, especially if you used Qwen-7B before 09.25.2023.请使用最新模型和代码,尤其如果你在9月25日前已经开始使用Qwen-7B,千万注意不要使用错误代码和模型。
The model is automatically converting to bf16 for faster inference. If you want to disable the automatic precision, please manually add bf16/fp16/fp32=True to "AutoModelForCausalLM.from_pretrained".
Try importing flash-attention for faster inference...
Warning: please make sure that you are using the latest codes and checkpoints, especially if you used Qwen-7B before 09.25.2023.请使用最新模型和代码,尤其如果你在9月25日前已经开始使用Qwen-7B,千万注意不要使用错误代码和模型。
Warning: please make sure that you are using the latest codes and checkpoints, especially if you used Qwen-7B before 09.25.2023.请使用最新模型和代码,尤其如果你在9月25日前已经开始使用Qwen-7B,千万注意不要使用错误代码和模型。
The model is automatically converting to bf16 for faster inference. If you want to disable the automatic precision, please manually add bf16/fp16/fp32=True to "AutoModelForCausalLM.from_pretrained".
The model is automatically converting to bf16 for faster inference. If you want to disable the automatic precision, please manually add bf16/fp16/fp32=True to "AutoModelForCausalLM.from_pretrained".
Try importing flash-attention for faster inference...
Try importing flash-attention for faster inference...
Warning: please make sure that you are using the latest codes and checkpoints, especially if you used Qwen-7B before 09.25.2023.请使用最新模型和代码,尤其如果你在9月25日前已经开始使用Qwen-7B,千万注意不要使用错误代码和模型。
The model is automatically converting to bf16 for faster inference. If you want to disable the automatic precision, please manually add bf16/fp16/fp32=True to "AutoModelForCausalLM.from_pretrained".
Try importing flash-attention for faster inference...
[2023-11-22 01:10:09,081] [INFO] [partition_parameters.py:348:exit] finished initializing model - num_params = 323, num_elems = 14.17B
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:44<00:00, 2.96s/it]
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:44<00:00, 2.96s/it]
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:44<00:00, 2.96s/it]
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:46<00:00, 3.13s/it]
Loading data...
Formatting inputs...Skip in lazy mode
Traceback (most recent call last):
File "/home/llm/Qwen-main/finetune.py", line 363, in
train()
File "/home/llm/Qwen-main/finetune.py", line 356, in train
trainer.train()
File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 1555, in train
return inner_training_loop(
File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 1668, in _inner_training_loop
self.model.gradient_checkpointing_enable(gradient_checkpointing_kwargs=gradient_checkpointing_kwargs)
File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1872, in gradient_checkpointing_enable
self._set_gradient_checkpointing(enable=True, gradient_checkpointing_func=gradient_checkpointing_func)
TypeError: QWenPreTrainedModel._set_gradient_checkpointing() got an unexpected keyword argument 'enable'
Traceback (most recent call last):
File "/home/llm/Qwen-main/finetune.py", line 363, in
train()
File "/home/llm/Qwen-main/finetune.py", line 356, in train
trainer.train()
File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 1555, in train
return inner_training_loop(
File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 1668, in _inner_training_loop
self.model.gradient_checkpointing_enable(gradient_checkpointing_kwargs=gradient_checkpointing_kwargs)
File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1872, in gradient_checkpointing_enable
Traceback (most recent call last):
File "/home/llm/Qwen-main/finetune.py", line 363, in
self._set_gradient_checkpointing(enable=True, gradient_checkpointing_func=gradient_checkpointing_func)
train()
File "/home/llm/Qwen-main/finetune.py", line 356, in train
TypeError: QWenPreTrainedModel._set_gradient_checkpointing() got an unexpected keyword argument 'enable'
trainer.train()
File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 1555, in train
return inner_training_loop(
File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 1668, in _inner_training_loop
self.model.gradient_checkpointing_enable(gradient_checkpointing_kwargs=gradient_checkpointing_kwargs)
File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1872, in gradient_checkpointing_enable
self._set_gradient_checkpointing(enable=True, gradient_checkpointing_func=gradient_checkpointing_func)
TypeError: QWenPreTrainedModel._set_gradient_checkpointing() got an unexpected keyword argument 'enable'
Traceback (most recent call last):
File "/home/llm/Qwen-main/finetune.py", line 363, in
train()
File "/home/llm/Qwen-main/finetune.py", line 356, in train
trainer.train()
File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 1555, in train
return inner_training_loop(
File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 1668, in _inner_training_loop
self.model.gradient_checkpointing_enable(gradient_checkpointing_kwargs=gradient_checkpointing_kwargs)
File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1872, in gradient_checkpointing_enable
self._set_gradient_checkpointing(enable=True, gradient_checkpointing_func=gradient_checkpointing_func)
TypeError: QWenPreTrainedModel._set_gradient_checkpointing() got an unexpected keyword argument 'enable'
[2023-11-22 01:10:58,791] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 119) of binary: /opt/conda/bin/python
Traceback (most recent call last):
File "/opt/conda/bin/torchrun", line 33, in
sys.exit(load_entry_point('torch==2.1.0', 'console_scripts', 'torchrun')())
File "/opt/conda/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/distributed/run.py", line 806, in main
run(args)
File "/opt/conda/lib/python3.10/site-packages/torch/distributed/run.py", line 797, in run
elastic_launch(
File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
raise ChildFailedError(
**

环境:
**
_libgcc_mutex 0.1 main
_openmp_mutex 5.1 1_gnu
accelerate 0.24.1 pypi_0 pypi
addict 2.4.0 pypi_0 pypi
aiofiles 23.2.1 pypi_0 pypi
aiohttp 3.8.6 pypi_0 pypi
aiosignal 1.3.1 pypi_0 pypi
aliyun-python-sdk-core 2.14.0 pypi_0 pypi
aliyun-python-sdk-kms 2.16.2 pypi_0 pypi
altair 5.1.2 pypi_0 pypi
annotated-types 0.6.0 pypi_0 pypi
anyio 3.7.1 pypi_0 pypi
asttokens 2.0.5 pyhd3eb1b0_0
astunparse 1.6.3 pypi_0 pypi
async-timeout 4.0.3 pypi_0 pypi
attrs 23.1.0 pypi_0 pypi
backcall 0.2.0 pyhd3eb1b0_0
beautifulsoup4 4.12.2 py310h06a4308_0
blas 1.0 mkl
boltons 23.0.0 py310h06a4308_0
brotlipy 0.7.0 py310h7f8727e_1002
bzip2 1.0.8 h7b6447c_0
c-ares 1.19.0 h5eee18b_0
ca-certificates 2023.08.22 h06a4308_0
certifi 2023.7.22 py310h06a4308_0
cffi 1.15.1 py310h5eee18b_3
chardet 4.0.0 py310h06a4308_1003
charset-normalizer 2.0.4 pyhd3eb1b0_0
click 8.0.4 py310h06a4308_0
cmake 3.26.4 h96355d8_0
colorama 0.4.6 pypi_0 pypi
conda 23.9.0 py310h06a4308_0
conda-build 3.27.0 py310h06a4308_0
conda-content-trust 0.2.0 py310h06a4308_0
conda-index 0.3.0 py310h06a4308_0
conda-libmamba-solver 23.7.0 py310h06a4308_0
conda-package-handling 2.2.0 py310h06a4308_0
conda-package-streaming 0.9.0 py310h06a4308_0
contourpy 1.2.0 pypi_0 pypi
crcmod 1.7 pypi_0 pypi
cryptography 41.0.3 py310hdda0065_0
cuda-cudart 12.1.105 0 nvidia
cuda-cupti 12.1.105 0 nvidia
cuda-libraries 12.1.0 0 nvidia
cuda-nvrtc 12.1.105 0 nvidia
cuda-nvtx 12.1.105 0 nvidia
cuda-opencl 12.2.140 0 nvidia
cuda-runtime 12.1.0 0 nvidia
cycler 0.12.1 pypi_0 pypi
datasets 2.13.0 pypi_0 pypi
decorator 5.1.1 pyhd3eb1b0_0
deepspeed 0.12.3 pypi_0 pypi
dill 0.3.6 pypi_0 pypi
dnspython 2.4.2 pypi_0 pypi
dropout-layer-norm 0.1 pypi_0 pypi
einops 0.7.0 pypi_0 pypi
exceptiongroup 1.0.4 py310h06a4308_0
executing 0.8.3 pyhd3eb1b0_0
expat 2.5.0 h6a678d5_0
expecttest 0.1.6 pypi_0 pypi
fastapi 0.104.1 pypi_0 pypi
ffmpeg 4.3 hf484d3e_0 pytorch
ffmpy 0.3.1 pypi_0 pypi
filelock 3.9.0 py310h06a4308_0
flash-attn 2.3.3 pypi_0 pypi
fmt 9.1.0 hdb19cb5_0
fonttools 4.44.0 pypi_0 pypi
freetype 2.12.1 h4a9f257_0
frozenlist 1.4.0 pypi_0 pypi
fsspec 2023.9.2 pypi_0 pypi
gast 0.5.4 pypi_0 pypi
giflib 5.2.1 h5eee18b_3
gmp 6.2.1 h295c915_3
gmpy2 2.1.2 py310heeb90bb_0
gnutls 3.6.15 he1e5248_0
gradio 4.2.0 pypi_0 pypi
gradio-client 0.7.0 pypi_0 pypi
h11 0.14.0 pypi_0 pypi
hjson 3.1.0 pypi_0 pypi
httpcore 1.0.2 pypi_0 pypi
httpx 0.25.1 pypi_0 pypi
huggingface-hub 0.17.3 pypi_0 pypi
hypothesis 6.87.2 pypi_0 pypi
icu 58.2 he6710b0_3
idna 3.4 py310h06a4308_0
importlib-metadata 6.8.0 pypi_0 pypi
importlib-resources 6.1.1 pypi_0 pypi
intel-openmp 2023.1.0 hdb19cb5_46305
ipython 8.15.0 py310h06a4308_0
jedi 0.18.1 py310h06a4308_1
jinja2 3.1.2 py310h06a4308_0
jmespath 0.10.0 pypi_0 pypi
jpeg 9e h5eee18b_1
jsonpatch 1.32 pyhd3eb1b0_0
jsonpointer 2.1 pyhd3eb1b0_0
jsonschema 4.19.2 pypi_0 pypi
jsonschema-specifications 2023.7.1 pypi_0 pypi
kiwisolver 1.4.5 pypi_0 pypi
krb5 1.20.1 h143b758_1
lame 3.100 h7b6447c_0
lcms2 2.12 h3be6417_0
ld_impl_linux-64 2.38 h1181459_1
lerc 3.0 h295c915_0
libarchive 3.6.2 h6ac8c49_2
libcublas 12.1.0.26 0 nvidia
libcufft 11.0.2.4 0 nvidia
libcufile 1.7.2.10 0 nvidia
libcurand 10.3.3.141 0 nvidia
libcurl 8.1.1 h251f7ec_1
libcusolver 11.4.4.55 0 nvidia
libcusparse 12.0.2.55 0 nvidia
libdeflate 1.17 h5eee18b_1
libedit 3.1.20221030 h5eee18b_0
libev 4.33 h7f8727e_1
libffi 3.4.4 h6a678d5_0
libgcc-ng 11.2.0 h1234567_1
libgomp 11.2.0 h1234567_1
libiconv 1.16 h7f8727e_2
libidn2 2.3.4 h5eee18b_0
libjpeg-turbo 2.0.0 h9bf148f_0 pytorch
liblief 0.12.3 h6a678d5_0
libmamba 1.4.1 h2dafd23_1
libmambapy 1.4.1 py310h2dafd23_1
libnghttp2 1.52.0 h2d74bed_1
libnpp 12.0.2.50 0 nvidia
libnvjitlink 12.1.105 0 nvidia
libnvjpeg 12.1.1.14 0 nvidia
libpng 1.6.39 h5eee18b_0
libsolv 0.7.22 he621ea3_0
libssh2 1.10.0 hdbd6064_2
libstdcxx-ng 11.2.0 h1234567_1
libtasn1 4.19.0 h5eee18b_0
libtiff 4.5.1 h6a678d5_0
libunistring 0.9.10 h27cfd23_0
libuuid 1.41.5 h5eee18b_0
libuv 1.44.2 h5eee18b_0
libwebp 1.3.2 h11a3e52_0
libwebp-base 1.3.2 h5eee18b_0
libxml2 2.10.3 hcbfbd50_0
llvm-openmp 14.0.6 h9e868ea_0
lz4-c 1.9.4 h6a678d5_0
markdown-it-py 3.0.0 pypi_0 pypi
markupsafe 2.1.1 py310h7f8727e_0
matplotlib 3.8.1 pypi_0 pypi
matplotlib-inline 0.1.6 py310h06a4308_0
mdurl 0.1.2 pypi_0 pypi
mkl 2023.1.0 h213fc3f_46343
mkl-service 2.4.0 py310h5eee18b_1
mkl_fft 1.3.8 py310h5eee18b_0
mkl_random 1.2.4 py310hdb19cb5_0
modelscope 1.9.5 pypi_0 pypi
more-itertools 8.12.0 pyhd3eb1b0_0
mpc 1.1.0 h10f8cd9_1
mpfr 4.0.2 hb69a4c5_1
mpmath 1.3.0 py310h06a4308_0
multidict 6.0.4 pypi_0 pypi
multiprocess 0.70.14 pypi_0 pypi
ncurses 6.4 h6a678d5_0
nettle 3.7.3 hbbd107a_1
networkx 3.1 py310h06a4308_0
ninja 1.11.1.1 pypi_0 pypi
numpy 1.26.0 py310h5f9d8c6_0
numpy-base 1.26.0 py310hb5e798b_0
openh264 2.1.1 h4ff587b_0
openjpeg 2.4.0 h3ad879b_0
openssl 3.0.11 h7f8727e_2
orjson 3.9.10 pypi_0 pypi
oss2 2.18.3 pypi_0 pypi
packaging 23.1 py310h06a4308_0
pandas 2.1.3 pypi_0 pypi
parso 0.8.3 pyhd3eb1b0_0
patch 2.7.6 h7b6447c_1001
patchelf 0.17.2 h6a678d5_0
pcre2 10.37 he7ceb23_1
peft 0.6.2 pypi_0 pypi
pexpect 4.8.0 pyhd3eb1b0_3
pickleshare 0.7.5 pyhd3eb1b0_1003
pillow 10.0.1 py310ha6cbd5a_0
pip 23.2.1 py310h06a4308_0
pkginfo 1.9.6 py310h06a4308_0
platformdirs 4.0.0 pypi_0 pypi
pluggy 1.0.0 py310h06a4308_1
prompt-toolkit 3.0.36 py310h06a4308_0
psutil 5.9.0 py310h5eee18b_0
ptyprocess 0.7.0 pyhd3eb1b0_2
pure_eval 0.2.2 pyhd3eb1b0_0
py-cpuinfo 9.0.0 pypi_0 pypi
py-lief 0.12.3 py310h6a678d5_0
pyarrow 14.0.1 pypi_0 pypi
pybind11-abi 4 hd3eb1b0_1
pycosat 0.6.6 py310h5eee18b_0
pycparser 2.21 pyhd3eb1b0_0
pycryptodome 3.19.0 pypi_0 pypi
pydantic 2.4.2 pypi_0 pypi
pydantic-core 2.10.1 pypi_0 pypi
pydub 0.25.1 pypi_0 pypi
pygments 2.15.1 py310h06a4308_1
pynvml 11.5.0 pypi_0 pypi
pyopenssl 23.2.0 py310h06a4308_0
pyparsing 3.1.1 pypi_0 pypi
pysocks 1.7.1 py310h06a4308_0
python 3.10.13 h955ad1f_0
python-dateutil 2.8.2 pypi_0 pypi
python-etcd 0.4.5 pypi_0 pypi
python-libarchive-c 2.9 pyhd3eb1b0_1
python-multipart 0.0.6 pypi_0 pypi
pytorch 2.1.0 py3.10_cuda12.1_cudnn8.9.2_0 pytorch
pytorch-cuda 12.1 ha16c6d3_5 pytorch
pytorch-mutex 1.0 cuda pytorch
pytz 2023.3.post1 py310h06a4308_0
pyyaml 6.0 py310h5eee18b_1
readline 8.2 h5eee18b_0
referencing 0.30.2 pypi_0 pypi
regex 2023.10.3 pypi_0 pypi
reproc 14.2.4 h295c915_1
reproc-cpp 14.2.4 h295c915_1
requests 2.31.0 py310h06a4308_0
rhash 1.4.3 hdbd6064_0
rich 13.6.0 pypi_0 pypi
rotary-emb 0.1 pypi_0 pypi
rpds-py 0.12.0 pypi_0 pypi
ruamel.yaml 0.17.21 py310h5eee18b_0
ruamel.yaml.clib 0.2.6 py310h5eee18b_1
safetensors 0.4.0 pypi_0 pypi
scipy 1.11.3 pypi_0 pypi
semantic-version 2.10.0 pypi_0 pypi
setuptools 68.0.0 py310h06a4308_0
shellingham 1.5.4 pypi_0 pypi
simplejson 3.19.2 pypi_0 pypi
six 1.16.0 pyhd3eb1b0_1
sniffio 1.3.0 pypi_0 pypi
sortedcontainers 2.4.0 pypi_0 pypi
soupsieve 2.5 py310h06a4308_0
sqlite 3.41.2 h5eee18b_0
stack_data 0.2.0 pyhd3eb1b0_0
starlette 0.27.0 pypi_0 pypi
sympy 1.12 pypi_0 pypi
tbb 2021.8.0 hdb19cb5_0
tiktoken 0.5.1 pypi_0 pypi
tk 8.6.12 h1ccaba5_0
tokenizers 0.14.1 pypi_0 pypi
tomli 2.0.1 py310h06a4308_0
tomlkit 0.12.0 pypi_0 pypi
toolz 0.12.0 py310h06a4308_0
torchaudio 2.1.0 py310_cu121 pytorch
torchelastic 0.2.2 pypi_0 pypi
torchtriton 2.1.0 py310 pytorch
torchvision 0.16.0 py310_cu121 pytorch
tqdm 4.65.0 py310h2f386ee_0
traitlets 5.7.1 py310h06a4308_0
transformers 4.35.0 pypi_0 pypi
transformers-stream-generator 0.0.4 pypi_0 pypi
truststore 0.8.0 py310h06a4308_0
typer 0.9.0 pypi_0 pypi
types-dataclasses 0.6.6 pypi_0 pypi
typing-extensions 4.8.0 pypi_0 pypi
tzdata 2023.3 pypi_0 pypi
urllib3 1.26.16 py310h06a4308_0
uvicorn 0.24.0.post1 pypi_0 pypi
wcwidth 0.2.5 pyhd3eb1b0_0
websockets 11.0.3 pypi_0 pypi
wheel 0.41.2 py310h06a4308_0
xxhash 3.4.1 pypi_0 pypi
xz 5.4.2 h5eee18b_0
yaml 0.2.5 h7b6447c_0
yaml-cpp 0.7.0 h295c915_1
yapf 0.40.2 pypi_0 pypi
yarl 1.9.2 pypi_0 pypi
zipp 3.17.0 pypi_0 pypi
zlib 1.2.13 h5eee18b_0
zstandard 0.19.0 py310h5eee18b_0
zstd 1.5.5 hc292b87_0
**

@jklj077
Copy link
Contributor

jklj077 commented Nov 22, 2023

您好,finetune.py兼容性问题,建议您先使用版本低于4.35.0的transformers(并调整相关软件版本以符合依赖,如peft、optimum、accelerate等)。

@1424153694
Copy link
Author

您好,finetune.py兼容性问题,建议您先使用版本低于4.35.0的transformers(并调整相关软件版本以符合依赖,如peft、optimum、accelerate等)。

感谢您的答复

@ehartford
Copy link

Hello, I got the same.
I would really like to use Qwen.

@ehartford
Copy link

您好,finetune.py兼容性问题,建议您先使用版本低于4.35.0的transformers(并调整相关软件版本以符合依赖,如peft、optimum、accelerate等)

The right answer is not to downgrade transformers to a previous version.
The right answer is to update Qwen to conform to the latest transformers.

@jklj077
Copy link
Contributor

jklj077 commented Dec 1, 2023

@ehartford Hi,

transformers has recently refactored the logic for gradient checkpointing in 4.35.0. It said and I quote

The refactor should be totally backward compatible with previous behaviour.

It is not listed as a breaking change. But in fact, it basically breaks all models on Hugging Face Hub, including Qwen, that use custom code for gradient checkpointing. Fortunately, the inference is not affected.

The fix by transformers is not released as of 2023-12-01, but it was merged in main.

We will try to support both the old and the new way, but no promise on time can be made now. You could (a) wait for the next patch release (if there is one), since 4.35 breaks a lot of things and it already has two patches, (b) update to the "latest" transformers in main, or (c) as we previously suggested use a version lower than 4.35.0, to use our models.

@ehartford
Copy link

I have worked around this by modifying modeling_qwen.py as follows:

def _set_gradient_checkpointing(self, enable: bool = False, gradient_checkpointing_func: Callable = None):
        is_gradient_checkpointing_set = False

        if isinstance(self, QWenModel):
            self.gradient_checkpointing = enable
            self._gradient_checkpointing_func = gradient_checkpointing_func
            is_gradient_checkpointing_set = True

        for module in self.modules():
            if isinstance(module, QWenModel):
                module.gradient_checkpointing = enable
                module._gradient_checkpointing_func = gradient_checkpointing_func
                is_gradient_checkpointing_set = True

        if not is_gradient_checkpointing_set:
            raise ValueError(f"{self.__class__.__name__} is not compatible with gradient checkpointing. Make sure all the architecture support it by setting a boolean attribute 'gradient_checkpointing' to modules of the model that uses checkpointing.")

@jklj077 jklj077 closed this as completed Feb 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants