Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

finetune时报错:KeyError: 'models.llama' #50

Closed
simonqian opened this issue Apr 9, 2023 · 17 comments
Closed

finetune时报错:KeyError: 'models.llama' #50

simonqian opened this issue Apr 9, 2023 · 17 comments

Comments

@simonqian
Copy link

  1. 操作系统:Ubuntu
  2. 显卡:3090
  3. Python:3.8
  4. 其他Python库版本如下:
pytorch-mutex             1.0                        cuda    pytorch
torch                     2.0.0                    pypi_0    pypi
torchaudio                0.12.1               py38_cu113    pytorch
torchvision               0.13.1               py38_cu113    pytorch
cudatoolkit               11.3.1               h2bc3f7f_2    defaults
nvidia-cuda-cupti-cu11    11.7.101                 pypi_0    pypi
nvidia-cuda-nvrtc-cu11    11.7.99                  pypi_0    pypi
nvidia-cuda-runtime-cu11  11.7.99                  pypi_0    pypi
pytorch-mutex             1.0                        cuda    pytorch
transformers              4.27.4                   pypi_0    pypi
tokenizers                0.13.3                   pypi_0    pypi
sentencepiece             0.1.97                   pypi_0    pypi
  1. nvidia-smi
NVIDIA-SMI 470.141.03   Driver Version: 470.141.03   CUDA Version: 11.4  
  1. nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85
  1. 上游模型、lora模型、数据都下载到本地了
  2. finetune脚本如下:
DATA_PATH="./sample/merge.json" #"../dataset/instruction/guanaco_non_chat_mini_52K-utf8.json" #"./sample/merge_sample.json"
OUTPUT_PATH="my-lora-Vicuna"
MODEL_PATH="../llama-13b-hf/"
lora_checkpoint="../Chinese-Vicuna-lora-13b-belle-and-guanaco/"
TEST_SIZE=2000

python finetune.py \
--data_path $DATA_PATH \
--output_path $OUTPUT_PATH \
--model_path $MODEL_PATH \
--eval_steps 200 \
--save_steps 200 \
--test_size $TEST_SIZE

执行finetune脚本时报错如下:

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
CUDA SETUP: CUDA runtime path found: /home/doudou/miniconda3/envs/chinese-vicuna/lib/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.6
CUDA SETUP: Detected CUDA version 113
CUDA SETUP: Loading binary /home/doudou/miniconda3/envs/chinese-vicuna/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda113.so...
Traceback (most recent call last):
  File "finetune.py", line 13, in <module>
    "LlamaTokenizer" in transformers._import_structure["models.llama"]
KeyError: 'models.llama'

求助一下,这是什么原因呢?

@simonqian
Copy link
Author

我卸载了pip安装的transformers,用conda install -c huggingface transformers安装,安装后版本如下:

transformers              4.27.4                     py_0    huggingface

执行finetune还是一样的报错

@Facico
Copy link
Owner

Facico commented Apr 9, 2023

transformers不能直接这样下载,要像requirement.txt里面一样从github直接拉去,llama现在还没在transformers发布最新版本里,还在github的仓库中。拉去后版本为4.28.0.dev

@simonqian
Copy link
Author

@Facico
谢谢。我现在使用源码安装

git clone https://github.com/huggingface/transformers.git
cd transformers
pip install -e .

版本如下:

transformers              4.28.0.dev0              pypi_0    pypi

安装好之后我回到Chinese-Vicuna目录下执行finetune,发现其他的错误:

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
CUDA SETUP: Required library version not found: libsbitsandbytes_cpu.so. Maybe you need to compile it from source?
CUDA SETUP: Defaulting to libbitsandbytes_cpu.so...
/home/doudou/miniconda3/envs/chinese-vicuna/lib/python3.8/site-packages/bitsandbytes/cextension.py:31: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers and GPU quantization are unavailable.
  warn("The installed version of bitsandbytes was compiled without GPU support. "
Traceback (most recent call last):
  File "finetune.py", line 8, in <module>
    import transformers
  File "/home/doudou/projects/github/transformers-main/src/transformers/__init__.py", line 26, in <module>
    from . import dependency_versions_check
  File "/home/doudou/projects/github/transformers-main/src/transformers/dependency_versions_check.py", line 17, in <module>
    from .utils.versions import require_version, require_version_core
  File "/home/doudou/projects/github/transformers-main/src/transformers/utils/__init__.py", line 57, in <module>
    from .hub import (
  File "/home/doudou/projects/github/transformers-main/src/transformers/utils/hub.py", line 32, in <module>
    from huggingface_hub import (
  File "/home/doudou/miniconda3/envs/chinese-vicuna/lib/python3.8/site-packages/huggingface_hub/__init__.py", line 278, in __getattr__
    submod = importlib.import_module(submod_path)
  File "/home/doudou/miniconda3/envs/chinese-vicuna/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/home/doudou/miniconda3/envs/chinese-vicuna/lib/python3.8/site-packages/huggingface_hub/file_download.py", line 21, in <module>
    from filelock import FileLock
ModuleNotFoundError: No module named 'filelock'

是不是环境还有问题?

@simonqian
Copy link
Author

simonqian commented Apr 9, 2023

国内使用 pip install git+https://github.com/huggingface/transformers 安装会报错:

Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting git+https://github.com/huggingface/transformers
  Cloning https://github.com/huggingface/transformers to /tmp/pip-req-build-lxrdj0gq
  Running command git clone --filter=blob:none --quiet https://github.com/huggingface/transformers /tmp/pip-req-build-lxrdj0gq
  fatal: unable to access 'https://github.com/huggingface/transformers/': gnutls_handshake() failed: The TLS connection was non-properly terminated.
  error: subprocess-exited-with-error
  
  × git clone --filter=blob:none --quiet https://github.com/huggingface/transformers /tmp/pip-req-build-lxrdj0gq did not run successfully.
  │ exit code: 128
  ╰─> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× git clone --filter=blob:none --quiet https://github.com/huggingface/transformers /tmp/pip-req-build-lxrdj0gq did not run successfully.
│ exit code: 128
╰─> See above for output.

估计是网络的原因。所以我是下载了源码zip包手动安装,安装是成功的。

@Facico
Copy link
Owner

Facico commented Apr 9, 2023

你这个应该还是依赖的问题,你安装的时候有参照我们的指引运行过"pip install -r requirements.txt"这个命令吗。

“The TLS connection was non-properly terminated”一看就是网络问题,这个没办法,我们平时都是终端挂代理。

@simonqian
Copy link
Author

有执行 pip install -r requirements.txt 的,但是因为网络问题,一直安装不成功,所以我都把https去掉了。
其他依赖是安装成功的,只有transformers和peft是手动安装。

@Facico
Copy link
Owner

Facico commented Apr 9, 2023

可以参考一下这里提供的一个依赖,或者你遇到ModuleNotFoundError: No module named啥的可以把它尝试着下一遍

@simonqian
Copy link
Author

好的好的,我试一下,谢谢!

@simonqian
Copy link
Author

经过反复多次安装 pip install git+https://github.com/huggingface/transformers之后,终于安装成功了!现在可以执行finetune了,谢谢大佬! @Facico

不过我还有个问题,我已经下载 https://huggingface.co/datasets/Chinese-Vicuna/guanaco_belle_merge_v1.0 的merge.json了,为啥现在还在下载数据啊?我的脚本写错了吗?
image

我的脚本:

DATA_PATH="./sample/merge.json" # 这是下载好的merge.json
OUTPUT_PATH="my-lora-Vicuna"
MODEL_PATH="../llama-13b-hf/" # 下载好的llama-13b模型
lora_checkpoint="../Chinese-Vicuna-lora-13b-belle-and-guanaco/" # 下载到本地的
TEST_SIZE=2000

python finetune.py \
--data_path $DATA_PATH \
--output_path $OUTPUT_PATH \
--model_path $MODEL_PATH \
--eval_steps 200 \
--save_steps 200 \
--test_size $TEST_SIZE

@simonqian
Copy link
Author

@Facico 我想问下,finetune需要多少G显存呢?
我现在出现异常了

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 136.00 MiB (GPU 0; 23.70 GiB total capacity; 13.51 GiB already allocated; 95.56 MiB free; 13.94 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

@Facico
Copy link
Owner

Facico commented Apr 9, 2023

如果是7b的话,代码中的默认设置大概是9G左右。13b的话,应该需要将近20个G,你可以把MICRO_BATCH_SIZE设置小一点

@simonqian
Copy link
Author

好的,谢谢!

@molyswu
Copy link

molyswu commented Apr 13, 2023

python finetune.py --data_path merge.json --test_size 2000
如下错误:
CUDA SETUP: CUDA runtime path found: /root/anaconda3/lib/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.6
CUDA SETUP: Detected CUDA version 113
CUDA SETUP: Loading binary /root/anaconda3/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cuda113.so...
Chinese-Vicuna-lora-13b-belle-and-guanaco
Overriding torch_dtype=None with torch_dtype=torch.float16 due to requirements of bitsandbytes to enable model loading in mixed int8. Either pass torch_dtype=torch.float16 or don't pass this argument at all to remove this warning.
Traceback (most recent call last):
File "/home/Chinese-Vicuna/finetune.py", line 64, in
model = LlamaForCausalLM.from_pretrained(
File "/root/anaconda3/lib/python3.9/site-packages/transformers/modeling_utils.py", line 2405, in from_pretrained
raise EnvironmentError(
OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory Chinese-Vicuna-lora-13b-belle-and-guanaco.

24 parser = argparse.ArgumentParser()
25 parser.add_argument("--wandb", action="store_true", default=False)
26 parser.add_argument("--data_path", type=str, default="./sample/merge.json")
27 parser.add_argument("--output_path", type=str, default="out")
28 parser.add_argument("--model_path", type=str, default="Chinese-Vicuna-lora-13b-belle-and-guanaco")
29 parser.add_argument("--eval_steps", type=int, default=200)
30 parser.add_argument("--save_steps", type=int, default=200)
31 parser.add_argument("--test_size", type=int, default=200)
32 parser.add_argument("--resume_from_checkpoint", type=str, default=None)
33 parser.add_argument("--ignore_data_skip", type=str, default="False")

@molyswu
Copy link

molyswu commented Apr 13, 2023

@molyswu
Copy link

molyswu commented Apr 13, 2023

从huggingface下载llama-7b模型可以了

@96005900
Copy link

经过反复多次安装 pip install git+https://github.com/huggingface/transformers之后,终于安装成功了!现在可以执行finetune了,谢谢大佬! @Facico

不过我还有个问题,我已经下载 https://huggingface.co/datasets/Chinese-Vicuna/guanaco_belle_merge_v1.0 的merge.json了,为啥现在还在下载数据啊?我的脚本写错了吗? image

我的脚本:

DATA_PATH="./sample/merge.json" # 这是下载好的merge.json
OUTPUT_PATH="my-lora-Vicuna"
MODEL_PATH="../llama-13b-hf/" # 下载好的llama-13b模型
lora_checkpoint="../Chinese-Vicuna-lora-13b-belle-and-guanaco/" # 下载到本地的
TEST_SIZE=2000

python finetune.py \
--data_path $DATA_PATH \
--output_path $OUTPUT_PATH \
--model_path $MODEL_PATH \
--eval_steps 200 \
--save_steps 200 \
--test_size $TEST_SIZE

请问你最终是怎么成功的,我也遇到了一样的问题

@96005900
Copy link

96005900 commented Dec 13, 2023

pip install git+https://github.com/huggingface/transformers
一直网络报错,我这边也不方便挂代理

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants