Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

模型加载问题 #96

Open
Tzx11 opened this issue May 25, 2024 · 8 comments
Open

模型加载问题 #96

Tzx11 opened this issue May 25, 2024 · 8 comments

Comments

@Tzx11
Copy link

Tzx11 commented May 25, 2024

config = MonkeyConfig.from_pretrained(
"monkey_model",
cache_dir=training_args.cache_dir,
trust_remote_code=True,
)这段代码因为"monkey_model",会报以下错误
OSError: monkey_model is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with huggingface-cli login or by passing token=<your_token>

@echo840
Copy link
Collaborator

echo840 commented May 25, 2024

I apologize for any confusion caused. You can replace "monkey_model" with for the model weights' path you have downloaded or "echo840/Monkey".

@Tzx11
Copy link
Author

Tzx11 commented May 25, 2024

感谢您的回复,将”monkey_model“改成”echo840/Monkey”后,出现了以下错误
Traceback (most recent call last):
File "/home/tongzixuan/code/Monkey-main/finetune/../finetune_multitask.py", line 422, in
train()
File "/home/tongzixuan/code/Monkey-main/finetune/../finetune_multitask.py", line 405, in train
trainer.train()
File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/transformers/trainer.py", line 1555, in train
return inner_training_loop(
File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/transformers/trainer.py", line 1837, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/transformers/trainer.py", line 2682, in training_step
loss = self.compute_loss(model, inputs)
File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/transformers/trainer.py", line 2707, in compute_loss
outputs = model(**inputs)
File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn
ret_val = func(*args, **kwargs)
File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/deepspeed/runtime/engine.py", line 1855, in forward
loss = self.module(*inputs, **kwargs)
File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/tongzixuan/code/Monkey-main/monkey_model/modeling_qwen.py", line 1004, in forward
transformer_outputs = self.transformer(
File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/tongzixuan/code/Monkey-main/monkey_model/modeling_monkey.py", line 82, in forward
return super().forward(input_ids,
File "/home/tongzixuan/code/Monkey-main/monkey_model/modeling_qwen.py", line 840, in forward
outputs = torch.utils.checkpoint.checkpoint(
File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/utils/checkpoint.py", line 249, in checkpoint
return CheckpointFunction.apply(function, preserve, *args)
File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/autograd/function.py", line 506, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/utils/checkpoint.py", line 107, in forward
outputs = run_function(*args)
File "/home/tongzixuan/code/Monkey-main/monkey_model/modeling_qwen.py", line 836, in custom_forward
return module(*inputs, use_cache, output_attentions)
File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/tongzixuan/code/Monkey-main/monkey_model/modeling_qwen.py", line 516, in forward
attn_outputs = self.attn(
File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in call_impl
return forward_call(*args, **kwargs)
File "/home/tongzixuan/code/Monkey-main/monkey_model/modeling_qwen.py", line 418, in forward
query = apply_rotary_pos_emb(query, q_pos_emb)
File "/home/tongzixuan/code/Monkey-main/monkey_model/modeling_qwen.py", line 1343, in apply_rotary_pos_emb
output = apply_rotary_emb_func(t
, cos, sin).type_as(t)
File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/flash_attn/layers/rotary.py", line 122, in apply_rotary_emb
return ApplyRotaryEmb.apply(
File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/autograd/function.py", line 506, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/flash_attn/layers/rotary.py", line 48, in forward
out = apply_rotary(
File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/flash_attn/ops/triton/rotary.py", line 213, in apply_rotary
rotary_kernel[grid](
File "", line 41, in rotary_kernel
File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/triton/compiler.py", line 1587, in compile
so_path = make_stub(name, signature, constants)
File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/triton/compiler.py", line 1476, in make_stub
so = _build(name, src_path, tmpdir)
File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/triton/compiler.py", line 1391, in _build
ret = subprocess.check_call(cc_cmd)
File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/subprocess.py", line 373, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/usr/bin/gcc', '/tmp/tmpks7a4w5v/main.c', '-O3', '-I/usr/local/cuda/include', '-I/home/tongzixuan/anaconda3/envs/monkey/include/python3.9', '-I/tmp/tmpks7a4w5v', '-shared', '-fPIC', '-lcuda', '-o', '/tmp/tmpks7a4w5v/rotary_kernel.cpython-39-x86_64-linux-gnu.so']' returned non-zero exit status 1.
/usr/bin/ld: cannot find -lcuda: No such file or directory
collect2: error: ld returned 1 exit status
Traceback (most recent call last):
File "", line 21, in rotary_kernel
KeyError: ('2-.-0-.-0-7d1eb0d2fed8ff2032dccb99c2cc311a-d6252949da17ceb5f3a278a70250af13-1af5134066c618146d2cd009138944a0-37a3350d9f1920364a7e68ae67c1a1f0-3498c340fd4b6ee7805fd54b882a04f5-e1f133f98d04093da2078dfc51c36b72-b26258bf01f839199e39d64851821f26-d7c06e3b46e708006c15224aac7a1378-f585402118c8a136948ce0a49cfe122c', (torch.float32, torch.float32, torch.float32, torch.float32, None, 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32'), (128, False, False, False, False, 4), (True, True, True, True, (False,), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (False, True), (True, False), (True, False), (True, False), (False, True)))

@TAOSHss
Copy link

TAOSHss commented May 27, 2024

[2024-05-27 07:42:53,689] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
The argument trust_remote_code is to be used with Auto classes. It has no effect here and is ignored.
You are using a model of type qwen to instantiate a model of type monkey. This is not supported for all configurations of models and can yield errors.
The argument trust_remote_code is to be used with Auto classes. It has no effect here and is ignored.
Your device does NOT seem to support bf16, you can switch to fp16 or fp32 by by passing fp16/fp32=True in "AutoModelForCausalLM.from_pretrained".
Your device does NOT seem to support bf16, you can switch to fp16 or fp32 by by passing fp16/fp32=True in "AutoModelForCausalLM.from_pretrained".
/opt/conda/lib/python3.10/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1695392020201/work/aten/src/ATen/native/TensorShape.cpp:3526.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
Traceback (most recent call last):
File "/root/code/extra/Monkey/textMonkey_testocr.py", line 123, in
main()
File "/root/code/extra/Monkey/textMonkey_testocr.py", line 82, in main
model, tokenizer = _load_model_tokenizer(args)
File "/root/code/extra/Monkey/textMonkey_testocr.py", line 38, in _load_model_tokenizer
tokenizer = QWenTokenizer.from_pretrained(checkpoint_path,
File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2028, in from_pretrained
return cls._from_pretrained(
File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2260, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/root/code/extra/Monkey/monkey_model/tokenization_qwen.py", line 114, in init
super().init(**kwargs)
File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 367, in init
self._add_tokens(
File "/root/code/extra/Monkey/monkey_model/tokenization_qwen.py", line 218, in _add_tokens
if surface_form not in SPECIAL_TOKENS + self.IMAGE_ST:
AttributeError: 'QWenTokenizer' object has no attribute 'IMAGE_ST'

在测试textMonkey模型时候的报错

@echo840
Copy link
Collaborator

echo840 commented May 27, 2024

Hello, you should either use transformers==4.32.0 or refer to this link for fixing: https://huggingface.co/echo840/Monkey-Chat/discussions/1. https://huggingface.co/echo840/Monkey/discussions/4

[2024-05-27 07:42:53,689] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect) The argument trust_remote_code is to be used with Auto classes. It has no effect here and is ignored. You are using a model of type qwen to instantiate a model of type monkey. This is not supported for all configurations of models and can yield errors. The argument trust_remote_code is to be used with Auto classes. It has no effect here and is ignored. Your device does NOT seem to support bf16, you can switch to fp16 or fp32 by by passing fp16/fp32=True in "AutoModelForCausalLM.from_pretrained". Your device does NOT seem to support bf16, you can switch to fp16 or fp32 by by passing fp16/fp32=True in "AutoModelForCausalLM.from_pretrained". /opt/conda/lib/python3.10/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1695392020201/work/aten/src/ATen/native/TensorShape.cpp:3526.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] Traceback (most recent call last): File "/root/code/extra/Monkey/textMonkey_testocr.py", line 123, in main() File "/root/code/extra/Monkey/textMonkey_testocr.py", line 82, in main model, tokenizer = _load_model_tokenizer(args) File "/root/code/extra/Monkey/textMonkey_testocr.py", line 38, in _load_model_tokenizer tokenizer = QWenTokenizer.from_pretrained(checkpoint_path, File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2028, in from_pretrained return cls._from_pretrained( File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2260, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) File "/root/code/extra/Monkey/monkey_model/tokenization_qwen.py", line 114, in init super().init(**kwargs) File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 367, in init self._add_tokens( File "/root/code/extra/Monkey/monkey_model/tokenization_qwen.py", line 218, in _add_tokens if surface_form not in SPECIAL_TOKENS + self.IMAGE_ST: AttributeError: 'QWenTokenizer' object has no attribute 'IMAGE_ST'

在测试textMonkey模型时候的报错

@echo840
Copy link
Collaborator

echo840 commented May 27, 2024

感谢您的回复,将”monkey_model“改成”echo840/Monkey”后,出现了以下错误 Traceback (most recent call last): File "/home/tongzixuan/code/Monkey-main/finetune/../finetune_multitask.py", line 422, in train() File "/home/tongzixuan/code/Monkey-main/finetune/../finetune_multitask.py", line 405, in train trainer.train() File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/transformers/trainer.py", line 1555, in train return inner_training_loop( File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/transformers/trainer.py", line 1837, in _inner_training_loop tr_loss_step = self.training_step(model, inputs) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/transformers/trainer.py", line 2682, in training_step loss = self.compute_loss(model, inputs) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/transformers/trainer.py", line 2707, in compute_loss outputs = model(**inputs) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn ret_val = func(*args, **kwargs) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/deepspeed/runtime/engine.py", line 1855, in forward loss = self.module(*inputs, **kwargs) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/tongzixuan/code/Monkey-main/monkey_model/modeling_qwen.py", line 1004, in forward transformer_outputs = self.transformer( File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/tongzixuan/code/Monkey-main/monkey_model/modeling_monkey.py", line 82, in forward return super().forward(input_ids, File "/home/tongzixuan/code/Monkey-main/monkey_model/modeling_qwen.py", line 840, in forward outputs = torch.utils.checkpoint.checkpoint( File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/utils/checkpoint.py", line 249, in checkpoint return CheckpointFunction.apply(function, preserve, *args) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/autograd/function.py", line 506, in apply return super().apply(*args, **kwargs) # type: ignore[misc] File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/utils/checkpoint.py", line 107, in forward outputs = run_function(*args) File "/home/tongzixuan/code/Monkey-main/monkey_model/modeling_qwen.py", line 836, in custom_forward return module(*inputs, use_cache, output_attentions) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/tongzixuan/code/Monkey-main/monkey_model/modeling_qwen.py", line 516, in forward attn_outputs = self.attn( File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in call_impl return forward_call(*args, **kwargs) File "/home/tongzixuan/code/Monkey-main/monkey_model/modeling_qwen.py", line 418, in forward query = apply_rotary_pos_emb(query, q_pos_emb) File "/home/tongzixuan/code/Monkey-main/monkey_model/modeling_qwen.py", line 1343, in apply_rotary_pos_emb output = apply_rotary_emb_func(t, cos, sin).type_as(t) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/flash_attn/layers/rotary.py", line 122, in apply_rotary_emb return ApplyRotaryEmb.apply( File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/autograd/function.py", line 506, in apply return super().apply(*args, **kwargs) # type: ignore[misc] File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/flash_attn/layers/rotary.py", line 48, in forward out = apply_rotary( File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/flash_attn/ops/triton/rotary.py", line 213, in apply_rotary rotary_kernel[grid]( File "", line 41, in rotary_kernel File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/triton/compiler.py", line 1587, in compile so_path = make_stub(name, signature, constants) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/triton/compiler.py", line 1476, in make_stub so = _build(name, src_path, tmpdir) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/triton/compiler.py", line 1391, in _build ret = subprocess.check_call(cc_cmd) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/subprocess.py", line 373, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['/usr/bin/gcc', '/tmp/tmpks7a4w5v/main.c', '-O3', '-I/usr/local/cuda/include', '-I/home/tongzixuan/anaconda3/envs/monkey/include/python3.9', '-I/tmp/tmpks7a4w5v', '-shared', '-fPIC', '-lcuda', '-o', '/tmp/tmpks7a4w5v/rotary_kernel.cpython-39-x86_64-linux-gnu.so']' returned non-zero exit status 1. /usr/bin/ld: cannot find -lcuda: No such file or directory collect2: error: ld returned 1 exit status Traceback (most recent call last): File "", line 21, in rotary_kernel KeyError: ('2-.-0-.-0-7d1eb0d2fed8ff2032dccb99c2cc311a-d6252949da17ceb5f3a278a70250af13-1af5134066c618146d2cd009138944a0-37a3350d9f1920364a7e68ae67c1a1f0-3498c340fd4b6ee7805fd54b882a04f5-e1f133f98d04093da2078dfc51c36b72-b26258bf01f839199e39d64851821f26-d7c06e3b46e708006c15224aac7a1378-f585402118c8a136948ce0a49cfe122c', (torch.float32, torch.float32, torch.float32, torch.float32, None, 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32'), (128, False, False, False, False, 4), (True, True, True, True, (False,), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (False, True), (True, False), (True, False), (True, False), (False, True)))

您能告诉我,您是在训练还是测试时报错的呢?

@Tzx11
Copy link
Author

Tzx11 commented May 27, 2024

感谢您的回复,将”monkey_model“改成”echo840/Monkey”后,出现了以下错误 Traceback (most recent call last): File "/home/tongzixuan/code/Monkey-main/finetune/../finetune_multitask.py", line 422, in train() File "/home/tongzixuan/code/Monkey-main/finetune/../finetune_multitask.py", line 405, in train trainer.train() File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/transformers/trainer.py", line 1555, in train return inner_training_loop( File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/transformers/trainer.py", line 1837, in _inner_training_loop tr_loss_step = self.training_step(model, inputs) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/transformers/trainer.py", line 2682, in training_step loss = self.compute_loss(model, inputs) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/transformers/trainer.py", line 2707, in compute_loss outputs = model(**inputs) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn ret_val = func(*args, **kwargs) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/deepspeed/runtime/engine.py", line 1855, in forward loss = self.module(*inputs, **kwargs) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/tongzixuan/code/Monkey-main/monkey_model/modeling_qwen.py", line 1004, in forward transformer_outputs = self.transformer( File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/tongzixuan/code/Monkey-main/monkey_model/modeling_monkey.py", line 82, in forward return super().forward(input_ids, File "/home/tongzixuan/code/Monkey-main/monkey_model/modeling_qwen.py", line 840, in forward outputs = torch.utils.checkpoint.checkpoint( File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/utils/checkpoint.py", line 249, in checkpoint return CheckpointFunction.apply(function, preserve, *args) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/autograd/function.py", line 506, in apply return super().apply(*args, **kwargs) # type: ignore[misc] File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/utils/checkpoint.py", line 107, in forward outputs = run_function(*args) File "/home/tongzixuan/code/Monkey-main/monkey_model/modeling_qwen.py", line 836, in custom_forward return module(*inputs, use_cache, output_attentions) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/tongzixuan/code/Monkey-main/monkey_model/modeling_qwen.py", line 516, in forward attn_outputs = self.attn( File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in call_impl return forward_call(*args, **kwargs) File "/home/tongzixuan/code/Monkey-main/monkey_model/modeling_qwen.py", line 418, in forward query = apply_rotary_pos_emb(query, q_pos_emb) File "/home/tongzixuan/code/Monkey-main/monkey_model/modeling_qwen.py", line 1343, in apply_rotary_pos_emb output = apply_rotary_emb_func(t, cos, sin).type_as(t) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/flash_attn/layers/rotary.py", line 122, in apply_rotary_emb return ApplyRotaryEmb.apply( File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/torch/autograd/function.py", line 506, in apply return super().apply(*args, **kwargs) # type: ignore[misc] File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/flash_attn/layers/rotary.py", line 48, in forward out = apply_rotary( File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/flash_attn/ops/triton/rotary.py", line 213, in apply_rotary rotary_kernel[grid]( File "", line 41, in rotary_kernel File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/triton/compiler.py", line 1587, in compile so_path = make_stub(name, signature, constants) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/triton/compiler.py", line 1476, in make_stub so = _build(name, src_path, tmpdir) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/site-packages/triton/compiler.py", line 1391, in _build ret = subprocess.check_call(cc_cmd) File "/home/tongzixuan/anaconda3/envs/monkey/lib/python3.9/subprocess.py", line 373, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['/usr/bin/gcc', '/tmp/tmpks7a4w5v/main.c', '-O3', '-I/usr/local/cuda/include', '-I/home/tongzixuan/anaconda3/envs/monkey/include/python3.9', '-I/tmp/tmpks7a4w5v', '-shared', '-fPIC', '-lcuda', '-o', '/tmp/tmpks7a4w5v/rotary_kernel.cpython-39-x86_64-linux-gnu.so']' returned non-zero exit status 1. /usr/bin/ld: cannot find -lcuda: No such file or directory collect2: error: ld returned 1 exit status Traceback (most recent call last): File "", line 21, in rotary_kernel KeyError: ('2-.-0-.-0-7d1eb0d2fed8ff2032dccb99c2cc311a-d6252949da17ceb5f3a278a70250af13-1af5134066c618146d2cd009138944a0-37a3350d9f1920364a7e68ae67c1a1f0-3498c340fd4b6ee7805fd54b882a04f5-e1f133f98d04093da2078dfc51c36b72-b26258bf01f839199e39d64851821f26-d7c06e3b46e708006c15224aac7a1378-f585402118c8a136948ce0a49cfe122c', (torch.float32, torch.float32, torch.float32, torch.float32, None, 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32', 'i32'), (128, False, False, False, False, 4), (True, True, True, True, (False,), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (True, False), (False, True), (True, False), (True, False), (True, False), (False, True)))

您能告诉我,您是在训练还是测试时报错的呢?

在训练的时候报错

@echo840
Copy link
Collaborator

echo840 commented May 27, 2024

您是在相应的项目文件夹中运行训练sh吗?

tokenizer = QWenTokenizer.from_pretrained(
        "monkey_model",
        cache_dir=training_args.cache_dir,
        model_max_length=training_args.model_max_length,
        padding_side="right",
        use_fast=False,
        trust_remote_code=True,
    )

这里的"monkey_model"是相对路径,指向您项目中的link文件夹。此外您需要检查一下您的环境问题。

@Tzx11
Copy link
Author

Tzx11 commented May 29, 2024

您是在相应的项目文件夹中运行训练sh吗?

tokenizer = QWenTokenizer.from_pretrained(
        "monkey_model",
        cache_dir=training_args.cache_dir,
        model_max_length=training_args.model_max_length,
        padding_side="right",
        use_fast=False,
        trust_remote_code=True,
    )

这里的"monkey_model"是相对路径,指向您项目中的link文件夹。此外您需要检查一下您的环境问题。

我将这里的"monkey_model"改成了本地路径,并且按照readme文件一步一步配置的环境,但是之后运行训练脚本就会报上面的错误

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants