Finetune error: RuntimeError: FIND was unable to find an engine to execute this computation #52

yt7589 · 2023-05-24T01:50:28Z

I had download the latest version of VisualGLM-6B. I used the following commands to setup the development environment:

conda create -n glm python=3.9
conda activate glm
git clone https://github.com/THUDM/VisualGLM-6B.git
cd VisualGLM-6B
pip install -i https://mirrors.aliyun.com/pypi/simple/ -r requirements.txt
# edit finetune/finetune_visualglm.sh to set NUM_GPUS_PER_WORKER=2 which is the number of GPU in my server
unzip fewshot-data.zip
bash finetune/finetune_visualglm.sh

It reported errors as below:

Traceback (most recent call last):
  File "/media/zjkj/2t/yantao/VisualGLM-6B/finetune_visualglm.py", line 188, in <module>
    training_main(args, model_cls=model, forward_step_function=forward_step, create_dataset_function=create_dataset_function, collate_fn=data_collator)
  File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/sat/training/deepspeed_training.py", line 130, in training_main
    iteration, skipped = train(model, optimizer,
  File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/sat/training/deepspeed_training.py", line 274, in train
    lm_loss, skipped_iter, metrics = train_step(train_data_iterator,
  File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/sat/training/deepspeed_training.py", line 348, in train_step
    forward_ret = forward_step(data_iterator, model, args, timers, **kwargs)
  File "/media/zjkj/2t/yantao/VisualGLM-6B/finetune_visualglm.py", line 84, in forward_step
    logits = model(input_ids=tokens, image=image, pre_image=pre_image)[0]
  File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn
    ret_val = func(*args, **kwargs)
  File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/deepspeed/runtime/engine.py", line 1724, in forward
    loss = self.module(*inputs, **kwargs)
  File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/sat/model/official/chatglm_model.py", line 192, in forward
    return super().forward(input_ids=input_ids, attention_mask=attention_mask, position_ids=position_ids, past_key_values=past_key_values, **kwargs)
  File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/sat/model/base_model.py", line 144, in forward
    return self.transformer(*args, **kwargs)
  File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/sat/model/transformer.py", line 451, in forward
    hidden_states = self.hooks['word_embedding_forward'](input_ids, output_cross_layer=output_cross_layer, **kw_args)
  File "/media/zjkj/2t/yantao/VisualGLM-6B/model/visualglm.py", line 20, in word_embedding_forward
    image_emb = self.model(**kw_args)
  File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/media/zjkj/2t/yantao/VisualGLM-6B/model/blip2.py", line 65, in forward
    enc = self.vit(image)[0]
  File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/media/zjkj/2t/yantao/VisualGLM-6B/model/blip2.py", line 29, in forward
    return super().forward(input_ids=input_ids, position_ids=None, attention_mask=attention_mask, image=image)
  File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/sat/model/base_model.py", line 144, in forward
    return self.transformer(*args, **kwargs)
  File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/sat/model/transformer.py", line 451, in forward
    hidden_states = self.hooks['word_embedding_forward'](input_ids, output_cross_layer=output_cross_layer, **kw_args)
  File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/sat/model/official/vit_model.py", line 55, in word_embedding_forward
    embeddings = self.proj(images)
  File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 463, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: FIND was unable to find an engine to execute this computation

Please note that I found the version of my pytorch is 2.0. Dose VisualGLM-6B have something wrong with Pytorch 2.0?

Sleepychord · 2023-05-24T10:54:45Z

Could you try reinstall pytorch , which matches your CUDA version? I think 2.0 is okay but 13.1 supports more cuda versions.

yt7589 · 2023-05-26T07:59:46Z

@Sleepychord Thanks. Installed the right version (11.7) of CUDA solved my problem.

yt7589 closed this as completed May 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finetune error: RuntimeError: FIND was unable to find an engine to execute this computation #52

Finetune error: RuntimeError: FIND was unable to find an engine to execute this computation #52

yt7589 commented May 24, 2023

Sleepychord commented May 24, 2023

yt7589 commented May 26, 2023

Finetune error: RuntimeError: FIND was unable to find an engine to execute this computation #52

Finetune error: RuntimeError: FIND was unable to find an engine to execute this computation #52

Comments

yt7589 commented May 24, 2023

Sleepychord commented May 24, 2023

yt7589 commented May 26, 2023