BLIP2 inference error: Expected all tensors to be on the same device, but found at least two devices, cuda:7 and cuda:2 #26806

YongLD · 2023-10-14T12:54:14Z

System Info

Describe the bug
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:7 and cuda:2.

Screenshots

Traceback (most recent call last):
  File "/home/cike/ldy/ner/test-blip2-1.py", line 18, in <module>
    out = model.generate(**inputs)
          ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cike/anaconda3/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
     ......
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/cike/.local/lib/python3.11/site-packages/transformers/generation/utils.py", line 2494, in greedy_search
    next_tokens = next_tokens * unfinished_sequences + pad_token_id * (1 - unfinished_sequences)
                  ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:7 and cuda:2

System info (please complete the following information):

OS: 18.04.2 LTS
One mechine with 8x tesla p100-pcie-16gb

How can I fix this bug?

Who can help?

@pacman100

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

To Reproduce
I am trying to enable multi-gpus inference on the BLIP2 model.
I tried the following code snippet:

model_path = "/home/cike/.cache/huggingface/hub/models--Salesforce--blip2-flan-t5-xl/snapshots/cc2bb7bce2f7d4d1c37753c7e9c05a443a226614/"
processor = Blip2Processor.from_pretrained(model_path)
model = Blip2ForConditionalGeneration.from_pretrained(model_path, device_map="auto")

img_url = 'https://storage.googleapis.com/sfr-vision-language-research/BLIP/demo.jpg' 
raw_image = Image.open(requests.get(img_url, stream=True).raw).convert('RGB')

question = "how many dogs are in the picture?"
inputs = processor(raw_image, question, return_tensors="pt").to("cuda")

print("model: ",model.hf_device_map)

out = model.generate(**inputs)
print(processor.decode(out[0], skip_special_tokens=True))

Expected behavior

The BLIP2 model loads and runs successfully on multi-GPUs.

The text was updated successfully, but these errors were encountered:

ArthurZucker · 2023-10-16T08:05:31Z

pinging @SunMarc and @younesbelkada as well!

SunMarc · 2023-10-16T22:00:12Z

Hi @YongLD, please make sure to have the latest version of transformers. We fixed a similar issue to this one in the past. On my side, I'm able to run on 2 GPUs. LMK how it goes. If it doesn't work, please provide you environnement config.

YongLD · 2023-10-18T12:16:20Z

Environment Config

transformers== 4.34.0
accelerate== 0.23.0
torch== 2.0.1+cu117

Beside, I found a warning when I run with device_map="auto":

The `language_model` is not in the `hf_device_map` dictionary and you are running your script in a multi-GPU environment. 
this may lead to unexpected behavior when using `accelerate`. Please pass a `device_map` that contains `language_model` to remove this warning.

Does accelerate-large-model support blip2-flan-t5-xl or blip2-flan-t5-xxl?

Anorther Question (Although it's a CUDA bug.)
I found DeferredCudaCallError When I use to("cuda") with multi-gpu, Do you know why?

os.environ["CUDA_VISIBLE_DEVICES"] = "0,1"
import torch
model = Blip2ForConditionalGeneration.from_pretrained("Salesforce/blip2-opt-2.7b").to("cuda")

Error:torch.cuda.DeferredCudaCallError: CUDA call failed lazily at initialization with error: device >= 0 && device < num_gpus

YongLD · 2023-10-18T12:30:48Z

@SunMarc How to lock the usage of device_map="auto" to a specific GPU?
I have used os.environ["CUDA_VISIBLE_DEVICES"] = "0,1,2", but it does not work;
The command torchrun test.py --nproc_per_node=3 does not work, either.

SunMarc · 2023-10-18T21:26:28Z

I think it is a problem with torch and cuda. In the past, we had a similar case. Can you reinstall and try again ?

Also, the following code snippet works on my side:

os.environ["CUDA_VISIBLE_DEVICES"] = "0,1"
import torch
model = Blip2ForConditionalGeneration.from_pretrained("Salesforce/blip2-opt-2.7b").to("cuda")

As for the warning, this is something we need to fix. It shouldn't show the warning.

YongLD · 2023-10-20T11:05:52Z

@SunMarc yes, I can use Salesforce/blip2-opt-2.7b with to("cuda"), but I can not use Salesforce--blip2-flan-t5-xxl in 1 gpu with 16GB.
There is always a RuntimeError when I use device_map="auto" for Blip2 multi-gpu test, but I can use it for T5 model

tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-xxl")
model = T5ForConditionalGeneration.from_pretrained("google/flan-t5-xxl", device_map="auto")

So I wonder is it a problem with Salesforce--blip2-flan-t5-xxl or other Blip2 model?

github-actions · 2023-11-14T08:03:59Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions bot closed this as completed Nov 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BLIP2 inference error: Expected all tensors to be on the same device, but found at least two devices, cuda:7 and cuda:2 #26806

BLIP2 inference error: Expected all tensors to be on the same device, but found at least two devices, cuda:7 and cuda:2 #26806

YongLD commented Oct 14, 2023

ArthurZucker commented Oct 16, 2023

SunMarc commented Oct 16, 2023

YongLD commented Oct 18, 2023

YongLD commented Oct 18, 2023

SunMarc commented Oct 18, 2023 •

edited

YongLD commented Oct 20, 2023

github-actions bot commented Nov 14, 2023

BLIP2 inference error: Expected all tensors to be on the same device, but found at least two devices, cuda:7 and cuda:2 #26806

BLIP2 inference error: Expected all tensors to be on the same device, but found at least two devices, cuda:7 and cuda:2 #26806

Comments

YongLD commented Oct 14, 2023

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

ArthurZucker commented Oct 16, 2023

SunMarc commented Oct 16, 2023

YongLD commented Oct 18, 2023

YongLD commented Oct 18, 2023

SunMarc commented Oct 18, 2023 • edited

YongLD commented Oct 20, 2023

github-actions bot commented Nov 14, 2023

SunMarc commented Oct 18, 2023 •

edited