-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loading the model on multiple GPUs #46
Comments
I also would like to know how to do this? |
I have same request. |
I have same request too. |
It can run on two RTX 2080Ti in my computer. |
It seems the model is implemented in two devices. But when doing the inference, the tensor flowed in two deivces and it will throw the two devices error. |
(1) Load the LLaMA with device map to 'auto': MiniGPT-4/minigpt4/models/mini_gpt4.py Line 94 in 22d8888
device_map = 'auto' (2) Modify the line below from 'cuda:{}'.format(args.gpu_id)' to 'cuda', It will automatically assign to device0 or device1 if you have two devices: Line 64 in 22d8888
chat = Chat(model, vis_processor, device='cuda' ) (3) The "to device" can be removed from the line below because llama has been loaded to GPUs automatically: Line 60 in 22d8888
model = model_cls.from_config(model_config) (4) When encode the image, we may encode the image with CPU and assign the image embedding to GPU
image_emb, _ = self.model.encode_img(image.to('cpu'))
img_list.append(image_emb.to('cuda')) The model should now work if you have multiple GPUs with low memory space. |
Traceback (most recent call last): I did all of these steps but i still get |
@JainitBITW Is it working now for you? |
Yes i just restarted my cuda. |
@JainitBITW Did you do anything apart from @thcheung 's instruction? |
Nope exactly same |
What error you are getting |
I'm trying to run the 13 B model on multiple GPUs. The author has written they currently don't support multi-GPU inference. So , I want to be sure that it's possible to do inference on multiple GPUs before provisioning the ec2 instance. |
I think you van go ahead |
@JainitBITW @thcheung thanks it worked for me (8 bit). Have any idea how to do it for 16 bit (low resource = False) ? |
I got through this error by setting |
My solution is: |
I have two 4090 24GB, if possible please provide an extra argument to demo.py to either load the model on CPU or 2 or more GPU and another argument to run on 16-bit and take advantage of extra GPU RAM, instead of editing config files.
The text was updated successfully, but these errors were encountered: