You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm attempting to run the model on dual RTX 4090s. Enabling this would be a great update and would allow more people to run the full float16 model.
Some changes would need to be made, starting by passing kwargs = { "device_map": "auto", "max_memory": {i: "13GiB" for i in range(num_gpus)}, } to LlamaForCausalLM.from_pretrained.
After making this change the model loads but throws the following error when submitting text:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!
I successfully run on dual RTX3090s.
Try with CUDA_VISIBLE_DEVICES=0 python demo.py --cfg-path eval_configs/minigpt4_eval.yaml.
Curious, what's your VRAM usage between the cards? With CUDA_VISIBLE_DEVICES=0 I can run the model in 8bit model but my second card is idle. I'm hoping to run the full 16bit model split between my two cards.
One is 16GB and the other is no usage. CUDA_VISIBLE_DEVICES=0 just limit visibility from python program. e.g., CUDA_VISIBLE_DEVICES=0: only GPU_id = 0 is visible, CUDA_VISIBLE_DEVICES=1: only GPU_id = 1 is visible, and CUDA_VISIBLE_DEVICES=0,1: both GPU_id=0 and 1 are visible.
If you want to run the inference using two cards then you need to do something like this
I'm attempting to run the model on dual RTX 4090s. Enabling this would be a great update and would allow more people to run the full float16 model.
Some changes would need to be made, starting by passing
kwargs = { "device_map": "auto", "max_memory": {i: "13GiB" for i in range(num_gpus)}, }
toLlamaForCausalLM.from_pretrained
.After making this change the model loads but throws the following error when submitting text:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!
Full Logs
minigpt4-error-logs.txt
The text was updated successfully, but these errors were encountered: