Training unbalance on different GPUs? #8

Epiphqny · 2019-12-02T12:49:58Z

I used 8 gpus to train the model, but most memory is placed on the first GPU and i can not fully utilize other gpus, is threre any solution? thanks!

PingoLH · 2019-12-02T16:24:10Z

Hello, good question! I've also faced this problem before. You can try this one:
#distribute model on first 7 gpus
model = torch.nn.DataParallel(model, device_ids=[0,1,2,3,4,5,6])
images = images.to(device)
#send output and label to the last gpu
labels = labels.to(device).cuda(7)
optimizer.zero_grad()
outputs = model(images).cuda(7)
#after computing the loss, send loss back to gpu 0 for backpropagation
loss = loss_fn(input=outputs, target=labels).cuda(0)

Epiphqny · 2019-12-03T08:06:51Z

Hello, good question! I've also faced this problem before. You can try this one:
#distribute model on first 7 gpus
model = torch.nn.DataParallel(model, device_ids=[0,1,2,3,4,5,6])
images = images.to(device)
#send output and label to the last gpu
labels = labels.to(device).cuda(7)
optimizer.zero_grad()
outputs = model(images).cuda(7)
#after computing the loss, send loss back to gpu 0 for backpropagation
loss = loss_fn(input=outputs, target=labels).cuda(0)

I have tried this code, but the memory of the GPU 7 still limits the batch size, and other gpu memory can not be fully utilized, then there is no need to use multi-gpu....

Epiphqny changed the title ~~training unbalanced on different GPUs?~~ Training unbalance on different GPUs? Dec 2, 2019

yswang1717 mentioned this issue Jan 21, 2020

Training protocol (specification for training settings) #15

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training unbalance on different GPUs? #8

Training unbalance on different GPUs? #8

Epiphqny commented Dec 2, 2019

PingoLH commented Dec 2, 2019

Epiphqny commented Dec 3, 2019

Training unbalance on different GPUs? #8

Training unbalance on different GPUs? #8

Comments

Epiphqny commented Dec 2, 2019

PingoLH commented Dec 2, 2019

Epiphqny commented Dec 3, 2019