no difference in memory usage #22

ofrimasad · 2021-12-16T12:42:22Z

Hi.
I am training my network with bnb.optim.Adam8bit vs torch.optim.Adam but I don't see any difference in memory consumption.

Running on GTX 2080Ti (single gpu or DDP).
with cudatoolkit 11.1.74
bitsandbytes-cuda111

looking in nvidia-smi I see 9.6GB in both cases
Am I missing something here?

The text was updated successfully, but these errors were encountered:

JoeyQQ · 2022-01-03T09:01:34Z

hi there, i got the same error.
did you solve this issues?

LeeDoYup · 2022-02-02T20:58:43Z

me too

TimDettmers · 2022-04-04T15:56:12Z

If the network has few parameters you will not only small memory reductions (for example for ResNet 50).

The other problem is that nvidia-smi only shows the maximum memory allocated. PyTorch keeps unallocated memory around to made allocation more efficient. So it may seem that the memory is not decreased while it actually is.

TimDettmers closed this as completed Apr 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

no difference in memory usage #22

no difference in memory usage #22

ofrimasad commented Dec 16, 2021

JoeyQQ commented Jan 3, 2022

LeeDoYup commented Feb 2, 2022

TimDettmers commented Apr 4, 2022

no difference in memory usage #22

no difference in memory usage #22

Comments

ofrimasad commented Dec 16, 2021

JoeyQQ commented Jan 3, 2022

LeeDoYup commented Feb 2, 2022

TimDettmers commented Apr 4, 2022