Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

no difference in memory usage #22

Closed
ofrimasad opened this issue Dec 16, 2021 · 3 comments
Closed

no difference in memory usage #22

ofrimasad opened this issue Dec 16, 2021 · 3 comments

Comments

@ofrimasad
Copy link

Hi.
I am training my network with bnb.optim.Adam8bit vs torch.optim.Adam but I don't see any difference in memory consumption.

Running on GTX 2080Ti (single gpu or DDP).
with cudatoolkit 11.1.74
bitsandbytes-cuda111

looking in nvidia-smi I see 9.6GB in both cases
Am I missing something here?

@JoeyQQ
Copy link

JoeyQQ commented Jan 3, 2022

hi there, i got the same error.
did you solve this issues?

@LeeDoYup
Copy link

LeeDoYup commented Feb 2, 2022

me too

@TimDettmers
Copy link
Contributor

If the network has few parameters you will not only small memory reductions (for example for ResNet 50).

The other problem is that nvidia-smi only shows the maximum memory allocated. PyTorch keeps unallocated memory around to made allocation more efficient. So it may seem that the memory is not decreased while it actually is.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants