Minimizing cpu RAM vs only use GPU RAM #199

vince62s · 2024-02-06T15:10:22Z

🚀 Feature

Load model directly on GPU when available instead of 1) CPU 2) GPU

Trying to use comet-score with cometkiwi-xl on Colab

Currently, the load_checkpoint method forces to load on torch.device("cpu").
On Colab Free there is only 12GB of cpu RAM, hence XL does not fit.

Then I switched in init.py torch.device() to "cuda"
Now it loads the model on GPU fine

BUT just before starting to score, the cpu RAM suddenly jumps to > 12GB, not sure to understand why.

Any clue ?

vince62s · 2024-02-06T16:07:57Z

usually, the way it should work is:

build model on meta device (empty weights) so it takes zero ram
load directly weights from checkpoint to GPU
I am trying to amend the code, but no luck so far.

vince62s added the enhancement New feature or request label Feb 6, 2024

vince62s closed this as completed Jun 6, 2024