Using 8-bit optims #1

alceballosa · 2024-03-23T19:54:06Z

Hi!

Thanks for making this happen, it's a super useful resource!

I was wondering whether there is any reason to use bnb's 8-bit optimizers when doing Q-Lora optimization. Is it better to just use vanilla optims from PyTorch?

Best,

Alberto

michaelnny · 2024-03-24T11:16:03Z

Hi Alberto,

There's no specify reason to use the 8-bit optimizer from bnb, we didn't run tests to compare it with the vanilla optimizer from PyTorch.

My guess is when run fine-tuning over very larger models like 70B or bigger ones, the 8-bit optimizer could potentially save more GPU RAM, but I'm not sure since I haven't done it due to limited GPU resources.

alceballosa · 2024-03-31T13:19:48Z

Hi Michael!

Thanks for the reply. In the end I just used normal AdamW, didn't see much difference at the scale I'm training at.

Best,

Alberto

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using 8-bit optims #1

Using 8-bit optims #1

alceballosa commented Mar 23, 2024

michaelnny commented Mar 24, 2024

alceballosa commented Mar 31, 2024

Using 8-bit optims #1

Using 8-bit optims #1

Comments

alceballosa commented Mar 23, 2024

michaelnny commented Mar 24, 2024

alceballosa commented Mar 31, 2024