Skip to content
This repository has been archived by the owner on Jun 11, 2024. It is now read-only.

Using 8-bit optims #1

Open
alceballosa opened this issue Mar 23, 2024 · 2 comments
Open

Using 8-bit optims #1

alceballosa opened this issue Mar 23, 2024 · 2 comments

Comments

@alceballosa
Copy link

Hi!

Thanks for making this happen, it's a super useful resource!

I was wondering whether there is any reason to use bnb's 8-bit optimizers when doing Q-Lora optimization. Is it better to just use vanilla optims from PyTorch?

Best,

Alberto

@michaelnny
Copy link
Owner

Hi Alberto,

There's no specify reason to use the 8-bit optimizer from bnb, we didn't run tests to compare it with the vanilla optimizer from PyTorch.

My guess is when run fine-tuning over very larger models like 70B or bigger ones, the 8-bit optimizer could potentially save more GPU RAM, but I'm not sure since I haven't done it due to limited GPU resources.

@alceballosa
Copy link
Author

Hi Michael!

Thanks for the reply. In the end I just used normal AdamW, didn't see much difference at the scale I'm training at.

Best,

Alberto

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants