Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate 8 bit optimisers #95

Open
Abe404 opened this issue Mar 29, 2023 · 0 comments
Open

Evaluate 8 bit optimisers #95

Abe404 opened this issue Mar 29, 2023 · 0 comments

Comments

@Abe404
Copy link
Owner

Abe404 commented Mar 29, 2023

If I can get RootPainter working with an 8 bit optimiser it could reduce memory requirements and speed up training.

See:
https://arxiv.org/pdf/2303.10181.pdf who state: "The use of 8-bit optimiser reduces the GPU memory utilised and the convergence time. The more interesting observation is that in almost all cases (except
ViT), it also converges to a better solution, yielding a small performance improvement."
One concern is training stability. They also mention: "One reason for the degradation in
performance in transformers when using the 8-bit optimiser could be instability during training."

See:
Dettmers, T., Lewis, M., Shleifer, S., Zettlemoyer, L.: 8-bit optimizers via
block-wise quantization. In: International Conference on Learning
Representations (2022), https://openreview.net/forum?id=shpkpVXzo3h

https://github.com/TimDettmers/bitsandbytes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant