The bitsandbytes
library is a lightweight Python wrapper around CUDA custom functions, in particular 8-bit optimizers, matrix multiplication (LLM.int8()), and 8 & 4-bit quantization functions.
The library includes quantization primitives for 8-bit & 4-bit operations, through bitsandbytes.nn.Linear8bitLt
and bitsandbytes.nn.Linear4bit
and 8-bit optimizers through bitsandbytes.optim
module.
There are ongoing efforts to support further hardware backends, i.e. Intel CPU + GPU, AMD GPU, Apple Silicon. Windows support is quite far along and is on its way as well.
Please head to the official documentation page:
https://huggingface.co/docs/bitsandbytes/main
๐ฏ๐ถ๐๐๐ฎ๐ป๐ฑ๐ฏ๐๐๐ฒ๐
๐บ๐๐น๐๐ถ-๐ฏ๐ฎ๐ฐ๐ธ๐ฒ๐ป๐ฑ ๐๐ก๐ฅ๐๐ ๐ฟ๐ฒ๐น๐ฒ๐ฎ๐๐ฒ is out!
๐ Big news! After months of hard work and incredible community contributions, we're thrilled to announce the ๐ฏ๐ถ๐๐๐ฎ๐ป๐ฑ๐ฏ๐๐๐ฒ๐ ๐บ๐๐น๐๐ถ-๐ฏ๐ฎ๐ฐ๐ธ๐ฒ๐ป๐ฑ ๐๐ก๐ฅ๐๐ ๐ฟ๐ฒ๐น๐ฒ๐ฎ๐๐ฒ! ๐ฅ
Now supporting:
- ๐ฅ ๐๐ ๐ ๐๐ฃ๐จ๐ (ROCm)
- โก ๐๐ป๐๐ฒ๐น ๐๐ฃ๐จ๐ & ๐๐ฃ๐จ๐
Weโd love your early feedback! ๐
๐ Instructions for your ๐๐๐ ๐๐๐๐๐๐๐
here
We're super excited about these recent developments and grateful for any constructive input or support that you can give to help us make this a reality (e.g. helping us with the upcoming Apple Silicon backend or reporting bugs). BNB is a community project and we're excited for your collaboration ๐ค
bitsandbytes
is MIT licensed.
We thank Fabio Cannizzo for his work on FastBinarySearch which we use for CPU quantization.