Skip to content

allrob23/bitsandbytes

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bitsandbytes

Downloads Downloads Downloads

The bitsandbytes library is a lightweight Python wrapper around CUDA custom functions, in particular 8-bit optimizers, matrix multiplication (LLM.int8()), and 8 & 4-bit quantization functions.

The library includes quantization primitives for 8-bit & 4-bit operations, through bitsandbytes.nn.Linear8bitLt and bitsandbytes.nn.Linear4bit and 8-bit optimizers through bitsandbytes.optim module.

There are ongoing efforts to support further hardware backends, i.e. Intel CPU + GPU, AMD GPU, Apple Silicon. Windows support is quite far along and is on its way as well.

Please head to the official documentation page:

https://huggingface.co/docs/bitsandbytes/main

๐—ฏ๐—ถ๐˜๐˜€๐—ฎ๐—ป๐—ฑ๐—ฏ๐˜†๐˜๐—ฒ๐˜€ ๐—บ๐˜‚๐—น๐˜๐—ถ-๐—ฏ๐—ฎ๐—ฐ๐—ธ๐—ฒ๐—ป๐—ฑ ๐™–๐™ก๐™ฅ๐™๐™– ๐—ฟ๐—ฒ๐—น๐—ฒ๐—ฎ๐˜€๐—ฒ is out!

๐Ÿš€ Big news! After months of hard work and incredible community contributions, we're thrilled to announce the ๐—ฏ๐—ถ๐˜๐˜€๐—ฎ๐—ป๐—ฑ๐—ฏ๐˜†๐˜๐—ฒ๐˜€ ๐—บ๐˜‚๐—น๐˜๐—ถ-๐—ฏ๐—ฎ๐—ฐ๐—ธ๐—ฒ๐—ป๐—ฑ ๐™–๐™ก๐™ฅ๐™๐™– ๐—ฟ๐—ฒ๐—น๐—ฒ๐—ฎ๐˜€๐—ฒ! ๐Ÿ’ฅ

Now supporting:

  • ๐Ÿ”ฅ ๐—”๐— ๐—— ๐—š๐—ฃ๐—จ๐˜€ (ROCm)
  • โšก ๐—œ๐—ป๐˜๐—ฒ๐—น ๐—–๐—ฃ๐—จ๐˜€ & ๐—š๐—ฃ๐—จ๐˜€

Weโ€™d love your early feedback! ๐Ÿ™

๐Ÿ‘‰ Instructions for your ๐š™๐š’๐š™ ๐š’๐š—๐šœ๐š๐šŠ๐š•๐š• here

We're super excited about these recent developments and grateful for any constructive input or support that you can give to help us make this a reality (e.g. helping us with the upcoming Apple Silicon backend or reporting bugs). BNB is a community project and we're excited for your collaboration ๐Ÿค—

License

bitsandbytes is MIT licensed.

We thank Fabio Cannizzo for his work on FastBinarySearch which we use for CPU quantization.

About

Accessible large language models via k-bit quantization for PyTorch.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 62.4%
  • Cuda 24.2%
  • C++ 10.6%
  • CMake 1.1%
  • Shell 1.1%
  • Metal 0.3%
  • Other 0.3%