This is an unofficial implementation of BitNet.
It is implemented only for BitNetLinear.
- BitNet: Scaling 1-bit Transformers for Large Language Models
- The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Although it is in Japanese, I am posting articles on Qiita.
BitNet b1.58(BitLinear)を実装してMNISTで検証してみた(Tensorflow/Torch)
Execution examples
# Tensorflow
> python examples/mnist_tf.py
# Torch
> python examples/mnist_torch.py
Python : 3.12.2
Tensorflow : 2.16.1
Torch : 2.2.1