Skip to content

edmav4/BiPer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Logo

BiPer

Binary Neural Networks using a Periodic Function
Paper · Supplementary Material · Video

News and Updates 🗞️

April 1, 2024

  • Our preprint is available at Arxiv

February 27, 2024

  • Our paper have been accepted to CVPR 2024!

Table of Contents
  1. Abstract
  2. Getting Started
  3. CIFAR10
  4. ImageNet
  5. How to Cite
  6. License
  7. Contact
  8. Acknowledgments

Abstract

Quantized neural networks employ reduced precision representations for both weights and activations. This quantization process significantly reduces the memory requirements and computational complexity of the network. Binary Neural Networks (BNNs) are the extreme quantization case, representing values with just one bit. Since the sign function is typically used to map real values to binary values, smooth approximations are introduced to mimic the gradients during error backpropagation. Thus, the mismatch between the forward and backward models corrupts the direction of the gradient causing training inconsistency problems and performance degradation. In contrast to current BNN approaches, we propose to employ a binary periodic (BiPer) function during binarization. Specifically, we use a square wave for the forward pass to obtain the binary values and employ the trigonometric sine function with the same period of the square wave as a differentiable surrogate during the backward pass. We demonstrate that this approach can control the quantization error by using the frequency of the periodic function and improves network performance. Extensive experiments validate the effectiveness of BiPer in benchmark datasets and network architectures, with improvements of up to 1% and 0.69% with respect to state-of-the-art methods in the classification task over CIFAR-10 and ImageNet, respectively.

(back to top)

Getting Started

Clone this repository

Clone our repo to your local machine using the following command:

git clone https://github.com/edmav4/BiPer.git
cd BiPer

Prerequisites

Create a new conda environment using the provided environment.yml file.

conda env create --prefix ./venv -f environment.yml
conda activate ./venv

Download Datasets

Our BiPer was trained on CIFAR-10 and ImageNet datasets. You can download the datasets using the following commands:

  • CIFAR-10

    python cifar10/dataset/download.py --dataset cifar10 --data_path cifar10/data/CIFAR10
  • ImageNet

    See ImageNet for more details.

(back to top)

CIFAR10

Training

Our approach consists of a two-stage training strategy. In the first stage, the network is trained with real weights and binary features. Then, in the second stage, a warm weight initialization is employed based on the binary representation of the output weights from the first stage, and the model is fully trained to binarize the weights. Thus, the problem is split into two subproblems: weight and feature binarization.

Stage 1

To train stage1, you can use a similar command as follows:

# Example for BiPer-ResNet18 model
python -u main.py \
--gpus 0 \
--model resnet18_1w1a \
--results_dir ./result/stage1 \
--dataset cifar10 \
--epochs 600 \
--lr 0.021 \
-b 256 \
-bt 128 \
--lr_type cos \
--warm_up \
--weight_decay 0.0016 \
--tau 0.037 \
--freq 20

See this example in run_stage1.sh, and run it with bash run_stage1.sh.

Stage 2

After training the first stage, you can train the second stage using the following command:

# Example for BiPer-ResNet18 model
python -u main_stage2.py \
--gpus 0 \
--model resnet18_1w1a \
--results_dir ./result/stage2 \
--dataset cifar10 \
--epochs 300 \
--lr 0.0037 \
-b 256 \
-bt 128 \
--lr_type cos \
--warm_up \
--weight_decay 0.00016 \
--tau 0.0468 \
--load_ckpt_stage1 ./result/stage1/model_best.pth.tar

Note that --load_ckpt_stage1 should be specified to load the pretrained model from the first stage. See this example in run_stage2.sh, and run it with bash run_stage2.sh.

Evaluation

To evaluate a pretrained model, you can use the following command:

# see eval.sh
python main_stage2.py \
--gpus 0 \
-e {checkpoint_path} \
--model {model arch} \
--dataset cifar10 \
-bt 128 \

for example, using the pretrained model of BiPer-ResNet18:

# example ResNet18
python main_stage2.py \
--gpus 0 \
-e ./pretrained_models/biper_cifar10_resnet18_stage2/model_best.pth.tar \
--model resnet18_1w1a \
--dataset cifar10 \
-bt 128 \

(back to top)

Results

Quantization Error

To compute the quantization error, you can use the following command:

python compute_QE.py

Please specify the model and data path in the script.

Pretrained Models

Quantized Model Dataset Params (M) Top-1 Config Download
BiPer-ResNet18 CIFAR-10 11.01 93.75 Config File Model | Log
BiPer-ResNet20 CIFAR-10 0.27 86.98 Config File Model | Log
BiPer-VGG-Small CIFAR-10 4.66 92.46 Config File Model | Log

(back to top)

ImageNet

Training

Similar to CIFAR10, here we specify the training process for ImageNet.

Stage 1

To train stage1, you can use a similar command as follows:

# example BiPer-ResNet18
python main.py \
--gpus 0,1,2,3 \
--model resnet18_1w1a \
--data_path data \
--dataset imagenet \
--epochs 200 \
--lr 0.1 \
--weight_decay 1e-4 \
-b 512 \
-bt 256 \
--lr_type cos \
--freq 20 \
--warm_up \
--tau_min 0.85  \
--tau_max 0.99  \
--print_freq 250 \
--use_dali

See this example in run_stage1.sh, and run it with bash run_stage1.sh.

Stage 2

After training the first stage, you can train the second stage using a similar command as following:

python main_stg2.py \
--gpus 0 \
--model resnet18_1w1a \
--data_path data \
--dataset imagenet \
--epochs 100 \
--lr 0.01 \
-b 512 \
-bt 256 \
--lr_type cos \
--weight_decay 1e-4 \
--tau_min 0.0  \
--tau_max 0.0  \
--freq 20 \
--load_ckpt_2tage ./result/stage1/model_best.pth.tar \
--use_dali \
# --resume

See this example in run_stage2.sh, and run it with bash run_stage2.sh.

Evaluation

To evaluate a pretrained model, you can use the following command:

# see eval.sh
python main_stage2.py \
--gpus 0 \
-e {checkpoint_path} \
--model {model arch} \
--dataset imagenet \
-bt 256

for example, using the pretrained model of ResNet18:

# example BiPer-ResNet18
python main_stage2.py \
--gpus 0 \
-e pretrained_models/biper_imagenet_resnet18_stage2/model_best.pth.tar \
--model resnet18_1w1a \
--dataset imagenet \
-bt 256

(back to top)

Results

Pretrained Models

Quantized Model Dataset Params (M) Top-1 Config Download
BiPer-ResNet18 ImageNet1K 11.69 61.40 Config File Model | Log
BiPer-ResNet34 ImageNet1K 21.81 65.73 Config File Model | Log

(back to top)

How to cite

If you use the code or models from this project in your research, please cite our work as follows:

@article{vargas2024biper,
  title={BiPer: Binary Neural Networks using a Periodic Function},
  author={Vargas, Edwin and Correa, Claudia and Hinojosa, Carlos and Arguello, Henry},
  journal={arXiv preprint arXiv:2404.01278},
  year={2024}
}

(back to top)

License

Biper is distributed under the MIT License. See LICENSE for more information.

(back to top)

Contact

(back to top)

Acknowledgments

  • Our code is based on the ReCU repository: https://github.com/z-hXu/ReCU. We thank the authors for making their code publicly available.
  • This work was supported by the Vicerrectoría de Investigacion y Extensión of Universidad Industrial de Santander (UIS), Colombia under the research project VIE-3735.

(back to top)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published