Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On the Expressivity of Random Features in CNNs - TF 2.3 (Community) #9174

Closed
wants to merge 1 commit into from

Conversation

Vishal-V
Copy link

Only BatchNorm

Paper

This repository is the unofficial implementation of the following [Paper].

  • Training BatchNorm and Only BatchNorm: On the Expressivity of Random Features in CNNs

Description/Abstract

Batch normalization (BatchNorm) has become an indispensable tool for training
deep neural networks, yet it is still poorly understood. Although previous work
has typically focused on studying its normalization component, BatchNorm also
adds two per-feature trainable parameters—a coefficient and a bias—whose role
and expressive power remain unclear. To study this question, we investigate the
performance achieved when training only these parameters and freezing all others
at their random initializations. We find that doing so leads to surprisingly high
performance. For example, sufficiently deep ResNets reach 82% (CIFAR-10) and
32% (ImageNet, top-5) accuracy in this configuration, far higher than when training
an equivalent number of randomly chosen parameters elsewhere in the network.
BatchNorm achieves this performance in part by naturally learning to disable
around a third of the random features. Not only do these results highlight the
under-appreciated role of the affine parameters in BatchNorm, but—in a broader
sense—they characterize the expressive power of neural networks constructed
simply by shifting and rescaling random features.

Key Features

  • TensorFlow 2.3.0
  • Inference example (Colab Demo)
  • Graph mode training with model.fit
  • Functional model with tf.keras.layers
  • Input pipeline using tf.data and tfds
  • GPU accelerated
  • Fully integrated with absl-py from abseil.io
  • Clean implementation
  • Following the best practices
  • Apache 2.0 License

Requirements

TensorFlow 2.3
Python 3.7

To install requirements:

pip install -r requirements.txt

Results

Image Classification (Only BatchNorm weights)

Model name Download Top 1 Accuracy
ResNet-14 (N=2) Checkpoint 46.67%
ResNet-32 (N=5) Checkpoint 51.29%
ResNet-56 (N=9) Checkpoint 55.21%
ResNet-110 (N=18) Checkpoint 65.19%
ResNet-218 (N=36) Checkpoint 70.09%
ResNet-434 (N=72) Checkpoint 73.67%
ResNet-866 (N=144) Checkpoint 77.83%

Dataset

CIFAR10 dataset - 10 classes with 50,000 images in the train set and 10,000 images in the test set.

Training

📝 Provide training information.

  • Provide details for preprocessing, hyperparameters, random seeds, and environment.
  • Provide a command line example for training.

Please run this command line for training.

python3 resnet_cifar.py

This trains the OnlyBN model for the ResNet-14 architecture. Replace num_blocks with the appropriate value for 'N' from the results table above to train the respective ResNet architecture.

Evaluation

Please run this command line for evaluation.

python3 ...

References

📝 Provide links to references.

Citation

📝 Make your repository citable.

If you want to cite this repository in your research paper, please use the following information.

Authors or Maintainers

This project is licensed under the terms of the Apache License 2.0.

@Vishal-V Vishal-V requested a review from a team as a code owner August 31, 2020 17:45
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

Review Jupyter notebook visual diffs & provide feedback on notebooks.


Powered by ReviewNB

@jaeyounkim
Copy link
Collaborator

The community directory is only for providing a curated list of community models.

@jaeyounkim jaeyounkim closed this Oct 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants