Skip to content

brohrer/parameter_efficiency_leaderboard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 

Repository files navigation

Parameter Efficiency Leaderboard

MNIST, 99% accuracy

Parameters Model Links Authors
910 ConvNet with customized Hough voting layer code Satoshi Tanaka
1,398 Three-layer Sharpened Cosine Similarity with paired depthwise and pointwise operations. code Raphael Pisoni

Fashion MNIST, 90% accuracy

Parameters Model Links Authors
1,890 ConvNet with customized Hough voting layer code Satoshi Tanaka
2,764 Four-layer SCS (layers include depthwise and pointwise operations) with 20 kernels per layer code Brandon Rohrer
7,156 WaveMix Lite-8/5. code, paper Pranav Jeevan P, Amit Sethi

Fashion MNIST, 95% accuracy

Parameters Model Links Authors

CIFAR-10, 80% accuracy

Parameters Model Links Authors
25,214 Three-layer Sharpened Cosine Similarity with Mixer layer (paired depthwise and pointwise layers). code Brandon Rohrer
37,058 WaveMix Lite-32/7 (Replaced DeConv with Upsample) code, paper Pranav Jeevan P, Amit Sethi
37,086 Three-layer Sharpened Cosine Similarity with 56 kernels in each layer. code Brandon Rohrer
45,962 WaveMix Lite-32/4 (ff=16, mult=1, dropout=0.25). code, paper Pranav Jeevan P, Amit Sethi
47,643 Three-layer Sharpened Cosine Similarity with 30 5x5 kernels in each layer. code Brandon Rohrer

CIFAR-10, 90% accuracy

Parameters Model Links Authors
103,000 ConvMixer-128/4, achieved 91.26%. paper, code Asher Trockman, J. Zico Kolter
520,106 WaveMix Lite-64/6 code, paper Pranav Jeevan P, Amit Sethi
639,702 kEffNet-B0, an EfficientNet with paired pointwise convolutions, achieved 91.64%. paper Joao Paulo Schwarz Schuler, Santiago Romani, Mohamed Abdel-Nasser, Hatem Rashwan, Domenec Puig
1.2M SCS-based network achieved 91.3%. code Håkon Hukkelås

CIFAR-10, 95% accuracy

Parameters Model Links Authors
594,000 ConvMixer-256/8 paper, code Asher Trockman, J. Zico Kolter

ImageNet top-1, 80% accuracy

Parameters Model Links Authors
21.1M ConvMixer-768/32 paper, code Asher Trockman, J. Zico Kolter

ImageNet top-1, 90% accuracy

Parameters Model Links Authors
390M EfficientNet-B6-Wide with Meta-Pseudo Labels with 300M unlabled images from JFT paper, code Hieu Pham, Zihang Dai, Qizhe Xie, Quoc V. Le

Why parameter efficiency?

There are a lot of different dimensions to a model's performance and parameter efficiency is one that gets overlooked. If two models have similar accuracy, but one has fewer parameters it will probably be cheaper to store, run, distribute, and maintain. Some model families are inherently more parameter efficient than others, but those differences aren't showcased in accuracy leaderboards. This is a chance for parameter efficient architectures to get their time in the spotlight.

Isn't this just a cherry-picked metric that sharpened cosine similarity does well on?

Yes.

About

The most parameter efficient machine learning models on a few popular benchmarks

Resources

Stars

Watchers

Forks