Parameters | Model | Links | Authors |
---|---|---|---|
910 | ConvNet with customized Hough voting layer | code | Satoshi Tanaka |
1,398 | Three-layer Sharpened Cosine Similarity with paired depthwise and pointwise operations. | code | Raphael Pisoni |
Parameters | Model | Links | Authors |
---|---|---|---|
1,890 | ConvNet with customized Hough voting layer | code | Satoshi Tanaka |
2,764 | Four-layer SCS (layers include depthwise and pointwise operations) with 20 kernels per layer | code | Brandon Rohrer |
7,156 | WaveMix Lite-8/5. | code, paper | Pranav Jeevan P, Amit Sethi |
Parameters | Model | Links | Authors |
---|
Parameters | Model | Links | Authors |
---|---|---|---|
25,214 | Three-layer Sharpened Cosine Similarity with Mixer layer (paired depthwise and pointwise layers). | code | Brandon Rohrer |
37,058 | WaveMix Lite-32/7 (Replaced DeConv with Upsample) | code, paper | Pranav Jeevan P, Amit Sethi |
37,086 | Three-layer Sharpened Cosine Similarity with 56 kernels in each layer. | code | Brandon Rohrer |
45,962 | WaveMix Lite-32/4 (ff=16, mult=1, dropout=0.25). | code, paper | Pranav Jeevan P, Amit Sethi |
47,643 | Three-layer Sharpened Cosine Similarity with 30 5x5 kernels in each layer. | code | Brandon Rohrer |
Parameters | Model | Links | Authors |
---|---|---|---|
103,000 | ConvMixer-128/4, achieved 91.26%. | paper, code | Asher Trockman, J. Zico Kolter |
520,106 | WaveMix Lite-64/6 | code, paper | Pranav Jeevan P, Amit Sethi |
639,702 | kEffNet-B0, an EfficientNet with paired pointwise convolutions, achieved 91.64%. | paper | Joao Paulo Schwarz Schuler, Santiago Romani, Mohamed Abdel-Nasser, Hatem Rashwan, Domenec Puig |
1.2M | SCS-based network achieved 91.3%. | code | Håkon Hukkelås |
Parameters | Model | Links | Authors |
---|---|---|---|
594,000 | ConvMixer-256/8 | paper, code | Asher Trockman, J. Zico Kolter |
Parameters | Model | Links | Authors |
---|---|---|---|
21.1M | ConvMixer-768/32 | paper, code | Asher Trockman, J. Zico Kolter |
Parameters | Model | Links | Authors |
---|---|---|---|
390M | EfficientNet-B6-Wide with Meta-Pseudo Labels with 300M unlabled images from JFT | paper, code | Hieu Pham, Zihang Dai, Qizhe Xie, Quoc V. Le |
There are a lot of different dimensions to a model's performance and parameter efficiency is one that gets overlooked. If two models have similar accuracy, but one has fewer parameters it will probably be cheaper to store, run, distribute, and maintain. Some model families are inherently more parameter efficient than others, but those differences aren't showcased in accuracy leaderboards. This is a chance for parameter efficient architectures to get their time in the spotlight.
Yes.