Synchronization-aware NAS for an Efficient Collaborative Inference on Mobile Platforms

(NOTICE) Our paper has been accepted at LCTES 2023! [Paper]

Evaluation of Pre-trained Versions of Our models

Requirements

Install python libraries:
```
pip3 install -r requirements.txt
```
Set the environment variable PYTHONPATH
```
export PYTHONPATH=/path/to/SyncNAS
```
ImageNet2012 datasets (For validation)

How to use

To simply load PyTorch model :

#~/SyncNAS/
from torch_modules import TorchBranchedModel
model = TorchBranchedModel('model_configs/syncnas_mobilenet_v2_100.json')	#base_model: mobilenet_v2    
model.load_state_dict(load_params('pretrained/syncnas_mobilenet_v2_100.pth'))

To evaluate on ImageNet :

$ python3 eval.py --base_model mobilenet_v2 --path 'your/path/to/imagenet'  
    # --base_model: base model that is being adapted -> available: ['mobilenet_v2', 'mnasnet_b1', 'fbnet_c']
    # -j: number of workers (default: 4)
    # -b: batch_size (default: 128)

Monte Carlo Tree Search

Algorithm 1 in our paper corresponds to SyncNAS/src/local_worker.py
Algorithm 2 in our paper corresponds to SyncNAS/src/mcts.py

Appendix

A. Lightweight Model Design Trend

The recent lightweight CNNs consist of multiple inverted residual (MBConv) blocks, following the design convention inspired by MobileNetV2 [1] due to its computational efficiency.
- stages: The conventional criteria that a number of MBConv blocks are grouped together
- Exp: The expanded channel size of each MBConv layer
- Out: The output channel size of each MBConv layer
- Stride: The stride size of a depthwise convolution in each MBConv layer
- Kernel: The kernel size of a depthwise convolution in each MBConv layer
Note that $*$ mark indicates that squeeze-and-excitation is applied.
We omitted other details, such as nonlinearities, to highlight the general structure.

* Each bracket in stages specifies MBConv Block in the format of (Exp-Out,Stride,Kernel)

Model	Stem (Out)	Stage 1	Stage 2	Stage 3	Stage 4	Stage 5	Stage 6	Stage 7
MobileNetV2 [1]	(32)	(32-16,1,3)	(96-24,2,3) (144-24,1,3)	(144-32,2,3) (192-32,1,3) (192-32,1,3)	(192-64,2,3) (384-64,1,3) (384-64,1,3) (384-64,1,3)	(384-96,1,3) (576-96,1,3) (576-96,1,3)	(576-160,2,3) (960-160,1,3) (960-160,1,3)	(960-320,1,3)
MnasNet-A1 [2]	(32)	(32-16,1,3)	(96-24,2,3) (144-24,1,3)	(72-40,2,5) (120-40,1,5) (120-40,1,5)	(240-80,2,3) (480-80,1,3) (480-80,1,3) (480-80,1,3)	(480-112,1,3) (672-112,1,3)	(672-160,2,5) (960-160,1,5) (960-160,1,5)	(960-320,1,3)
MnasNet-B1 [2]	(32)	(32-16,1,3)	(48-24,2,3) (72-24,1,3) (72-24,1,3)	(72-40,2,5) (120-40,1,5) (120-40,1,5)	(240-80,2,5) (480-80,1,5) (480-80,1,5)	(480-96,1,3) (576-96,1,3)	(576-192,2,5) (1152-192,1,5) (1152-192,1,5) (1152-192,1,5)	(1152-320,1,3)
FBNet-B [3]	(16)	(16-16,1,3)	(96-24,2,3) (24-24,1,5) (24,24,1,3) (24,24,1,3)	(144-32,2,5) (96-32,1,5) (192-32,1,3) (192-32,1,5)	(192-64,2,5) (64-64,1,5) (192-64,1,5)	(384-112,1,5) (112-112,1,3) (112-112,1,5) (336-112,1,5)	(672-184,2,5) (184-184,1,5) (1104-184,1,5) (1104-184,1,5)	(1104-352,1,3)
FBNet-C [3]	(16)	(16-16,1,3)	(96-24,2,3) (24-24,1,5) (24,24,1,3)	(144-32,2,5) (96-32,1,5) (192-32,1,5) (192-32,1,3)	(192-64,2,5) (192-64,1,5) (384-64,1,5) (384-64,1,5)	(384-112,1,5) (672-112,1,5) (672-112,1,5) (336-112,1,5)	(672-184,2,5) (1104-184,1,5) (1104-184,1,5) (1104-184,1,5)	(1104-352,1,3)
Proxyless-R [4]	(32)	(32-16,1,3)	(48-32,2,5) (96-32,1,3)	(96-40,2,7) (120-40,1,3) (120-40,1,5) (120-40,1,5)	(240-80,2,7) (240-80,1,5) (240-80,1,5) (240-80,1,5)	(480-96,1,5) (288-96,1,5) (288-96,1,5) (288-96,1,5)	(576-192,2,7) (1152-192,1,7) (576-192,1,7) (576-192,1,7)	(1152-320,1,7)
Single-Path NAS [5]	(32)	(32-16,1,3)	(48-24,2,3) (72-24,1,3) (72-24,1,3)	(144-40,2,5) (120-40,1,3) (120-40,1,3) (120-40,1,3)	(240-80,2,5) (240-80,1,3) (240-80,1,3) (240-80,1,3)	(480-96,1,5) (288-96,1,5) (288-96,1,5) (288-96,1,5)	(576-192,2,5) (1152-192,1,5) (1152-192,1,5) (1152-192,1,5)	(1152-320,1,3)
MobileNetV3-Large [6]	(32)	(32-16,1,3)	(64-24,2,3) (72-24,1,3)	(72-40,2,5)* (120-40,1,5)* (120-40,1,5)*	(240-80,2,3) (200-80,1,3) (184-80,1,3) (184-80,1,3)	(480-112,1,3)* (672-112,1,3)*	(672-160,2,5)* (960-160,1,5)* (960-160,1,5)*	Conv2D (X-960,1,1)
EfficientNet-B0 [7]	(32)	(32-16,1,3)*	(96-24,2,3) (144-24,1,3)	(144-40,2,5) (240-40,1,5)	(240-80,2,3) (480-80,1,3)* (480-80,1,3)	(480-112,1,5)* (672-112,1,5)* (672-112,1,5)*	(672-192,2,5)* (1152-192,1,5)* (1152-192,1,5)* (1152-192,1,5)*	(1152-320,1,3)*
MixNet-M [8]	(24)	(24-24,1,3)	(144-32,2,3/5/7) (96-32,1,3)	(192-40,2,3/5/7/9)* (240-40,1,3/5)* (240-40,1,3/5)* (240-40,1,3/5)*	(240-80,2,3/5/7)* (240-80,1,3/5/7/9)* (240-80,1,3/5/7/9)* (240-80,1,3/5/7/9)*	(480-120,1,3)* (360-120,1,3/5/7/9)* (360-120,1,3/5/7/9)* (360-120,1,3/5/7/9)*	(720-200,2,3/5/7/9)* (1200-200,1,3/5/7/9)* (1200-200,1,3/5/7/9)*	(1200-200,1,3/5/7/9)*
ReXNet [9]	(32)	(32-16,1,3)	(96-27,2,3) (162-38,1,3)	(228-50,2,3)* (300-61,1,3)*	(366-72,2,3)* (432-84,1,3)* (504-95,1,3)*	(570-106,1,3)* (636-117,1,3)* (702-128,1,3)*	(768-140,2,3)* (840-151,1,3)* (906-162,1,3)* (972-174,1,3)*	(1044-185,1,3)

B. Visualization of Searched Models

We visualize the network architectures searched by SyncNAS below. Each block is parameterized by in_c, exp_c, out_c, k, and s, depending on the type of block.

Baseline Model Traning Information

Optimizer: Stochastic Gradient Descent
Learning Rate Scheduler: Cosine Annealing with Warm Restarts
- Warm-up: 10 epochs
Weight Decay: 1e-5
Initial Learning Rate
- MobileNetV2: 0.256
- FBNet-B: 0.512
- MnasNet: 0.512

References

[1] Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. “Mobilenetv2: Inverted Residuals And Linear Bottlenecks”. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2018, pp. 4510–4520.

[2] Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, and Quoc V Le. “Mnasnet: Platform-Aware Neural Architecture Search For Mobile”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019, pp. 2820–2828.

[3] Bichen Wu, Xiaoliang Dai, Peizhao Zhang, Yanghan Wang, Fei Sun, Yiming Wu, Yuandong Tian, Peter Vajda, Yangqing Jia, and Kurt Keutzer. “Fbnet: Hardware-Aware Efficient Convnet Design Via Differentiable Neural Architecture Search”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019, pp. 10734–10742.

[4] Han Cai, Ligeng Zhu, and Song Han. “ProxylessNAS: Direct Neural Architecture Search On Target Task And Hardware”. In: International Conference on Learning Representations (ICLR). 2019.

[5] Dimitrios Stamoulis, Ruizhou Ding, Di Wang, Dimitrios Lymberopoulos, Bodhi Priyantha, Jie Liu, and Diana Marculescu. “Single-Path Nas: Designing Hardware-Efficient Convnets In Less Than 4 Hours”. Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2019.

[6] Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, et al. “Searching For MobilenetV3”. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019, pp. 1314–1324.

[7] Mingxing Tan and Quoc Le. “Efficientnet: Rethinking Model Scaling For Convolutional Neural Networks”. In: International Conference on Machine Learning. 2019, pp. 6105–6114.

[8] Mingxing Tan and Quoc V Le. “Mixconv: Mixed Depthwise Convolutional Kernels”. In Proceedings of the British Machine Vision Conference. (2019).

[9] Dongyoon Han, Sangdoo Yun, Byeongho Heo, and YoungJoon Yoo. “Rethinking Channel Dimensions For Efficient Model Design”. 2021. Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
model_configs		model_configs
npu_simulator_regression		npu_simulator_regression
pretrained		pretrained
src		src
.gitignore		.gitignore
LICENSE		LICENSE
Lightweight_Model_Design_Trend.pdf		Lightweight_Model_Design_Trend.pdf
README.md		README.md
eval.py		eval.py
requirements.txt		requirements.txt
torch_modules.py		torch_modules.py

License

beomwookang/SyncNAS

Folders and files

Latest commit

History

Repository files navigation

Synchronization-aware NAS for an Efficient Collaborative Inference on Mobile Platforms

Evaluation of Pre-trained Versions of Our models

Requirements

How to use

Monte Carlo Tree Search

Appendix

A. Lightweight Model Design Trend

B. Visualization of Searched Models

Baseline Model Traning Information

References

About

Resources

License

Stars

Watchers

Forks

Languages