Skip to content

[NeurIPS 2021] “Stronger NAS with Weaker Predictors“, Junru Wu, Xiyang Dai, Dongdong Chen, Yinpeng Chen, Mengchen Liu, Ye Yu, Zhangyang Wang, Zicheng Liu, Mei Chen and Lu Yuan

Notifications You must be signed in to change notification settings

VITA-Group/WeakNAS

Repository files navigation

Stronger NAS with Weaker Predictors

[NeurIPS'21] Stronger NAS with Weaker Predictors.

Junru Wu, Xiyang Dai, Dongdong Chen, Yinpeng Chen, Mengchen Liu, Ye Yu, Zhangyang Wang, Zicheng Liu, Mei Chen and Lu Yuan

Overview

Our WeakNAS Pipeline

drawing

Search Dynamic Visualization of WeakNAS in t-SNE

drawing

Reproduce the aboved figure:
visualize_search_dynamic.ipynb

Implementation

  • NAS-Bench Search Space
    • NAS-Bench-101 Search Space (CIFAR10)
    • NAS-Bench-201 Search Space (CIFAR10, CIFAR100, ImageNet16-120)
  • Open Domain Search Space
    • NASNet Search Space (ImageNet)
    • MobileNet Search Space (ImageNet)
  • Interpretation
    • Search Dynamic Visualization in t-SNE (NAS-Bench-201)

Environment

pip install -r requirements.txt

NASBench Search Space

NAS-Bench-101

Download pre-processed NAS-Bench-101 from this Link, Replace $BENCH_PATH with the file path

Replace --save_dir with your own log path, repeat at least 100 times for stable result

  • run WeakNAS

  • python WeakNAS.py --rand_seed -1 --repeat 100 --train_set valid --test_set test \
    --save_dir OUTPUT/nasbench101/init_100_sample_10/top_start_100_end_100/MLP/onehot_size_1000_1000_1000_1000_iter_100/acq_uniform/deter_False \
    --bench_path $BENCH_PATH --bench nasbench101 --dataset cifar10 --deterministic False \
    --top_start 100 --top_end 100 --init_sample 100 --sample_each_iter 10 --sampling_method uniform \
    --predictor MLP --max_sample 1000 --mlp_size 1000 1000 1000 1000 --mlp_iter 100
  • run WeakNAS + EI Variant

  • python WeakNAS.py --rand_seed -1 --repeat 100 --train_set valid --test_set test \
    --save_dir OUTPUT/nasbench101/init_100_sample_10/top_start_100_end_100/MLP/onehot_size_1000_1000_1000_1000_iter_100/acq_ei/deter_False \
    --bench_path $BENCH_PATH --bench nasbench101 --dataset cifar10 --deterministic False \
    --top_start 100 --top_end 100 --init_sample 100 --sample_each_iter 10 --sampling_method ei \
    --predictor MLP --max_sample 1000 --mlp_size 1000 1000 1000 1000 --mlp_iter 100 
    Note: EI calculation took very long time (~10 hrs each run)
  • Plot Figure

  • plot_nasbench101.ipynb

NAS-Bench-201

  • CIFAR10 Subset

    • Download pre-processed NAS-Bench-201 CIFAR10 Subset from this Link, Replace $BENCH_PATH with the file path

    • Replace --save_dir with your own log path, repeat at least 100 times for stable result

    • run WeakNAS

    • python WeakNAS.py --rand_seed -1 --repeat 100 --train_set x-valid --test_set ori-test \
      --save_dir OUTPUT/nasbench201/cifar10/init_10_sample_10/top_start_100_end_100/MLP/onehot_size_1000_1000_1000_1000_iter_100/acq_uniform/deter_False \
      --bench_path $BENCH_PATH --bench nasbench201 --dataset cifar10-valid --deterministic False \
      --top_start 100 --top_end 100 --init_sample 10 --sample_each_iter 10 --sampling_method uniform \
      --predictor MLP --max_sample 1000 --mlp_size 1000 1000 1000 1000 --mlp_iter 100
    • run WeakNAS + EI Variant

    • python WeakNAS.py --rand_seed -1 --repeat 100 --train_set x-valid --test_set ori-test \
      --save_dir OUTPUT/nasbench201/cifar10/init_10_sample_10/top_start_100_end_100/MLP/onehot_size_1000_1000_1000_1000_iter_100/acq_ei/deter_False \
      --bench_path $BENCH_PATH --bench nasbench201 --dataset cifar10-valid --deterministic False \
      --top_start 100 --top_end 100 --init_sample 10 --sample_each_iter 10 --sampling_method ei \
      --predictor MLP --max_sample 1000 --mlp_size 1000 1000 1000 1000 --mlp_iter 100

      Note: EI calculation took very long time (~5 hrs each run)

  • CIFAR100 Subset

    • Download pre-processed NAS-Bench-201 CIFAR100 Subset from this Link, Replace $BENCH_PATH with the file path

    • Replace --save_dir with your own log path, repeat at least 100 times for stable result

    • run WeakNAS

    • python WeakNAS.py --rand_seed -1 --repeat 100 --train_set x-valid --test_set x-test \
      --save_dir OUTPUT/nasbench201/cifar100/init_10_sample_10/top_start_100_end_100/MLP/onehot_size_1000_1000_1000_1000_iter_100/acq_uniform/deter_False \
      --bench_path $BENCH_PATH --bench nasbench201 --dataset cifar100 --deterministic False \
      --top_start 100 --top_end 100 --init_sample 10 --sample_each_iter 10 --sampling_method uniform \
      --predictor MLP --max_sample 1000 --mlp_size 1000 1000 1000 1000 --mlp_iter 100
    • run WeakNAS + EI Variant

    • python WeakNAS.py --rand_seed -1 --repeat 100 --train_set x-valid --test_set x-test \
      --save_dir OUTPUT/nasbench201/cifar100/init_10_sample_10/top_start_100_end_100/MLP/onehot_size_1000_1000_1000_1000_iter_100/acq_ei/deter_False \
      --bench_path $BENCH_PATH --bench nasbench201 --dataset cifar100 --deterministic False \
      --top_start 100 --top_end 100 --init_sample 10 --sample_each_iter 10 --sampling_method ei \
      --predictor MLP --max_sample 1000 --mlp_size 1000 1000 1000 1000 --mlp_iter 100

      Note: EI calculation took very long time (~5 hrs each run)

  • ImageNet16-120 Subset

    • Download pre-processed NAS-Bench-201 ImageNet16-120 Subset from this Link, Replace $BENCH_PATH with the file path

    • Replace --save_dir with your own log path, repeat at least 100 times for stable result

    • run WeakNAS

    • python WeakNAS.py --rand_seed -1 --repeat 100 --train_set x-valid --test_set x-test \
      --save_dir OUTPUT/nasbench201/ImageNet16-120/init_10_sample_10/top_start_100_end_100/MLP/onehot_size_1000_1000_1000_1000_iter_100/acq_uniform/deter_False \
      --bench_path $BENCH_PATH --bench nasbench201 --dataset ImageNet16-120 --deterministic False \
      --top_start 100 --top_end 100 --init_sample 10 --sample_each_iter 10 --sampling_method uniform \
      --predictor MLP --max_sample 1000 --mlp_size 1000 1000 1000 1000 --mlp_iter 100
    • run WeakNAS + EI Variant

    • python WeakNAS.py --rand_seed -1 --repeat 100 --train_set x-valid --test_set x-test \
      --save_dir OUTPUT/nasbench201/ImageNet16-120/init_10_sample_10/top_start_100_end_100/MLP/onehot_size_1000_1000_1000_1000_iter_100/acq_ei/deter_False \
      --bench_path $BENCH_PATH --bench nasbench201 --dataset ImageNet16-120 --deterministic False \
      --top_start 100 --top_end 100 --init_sample 10 --sample_each_iter 10 --sampling_method ei \
      --predictor MLP --max_sample 1000 --mlp_size 1000 1000 1000 1000 --mlp_iter 100

      Note: EI calculation took very long time (~5 hrs each run)

  • Plot Figure

  • plot_nasbench201.ipynb

Open Domain Search Space

ImageNet (MobileNet Setting)

drawing

Best architecture founded by WeakNAS

drawing

  • Train SuperNet
    • We use the codebase OFA as our training pipeline, directly reuse the weight from pretrain SuperNet variant "ofa_mbv3_d234_e346_k357_w1.2".
  • Search
    • More details in imagenet_mobilenet_search.ipynb, it will print out the founded best architecture(s)
  • Train from scratch
    • We use the pytorch-image-models codebase as our training pipeline.

    • Our run of best architecture founded by WeakNAS

      • Best architecture @800 Queries
      cd pytorch-image-models;
      bash distributed_train.sh $NUM_GPU $IMAGENET_PATH --model ofa_mbv3_800 -b 128 \
      --sched cosine --img-size 236 --epochs 300 --warmup-epochs 3 --decay-rate .97 \
      --opt rmsproptf --opt-eps .001 -j 10 --warmup-lr 1e-6 --weight-decay 1e-05 --drop 0.3 \
      --drop-path 0.0 --model-ema --model-ema-decay 0.9999 --aa rand-m9-mstd0.5 \
      --remode pixel --reprob 0.2 --lr 1e-02 --output $LOG_PATH \
      --experiment res_236/bs_128/cosine/lr_5e-03/wd_1e-05/epoch_300/dp_0.0 --log-interval 200
      • Best architecture @1000 Queries
      cd pytorch-image-models;
      bash distributed_train.sh $NUM_GPU $IMAGENET_PATH --model ofa_mbv3_1000 -b 128 \
      --sched cosine --img-size 236 --epochs 600 --warmup-epochs 3 --decay-rate .97 \
      --opt rmsproptf --opt-eps .001 -j 10 --warmup-lr 1e-6 --weight-decay 1e-05 --drop 0.3 \
      --drop-path 0.0 --model-ema --model-ema-decay 0.9999 --aa rand-m9-mstd0.5 \
      --remode pixel --reprob 0.2 --lr 1e-02 --output $LOG_PATH \
      --experiment res_236/bs_128/cosine/lr_5e-03/wd_1e-05/epoch_600/dp_0.0 --log-interval 200
    • Adapt to your run of best architecture founded by WeakNAS

      • Modify line 24 - line 53 pytorch-image-models/timm/models/ofa_mbv3.py, add the configuration of architecture founded in search stage as "ofa_mbv3_custom" to default_cfgs
      cd pytorch-image-models;
      bash distributed_train.sh $NUM_GPU $IMAGENET_PATH --model ofa_mbv3_custom -b 128 \
      --sched cosine --img-size 236 --epochs 600 --warmup-epochs 3 --decay-rate .97 \
      --opt rmsproptf --opt-eps .001 -j 10 --warmup-lr 1e-6 --weight-decay 1e-05 --drop 0.3 \
      --drop-path 0.0 --model-ema --model-ema-decay 0.9999 --aa rand-m9-mstd0.5 \
      --remode pixel --reprob 0.2 --lr 1e-02 --output $LOG_PATH \
      --experiment res_236/bs_128/cosine/lr_5e-03/wd_1e-05/epoch_600/dp_0.0 --log-interval 200
    Previous Tensorboard.dev Logs: Link

Acknowledgement

NASBench Codebase from AutoDL-Projects
ImageNet Codebase from timm

Citation

if you find this repo is helpful, please cite

@article{wu2021weak,
  title={Stronger NAS with Weaker Predictors},
  author={Junru Wu and Xiyang Dai and Dongdong Chen and Yinpeng Chen and Mengchen Liu and Ye Yu and Zhangyang Wang and Zicheng Liu and Mei Chen and Lu Yuan},
  journal={arXiv preprint arXiv:2102.10490},
  year={2021}
}

About

[NeurIPS 2021] “Stronger NAS with Weaker Predictors“, Junru Wu, Xiyang Dai, Dongdong Chen, Yinpeng Chen, Mengchen Liu, Ye Yu, Zhangyang Wang, Zicheng Liu, Mei Chen and Lu Yuan

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published