DAAS has been accepted by PR(2021), arxiv verision is here.
Weight-sharing methods determine sub-networks by discretization, i.e. by pruning off weak candidates, and the discretization process incurs signficant inaccuracy. We propose discretization-aware architecture search to alleviate this issue. The main idea is to introduce an additional term to the loss function, so that the architectural parameters of the super-network is gradually pushed towards the desired configuration during the search process.
Figure 1: Pipeline of DA2S
The algorithm is based on continuous relaxation and gradient descent in the architecture space. Only a single GPU is required.Python == 3.6, PyTorch == 0.4
CIFAR-10 and Imagenet.
To carry out architecture search using 2nd-order approximation, run
cd cnn && python train_search.py
DA2S: Change of softmax of operation weights α during the searching procedure in a normal cell on CIFAR10.
DA2S: Change of softmax of edge weights β of node3/4/5 during the searching procedure in a normal cell searched on CIFAR10.
DARTS: Change of softmax of operation weights α during the searching procedure in a normal cell on CIFAR10.
To evaluate our best cells by training from scratch, run
cd cnn && python train.py --auxiliary --cutout # CIFAR-10
cd cnn && python train_imagenet.py --auxiliary # ImageNet
Customized architectures are supported through the --arch
flag once specified in genotypes.py
.