# Neural Architecture Search
Neural Architecture Search (NAS) is a special kind of Hyperparameter Optimization (HO) where we aim to tune the model architecture, i.e. structural properties of our model, instead of hyperparameters such as learning rates. Model architectures are hyperparameters as well, however, the search space is combinatorial in size. Thus classical HO-algorithms' runtime scales up very quickly. Specialized algorithms have been developed to search architectures more efficiently, two of them are ENAS and DARTS which we will look at here. Both of these algorithms are *one-shot* approaches to NAS. This means that they don't train each architecture sampled from the search space independently. Both exploit *weight sharing* which simply means that all architectures in the search space share the same model-parameters. This also means that both, the architecture as well as the model-parameters, are being optimized at the same time. This makes learning much faster and EANS and DARTS have been empirically proven to yield state of the art architectures.

## The Problem
Usually, NAS is defined as a bi-level optimization problem:
\begin{align}
    & \min_{\mathbf{a} \in \mathcal{A}} \mathcal{L}(\mathbf{X}_{val}, \mathbf{y}_{val}; \mathbf{w}^*) \\
    \text{s.t. } & \mathbf{w}^* = \arg \min_{\mathbf{w} \in \mathbb{R}^n} \mathcal{L}(\mathbf{X}_{train}, \mathbf{y}_{train}; \mathbf{w})
\end{align}
We will now consider two different approaches to solve this optimization problem.

## DARTS
Differentiable Architecture Search (DARTS) is based on the idea of defining a search-space over architectures using a single DAG. In this DAG, each node is a representation of the input and all edges are computations. Each edge of this DAG is a mixture of different operations, each associated with a parameter indicating how "important" a certain operation on some edge is. These parameters are then optimized w.r.t the validation loss via SGD whereas the model-parameters (i.e. parameters of the operations on the edges) are updated w.r.t. training-loss, also using standard backpropagation. 

DARTS searches for two cell-types: Normal cells and reduction cells. These two cells are exactly the same except that reduction cells reduce the input-dimensions (e.g. down-scaling an image or its feature maps) whereas a normal cell keeps all dimensions the same.

In [3]:
import json
import logging
import time
from argparse import ArgumentParser

import torch
import torch.nn as nn

import datasets
from model import CNN
from utils import accuracy
from nni.retiarii.oneshot.pytorch import DartsTrainer, EnasTrainer
from nni.retiarii.nn.pytorch import LayerChoice, InputChoice
import ops
from collections import OrderedDict

In [4]:
dataset_train, dataset_valid = datasets.get_dataset("cifar10")
model = CNN(32, 3, 16, 10, 8)
criterion = nn.CrossEntropyLoss()
optim = torch.optim.SGD(model.parameters(), 0.025, momentum=0.9, weight_decay=3.0E-4)
lr_scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optim, 10, eta_min=0.001)
trainer = DartsTrainer(
    model=model,
    loss=criterion,
    metrics=lambda output, target: accuracy(output, target, topk=(1,)),
    optimizer=optim,
    num_epochs=10,
    dataset=dataset_train,
    batch_size=64,
    log_frequency=10,
    unrolled=False
)
trainer.fit()
final_architecture = trainer.export()
print('Final architecture:', trainer.export())
json.dump(trainer.export(), open('checkpoint.json', 'w'))

Files already downloaded and verified
Files already downloaded and verified


Traceback (most recent call last):
  File "/home/jonas/anaconda3/envs/automl/lib/python3.9/multiprocessing/queues.py", line 251, in _feed
    send_bytes(obj)
  File "/home/jonas/anaconda3/envs/automl/lib/python3.9/multiprocessing/connection.py", line 205, in send_bytes
    self._send_bytes(m[offset:offset + size])
  File "/home/jonas/anaconda3/envs/automl/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes
    self._send(header + buf)
  File "/home/jonas/anaconda3/envs/automl/lib/python3.9/multiprocessing/connection.py", line 373, in _send
    n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Traceback (most recent call last):
  File "/home/jonas/anaconda3/envs/automl/lib/python3.9/multiprocessing/queues.py", line 251, in _feed
    send_bytes(obj)
  File "/home/jonas/anaconda3/envs/automl/lib/python3.9/multiprocessing/connection.py", line 205, in send_bytes
    self._send_bytes(m[offset:offset + size])
  File "/home/jonas/anaconda3/envs/automl/lib/

KeyboardInterrupt: 