## Prior-Guided Neural Architecture Search 

Competition Homepage: [2022 CVPR Track1: SuperNet Track](https://aistudio.baidu.com/aistudio/competition/detail/149/0/introduction)

Table of Content

- Environment Setup
- SuperNet Training
- SuperNet Evaluation 

### 1. Environment Setup


1.1 Env Requirements

```
paddle==2.2.2
fire
matplotlib
visualdl 
```

You can run commands:

```bash
pip install -r requirements.txt 
```

1.2 Data Prepare 

You can download imagenet-mini [from kaggle](https://www.kaggle.com/datasets/ifigotin/imagenetmini-1000). 

ImageNet-mini is a toy dataset compared with imagenet-1k, which has 1000 classes and is only 4.24GB.

For quick iteration, we adopted imagenet-mini as a dataset for training and evaluation. If you have enough computation resources, you can try imagine-1k directly.

In PP AI Studio, [imagenet-mini datasets](https://aistudio.baidu.com/aistudio/datasetdetail/89857) have been offered to us. 


1.3 Arch Json Prepare 

[Here](https://aistudio.baidu.com/aistudio/datasetdetail/134077) you can get the 4,5000 architectures for test. Download CVPR_2022_NAS_Track1_test.json and put it to `checkpoints/CVPR_2022_NAS_Track1_test.json` 


## 2. SuperNet Training

First, we need to import necessary modules.

In [None]:
import os

import paddle
import paddle.nn as nn
from paddle.nn import CrossEntropyLoss
from paddle.vision.transforms import (
    RandomHorizontalFlip, RandomResizedCrop, SaturationTransform, 
    Compose, Resize, HueTransform, BrightnessTransform, ContrastTransform, 
    RandomCrop, Normalize, RandomRotation, CenterCrop)
from paddle.io import DataLoader
from paddle.optimizer.lr import CosineAnnealingDecay, MultiStepDecay, LinearWarmup

from hnas.utils.callbacks import LRSchedulerM, MyModelCheckpoint
from hnas.utils.transforms import ToArray
from hnas.dataset.random_size_crop import MyRandomResizedCrop
from paddle.vision.datasets import DatasetFolder

from paddleslim.nas.ofa.convert_super import Convert, supernet
from paddleslim.nas.ofa import RunConfig, DistillConfig, ResOFA
from paddleslim.nas.ofa.utils import utils

import paddle.distributed as dist
from hnas.utils.yacs import CfgNode
from hnas.models.builder import build_classifier
from hnas.utils.hapi_wrapper import Trainer

: 

Then, we set the loss function and accuracy computation part in paddle.

We offer three ways to compute loss:

- Normal CrossEntropy Loss for teacher network.  
- Inplace Distillation for student network.
- Knowledge Distillation for student network.

In [None]:
def _loss_forward(self, input, tea_input=None, label=None):
    if tea_input is not None and label is not None:
        # knoledge distillation = cross entropy + inplace distillation
        ce = paddle.nn.functional.cross_entropy(
            input,
            label,
            weight=self.weight,
            ignore_index=self.ignore_index,
            reduction=self.reduction,
            soft_label=self.soft_label,
            axis=self.axis,
            name=self.name)

        kd = paddle.nn.functional.cross_entropy(
            input,
            paddle.nn.functional.softmax(tea_input),
            weight=self.weight,
            ignore_index=self.ignore_index,
            reduction=self.reduction,
            soft_label=True,
            axis=self.axis)
        return ce, kd
    elif tea_input is not None and label is None:
        # inplace distillation
        kd = paddle.nn.functional.cross_entropy(
            input,
            paddle.nn.functional.softmax(tea_input),
            weight=self.weight,
            ignore_index=self.ignore_index,
            reduction=self.reduction,
            soft_label=True,
            axis=self.axis)
        return kd 
    elif label is not None:
        # normal cross entropy 
        ce = paddle.nn.functional.cross_entropy(
            input,
            label,
            weight=self.weight,
            ignore_index=self.ignore_index,
            reduction=self.reduction,
            soft_label=False,
            axis=self.axis,
            name=self.name)
        return ce
    else:
        raise "Not Implemented Loss."

CrossEntropyLoss.forward = _loss_forward

def _compute(self, pred, tea_pred, label=None, *args):
    if label is None:
        label = tea_pred
    pred = paddle.argsort(pred, descending=True)
    pred = paddle.slice(
        pred, axes=[len(pred.shape) - 1], starts=[0], ends=[self.maxk])
    if (len(label.shape) == 1) or \
        (len(label.shape) == 2 and label.shape[-1] == 1):
        label = paddle.reshape(label, (-1, 1))
    elif label.shape[-1] != 1:
        label = paddle.argmax(label, axis=-1, keepdim=True)
    correct = pred == label
    return paddle.cast(correct, dtype='float32')

paddle.metric.Accuracy.compute = _compute

Next, we should set the parameters:

In [None]:
backbone='resnet48_prelu'
image_size='224'
max_epoch=70
lr=0.001
weight_decay=0.
momentum=0.9
batch_size=256
dyna_batch_size=4
warmup=2
phase=None
resume=None
pretrained='checkpoints/resnet48.pdparams'
image_dir='/root/paddlejob/workspace/env_run/data/ILSVRC2012/'
save_dir='checkpoints/res48-depth'
save_freq=20
log_freq=100
visualdl_dir="./visualdl_log/autoslim3"