# Coding Project: Differentiable NAS

* ### Based on the paper:H. Liu, K. Simonyanand Y. Yang, “DARTS: Differentiable Architecture Search,” International Conference on Learning Representations (ICLR),2019

* ### Assignment

  1. Find a codebase of this paper (the original DARTS implementation is available, and you can find a few variants), download the CIFAR10 and CIFAR100 datasets

  **The dataset and codebase have already upload in the OBS of Huawei Cloud Platform, you can use it directly in the ModelArts.**

  1. Run the basic code on the server, with the standard configuration of the selected paperon CIFAR10 (take the computational costs into consideration)
  
  2. Finish the required task and one of the optional tasks (see the following slides) –of course, you can do more than one optional tasks if you wish (bonus points)
  3. If you have more ideas, please specify a new task by yourself (bonus points)
  4. Remember: integrate your results into your reading report
  5. Date assigned: Nov. 19, 2019;    Date Due: Dec 14, 2020


# Required Task

* The basic training and testing pipeline
    * Run a complete search process with DARTS or any of its variant on CIFAR10 (PC-DARTS is preferred due to the low costs)
    * Note: due to the limitation of computational resource, you may not have sufficient resource to perform the re-training process
    * Pay attention to the hyper-parameters (config, epochs, etc.)
* Questions that should be answered in the report
    * Paste complete training and testing curves and the final architecture
    * Report the training and validation accuracy throughout the process
    * How is performance changing with the number of iterations?
    * Any other significant features that can be recognized in the curves?

## Preparation
One time installation of required libraries from requirement.txt and creating data path

In [1]:
# !pip3 install torch
!mkdir data

mkdir: cannot create directory ‘data’: File exists


Downloading CIFAR10

In [2]:
from dataset.dataset_dowloader_ import *

cifar10_dowloader()

Successfully download file cv-course-public/coding-1/cifar-10-python.tar.gz from OBS to local ./data/cifar-10-python.tar.gz


Let's start!

We are going to search couple of genotypes. We choose next combinations of initial hyperparams:

  * `N3-E50-CS6-BS256-CT10-BT96` - classic
     - `N3` - nodes number (4) in each cell during search
     - `E50` - epochs number (50) for searching final genotype
     - `CS6` - cell number (8) as a "layer"
     - `BS256` - batch size (256) from CIFAR10 (training portion of data is 0.5 - 25k) during search
     - `CT10` - nodes number (4) in each cell during eval
     - `BT96` - batch size (256) from CIFAR10 (training portion of data is 0.5 - 25k) during eval
     
  * `N3-E50-CS6-BS128-CT10-BT96` - batch size (128)
     
  * `N3-E100-CS6-BS256-CT10-BT96` - epochs number (100)

In [None]:
!python train_search.py --data='./data' --save='N3-E50-CS6-BS256-CT10-BT96' --nodes=3 --multiplier=3 --layers=6

In [None]:
!python train_search.py --data='./data' --save='N3-E50-CS6-BS128-CT10-BT96' --nodes=3 --multiplier=3 --layers=6 --batch_size=128

Experiment dir : search-N3-E50-CS6-BS128-CT10-BT96-20200301-003852
03/01 12:38:52 AM gpu device = 0
03/01 12:38:52 AM args = Namespace(arch_learning_rate=0.0006, arch_weight_decay=0.001, batch_size=128, cutout=False, cutout_length=16, data='./data', drop_path_prob=0.3, epochs=50, gpu=0, grad_clip=5, init_channels=16, layers=6, learning_rate=0.1, learning_rate_min=0.001, model_path='saved_models', momentum=0.9, multiplier=3, nodes=3, report_freq=50, save='search-N3-E50-CS6-BS128-CT10-BT96-20200301-003852', seed=2, set='cifar10', train_portion=0.5, unrolled=False, weight_decay=0.0003)
03/01 12:38:57 AM param size = 0.134410MB
Using downloaded and verified file: ./data/cifar-10-python.tar.gz
03/01 12:39:00 AM epoch 0 lr 1.000000e-01
03/01 12:39:00 AM genotype_debug = Genotype(normal=[('dil_conv_3x3', 'max_pool_3x3', 0), ('dil_conv_3x3', 'max_pool_3x3', 1), ('max_pool_3x3', 'max_pool_3x3', 1), ('skip_connect', 'max_pool_3x3', 2), ('sep_conv_3x3', 'avg_pool_3x3', 0), ('avg_pool_3x3', 'max_p

03/01 12:42:24 AM train 000 1.248078e+00 52.343750 93.750000
03/01 12:42:50 AM train 050 1.299097e+00 52.711397 94.653799
03/01 12:43:16 AM train 100 1.278301e+00 53.163676 94.647277
03/01 12:43:41 AM train 150 1.268171e+00 53.621689 94.846854
03/01 12:44:04 AM train_acc 54.344000
03/01 12:44:05 AM epoch 3 lr 9.912322e-02
03/01 12:44:05 AM genotype_debug = Genotype(normal=[('dil_conv_3x3', 'max_pool_3x3', 0), ('dil_conv_3x3', 'max_pool_3x3', 1), ('max_pool_3x3', 'max_pool_3x3', 1), ('skip_connect', 'max_pool_3x3', 2), ('sep_conv_3x3', 'avg_pool_3x3', 0), ('avg_pool_3x3', 'max_pool_3x3', 0), ('skip_connect', 'avg_pool_3x3', 2), ('dil_conv_3x3', 'sep_conv_3x3', 1), ('dil_conv_5x5', 'max_pool_3x3', 3)], normal_concat=range(2, 5), reduce=[('avg_pool_3x3', 'max_pool_3x3', 0), ('sep_conv_5x5', 'skip_connect', 1), ('sep_conv_5x5', 'skip_connect', 0), ('max_pool_3x3', 'max_pool_3x3', 2), ('max_pool_3x3', 'max_pool_3x3', 1), ('max_pool_3x3', 'max_pool_3x3', 1), ('max_pool_3x3', 'max_pool_3x3', 

        [0.1251, 0.1250, 0.1249, 0.1250, 0.1249, 0.1251, 0.1251, 0.1249],
        [0.1250, 0.1248, 0.1249, 0.1252, 0.1251, 0.1252, 0.1250, 0.1249],
        [0.1250, 0.1251, 0.1251, 0.1250, 0.1249, 0.1250, 0.1249, 0.1249],
        [0.1249, 0.1251, 0.1251, 0.1249, 0.1250, 0.1250, 0.1250, 0.1249],
        [0.1250, 0.1252, 0.1251, 0.1250, 0.1250, 0.1249, 0.1250, 0.1249],
        [0.1250, 0.1251, 0.1251, 0.1248, 0.1250, 0.1250, 0.1250, 0.1249],
        [0.1252, 0.1251, 0.1249, 0.1249, 0.1249, 0.1249, 0.1250, 0.1251],
        [0.1251, 0.1250, 0.1251, 0.1252, 0.1249, 0.1250, 0.1248, 0.1249]],
       device='cuda:0', grad_fn=<SoftmaxBackward>)
tensor([0.3330, 0.3336, 0.3334], device='cuda:0', grad_fn=<SoftmaxBackward>)
03/01 12:47:28 AM train 000 1.002734e+00 64.843750 98.437500
03/01 12:47:54 AM train 050 1.023046e+00 62.714461 96.798407
03/01 12:48:19 AM train 100 1.026852e+00 62.755260 96.712562
03/01 12:48:45 AM train 150 1.010802e+00 63.539942 96.843957
03/01 12:49:08 AM train_acc 63.9880

        [0.1251, 0.1250, 0.1249, 0.1250, 0.1249, 0.1251, 0.1251, 0.1249],
        [0.1250, 0.1248, 0.1249, 0.1252, 0.1251, 0.1252, 0.1250, 0.1249],
        [0.1250, 0.1251, 0.1251, 0.1250, 0.1249, 0.1250, 0.1249, 0.1249],
        [0.1249, 0.1251, 0.1251, 0.1249, 0.1250, 0.1250, 0.1250, 0.1249],
        [0.1250, 0.1252, 0.1251, 0.1250, 0.1250, 0.1249, 0.1250, 0.1249],
        [0.1250, 0.1251, 0.1251, 0.1248, 0.1250, 0.1250, 0.1250, 0.1249],
        [0.1252, 0.1251, 0.1249, 0.1249, 0.1249, 0.1249, 0.1250, 0.1251],
        [0.1251, 0.1250, 0.1251, 0.1252, 0.1249, 0.1250, 0.1248, 0.1249]],
       device='cuda:0', grad_fn=<SoftmaxBackward>)
tensor([0.3330, 0.3336, 0.3334], device='cuda:0', grad_fn=<SoftmaxBackward>)
03/01 12:52:33 AM train 000 1.028069e+00 58.593750 96.093750
03/01 12:52:59 AM train 050 8.804079e-01 68.244485 97.702206
03/01 12:53:24 AM train 100 8.749590e-01 68.672649 97.756807
03/01 12:53:50 AM train 150 8.742217e-01 68.641349 97.847682
03/01 12:54:13 AM train_acc 69.0280

03/01 12:57:37 AM train 000 7.976441e-01 72.656250 99.218750
03/01 12:58:02 AM train 050 8.028778e-01 71.905637 98.192402
03/01 12:58:28 AM train 100 7.893840e-01 71.898205 98.073948
03/01 12:58:54 AM train 150 7.817504e-01 72.112997 98.183982
03/01 12:59:17 AM train_acc 72.388000
03/01 12:59:17 AM epoch 12 lr 8.658395e-02
03/01 12:59:17 AM genotype_debug = Genotype(normal=[('dil_conv_3x3', 'max_pool_3x3', 0), ('dil_conv_3x3', 'max_pool_3x3', 1), ('max_pool_3x3', 'max_pool_3x3', 1), ('skip_connect', 'max_pool_3x3', 2), ('sep_conv_3x3', 'avg_pool_3x3', 0), ('avg_pool_3x3', 'max_pool_3x3', 0), ('skip_connect', 'avg_pool_3x3', 2), ('dil_conv_3x3', 'sep_conv_3x3', 1), ('dil_conv_5x5', 'max_pool_3x3', 3)], normal_concat=range(2, 5), reduce=[('avg_pool_3x3', 'max_pool_3x3', 0), ('sep_conv_5x5', 'skip_connect', 1), ('sep_conv_5x5', 'skip_connect', 0), ('max_pool_3x3', 'max_pool_3x3', 2), ('max_pool_3x3', 'max_pool_3x3', 1), ('max_pool_3x3', 'max_pool_3x3', 1), ('max_pool_3x3', 'max_pool_3x3',

03/01 01:02:41 AM train 000 7.880456e-01 75.781250 98.437500
03/01 01:03:07 AM train 050 7.000935e-01 75.934436 98.590686
03/01 01:03:33 AM train 100 7.035018e-01 75.293936 98.584468
03/01 01:03:58 AM train 150 6.933117e-01 75.848510 98.608237
03/01 01:04:22 AM train_acc 75.964000
03/01 01:04:22 AM epoch 15 lr 7.959537e-02
03/01 01:04:22 AM genotype_debug = Genotype(normal=[('dil_conv_3x3', 'max_pool_3x3', 0), ('dil_conv_3x3', 'max_pool_3x3', 1), ('max_pool_3x3', 'max_pool_3x3', 1), ('skip_connect', 'max_pool_3x3', 2), ('sep_conv_3x3', 'avg_pool_3x3', 0), ('avg_pool_3x3', 'max_pool_3x3', 0), ('skip_connect', 'avg_pool_3x3', 2), ('dil_conv_3x3', 'sep_conv_3x3', 1), ('dil_conv_5x5', 'max_pool_3x3', 3)], normal_concat=range(2, 5), reduce=[('avg_pool_3x3', 'max_pool_3x3', 0), ('sep_conv_5x5', 'skip_connect', 1), ('sep_conv_5x5', 'skip_connect', 0), ('max_pool_3x3', 'max_pool_3x3', 2), ('max_pool_3x3', 'max_pool_3x3', 1), ('max_pool_3x3', 'max_pool_3x3', 1), ('max_pool_3x3', 'max_pool_3x3',

03/01 01:10:11 AM train 000 6.444074e-01 77.343750 98.437500
03/01 01:10:55 AM train 050 6.447510e-01 77.236520 98.682598
03/01 01:11:40 AM train 100 6.374932e-01 77.328280 98.731436
03/01 01:12:24 AM train 150 6.336520e-01 77.700745 98.763452
03/01 01:13:03 AM train_acc 77.888000
03/01 01:13:03 AM epoch 18 lr 7.157607e-02
03/01 01:13:03 AM genotype_debug = Genotype(normal=[('sep_conv_5x5', 'sep_conv_3x3', 1), ('dil_conv_3x3', 'sep_conv_3x3', 0), ('sep_conv_3x3', 'max_pool_3x3', 2), ('sep_conv_5x5', 'sep_conv_3x3', 1), ('dil_conv_3x3', 'sep_conv_3x3', 0), ('sep_conv_3x3', 'max_pool_3x3', 2), ('sep_conv_5x5', 'sep_conv_3x3', 3), ('sep_conv_5x5', 'max_pool_3x3', 1), ('sep_conv_5x5', 'sep_conv_3x3', 0)], normal_concat=range(2, 5), reduce=[('sep_conv_3x3', 'max_pool_3x3', 0), ('sep_conv_5x5', 'max_pool_3x3', 1), ('max_pool_3x3', 'max_pool_3x3', 2), ('sep_conv_5x5', 'max_pool_3x3', 0), ('max_pool_3x3', 'max_pool_3x3', 1), ('max_pool_3x3', 'max_pool_3x3', 3), ('sep_conv_5x5', 'max_pool_3x3',

03/01 01:18:53 AM train 000 7.334464e-01 77.343750 99.218750
03/01 01:19:38 AM train 050 5.764575e-01 80.407475 99.065564
03/01 01:20:22 AM train 100 5.717085e-01 80.275371 99.056312
03/01 01:21:06 AM train 150 5.786865e-01 79.822020 99.032492
03/01 01:21:46 AM train_acc 79.748000
03/01 01:21:46 AM epoch 21 lr 6.281015e-02
03/01 01:21:46 AM genotype_debug = Genotype(normal=[('sep_conv_5x5', 'sep_conv_3x3', 1), ('dil_conv_3x3', 'sep_conv_3x3', 0), ('sep_conv_5x5', 'sep_conv_3x3', 1), ('dil_conv_5x5', 'sep_conv_5x5', 2), ('dil_conv_5x5', 'dil_conv_3x3', 0), ('sep_conv_3x3', 'max_pool_3x3', 2), ('sep_conv_5x5', 'avg_pool_3x3', 1), ('sep_conv_5x5', 'sep_conv_3x3', 3), ('sep_conv_5x5', 'sep_conv_3x3', 0)], normal_concat=range(2, 5), reduce=[('sep_conv_5x5', 'sep_conv_3x3', 1), ('sep_conv_5x5', 'sep_conv_3x3', 0), ('max_pool_3x3', 'max_pool_3x3', 2), ('sep_conv_5x5', 'max_pool_3x3', 0), ('max_pool_3x3', 'max_pool_3x3', 1), ('max_pool_3x3', 'max_pool_3x3', 3), ('max_pool_3x3', 'max_pool_3x3',

03/01 01:27:34 AM train 000 5.172673e-01 81.250000 97.656250
03/01 01:28:18 AM train 050 5.177096e-01 81.740196 99.172794
03/01 01:29:02 AM train 100 5.283115e-01 81.427908 99.234220
03/01 01:29:45 AM train 150 5.335048e-01 81.317260 99.239445
03/01 01:30:25 AM train_acc 81.444000
03/01 01:30:25 AM epoch 24 lr 5.360813e-02
03/01 01:30:25 AM genotype_debug = Genotype(normal=[('sep_conv_5x5', 'sep_conv_3x3', 1), ('dil_conv_3x3', 'sep_conv_5x5', 0), ('sep_conv_5x5', 'sep_conv_3x3', 1), ('dil_conv_5x5', 'sep_conv_3x3', 2), ('dil_conv_5x5', 'dil_conv_3x3', 0), ('sep_conv_5x5', 'avg_pool_3x3', 1), ('sep_conv_5x5', 'sep_conv_3x3', 3), ('sep_conv_3x3', 'avg_pool_3x3', 2), ('dil_conv_3x3', 'sep_conv_5x5', 0)], normal_concat=range(2, 5), reduce=[('sep_conv_5x5', 'sep_conv_3x3', 0), ('sep_conv_5x5', 'sep_conv_3x3', 1), ('max_pool_3x3', 'max_pool_3x3', 2), ('sep_conv_5x5', 'max_pool_3x3', 0), ('max_pool_3x3', 'max_pool_3x3', 1), ('max_pool_3x3', 'max_pool_3x3', 3), ('sep_conv_5x5', 'max_pool_3x3',

        [0.1181, 0.1255, 0.1105, 0.1166, 0.1260, 0.1552, 0.1144, 0.1336],
        [0.1160, 0.1380, 0.1142, 0.1201, 0.1201, 0.1537, 0.1220, 0.1158],
        [0.1213, 0.1335, 0.1209, 0.1287, 0.1258, 0.1272, 0.1274, 0.1152],
        [0.1194, 0.1537, 0.1155, 0.1208, 0.1169, 0.1244, 0.1153, 0.1339],
        [0.1118, 0.1367, 0.1231, 0.1198, 0.1262, 0.1444, 0.1196, 0.1184],
        [0.1132, 0.1369, 0.1217, 0.1210, 0.1409, 0.1379, 0.1129, 0.1155],
        [0.1147, 0.1366, 0.1135, 0.1119, 0.1323, 0.1382, 0.1190, 0.1340],
        [0.1158, 0.1533, 0.1125, 0.1148, 0.1276, 0.1165, 0.1308, 0.1286]],
       device='cuda:0', grad_fn=<SoftmaxBackward>)
tensor([0.2804, 0.3470, 0.3726], device='cuda:0', grad_fn=<SoftmaxBackward>)
03/01 01:36:11 AM train 000 5.033362e-01 81.250000 98.437500
03/01 01:36:55 AM train 050 4.956421e-01 82.720588 99.463848
03/01 01:37:39 AM train 100 4.956055e-01 82.812500 99.350248
03/01 01:38:23 AM train 150 4.956443e-01 82.890108 99.332575


In [None]:
!python train_search.py --data='./data' --save='N3-E100-CS6-BS256-CT10-BT96' --nodes=3 --multiplier=3 --layers=6 --epochs=100

In [None]:
!python train.py --auxiliary --cutout --arch='' --data='./data' --save='N4-E50-CS8-BS256'

In [None]:
!python train.py --auxiliary --cutout --arch='N4-E50-CS8-BS128-20200118-105259' --data='./data' --save='N4-E50-CS8-BS128'

In [None]:
!python train.py --auxiliary --cutout --arch='N4-E50-CS4-BS256-20200118-105518' --data='./data' --save='N4-E50-CS4-BS256'

In [None]:
!python train.py --auxiliary --cutout --arch='N4-E20-CS8-BS256-20200118-105659' --data='./data' --save='N4-E20-CS8-BS256'