# AnalogNAS Tutorial

[AnalogAINAS](https://github.com/IBM/analog-nas) is a framework that aims at building analog-aware efficient deep learning models. AnalogNAS is built on top of the [AIHWKIT](https://github.com/IBM/aihwkit). IBM Analog Hardware Acceleration Kit (AIHWKIT) is an open source Python toolkit for exploring and using the capabilities of in-memory computing devices in the context of artificial intelligence.

In a high-level AnalogAINAS consists of 4 main building blocks which (can) interact with each other:
* Configuration spaces: a search space of architectures targeting a specific dataset.
* Evaluator: a ML predictor model to predict: 
    * 1-day Accuracy: the evaluator models the drift effect that is encountered in Analog devices. The accuracy after 1 day of drift is then predicted and used as an objective to maximize. 
    * The Accuracy Variation for One Month (AVM): The difference between the accuracy after 1 month and the accuracy after 1 sec. 
    * The 1-day accuracy standard deviation: The stochasticity of the noise induces different variation of the model's accuracy depending on its architecture. 
* Optimizer: a optimization strategy such as evolutionary algorithm or bayesian optimization. 
* Worker: A global object that runs the architecture search loop and the final network training pipeline

### Installation and setup
**NOTE:** this installation is tested on a Linux and Windows machine.

Firstly, refer to [AIHWKit installation](https://aihwkit.readthedocs.io/en/latest/install.html) to install Pytorch and the AIHWKit toolkit. 

Install the additional requirements, using:
```
pip install -r requirements.txt 
```

Afterwards, install AnalogNAS by running the ```setup.py``` file:
``` 
python setup.py install 
```

### Create a Configuration Space
AnalogNAS presents a general search space composed of ResNet-like architectures. 

The macro-architecture defined in the file ```search_spaces/resnet_macro_architecture.py``` is customizable to any image classification dataset, given an input shape and output classes. 

In [1]:
from analogainas.search_spaces.config_space import ConfigSpace

In [2]:
## Default Search Space
CS = ConfigSpace("resnet-like",'CIFAR-10')
CS

Architecture Type: resnet-like
Search Space Size: 773094113280
------------------------------------------------
0)
Name: out_channel0
Min_Value:0
Max_value:0
Step:1

1)
Name: M
Min_Value:1
Max_value:5
Step:1

2)
Name: R1
Min_Value:1
Max_value:16
Step:1

3)
Name: R2
Min_Value:0
Max_value:16
Step:1

4)
Name: R3
Min_Value:0
Max_value:16
Step:1

5)
Name: R4
Min_Value:0
Max_value:16
Step:1

6)
Name: R5
Min_Value:0
Max_value:16
Step:1

7)
Name: convblock1
Min_Value:0
Max_value:0
Step:1

8)
Name: widenfact1
Min_Value:0.5
Max_value:0.8
Step:1

9)
Name: B1
Min_Value:1
Max_value:5
Step:1

10)
Name: convblock2
Min_Value:0
Max_value:0
Step:1

11)
Name: widenfact2
Min_Value:0.5
Max_value:0.8
Step:1

12)
Name: B2
Min_Value:1
Max_value:5
Step:1

13)
Name: convblock3
Min_Value:0
Max_value:0
Step:1

14)
Name: widenfact3
Min_Value:0.5
Max_value:0.8
Step:1

15)
Name: B3
Min_Value:1
Max_value:5
Step:1

16)
Name: convblock4
Min_Value:0
Max_value:0
Step:1

17)
Name: widenfact4
Min_Value:0.5
Max_value:0.8
St

In [3]:
CS.get_hyperparameters()

['out_channel0', 'M', 'R1', 'R2', 'R3', 'R4', 'R5', 'convblock1', 'widenfact1', 'B1', 'convblock2', 'widenfact2', 'B2', 'convblock3', 'widenfact3', 'B3', 'convblock4', 'widenfact4', 'B4', 'convblock5', 'widenfact5', 'B5']


In [4]:
## Add a hyperparameter 
## Name should be a unique ID. 
CS.add_hyperparameter("out_channel", "discrete", min_value=8, max_value=32, step=3)

In [5]:
## The error is generated on purpose when the same name is given
CS.add_hyperparameter("out_channel", "discrete", min_value=8, max_value=32, step=3)

Exception: Name should be unique!

In [6]:
CS.compute_cs_size()

6184752906240

In [7]:
CS.remove_hyperparameter("out_channel")
CS.compute_cs_size()

773094113280

In [8]:
# Sample possible configurations
configs = CS.sample_arch_uniformly(5)

In [9]:
# Configs is a dictionary holding 5 possible architectures in our search space. 
configs 

[{'out_channel0': 32,
  'M': 4,
  'R1': 11,
  'R2': 7,
  'R3': 7,
  'R4': 0,
  'R5': 0,
  'convblock1': 1,
  'widenfact1': 0.6734827172614789,
  'B1': 4,
  'convblock2': 2,
  'widenfact2': 0.5117470659165128,
  'B2': 3,
  'convblock3': 1,
  'widenfact3': 0.7852849803876965,
  'B3': 2,
  'convblock4': 1,
  'widenfact4': 0.6249027462741319,
  'B4': 1,
  'convblock5': 0,
  'widenfact5': 0,
  'B5': 0},
 {'out_channel0': 12,
  'M': 2,
  'R1': 11,
  'R2': 11,
  'R3': 0,
  'R4': 0,
  'R5': 0,
  'convblock1': 1,
  'widenfact1': 0.7024563876233106,
  'B1': 3,
  'convblock2': 2,
  'widenfact2': 0.774605503165751,
  'B2': 1,
  'convblock3': 0,
  'widenfact3': 0,
  'B3': 0,
  'convblock4': 0,
  'widenfact4': 0,
  'B4': 0,
  'convblock5': 0,
  'widenfact5': 0,
  'B5': 0},
 {'out_channel0': 16,
  'M': 1,
  'R1': 1,
  'R2': 0,
  'R3': 0,
  'R4': 0,
  'R5': 0,
  'convblock1': 1,
  'widenfact1': 0.5633164544867959,
  'B1': 4,
  'convblock2': 0,
  'widenfact2': 0,
  'B2': 0,
  'convblock3': 0,
  'widenf

## Evaluator 

To speed up the search, we built a machine learning predictor to evaluate the accuracy and robustness of any given architecture from the configuration space. 

In [10]:
# Load the evaluator 
from analogainas.evaluators.xgboost import XGBoostEvaluator
evaluator = XGBoostEvaluator()

In [11]:
# The ranker model ranks the architectures according to their 1-day accuracy. It is trained with a listwise training loss. 
evaluator.ranker 

XGBRegressor(base_score='8.5658985E-1', booster='gbtree', callbacks=None,
             colsample_bylevel=None, colsample_bynode=None,
             colsample_bytree=None, device=None, early_stopping_rounds=None,
             enable_categorical=False, eval_metric=None, feature_types=None,
             gamma=None, grow_policy=None, importance_type=None,
             interaction_constraints=None, learning_rate=None, max_bin=None,
             max_cat_threshold=None, max_cat_to_onehot=None,
             max_delta_step=None, max_depth=None, max_leaves=None,
             min_child_weight=None, missing=nan, monotone_constraints=None,
             multi_strategy=None, n_estimators=None, n_jobs=None,
             num_parallel_tree=None, random_state=None, ...)

In [12]:
# The AVM predictor regresses the average monthly variation. 
evaluator.avm_predictor

XGBRegressor(base_score='9.32594E0', booster='gbtree', callbacks=None,
             colsample_bylevel=None, colsample_bynode=None,
             colsample_bytree=None, device=None, early_stopping_rounds=None,
             enable_categorical=False, eval_metric=None, feature_types=None,
             gamma=None, grow_policy=None, importance_type=None,
             interaction_constraints=None, learning_rate=None, max_bin=None,
             max_cat_threshold=None, max_cat_to_onehot=None,
             max_delta_step=None, max_depth=None, max_leaves=None,
             min_child_weight=None, missing=nan, monotone_constraints=None,
             multi_strategy=None, n_estimators=None, n_jobs=None,
             num_parallel_tree=None, random_state=None, ...)

## Search Optimizer and Worker 
In this example, we will use evolutionary search to look for the best architecture in CS using our evaluator. 

In [13]:
from analogainas.search_algorithms.ea_optimized import EAOptimizer
from analogainas.search_algorithms.worker import Worker


In [14]:
optimizer = EAOptimizer(evaluator, population_size=20, nb_iter=10)  

In [15]:
NB_RUN = 1
worker = Worker(CS, optimizer=optimizer, runs=NB_RUN)

In [16]:
worker.search()

The 'results' directory already exists.

Search 0 started
2
ITERATION 0 completed: best acc [0.92193866]
ITERATION 1 completed: best acc [0.92193866]
ITERATION 2 completed: best acc [0.92193866]
ITERATION 3 completed: best acc [0.92193866]
ITERATION 4 completed: best acc [0.92193866]
ITERATION 5 completed: best acc [0.92193866]
ITERATION 6 completed: best acc [0.92193866]
ITERATION 7 completed: best acc [0.92193866]
ITERATION 8 completed: best acc [0.92193866]
ITERATION 9 completed: best acc [0.92193866]
Best Acc = [0.92193866]
SEARCH ENDED


  ret = _var(a, axis=axis, dtype=dtype, out=out, ddof=ddof,
  ret = ret.dtype.type(ret / rcount)


In [17]:
worker.result_summary()

Best architecture accuracy:  [0.92193866]
Standard deviation of accuracy over 1 runs: nan
Best architecture:  {'out_channel0': 16, 'M': 4, 'R1': 7, 'R2': 7, 'R3': 12, 'R4': 0, 'R5': 0, 'convblock1': 1, 'widenfact1': 0.5168686107241711, 'B1': 4, 'convblock2': 1, 'widenfact2': 0.5981899279712694, 'B2': 1, 'convblock3': 1, 'widenfact3': 0.7993492021086113, 'B3': 3, 'convblock4': 1, 'widenfact4': 0.5517837667404678, 'B4': 4, 'convblock5': 0, 'widenfact5': 0, 'B5': 0}
