# AIDK DENAS ASR DEMO

Neural Architecture Search (NAS) is quickly becoming the standard methodology to design neural network models. However, NAS is typically compute-intensive because multiple models need to be evaluated before choosing the best one. As one of the core capabilities of AIDK, DENAS is a hardware aware train-free neural architecture search framework to generate compact networks, and thus generate lightweight network architectures that can deliver higher throuhgput. DENAS is a domain-specific compact neural architecture search solution that support CV, NLP, ASR and RecSys domain.

DENAS core componements includes: 
- Reduced Search Space: The architecture is composed of some popular network skeleton like ResNet(bottleneck blocks,residual blocks), recurrent cells based model and transformer based model. Incorporating prior knowledge about the typical properties of architectures well-suited for a task can reduce the size of the search space and simplify the search.
   
- Hardware Aware Search Algorithm: Add more Intel optimzed network operators and leverage Intel tools to search and trainging as fast as possible on the CPU. We extend LINAS's multi-objective evolutionary algorithm (MOEA) to search better network architecture for the hardware aware multi-objective optimization(train-free score, latency, FLOPs and so on).
    
- Train-free evaluation: A novel zero-cost metrics combined network trainablity, network expressivity and generalization. Computation of DE-Score only takes a few forward inferences without training, making it extremely fast, lightweight and data-free. 

<center>
<img src="./img/denas_prototype.png" width="800"/>
</center>

# DENAS Search Space

Transformer based search space: consists of attention layer, layer normalization and feed forward layer, the search space can be controled by setting network depth, number attention heads, MLP layer ratio and layer dimension.

<center>
<img src="./img/transformer.png" width="800"/><figure>Overall architecture of NAS ASR</figure>
</center>

# DENAS Search Engine

Evolutionary Algorithm

- Initialization: Initialization the network architecture with random sample the possible component of each layer
- Selection:  Utilize our training-free proxy to rank the architectures in terms of their  evaluation score and  measure the other objective metries(like latency and flops size…) which will combine the LINAS multi objective search method
- Crossover and Mutation: Search over the candidate transformer architectures. Focus search on relevant portions of the search space, we enforce an upper bound on the latency of sampled architectures and reject samples outside this bound. 

<center>
<img src="./img/EA.PNG" width="800"/><figure>Evolutionary Algorithm</figure>
</center>
 
Reinforcement Learning

Use RNN model as controller to search the network architecture. The update method of the RNN weight is as follows: 

- Action: the result spit out at each step in the RNN controller 
- State:  the network structure generated by the controller up to the t step.
- Reward/cost: the loss on the validation set corresponding to the neural network result. 

This allows the use of reinforcement learning methods to update the RNN controller weights.

<center>
<img src="./img/RL.png" width="800"/><figure>RNN controller Neural Architecture Search</figure>
</center>

# Train-free Score(DE_Score)

DE_Score_cnn:

Trainability, expressivity, and generalization are three important, distinct, and complementary properties to characterize and understand neural networks. For our prototype DE_Score_base, we combine network expressivity and trainablity as our proxy for performance predictor.

<center>
<img src="./img/DE_Score_base.PNG" width="800"/><figure>DE_Score_cnn algorithm</figure>
</center>

DE_Score_transformer:

Recently, Transformer has achieved remarkable success in several natural language processing. The progresses are highly relevant to the architecture design, then it is worthwhile to propose Transformer based Neural Architecture Search to search for better automatically. We will propose an unified effective method to synaptic diversity of MSA(multi-head self-attention) and synaptic saliency of MLP, which are the basic component of transformer.

# DE-NAS for ASR

## Environment Setup

build docker image

```
cd Dockerfile-ubuntu18.04
docker build -t e2eaiok_pytorch . -f DockerfilePytorch110
docker run -itd --name aidk-denas-asr --privileged --ipc=host --network host --device=/dev/dri -v ${dataset_path}:/home/vmagent/app/dataset -v ${aidk_code_path}:/home/vmagent/app/e2eaiok -w /home/vmagent/app/ e2eaiok_pytorch:latest /bin/bash
```

Enter container with `docker exec -it aidk-denas-asr bash`

Start the jupyter notebook service

```
source /opt/intel/oneapi/setvars.sh --ccl-configuration=cpu_icc --force
conda activate pytorch-1.10.0
pip install jupyter
nohup jupyter notebook --notebook-dir=/home/vmagent/app/e2eaiok/ --ip=0.0.0.0 --port=8888 --allow-root &
```

Now you can visit AIDK DENAS ASR demo in http://${hostname}:8888/

## Launch search

In [7]:
%%bash
cd /home/vmagent/app/hydro.ai/DeNas
source /opt/intel/oneapi/setvars.sh --ccl-configuration=cpu_icc --force
conda activate pytorch-1.10.0
python search.py --domain asr --conf ../conf/denas/asr/aidk_denas_asr.conf

 
:: initializing oneAPI environment ...
   bash: BASH_VERSION = 4.4.20(1)-release
   args: Using "$@" for setvars.sh arguments: --ccl-configuration=cpu_icc --force
:: advisor -- latest
:: ccl -- latest
:: compiler -- latest
:: dal -- latest
:: debugger -- latest
:: dev-utilities -- latest
:: dnnl -- latest
:: dpl -- latest
:: intelpython -- latest



CommandNotFoundError: Your shell has not been properly configured to use 'conda deactivate'.
To initialize your shell, run

    $ conda init <SHELL_NAME>

Currently supported shells are:
  - bash
  - fish
  - tcsh
  - xonsh
  - zsh
  - powershell

See 'conda init --help' for more information and options.

IMPORTANT: You may need to close and restart your shell after running 'conda init'.



CommandNotFoundError: Your shell has not been properly configured to use 'conda deactivate'.
To initialize your shell, run

    $ conda init <SHELL_NAME>

Currently supported shells are:
  - bash
  - fish
  - tcsh
  - xonsh
  - zsh
  - powershell

See 'conda init --help' for more information and options.

IMPORTANT: You may need to close and restart your shell after running 'conda init'.



CommandNotFoundError: Your shell has not been properly configured to use 'conda deactivate'.
To initialize your shell, run

    $ conda init <SHELL_NAME>

Currently supported shells are:
  - bash
  - fish
  - tc

:: ipp -- latest
:: ippcp -- latest
:: mkl -- latest
:: modelzoo -- latest
:: mpi -- latest
:: neural-compressor -- latest
:: pytorch -- latest
:: tbb -- latest
:: tensorflow -- latest
:: vpl -- latest
:: vtune -- latest
:: oneAPI environment initialized ::
 


10/14/2022 04:42:55 - INFO - DENAS -   epoch = 0 structure = (11, 3.5, 3.0, 3.5, 4.0, 3.5, 3.5, 3.5, 3.5, 3.5, 4.0, 3.0, 2, 4, 2, 2, 4, 2, 4, 2, 2, 4, 4, 512) nas_score = 0.1436631679534912 params = 60.231936
10/14/2022 04:43:00 - INFO - DENAS -   epoch = 1 structure = (6, 3.5, 3.5, 3.5, 4.0, 4.0, 3.0, 4, 4, 4, 2, 2, 2, 512) nas_score = 0.1554543673992157 params = 46.044416
10/14/2022 04:43:04 - INFO - DENAS -   epoch = 2 structure = (11, 4.0, 4.0, 4.0, 3.0, 4.0, 3.5, 3.5, 3.0, 4.0, 3.5, 4.0, 2, 2, 2, 2, 2, 4, 2, 2, 4, 4, 2, 256) nas_score = 0.19446012377738953 params = 19.309184
10/14/2022 04:43:06 - INFO - DENAS -   epoch = 3 structure = (9, 4.0, 3.0, 4.0, 3.5, 4.0, 3.0, 4.0, 4.0, 4.0, 4, 4, 2, 4, 4, 2, 4, 2, 2, 128) nas_score = 0.020150352269411087 params = 6.462656
10/14/2022 04:43:10 - INFO - DENAS -   epoch = 4 structure = (5, 4.0, 4.0, 3.5, 3.5, 3.0, 2, 2, 2, 2, 2, 512) nas_score = 0.06112755835056305 params = 43.154432
10/14/2022 04:43:21 - INFO - DENAS -   epoch = 5 structure 

paths: /home/vmagent/app/hydro.ai/DeNas/asr/utils, /home/vmagent/app/hydro.ai/DeNas/asr
['/home/vmagent/app/hydro.ai/DeNas', '/opt/intel/oneapi/advisor/2022.1.0/pythonapi', '/opt/intel/oneapi/intelpython/latest/envs/pytorch-1.10.0/lib/python39.zip', '/opt/intel/oneapi/intelpython/latest/envs/pytorch-1.10.0/lib/python3.9', '/opt/intel/oneapi/intelpython/latest/envs/pytorch-1.10.0/lib/python3.9/lib-dynload', '/opt/intel/oneapi/intelpython/latest/envs/pytorch-1.10.0/lib/python3.9/site-packages', '/opt/intel/oneapi/intelpython/latest/envs/pytorch-1.10.0/lib/python3.9/site-packages/warprnnt_pytorch-0.1-py3.9-linux-x86_64.egg', '', '..', '/home/vmagent/app/hydro.ai/DeNas', '/home/vmagent/app/hydro.ai/DeNas', '/home/vmagent/app/hydro.ai/DeNas', '/home/vmagent/app/hydro.ai/DeNas', '/home/vmagent/app/hydro.ai/DeNas', '/home/vmagent/app/hydro.ai/DeNas/asr']
DE-NAS search best structure took 50.74229020252824 sec
DE-NAS completed, best structure is (11, 4.0, 4.0, 3.5, 4.0, 3.0, 3.0, 4.0, 4.0, 3.5

## Load, build and train model

In [6]:
%%bash
cd /home/vmagent/app/hydro.ai/DeNas
source /opt/intel/oneapi/setvars.sh --ccl-configuration=cpu_icc --force
conda activate pytorch-1.10.0
python trainer/train.py --domain=asr --conf=./asr/config/trainer_config.yaml

 
:: initializing oneAPI environment ...
   bash: BASH_VERSION = 4.4.20(1)-release
   args: Using "$@" for setvars.sh arguments: --ccl-configuration=cpu_icc --force
:: advisor -- latest
:: ccl -- latest
:: compiler -- latest
:: dal -- latest
:: debugger -- latest
:: dev-utilities -- latest
:: dnnl -- latest
:: dpl -- latest
:: intelpython -- latest



CommandNotFoundError: Your shell has not been properly configured to use 'conda deactivate'.
To initialize your shell, run

    $ conda init <SHELL_NAME>

Currently supported shells are:
  - bash
  - fish
  - tcsh
  - xonsh
  - zsh
  - powershell

See 'conda init --help' for more information and options.

IMPORTANT: You may need to close and restart your shell after running 'conda init'.



CommandNotFoundError: Your shell has not been properly configured to use 'conda deactivate'.
To initialize your shell, run

    $ conda init <SHELL_NAME>

Currently supported shells are:
  - bash
  - fish
  - tcsh
  - xonsh
  - zsh
  - powershell

See 'conda init --help' for more information and options.

IMPORTANT: You may need to close and restart your shell after running 'conda init'.



CommandNotFoundError: Your shell has not been properly configured to use 'conda deactivate'.
To initialize your shell, run

    $ conda init <SHELL_NAME>

Currently supported shells are:
  - bash
  - fish
  - tc

:: ipp -- latest
:: ippcp -- latest
:: mkl -- latest
:: modelzoo -- latest
:: mpi -- latest
:: neural-compressor -- latest
:: pytorch -- latest
:: tbb -- latest
:: tensorflow -- latest
:: vpl -- latest
:: vtune -- latest
:: oneAPI environment initialized ::
 


10/14/2022 04:25:09 - INFO - asr.utils.utils -   Beginning experiment!
10/14/2022 04:25:09 - INFO - asr.utils.utils -   Experiment folder: results/transformer/74443


ASR args:Namespace(param_file='asr/config/transformer.yaml', device='cpu', local_rank=None, distributed_launch=False, distributed_backend='gloo', find_unused_parameters=False, max_grad_norm=None, grad_accumulation_factor=None, seed=74443, output_folder='results/transformer/74443', save_folder='results/transformer/74443/save', data_folder='/home/vmagent/app/dataset/LibriSpeech', train_csv='/home/vmagent/app/dataset/LibriSpeech/train-test.csv', valid_csv='/home/vmagent/app/dataset/LibriSpeech/dev-test.csv', test_csv=['/home/vmagent/app/dataset/LibriSpeech/dev-test.csv'], skip_prep=False, lm_model_ckpt=None, tokenizer_ckpt='asr/results/transformer/74443/save/tokenizer.ckpt')


10/14/2022 04:25:30 - INFO - train -   epoch: 1, step: 1|4, time: 16.27s, loss: 1165.78955078125, avg_loss: 1165.7896, lr: 0.001
10/14/2022 04:25:42 - INFO - train -   epoch: 1, step: 2|4, time: 12.40s, loss: 1218.5321044921875, avg_loss: 1192.1608, lr: 0.001
10/14/2022 04:25:52 - INFO - train -   epoch: 1, step: 3|4, time: 9.83s, loss: 671.30810546875, avg_loss: 1018.5433, lr: 0.001
10/14/2022 04:25:55 - INFO - train -   epoch: 1, step: 4|4, time: 2.85s, loss: 739.382080078125, avg_loss: 948.7530, lr: 4e-07
10/14/2022 04:25:55 - INFO - train -   epoch: 1, time: 44.11s, avg_loss: 948.7530
10/14/2022 04:26:02 - INFO - evaluate -   epoch: 1, time: 6.978513717651367, wer: 100.0, acc: 0.011384062312762133, avg_loss: 270.6415820187086
10/14/2022 04:26:02 - INFO - train -   training time: 51.15848112106323


## Performance Overview

![asr_performance](./img/nas_asr_perf.png)

9.18x training speedup over RNN-T baseline (1x CLX)