# DENAS ASR DEMO

DE-NAS is a train-free and hardware-aware NAS. This demo mainly introduces ASR integration with DE-NAS to search lighter, faster and high performance transformer-based ASR model in a training-free way.

# Content

* [DENAS ASR Search Space](#DENAS-ASR-Search-Space)
* [Performance Overview](#Performance-Overview)
* [Demo](#Demo)

# DENAS ASR Search Space

Recently, Transformer has achieved remarkable success in several automatic speech recognition tasks. The progresses are highly relevant to the architecture design, then it is worthwhile to propose Transformer based Neural Architecture Search to search for better automatically. We will propose an unified effective method to synaptic diversity of MSA(multi-head self-attention) and synaptic saliency of MLP, which are the basic component of transformer.

Transformer based search space consists of attention layer, layer normalization and feed forward layer, the search space can be controled by setting network depth, number attention heads, MLP layer ratio and layer dimension.

<center>
<img src="./img/transformer.png" width="600"/><figure>Overall architecture of NAS ASR</figure>
</center>

# Performance Overview

leverage DENAS to search optimal transformer-based ASR model structure, traininig dataset: LibriSpeech train-clean-100, early stop metric: 25% WER

performance data to be added

# Demo

## Environment setup
``` bash
# Setup ENV
git clone https://github.com/intel/e2eAIOK.git
cd e2eAIOK
python3 scripts/start_e2eaiok_docker.py -b pytorch120 -w ${host0} ${host1} ${host2} ${host3} --proxy ""
```

## Enter Docker

```
sshpass -p docker ssh ${host0} -p 12347
```

## Workflow Prepare

``` bash
# Download Dataset
# Download and unzip dataset from https://www.openslr.org/12 to /home/vmagent/app/dataset/LibriSpeech
# Download tokenizer from https://huggingface.co/speechbrain/asr-transformer-transformerlm-librispeech/blob/main/tokenizer.ckpt to /home/vmagent/app/dataset/LibriSpeech

# Process audio data
cd /home/vmagent/app/e2eaiok/e2eAIOK/DeNas/asr
conda activate pytorch
bash scripts/preprocess_librispeech.sh
```

## Launch search

In [2]:
# DENAS ASR search configuration
! cat /home/vmagent/app/e2eaiok/conf/denas/asr/e2eaiok_denas_asr.conf

# conf for transformer based asr
model_type: asr
search_engine: RandomSearchEngine
batch_size: 32
random_max_epochs: 10
max_epochs: 10
select_num: 50
population_num: 50
m_prob: 0.2
s_prob: 0.4
crossover_num: 25
mutation_num: 25
max_param_limits: 100
min_param_limits: 1
supernet_cfg: ../../conf/denas/asr/supernet_large.conf
img_size: 224
seed: 0
expressivity_weight: 0
complexity_weight: 0
diversity_weight: 1
saliency_weight: 1
latency_weight: 0

In [1]:
# DENAS ASR supernet structure and search space
! cat /home/vmagent/app/e2eaiok/conf/denas/asr/supernet_large.conf

SUPERNET:
  MLP_RATIO: 4.0
  NUM_HEADS: 4
  EMBED_DIM: 512
  DEPTH: 12
SEARCH_SPACE:
  MLP_RATIO:
    - 3.0
    - 3.5
    - 4.0
  NUM_HEADS:
    - 2
    - 4
  DEPTH:
    - 5
    - 6
    - 7
    - 8
    - 9
    - 10
    - 11
    - 12
  EMBED_DIM:
    - 128
    - 256
    - 512

In [1]:
%%bash
cd /home/vmagent/app/e2eaiok/e2eAIOK/DeNas
conda activate pytorch
python search.py --domain asr --conf /home/vmagent/app/e2eaiok/conf/denas/asr/e2eaiok_denas_asr.conf


CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'.
To initialize your shell, run

    $ conda init <SHELL_NAME>

Currently supported shells are:
  - bash
  - fish
  - tcsh
  - xonsh
  - zsh
  - powershell

See 'conda init --help' for more information and options.

IMPORTANT: You may need to close and restart your shell after running 'conda init'.


12/01/2022 07:44:33 - INFO - DENAS -   epoch = 0 structure = (11, 3.5, 3.0, 3.5, 4.0, 3.5, 3.5, 3.5, 3.5, 3.5, 4.0, 3.0, 2, 4, 2, 2, 4, 2, 4, 2, 2, 4, 4, 512) nas_score = 0.14366315305233002 params = 60.231936
12/01/2022 07:44:39 - INFO - DENAS -   epoch = 1 structure = (6, 3.5, 3.5, 3.5, 4.0, 4.0, 3.0, 4, 4, 4, 2, 2, 2, 512) nas_score = 0.1554543524980545 params = 46.044416
12/01/2022 07:44:44 - INFO - DENAS -   epoch = 2 structure = (11, 4.0, 4.0, 4.0, 3.0, 4.0, 3.5, 3.5, 3.0, 4.0, 3.5, 4.0, 2, 2, 2, 2, 2, 4, 2, 2, 4, 4, 2, 256) nas_score = 0.19446012377738953 params = 19.309184
12/01/2022 07:44:46 

paths: /home/vmagent/app/e2eaiok/e2eAIOK/DeNas/asr/utils, /home/vmagent/app/e2eaiok/e2eAIOK/DeNas/asr
['/home/vmagent/app/e2eaiok/e2eAIOK/DeNas', '/opt/intel/oneapi/advisor/2022.1.0/pythonapi', '/home/spark-3.2.1-bin-hadoop3.2/python/lib/py4j-0.10.9.3-src.zip', '/home/spark-3.2.1-bin-hadoop3.2/python', '/opt/intel/oneapi/intelpython/latest/envs/pytorch/lib/python39.zip', '/opt/intel/oneapi/intelpython/latest/envs/pytorch/lib/python3.9', '/opt/intel/oneapi/intelpython/latest/envs/pytorch/lib/python3.9/lib-dynload', '/opt/intel/oneapi/intelpython/latest/envs/pytorch/lib/python3.9/site-packages', '/opt/intel/oneapi/intelpython/latest/envs/pytorch/lib/python3.9/site-packages/warprnnt_pytorch-0.1-py3.9-linux-x86_64.egg', '', '/home/vmagent/app/e2eaiok/e2eAIOK/DeNas', '/home/vmagent/app/e2eaiok/e2eAIOK/DeNas', '/home/vmagent/app/e2eaiok/e2eAIOK/DeNas', '/home/vmagent/app/e2eaiok/e2eAIOK/DeNas', '/home/vmagent/app/e2eaiok/e2eAIOK/DeNas', '/home/vmagent/app/e2eaiok/e2eAIOK/DeNas/asr']
DE-NAS s

## Launch training with best searched model

edit /home/vmagent/app/e2eaiok/conf/denas/asr/e2eaiok_denas_train.conf

```
train_csv: "/home/vmagent/app/dataset/LibriSpeech/dev-clean.csv"
valid_csv: "/home/vmagent/app/dataset/LibriSpeech/dev-clean.csv"
test_csv: "/home/vmagent/app/dataset/LibriSpeech/dev-clean.csv"
tokenizer_ckpt: "/home/vmagent/app/dataset/LibriSpeech/tokenizer.ckpt"
train_epochs: 1
```

In [5]:
%%bash
cd /home/vmagent/app/e2eaiok/e2eAIOK/DeNas
conda activate pytorch
python -m intel_extension_for_pytorch.cpu.launch /home/vmagent/app/e2eaiok/e2eAIOK/DeNas/train.py \
    --domain asr --conf /home/vmagent/app/e2eaiok/conf/denas/asr/e2eaiok_denas_train_asr.conf --random_seed 74443


CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'.
To initialize your shell, run

    $ conda init <SHELL_NAME>

Currently supported shells are:
  - bash
  - fish
  - tcsh
  - xonsh
  - zsh
  - powershell

See 'conda init --help' for more information and options.

IMPORTANT: You may need to close and restart your shell after running 'conda init'.


2022-12-01 07:47:43,833 - __main__ - INFO - OMP_NUM_THREADS=48
2022-12-01 07:47:43,833 - __main__ - INFO - Using Intel OpenMP
2022-12-01 07:47:43,833 - __main__ - INFO - KMP_AFFINITY=granularity=fine,compact,1,0
2022-12-01 07:47:43,833 - __main__ - INFO - KMP_BLOCKTIME=1
2022-12-01 07:47:43,834 - __main__ - INFO - LD_PRELOAD=/opt/intel/oneapi/intelpython/latest/envs/pytorch/lib/libiomp5.so
2022-12-01 07:47:43,834 - __main__ - INFO - numactl -C 0-47 /opt/intel/oneapi/intelpython/latest/envs/pytorch/bin/python -u /home/vmagent/app/e2eaiok/e2eAIOK/DeNas/train.py --domain asr --conf /home/vmagent/app/e2ea

          (dropout1): Dropout(p=0.1, inplace=False)
          (dropout2): Dropout(p=0.1, inplace=False)
        )
        (6): TransformerEncoderLayer(
          (self_att): MultiheadAttention(
            (att): MultiheadAttention(
              (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
            )
          )
          (pos_ffn): PositionalwiseFeedForward(
            (fc1): Linear(in_features=512, out_features=2048, bias=True)
            (act): GELU()
            (drop): Dropout(p=0.1, inplace=False)
            (fc2): Linear(in_features=2048, out_features=512, bias=True)
          )
          (norm1): LayerNorm((512,), eps=1e-06, elementwise_affine=True)
          (norm2): LayerNorm((512,), eps=1e-06, elementwise_affine=True)
          (dropout1): Dropout(p=0.1, inplace=False)
          (dropout2): Dropout(p=0.1, inplace=False)
        )
        (7): TransformerEncoderLayer(
          (self_att): MultiheadAttention(
            (at

            (act): GELU()
            (drop): Dropout(p=0.1, inplace=False)
            (fc2): Linear(in_features=2048, out_features=512, bias=True)
          )
          (norm1): LayerNorm((512,), eps=1e-06, elementwise_affine=True)
          (norm2): LayerNorm((512,), eps=1e-06, elementwise_affine=True)
          (norm3): LayerNorm((512,), eps=1e-06, elementwise_affine=True)
          (dropout1): Dropout(p=0.1, inplace=False)
          (dropout2): Dropout(p=0.1, inplace=False)
          (dropout3): Dropout(p=0.1, inplace=False)
        )
        (4): TransformerDecoderLayer(
          (self_attn): MultiheadAttention(
            (att): MultiheadAttention(
              (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
            )
          )
          (mutihead_attn): MultiheadAttention(
            (att): MultiheadAttention(
              (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
            )


12/01/2022 07:51:41 - INFO - Trainer -   epoch: 1, step: 21|84, time: 5.46s, loss: 233.23403930664062, avg_loss: 278.8202, lr: 8e-06
12/01/2022 07:51:49 - INFO - Trainer -   epoch: 1, step: 22|84, time: 6.16s, loss: 229.02133178710938, avg_loss: 276.5567, lr: 8.4e-06
12/01/2022 07:52:07 - INFO - Trainer -   epoch: 1, step: 23|84, time: 17.03s, loss: 245.9324493408203, avg_loss: 275.2252, lr: 8.8e-06
12/01/2022 07:52:26 - INFO - Trainer -   epoch: 1, step: 24|84, time: 17.61s, loss: 221.91502380371094, avg_loss: 273.0039, lr: 9.2e-06
12/01/2022 07:52:35 - INFO - Trainer -   epoch: 1, step: 25|84, time: 7.92s, loss: 190.89065551757812, avg_loss: 269.7194, lr: 9.600000000000001e-06
12/01/2022 07:52:45 - INFO - Trainer -   epoch: 1, step: 26|84, time: 8.62s, loss: 248.9710235595703, avg_loss: 268.9214, lr: 9.999999999999999e-06
12/01/2022 07:52:53 - INFO - Trainer -   epoch: 1, step: 27|84, time: 6.80s, loss: 218.224365234375, avg_loss: 267.0437, lr: 1.04e-05
12/01/2022 07:53:03 - INFO - T

12/01/2022 08:02:24 - INFO - Trainer -   epoch: 1, step: 80|84, time: 6.71s, loss: 177.7315216064453, avg_loss: 223.1468, lr: 3.1599999999999996e-05
12/01/2022 08:02:35 - INFO - Trainer -   epoch: 1, step: 81|84, time: 10.42s, loss: 193.48794555664062, avg_loss: 222.7806, lr: 3.2e-05
12/01/2022 08:02:44 - INFO - Trainer -   epoch: 1, step: 82|84, time: 7.25s, loss: 182.6701202392578, avg_loss: 222.2915, lr: 3.24e-05
12/01/2022 08:02:57 - INFO - Trainer -   epoch: 1, step: 83|84, time: 11.35s, loss: 222.13172912597656, avg_loss: 222.2896, lr: 3.2800000000000004e-05
12/01/2022 08:03:04 - INFO - Trainer -   epoch: 1, step: 84|84, time: 6.56s, loss: 163.16845703125, avg_loss: 221.5857, lr: 3.32e-05
12/01/2022 08:03:04 - INFO - Trainer -   epoch: 1, time: 918.91s, avg_loss: 221.5857
12/01/2022 08:06:08 - INFO - Trainer -   epoch: 1, time: 184.09759664535522, wer: 112.10249623175619, avg_loss: 188.94259714577498
12/01/2022 08:06:08 - INFO - Trainer -   Evaluate time:184.0989682674408
12/01/2