[![open in colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/intel/e2eAIOK/blob/main/demo/denas/hf/DENAS_HF_DEMO.ipynb)

# AIOK DE-NAS HF Demo

DE-NAS is a multi-model, hardware-aware, train-free NAS to construct compact model architectures for target platform directly. DE-NAS includes CNN-based search space for CV domain and Transformer-based search space for CV/NLP/ASR domains, and leverages hardware-aware train-free scoring method to evaluate the performance of the candidate architecture without training.

This demo mainly introduces HF integration with DE-NAS to search lighter, faster, higher performance transformer-based HF model in a training-free way.

# Content

* [Overview](#overview)
    * [DE-NAS on HF Domain](#de-nas-on-hf-domain)
* [Getting Started](#getting-started)
    * [1. Enviroment Setup](#1-environment-setup)
    * [2. Workflow Prepare](#2-workflow-prepare)
    * [3. Data Prepare](#3-data-prepare)
    * [4. Launch Search](#4-launch-search)
    * [5. Launch Training with Best Searched Model Structure](#5-launch-training-with-best-searched-model-structure)

# Overview

## DE-NAS on HF Domain

### DE-NAS on HF Search Space and Supernet
Transformer-based search space consists of number of transformer layer, number of attention head, size of query/key/value, size of MLP, and dimension of embeddings, and the supernet of DE-NAS on HF is a HF BERT-based structure, which are shown as the below figure.

<center>
<img src="./img/HF_BERT_Search_Space.png" width="800"/><figure>DE-NAS on HF BERT search space</figure>
</center>

### DE-NAS Searched HF BERT Architecture
By deploying the train-free EA search engine on DE-NAS HF BERT search space and supernet, the DE-NAS HF BERT delivered the architecture that was more compact than the HF BERT-Base model as shown in the below figure:

<center>
<img src="./img/DENAS HF BERT Architecture.png" width="500"/><figure>DE-NAS Searched HF BERT Architecture</figure>
</center>

# Getting Started
Noted: Need to download dataset and pre-trained model manually to run this demo.

## 1. Environment Setup

### (Option 1) Use Pip install

In [None]:
! pip install e2eAIOK-denas --pre

### (Option 2) Use Docker

Step1. prepare code

``` shell
### Build docker image ###
# clone the e2eaiok repo
git clone https://github.com/intel/e2eAIOK.git
cd e2eAIOK
git submodule update --init --recursive
```

Step2. build docker image

```shell
python3 scripts/start_e2eaiok_docker.py -b pytorch112 -w ${host0} ${host1} ${host2} ${host3} --proxy ""
```

Step3. run docker and start conda env

``` shell
sshpass -p docker ssh ${host0} -p 12347
```

## Enter Docker

``` shell
# connect the docker
sshpass -p docker ssh ${host0} -p 12347
```

## 2. Workflow Prepare

* Conf for HF BERT DE-NAS Search

```yaml
# conf for HF bert
model_type: hf
search_engine: EvolutionarySearchEngine
pretrained_model_path: /home/vmagent/app/dataset/
batch_size: 32

# conf for evolutionary search engine
random_max_epochs: 1000 #random search max epochs
max_epochs: 10 #search epoch
select_num: 50
population_num: 50
m_prob: 0.2
s_prob: 0.4
crossover_num: 25
mutation_num: 25
img_size: 128
max_param_limits: 110
min_param_limits: 55
seed: 0

# enable/disable each DE-Score
expressivity_weight: 0
complexity_weight: 0
diversity_weight: 0.00001
saliency_weight: 1
latency_weight: 0.01
```

The above yaml-format file shows the DE-NAS search relevant configuration on BERT, which was placed on the `/home/vmagent/app/e2eaiok/conf/denas/hf/e2eaiok_denas_hf.conf`. It determines the type of search engine, search hyparameter (etc., batch_size, select_num and population_num), DE-Score parameters (etc., expressivity score weight and latency weight) and supernet/search space configuration (etc., supernet_cfg).

* Conf for BERT Supernet and Search Space

```yaml
# HF BERT supernet definition
supernet:
  bert-base-uncased

# BERT search space definition
#search_space:
#  intermediate_size: 
#    max: 3072
#    step: 16
```

The above yaml-format file describes the details of BERT-base supernet and search space configuration, which was also placed on the `/home/vmagent/app/e2eaiok/conf/denas/hf/e2eaiok_denas_hf.conf`. The common HF layer structure of "hidden_size", "num_attention_heads" and "num_hidden_layers" are the default items used in the DE-NAS HF search space, and other structure items (etc., "embedding_size") are determined by the user-self.

* Download pre-trained model from Hugging Face
    * Download and extract one of BERT-Base-Uncased pretrained models from [Hugging Face repository](https://huggingface.co/bert-base-uncased/tree/main) to `/home/vmagent/app/dataset/bert-base-uncased/`

## 3. Data Prepare

* Prepare Dataset
    * Download Dataset: Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable. SQuAD 1.1 contains 100,000+ question-answer pairs on 500+ articles.
    * Download from below path to `/home/vmagent/app/dataset/SQuAD`
        * Train Data: [train-v1.1.json](https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v1.1.json)
        * Test Data: [dev-v1.1.json](https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json)
``` bash
Data Format:
{
    "answers": {
        "answer_start": [1],
        "text": ["This is a test text"]
    },
    "context": "This is a test context.",
    "id": "1",
    "question": "Is this a test?",
    "title": "train test"
}
```

## 4. Launch Search

Launch DE-NAS search process on HF model BERT with the input of overall search configuration `/home/vmagent/app/e2eaiok/conf/denas/hf/e2eaiok_denas_hf.conf`, and will produce the best model structure as a tuple `(layer_num, head_num, hidden_size)` in the `best_model_structure.txt` file.

In [8]:
from e2eAIOK.DeNas.search.utils import parse_config
from e2eAIOK.DeNas.thirdparty.supernet_hf import SuperHFModel
from e2eAIOK.DeNas.search.SearchEngineFactory import SearchEngineFactory


# parse DE-NAS search configure
params = parse_config('/home/vmagent/app/e2eaiok/conf/denas/hf/e2eaiok_denas_hf.conf')

# construct supernet and search space
super_net = SuperHFModel.from_pretrained(params.supernet)
search_space = SuperHFModel.search_space_generation(params.supernet)

# create DE-NAS searcher
searcher = SearchEngineFactory.create_search_engine(params = params, super_net = super_net, search_space = search_space)

# trigger the search process
searcher.search()
best_structure = searcher.get_best_structures()
print(f"DE-NAS completed, best structure is {best_structure}")

paths: /home/vmagent/app/e2eaiok/e2eAIOK/DeNas/asr/utils, /home/vmagent/app/e2eaiok/e2eAIOK/DeNas/asr
['/home/vmagent/app/e2eaiok/e2eAIOK/DeNas', '/opt/intel/oneapi/advisor/2022.3.0/pythonapi', '/opt/intel/oneapi/intelpython/latest/envs/pytorch-1.12.0/lib/python39.zip', '/opt/intel/oneapi/intelpython/latest/envs/pytorch-1.12.0/lib/python3.9', '/opt/intel/oneapi/intelpython/latest/envs/pytorch-1.12.0/lib/python3.9/lib-dynload', '/opt/intel/oneapi/intelpython/latest/envs/pytorch-1.12.0/lib/python3.9/site-packages', '/opt/intel/oneapi/intelpython/latest/envs/pytorch-1.12.0/lib/python3.9/site-packages/e2eAIOK-0.2.1-py3.9.egg', '', '/home/vmagent/app/e2eaiok/e2eAIOK/DeNas', '/home/vmagent/app/e2eaiok/e2eAIOK/DeNas', '/home/vmagent/app/e2eaiok/e2eAIOK/DeNas', '/home/vmagent/app/e2eaiok/e2eAIOK/DeNas', '/home/vmagent/app/e2eaiok/e2eAIOK/DeNas', '/home/vmagent/app/e2eaiok/e2eAIOK/DeNas/asr']
loading archive file /home/vmagent/app/dataset/bert-base-uncased
12/01/2022 13:43:12 - INFO - nlp.super

## 5. Launch Training with Best Searched Model Structure

Launch the DE-NAS training process with user-self defined training process. Below is an simple instroduction how to add DE-NAS constructed model into the customer's training pipeline or DE-NAS training pipeline (etc., fine-tuning DE-NAS HF BERT in the SQuADv1.1 task).

In [14]:
from e2eAIOK.DeNas.thirdparty.supernet_hf import SuperHFModel
from e2eAIOK.DeNas.thirdparty.utils import decode_arch
from e2eAIOK.DeNas.search.utils import parse_config
from e2eAIOK.common.trainer.data.nlp.data_builder_squad import DataBuilderSQuAD
from e2eAIOK.DeNas.nlp.utils import bert_create_optimizer, bert_create_criterion, bert_create_scheduler, bert_create_metric
from e2eAIOK.DeNas.nlp.bert_trainer import BERTTrainer

from easydict import EasyDict as edict
from torch import nn

### Adding a last task specific layer to denas model ###
class SuperHFTaskModel(nn.Module):
    def __init__(self, encoder_model, cfg):
        super(SuperHFTaskModel, self).__init__()
        self.cfg = cfg
        self.encoder_model = encoder_model
        if self.cfg["task_type"] == "classification":
            self.output = nn.Linear(self.cfg["hidden_size"], self.cfg["num_labels"])
        else:
            raise NotImplementedError("Task Type are not supported yet!")
        self.output.apply(self.init_output_weight)
    
    def init_output_weight(self, module):
        if isinstance(module, nn.Linear):
            module.weight.data.normal_(mean=0.0, std=self.cfg.initializer_range)
            if hasattr(module, "bias") and module.bias is not None:
                module.bias.data.zero_()

    def forward(self, x):
        input_keys = self.cfg["input_id"].strip().split()
        item = x.split(1, -1)
        inputs = dict()
        for id, input_key in enumerate(input_keys):
            inputs[input_key] = item[id].squeeze(-1)
        output = self.encoder_model(**inputs)
        last_hidden_state = output.last_hidden_state
        logits = self.output(last_hidden_state)
        return logits

### Model Contruction with DE-NAS Configuration ###
model_arch = decode_arch("/home/vmagent/app/e2eaiok/e2eAIOK/DeNas/best_model_structure.txt")
cfg = edict(parse_config("/home/vmagent/app/e2eaiok/conf/denas/hf/e2eaiok_denas_train_bert.conf"))
print(cfg)
model = SuperHFModel.set_sample_config("/home/vmagent/app/dataset/bert-base-uncased/", **model_arch)
model = SuperHFTaskModel(model, cfg)

train_dataloader, eval_dataloader, other_data = DataBuilderSQuAD(cfg).get_dataloader()
cfg.num_train_steps = len(train_dataloader)
optimizer = bert_create_optimizer(model, cfg)
criterion = bert_create_criterion(cfg)
scheduler = bert_create_scheduler(cfg)
metric = bert_create_metric(cfg)

### create DE-NAS trainer ###
trainer = BERTTrainer(cfg, model, train_dataloader, eval_dataloader, other_data, optimizer, criterion, scheduler, metric)

### trigger the training process ###
trainer.fit()

{'domain': 'hf', 'task_name': 'squad1', 'task_type': 'classification', 'supernet': 'bert-base-uncased', 'tokenizer': 'bert-base-uncased', 'optimizer': 'BertAdam', 'criterion': 'CrossEntropyQALoss', 'lr_scheduler': 'warmup_linear', 'eval_metric': 'qa_f1', 'dist_backend': 'gloo', 'input_id': 'input_ids attention_mask token_type_ids', 'data_set': 'SQuADv1.1', 'best_model_structure': '/home/vmagent/app/e2eaiok/e2eAIOK/DeNas/best_model_structure.txt', 'model': '/home/vmagent/app/dataset/bert-base-uncased/', 'model_dir': '/home/vmagent/app/dataset/bert-base-uncased/', 'data_dir': '/home/vmagent/app/dataset/SQuAD_small/', 'output_dir': '/home/vmagent/app/e2eaiok/e2eAIOK/DeNas/thirdparty/', 'hidden_size': 640, 'gradient_accumulation_steps': 1, 'warmup_proportion': 0.1, 'learning_rate': 6e-05, 'weight_decay': 0.01, 'initializer_range': 0.02, 'train_epochs': 2, 'max_seq_length': 384, 'doc_stride': 128, 'train_batch_size': 32, 'eval_batch_size': 8, 'eval_step': 500, 'n_best_size': 20, 'max_answer

Some weights of the model checkpoint at /home/vmagent/app/dataset/bert-base-uncased/ were not used when initializing BertModel: ['cls.predictions.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.weight', 'cls.predictions.decoder.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
03/18/2023 11:17:56 - INFO - e2eAIOK.DeNas.module.nlp.tokenization -   loading vocabulary file
03/18/2023 11:17:56 - INFO - e2eA

***** Eval results *****
em = 7.559523809523809
infer_cnt = 213
infer_time = 140.6275164319249
qa_f1 = 13.243573213106428


Iteration: 100%|##############################################################################################################| 34/34 [02:21<00:00,  4.15s/it]
03/18/2023 11:22:19 - INFO - Trainer -   Epoch 1 training time:141.16966462135315
03/18/2023 11:22:19 - INFO - Trainer -   **************S*************
task_name = squad1
total training time = 257.77339577674866
best_acc = 13.243573213106428
**************E*************

