[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/intel/e2eAIOK/blob/main/demo/ma/finetuner/Model_Adapter_Finetuner_builtin_ResNet50_CIFAR100.ipynb)

# Model Adapter Finetuner Builtin DEMO
Model Adapter is a convenient framework can be used to reduce training and inference time, or data labeling cost by efficiently utilizing public advanced models and those datasets from many domains. It mainly contains three components served for different cases: Finetuner, Distiller, and Domain Adapter. 

This demo mainly introduces the usage of Finetuner. Take image classification as an example, it shows how to integrate finetuner with ResNet50 on CIFAR100 dataset. This is a build-in usage, you can find customized detailed demo at [here](./Model_Adapter_Finetuner_Walkthrough_ResNet50_CIFAR100.ipynb).

# Content

* [Overview](#Overview)
    * [Model Adapter Finetuner Overview](#Model-Adapter-Finetuner-Overview)
* [Getting Started](#Getting-Started)
    * [1. Environment Setup](#1.-Environment-Setup)
    * [2. Launch training on baseline](#2.-Launch-training-on-baseline)
    * [3. Launch training with Finetuner](#3.-Launch-training-with-Finetuner)

# Overview

## Model Adapter Finetuner Overview
Finetuner is based on pretraining and finetuning technology, it can transfer knowledge from pretrained model to target model with same network structure. 

Pretrained models usually are generated by pretraining process, which is training specific model  on specific dataset and has been performed by DE-NAS, PyTorch, TensorFlow, or HuggingFace. Finetunner retrieves the pretrained model with same network structure, and copy pretrained weights from pretrained model to corresponding layer of target model, instead of random initialization for target mode. With finetunner, we can greatly improve training speed, and usually achieves better performance.

<img src="../imgs/finetuner.png" width="50%">
<center>Model Adapter Finetuner Structure</center>

# Getting Started

## 1. Environment Setup

### (Option 1) Use Pip install
We can directly install ModelAdapter module from Intel® End-to-End AI Optimization Kit with following command.

In [None]:
!pip install e2eAIOK-ModelAdapter --pre

### (Option 2) Use Docker 

We can also use Docker, which contains a complete environment.

Step1. prepare code
   ``` bash
   git clone https://github.com/intel/e2eAIOK.git
   cd e2eAIOK
   git submodule update --init –recursive
   ```
    
Step2. build docker image
   ``` bash
   python3 scripts/start_e2eaiok_docker.py -b pytorch112 --dataset_path ${dataset_path} -w ${host0} ${host1} ${host2} ${host3} --proxy  "http://addr:ip"
   ```
   
Step3. run docker and start conda env
   ``` bash
   sshpass -p docker ssh ${host0} -p 12347
   conda activate pytorch-1.12.0
   ```
  
Step4. Start the jupyter notebook and tensorboard service
   ``` bash
   nohup jupyter notebook --notebook-dir=/home/vmagent/app/e2eaiok --ip=${hostname} --port=8899 --allow-root &
   nohup tensorboard --logdir /home/vmagent/app/data/tensorboard --host=${hostname} --port=6006 & 
   ```
   Now you can visit demso in `http://${hostname}:8899/`, and see tensorboad log in ` http://${hostname}:6006`.

## 2. Launch training on baseline
First we train a vanilla ResNet50 on CIFAR100 as baseline for comparison.

### 2.1 Configuration
Let's download a configuration for ResNet50 with CIFAR100.

In [None]:
!wget https://raw.githubusercontent.com/intel/e2eAIOK/main/conf/ma/demo/baseline/cifar100_res50.yaml

--2023-03-19 22:25:15--  https://raw.githubusercontent.com/intel/e2eAIOK/main/conf/ma/demo/baseline/cifar100_res50.yaml
Resolving child-prc.intel.com (child-prc.intel.com)... 10.239.120.55
Connecting to child-prc.intel.com (child-prc.intel.com)|10.239.120.55|:913... connected.
Proxy request sent, awaiting response... 200 OK
Length: 512 [text/plain]
Saving to: ‘cifar100_res50.yaml’


2023-03-19 22:25:17 (15.2 MB/s) - ‘cifar100_res50.yaml’ saved [512/512]



Have a detailed look into the configurations.

In [None]:
!cat cifar100_res50.yaml

experiment:
  project: "demo"
  tag: "cifar100_res50"
  
output_dir: "./data"
train_epochs: 1

### dataset
data_set: "cifar100"
data_path:  "./data"
num_workers: 4

### model
model_type: "resnet50"

## optimizer
optimizer: "SGD"
learning_rate: 0.00753
weight_decay: 0.00115
momentum: 0.9

### scheduler
lr_scheduler: "CosineAnnealingLR"
lr_scheduler_config:
    T_max: 200

### early stop
early_stop: "EarlyStopping"
early_stop_config:
    tolerance_epoch: 15


### 2.2 Launch training
**Training resnet50 on CIFAR100 from scratch:**

We can directly train the model with only one-line command.

In [None]:
! python -u /usr/local/lib/python3.9/dist-packages/e2eAIOK/ModelAdapter/main.py --cfg cifar100_res50.yaml



Please cite the following paper when using nnUNet:

Isensee, F., Jaeger, P.F., Kohl, S.A.A. et al. "nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation." Nat Methods (2020). https://doi.org/10.1038/s41592-020-01008-z


If you have questions or suggestions, feel free to open an issue at https://github.com/MIC-DKFZ/nnUNet

configurations:
{'train_batch_size': 128, 'start_epoch': 0, 'initial_pretrain': '', 'kd': {'temperature': 4}, 'drop_last': False, 'optimizer': 'SGD', 'data_path': '/home/vmagent/app/data/dataset/cifar', 'loss_weight': {'backbone': 1.0, 'distiller': 0.0, 'adapter': 0.0}, 'dkd': {'alpha': 1.0, 'beta': 8.0, 'temperature': 4.0, 'warmup': 20}, 'enable_ipex': False, 'log_interval_step': 10, 'train_epochs': 1, 'metric_threshold': 100.0, 'profiler': False, 'warmup_scheduler_epoch': 0, 'distiller': {'type': '', 'teacher': {'type': '', 'initial_pretrain': '', 'pretrain': '', 'frozen': True}, 'save_logits': False, 'use_saved_logits': False, 

## 3. Launch training with Finetuner
Then we train ResNet50 on CIFAR100 with Finetuner to show the performance imrpovement.

### 3.1 Prepare pretrained model 
Download pretrained ResNet50 model on ImageNet21k and put it in "data" folder.

In [1]:
! wget https://miil-public-eu.oss-eu-central-1.aliyuncs.com/model-zoo/ImageNet_21K_P/models/resnet50_miil_21k.pth && mkdir data && mv resnet50_miil_21k.pth data/ 

--2023-03-27 09:57:01--  https://miil-public-eu.oss-eu-central-1.aliyuncs.com/model-zoo/ImageNet_21K_P/models/resnet50_miil_21k.pth
Resolving child-prc.intel.com (child-prc.intel.com)... 10.239.120.55
Connecting to child-prc.intel.com (child-prc.intel.com)|10.239.120.55|:913... connected.
Proxy request sent, awaiting response... 200 OK
Length: 186531247 (178M) [application/octet-stream]
Saving to: ‘resnet50_miil_21k.pth’


2023-03-27 09:57:34 (5.67 MB/s) - ‘resnet50_miil_21k.pth’ saved [186531247/186531247]



### 3.2 Configuration

Now we download a configuration for Finetuner with ResNet50 with CIFAR100

In [None]:
! wget https://raw.githubusercontent.com/intel/e2eAIOK/main/conf/ma/demo/finetuner/cifar100_res50PretrainI21k.yaml

--2023-03-19 22:47:45--  https://raw.githubusercontent.com/intel/e2eAIOK/main/conf/ma/demo/finetuner/cifar100_res50PretrainI21k.yaml
Resolving child-prc.intel.com (child-prc.intel.com)... 10.239.120.56
Connecting to child-prc.intel.com (child-prc.intel.com)|10.239.120.56|:913... connected.
Proxy request sent, awaiting response... 200 OK
Length: 788 [text/plain]
Saving to: ‘cifar100_res50PretrainI21k.yaml’


2023-03-19 22:47:46 (22.8 MB/s) - ‘cifar100_res50PretrainI21k.yaml’ saved [788/788]



Have a detailed look into the configurations.

In [None]:
! cat cifar100_res50PretrainI21k.yaml

experiment:
  project: "finetuner"
  tag: "cifar100_res50_PretrainI21k"
  strategy: "OnlyFinetuneStrategy"

output_dir: ".data/"
train_epochs: 1
enable_ipex: True

### dataset
data_set: "cifar100"
data_path:  ".data/"
num_workers: 4
input_size: 112

### model
model_type: "resnet50"

## finetuner
finetuner:
    type: "Basic"
    pretrain: '.data/resnet50_miil_21k.pth'
    pretrained_num_classes: 11221
    finetuned_lr: 0.00445
    frozen: False

## optimizer
optimizer: "SGD"
learning_rate: 0.00753
weight_decay: 0.00115
momentum: 0.9

### scheduler
lr_scheduler: "CosineAnnealingLR"
lr_scheduler_config:
    T_max: 200

### early stop
early_stop: "EarlyStopping"
early_stop_config:
    tolerance_epoch: 5

### 3.3 Launch Training with Finetuner
**Training resnet50 on CIFAR100 with Finetuner:**

Only need to change the configuration file, we can directly train the model with Fine-tuner in only one-line command.

In [None]:
! python -u /usr/local/lib/python3.9/dist-packages/e2eAIOK/ModelAdapter/main.py --cfg cifar100_res50PretrainI21k.yaml



Please cite the following paper when using nnUNet:

Isensee, F., Jaeger, P.F., Kohl, S.A.A. et al. "nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation." Nat Methods (2020). https://doi.org/10.1038/s41592-020-01008-z


If you have questions or suggestions, feel free to open an issue at https://github.com/MIC-DKFZ/nnUNet

See abnormal behavior in dataloader when enable IPEX in PyTorch 1.12, set enable_ipex to False!
configurations:
{'lr_scheduler': 'CosineAnnealingLR', 'pretrain': '', 'eval_epochs': 1, 'criterion': 'CrossEntropyLoss', 'data_set': 'cifar100', 'early_stop_config': {'tolerance_epoch': 15, 'delta': 0.0001, 'is_max': True}, 'dkd': {'alpha': 1.0, 'beta': 8.0, 'temperature': 4.0, 'warmup': 20}, 'output_dir': '/home/vmagent/app/data/model', 'data_path': '/home/vmagent/app/data/dataset/cifar', 'loss_weight': {'backbone': 1.0, 'distiller': 0.0, 'adapter': 0.0}, 'eval_batch_size': 128, 'lr_scheduler_config': {'decay_stages': [], 'decay_patien

2023-02-06 03:59:54 50/391
2023-02-06 04:00:01 60/391
2023-02-06 04:00:08 70/391
[2023-02-06 04:00:14] rank(0) epoch(0) Validation: accuracy = 80.6200;	loss = 0.6625
Best Epoch: 0, accuracy: 80.62000274658203
Epoch 0 took 998.8511202335358 seconds
Total seconds:998.85387
Totally take 1001.8232228755951 seconds
