## Reference Implementation

### ***E2E Architecture***
### **Use Case E2E flow**
![Use_case_flow](assets/E2E_2.PNG)


### Solution setup
Use the following cell to change to the correct kernel. Then check that you are in the `stock` kernel. If not, navigate to `Kernel > Change kernel > Python [conda env:stock]`. Note that the cell will remain with * but you can continue running the following cells.

In [None]:
%%javascript
Jupyter.notebook.session.restart({kernel_name: 'conda-env-stock-tf-py'})

### Training

We trained 6 convolution layers and 5 dense layers CNN architecture model to classify the normal and pneumonia from the production pipeline.

| **Input Size** | 416x608
| :--- | :---
| **Output Model format** | TensorFlow checkpoint

### Training CNN model

**Capturing the time for training**
<br>Run the training module as given below to start training and prediction using the active environment. This module takes option to run the training.
```
usage: medical_diagnosis_initial_training.py  [--datadir] 

optional arguments:
  -h,                   show this help message and exit
  
  --data_dir 
                        Absolute path to the data folder containing
                        "chest_xray" and "chest_xray" folder containing "train" "test" and "val" 
                         and each subfolders contain "Pneumonia" and "NORMAL" folders 
```

**Command to run training**

```sh
python src/medical_diagnosis_initial_training.py  --datadir ./data/chest_xray
```
By default, model checkpoint will be saved in "model" folder.

> **Note**: If any CV2 dependency comes like "cv2 import *ImportError: libGL.so.1: cannot open shared object file" please execute sudo apt install libgl1-mesa-glx apt-get install libgl1 -y pip install protobuf==3.20.*

In [None]:
!python3 src/medical_diagnosis_initial_training.py --datadir ./data/chest_xray

### 4. Hyperparameter tuning

 **hyperparameters used here are as below** 
<br> Dataset remains same with 90:10 split for Training and testing. It needs to be ran multiple times on the same dataset, across different hyper-parameters

Below parameters been used for tuning

<br>"learning rates"      : [0.001, 0.01]
<br>"batchsize"           : [10 ,20]

```
usage: medical_diagnosis_hyperparameter_tuning.py 

optional arguments:
  -h,                   show this help message and exit
  

  --data_dir 
                        Absolute path to the data folder containing
                        "chest_xray" and "chest_xray" folder containing "train" "test" and "val" 
                         and each subfolders contain "Pneumonia" and "NORMAL" folders

```
**Command to run hyperparameter tuning**

```sh
python src/medical_diagnosis_hyperparameter_tuning.py   --datadir  ./data/chest_xray
```
By default, best model checkpoint will be saved in "model" folder.

In [None]:
!python src/medical_diagnosis_hyperparameter_tuning.py   --datadir  ./data/chest_xray



**Convert the model to frozen graph**

run the conversion module to convert the TensorFlow checkpoint model format to frozen graph format. 

```
usage: python src/model_conversion.py [-h] [--model_dir] [--output_node_names]

optional arguments:
  -h  
                            show this help message and exit
  --model_dir
                            Please provide the Latest Checkpoint path e.g for
                            "./model"...Default path is mentioned

  --output_node_names       Default path is mentioned as "Softmax"
```
**Command to run conversion**

```sh
python src/model_conversion.py --model_dir ./model  --output_node_names Softmax
```
>**Note** : Also we need to generate Stock frozen_graph.pb and move all stock model files in new folder named "stockmodel" inside model folder to avoid the overwrite model file conflict when we run scripts in Intel.

In [None]:
!python src/model_conversion.py --model_dir ./model  --output_node_names Softmax

### 5. Inference

 Running inference using Stock TensorFlow using 2.8.0 

```
usage: inference.py [--codebatchsize ] [--modeldir ]

optional arguments:
  -h,                       show this help message and exit

  --codebatchsize           --codebatchsize
                              batchsize used for inference
                        
  --modeldir                --modeldir         
                              provide frozen Model path ".pb" file...users can also
                              use INC INT8 quantized model here

```
**Command to run inference**

```sh
python src/inference.py --codebatchsize 1  --modeldir ./stockmodel/updated_model.pb
```
>**Note** : As we mentioned earlier all the stock generated model need to be moved stockmodel folder and codebatchsize can be changed (1,32,64,128).

In [None]:
!python src/inference.py --codebatchsize 1  --modeldir ./model/updated_model.pb

## Optimizing the E2E solution with Intel® oneAPI components

### **Use Case E2E flow**

![Use_case_flow](assets/E2E_1.PNG)

### 1. Environment Creation

**Setting up the environment for Intel oneDNN optimized TensorFlow**<br>Follow the below conda installation commands to setup the Intel oneDNN optimized TensorFlow environment for the model training and prediction.
```sh
conda env create -f env/intel/intel-tf.yml
```
*Activate intel conda environment*
Use the following command to activate the environment that was created:

```sh
conda activate intel-tf 
export TF_ENABLE_ONEDNN_OPTS=1
export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python
```
>**Note**: We need to set the above flags everytime before running the scripts below

### Solution setup
Use the following cell to change to the correct kernel. Then check that you are in the `intel` kernel. If not, navigate to `Kernel > Change kernel > Python [conda env:intel]`. Note that the cell will remain with * but you can continue running the following cells.

In [None]:
%%javascript
Jupyter.notebook.session.restart({kernel_name: 'conda-env-intel-tf-py'})

### Training CNN model

**Capturing the time for training**
<br>Run the training module as given below to start training and prediction using the active environment. This module takes option to run the training.
```
usage: medical_diagnosis_initial_training.py  [--datadir] 

optional arguments:
  -h,                   show this help message and exit
  
  --data_dir 
                        Absolute path to the data folder containing
                        "chest_xray" and "chest_xray" folder containing "train" "test" and "val" 
                         and each subfolders contain "Pneumonia" and "NORMAL" folders 
```
**Command to run training**

```sh
python src/medical_diagnosis_initial_training.py  --datadir ./data/chest_xray
```
By default, model checkpoint will be saved in "model" folder.

> **Note**:  If any gcc dependency comes please upgrade it using sudo apt install build-essential.
Above training command will run in intel environment and the output trained model would be saved in TensorFlow checkpoint 
format.

In [None]:
%%bash
export TF_ENABLE_ONEDNN_OPTS=1
export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python
python src/medical_diagnosis_initial_training.py  --datadir ./data/chest_xray

### Hyperparameter tuning

**Hyperparameters used here are as below** 
<br> Dataset remains same with 90:10 split for Training and testing. It needs to be ran multiple times on the same dataset, across different hyper-parameters

Below parameters been used for tuning

<br>"learning rates"      : [0.001, 0.01]
<br>"batchsize"           : [10,20]

```
usage: medical_diagnosis_hyperparameter_tuning.py 

optional arguments:
  -h,                   show this help message and exit
  

  --data_dir 
                        Absolute path to the data folder containing
                        "chest_xray" and "chest_xray" folder containing "train" "test" and "val" 
                         and each subfolders contain "Pneumonia" and "NORMAL" folders

```
**Command to run hyperparameter tuning**

```sh
python src/medical_diagnosis_hyperparameter_tuning.py --datadir  ./data/chest_xray
```
By default, model checkpoint will be saved in "model" folder.

> **Note**: Here using --codebatchsize 20 and  --learningRate 0.001 best accuracy has been evaluated ,even that model is compatible for INC conversion

In [None]:
%%bash
export TF_ENABLE_ONEDNN_OPTS=1
export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python
python src/medical_diagnosis_hyperparameter_tuning.py --datadir  ./data/chest_xray

<br>**Convert the model to frozen graph**

Run the conversion module to convert the TensorFlow checkpoint model format to frozen graph format. This frozen graph can be later used for Inferencing, INC and Intel® Distribution of OpenVINO™.
```
usage: python src/model_conversion.py [-h] [--model_dir] [--output_node_names]

optional arguments:
  -h  
                            show this help message and exit
  --model_dir
                            Please provide the Latest Checkpoint path e.g for
                            "./model"...Default path is mentioned

  --output_node_names       Default name is mentioned as "Softmax"
```
**Command to run conversion**

```sh
python src/model_conversion.py --model_dir ./model --output_node_names Softmax
```
> **Note**: We need to make sure intel frozen_graph.pb gets generated using intel model files only 


In [None]:
!python src/model_conversion.py --model_dir ./model --output_node_names Softmax

### Inference

Performed inferencing on the trained model using TensorFlow  2.9.0 with oneDNN

#### Running inference using TensorFlow

```
usage: inference.py [--codebatchsize ] [--modeldir ]

optional arguments:
  -h,                       show this help message and exit

  --codebatchsize           --codebatchsize
                              batchsize used for inference
                        
  --modeldir                --modeldir         
                              provide frozen Model path ".pb" file...users can also
                              use INC INT8 quantized model here

```
**Command to run inference**

```sh
OMP_NUM_THREADS=4 KMP_BLOCKTIME=100 python src/inference.py --codebatchsize 1  --modeldir ./model/updated_model.pb
```
>**Note** : Above inference script can be run in intel environment using different batch sizes<br>

In [None]:
!OMP_NUM_THREADS=4 KMP_BLOCKTIME=100 python src/inference.py --codebatchsize 1  --modeldir ./model/updated_model.pb

### Quantize trained models using Intel® Neural Compressor

Intel® Neural Compressor is used to quantize the FP32 Model to the INT8 Model. Optimized model is used here for evaluating and timing Analysis.
Intel® Neural Compressor supports many optimization methods. In this case, we used post training quantization with `Default Quantiztion Mode` method to quantize the FP32 model.

>**Note**: We need to make sure intel frozen_graph.pb gets generated using intel model files only .We recommend initiate running hyperparametertuning script with default parameter to get a new model then convert to Frozen graph and using that get the compressed model , if model gets corrupted for any reason below script will not run .

*Step-1: Conversion of FP32 Model to INT8 Model*

```
usage: src/INC/neural_compressor_conversion.py  [--modelpath] ./model/updated_model.pb  [--outpath] ./model/output/compressedmodel.pb [--config]  ./src/INC/deploy.yaml

optional arguments:
  -h                          show this help message and exit

  --modelpath                 --modelpath 
                                Model path trained with TensorFlow ".pb" file
  --outpath                   --outpath 
                                default output quantized model will be save in ".model//output" folder
  --config                    --config 
                                Yaml file for quantizing model, default is "./deploy.yaml"
  
```

**Command to run the neural_compressor_conversion**
> Activate intel Environment before running

```
 python src/INC/neural_compressor_conversion.py  --modelpath  ./model/updated_model.pb  --outpath ./model/output/compressedmodel.pb  --config  ./src/INC/deploy.yaml
```
> Quantized model will be saved by default in `model/output` folder as `compressedmodel.pb`

In [None]:
!python src/INC/neural_compressor_conversion.py  --modelpath  ./model/updated_model.pb  --outpath ./model/output/compressedmodel.pb  --config  ./src/INC/deploy.yaml

*Step-2: Inferencing using quantized Model*

```
usage: inference_inc.py [--codebatchsize ] [--modeldir ]

optional arguments:
  -h,                       show this help message and exit

  --codebatchsize           --codebatchsize
                              batchsize used for inference
                        
  --modeldir                --modeldir         
                              provide frozen Model path ".pb" file...users can also
                              use INC INT8 quantized model here

```
**Command to run inference**

```sh
OMP_NUM_THREADS=4 KMP_BLOCKTIME=100 python src/INC/inference_inc.py --codebatchsize 1  --modeldir ./model/updated_model.pb
```
>**Note** : Above inference script can be run in intel environment using different batch sizes<br>
Same script can be used to benchmark INC INT8 Quantized model. For more details please refer to INC quantization section.By using different batchsize one can observe the gain obtained using Intel® oneDNN optimized TensorFlow in intel environment. <br>

Run this script to record multiple trials and the minimum value can be calculated.

In [None]:
!OMP_NUM_THREADS=4 KMP_BLOCKTIME=100 python src/INC/inference_inc.py --codebatchsize 1  --modeldir ./model/updated_model.pb

*Step-3 : Performance of  quantized Model*

```
usage: src/INC/run_inc_quantization_acc.py  [--datapath]   [--fp32modelpath]  [--config]   [--int8modelpath ]

optional arguments:
  -h,                       show this help message and exit

  --datapath                --datapath
                              need to mention absolute path of data
                        
  ---fp32modelpath          --fp32modelpath         
                              provide frozen Model path ".pb" file...(Absolute path)

  --config                  --config        
                              provide config path...(Absolute path)

  --int8modelpath          --int8modelpath      
                             provide int8 model path ".pb" file...(Absolute path)
                              

```

**Command to run Evalution of INT8 Model**

```sh
python src/INC/run_inc_quantization_acc.py --datapath ./data/chest_xray/val --fp32modelpath ./model/updated_model.pb --config ./src/INC/deploy.yaml --int8modelpath ./model/output/compressedmodel.pb
```

In [None]:
!python src/INC/run_inc_quantization_acc.py --datapath ./data/chest_xray/val --fp32modelpath ./model/updated_model.pb --config ./src/INC/deploy.yaml --int8modelpath ./model/output/compressedmodel.pb

### Quantize trained models using  Intel® Distribution of OpenVINO™

When it comes to the deployment of this model on edge devices, with less computing and memory resources, we further need to explore options for quantizing and compressing the model which brings out the same level of accuracy and efficient utilization of underlying computing resources. Intel® Distribution of OpenVINO™ Toolkit facilitates the optimization of a deep learning model from a framework and deployment using an inference engine on such computing platforms based on Intel hardware accelerators. Below section covers the steps to use this toolkit for the model quantization and measure its performance.

**Intel® Distribution of OpenVINO™ Intermediate Representation (IR) conversion** <br>
Below are the steps to convert TensorFlow frozen graph representation to OpenVINO IR using model optimizer.

*Environment Setup*

Intel® Distribution of OpenVINO™ is installed in OpenVINO environment. Since Intel® Distribution of OpenVINO™ supports Tensorflow<2.6.0.

```sh
conda env create -f env/OpenVINO.yml
```
*Activate OpenVINO environment*
```sh
conda activate OpenVINO
```


Frozen graph model should be generated using `model_conversion.py`, post training from the trained TensorFlow checkpoint model.

**Command to create Intel® Distribution of OpenVINO™ FPIR model**

```sh
mo --input_meta_graph ./model/Medical_Diagnosis_CNN.meta --input_shape="[1,300,300,3]" --mean_values="[127.5,127.5,127.5]" --scale_values="[127.5]" --data_type FP32 --output_dir ./model  --input="Placeholder" --output="Softmax"
```

>>**Note**: The above step will generate `Medical_Diagnosis_CNN.bin` and `Medical_Diagnosis_CNN.xml` as output in `model` which can be used with OpenVINO inference application. Default precision is FP32.



In [None]:
%%javascript
Jupyter.notebook.session.restart({kernel_name: 'conda-env-OpenVINO-py'})

In [None]:
!mo --input_meta_graph ./model/Medical_Diagnosis_CNN.meta --input_shape="[1,300,300,3]" --mean_values="[127.5,127.5,127.5]" --scale_values="[127.5]" --data_type FP32 --output_dir ./model  --input="Placeholder" --output="Softmax"

#### Model Quantization

```
python src/OPENVINO/run_openvino_script.py  --datapath ./data/chest_xray/val  --modelpath ./model/Medical_Diagnosis_CNN.xml

optional arguments:
  -h,                     show this help message and exit

  --modelpath,            --modelpath
  
  --datapath              --datapath
                            dataset folder containing "val"
      
```
**Command to run coversion of OpenVINO FPIR model to INT8 model**

```sh
python src/OPENVINO/run_openvino_script.py  --datapath ./data/chest_xray/val  --modelpath ./model/Medical_Diagnosis_CNN.xml
```

> The above step will quantize the model and generate `Medical_Diagnosis_CNN.bin` and `Medical_Diagnosis_CNN.xml` as output in `./model/optimized` which can be used with  Intel® Distribution of OpenVINO throughput and latency benchmarking. post quantization precision is INT8.

In [None]:
!python src/OPENVINO/run_openvino_script.py  --datapath ./data/chest_xray/val  --modelpath ./model/Medical_Diagnosis_CNN.xml



#### Benchmarking with  Intel® Distribution of OpenVINO™ Post-Training Optimization Tool

**Running inference using Intel® Distribution of OpenVINO™**<br>Command to perform inference using Intel® Distribution of OpenVINO™. The model needs to be converted to IR format as per the section. 
Post-training Optimization Tool (POT) is designed to accelerate the inference of deep learning models by applying special methods without model retraining or fine-tuning, like post-training quantization.

*Pre-requisites*
-  Intel® Distribution of OpenVINO™ Toolkit
-  Intel® Distribution of OpenVINO IR converted FP32/16 precision model
-  Intel® Distribution of OpenVINO INT8 model converted using FPIR model.

**Performance Benchmarking of full precision (FP32) Model**<br>Use the below command to run the benchmark tool for the FPIR model generated using this codebase for the Pneumonia detection. 

```sh
Latency mode:
benchmark_app -m ./model/Medical_Diagnosis_CNN.xml -api async -niter 120 -nireq 1 -b 1 -nstreams 1 -nthreads 8

Throughput mode:
benchmark_app -m ./model/Medical_Diagnosis_CNN.xml -api async -niter 120 -nireq 8 -b 32 -nstreams 8 -nthreads 8
```

**Performance Benchmarking of INT8 precision Model**<br>Use the below command to run the benchmark tool for the quantized INT8 model. 

```sh
Latency mode:
benchmark_app -m ./model/optimized/Medical_Diagnosis_CNN.xml  -api async -niter 120 -nireq 1 -b 1 -nstreams 1 -nthreads 8

Throughput mode:
benchmark_app -m ./model/optimized/Medical_Diagnosis_CNN.xml  -api async -niter 120 -nireq 8 -b 32 -nstreams 8 -nthreads 8
```

In [None]:
!benchmark_app -m ./model/Medical_Diagnosis_CNN.xml -api async -niter 120 -nireq 1 -b 1 -nstreams 1 -nthreads 8 -hint none
!benchmark_app -m ./model/Medical_Diagnosis_CNN.xml -api async -niter 120 -nireq 8 -b 32 -nstreams 8 -nthreads 8 -hint none

In [None]:
!benchmark_app -m ./model/optimized/Medical_Diagnosis_CNN.xml  -api async -niter 120 -nireq 1 -b 1 -nstreams 1 -nthreads 8 -hint none
!benchmark_app -m ./model/optimized/Medical_Diagnosis_CNN.xml  -api async -niter 120 -nireq 8 -b 32 -nstreams 8 -nthreads 8 -hint none