## OpenVINO optimizations for Text classification task


## Import the packages needed for successful execution

In [17]:
from transformers import AutoConfig, AutoTokenizer, default_data_collator
from datasets import load_dataset, load_metric
from optimum.intel.openvino import OVAutoModelForSequenceClassification

from torch.utils.data import DataLoader

from tqdm import tqdm

<odict_iterator object at 0x7f5be1bcb950>
<odict_iterator object at 0x7f5be1bcb950>


### Instructions on conversion to OpenVINO
We will use the OpenVINO™ Integration with Optimum module to convert the sentiment classification model to an OpenVINO model object. <br>
We will then use Huggingface datasets and metric to evaluate the converted model.

In [37]:
model_name = "distilbert-base-uncased-finetuned-sst-2-english"
config = AutoConfig.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
ov_model = OVAutoModelForSequenceClassification.from_pretrained(model_name, config=config, from_pt=True)
ov_model.save_pretrained('saved_model')

<odict_iterator object at 0x7f5d91a71590>
<odict_iterator object at 0x7f5d91a71590>
Loading model from PT file, inputs: None
<odict_iterator object at 0x7f5d91a719f0>
<odict_iterator object at 0x7f5d91a71630>
<odict_iterator object at 0x7f5d91a71630>
<odict_iterator object at 0x7f5d91a71630>
<odict_iterator object at 0x7f5d91a71c20>
<odict_iterator object at 0x7f5d91a715e0>


Exception raised from index_select_out_cpu_ at ../aten/src/ATen/native/TensorAdvancedIndexing.cpp:758 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f5f89458302 in /home/dkarkada/miniconda3/envs/optimumtests/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: at::native::index_select_out_cpu_(at::Tensor const&, long, at::Tensor const&, at::Tensor&) + 0x2a9 (0x7f5f77db9b89 in /home/dkarkada/miniconda3/envs/optimumtests/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame #2: at::native::index_select_cpu_(at::Tensor const&, long, at::Tensor const&) + 0x60 (0x7f5f77dbc5e0 in /home/dkarkada/miniconda3/envs/optimumtests/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame #3: <unknown function> + 0x19acf62 (0x7f5f78531f62 in /home/dkarkada/miniconda3/envs/optimumtests/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame #4: at::redispatch::index_select(c10::DispatchKeySet, at::Tensor const&, long, at::Tensor 

### Preprocess function for the dataset


In [12]:
def preprocess_function(examples):
    result =  tokenizer(examples['sentence'], padding="max_length", max_length=128, truncation=True)
    result["labels"] = examples["label"]
    return result

<odict_iterator object at 0x7f5be0dc0900>
<odict_iterator object at 0x7f5be0dc0900>


### Evaluating the model performance

In [39]:
dataset = load_dataset("sst2")
metric = load_metric('f1')

dataset = dataset.map(preprocess_function, batched=True, remove_columns=dataset["train"].column_names)
val_dataloader = DataLoader(
       dataset['validation'], shuffle=True, collate_fn=default_data_collator
    )

for idx, batch in enumerate(tqdm(val_dataloader, desc="Looping over validation data")):
    outputs = ov_model(input_ids=batch['input_ids'].numpy(), attention_mask=batch['attention_mask'].numpy())
    preds = outputs[0].argmax()
    references = batch['labels'].numpy()
    metric.add_batch(predictions=[preds], references=[references])

ov_score = metric.compute()
print(f'Score for SST2 dataset with OV Optimum: {ov_score}')

<odict_iterator object at 0x7f5d91a61b80>
<odict_iterator object at 0x7f5d91a61b80>


100%|██████████| 3/3 [00:00<00:00, 434.57it/s]
Downloading builder script: 4.21kB [00:00, 862kB/s]                    
100%|██████████| 2/2 [00:00<00:00, 11.59ba/s]
Looping over validation data: 100%|██████████| 872/872 [00:18<00:00, 47.82it/s]


Score for SST2 dataset with OV Optimum: {'accuracy': 0.9105504587155964}


### Benchmark the converted model using the benchmark app
The OpenVINO toolkit provides a benchmarking application to gauge the platform specific runtime performance that can be obtained under optimal configuration parameters for a given model. For more details refer to: https://docs.openvino.ai/latest/openvino_inference_engine_tools_benchmark_tool_README.html

In [38]:
base_model_name = 'saved_model/ov_model.xml'
print('Benchmark OpenVINO model using the benchmark app')
! benchmark_app -m "$base_model_name" -d CPU -api async -t 10 -hint latency

Benchmark OpenVINO model using the benchmark app
<odict_iterator object at 0x7f5b9bd4ddb0>
<odict_iterator object at 0x7f5b9bd4ddb0>
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
[Step 1/11] Parsing and validating input arguments
[Step 2/11] Loading OpenVINO
[ INFO ] OpenVINO:
         API version............. 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ] Device info
         CPU
         openvino_intel_cpu_plugin version 2022.1
         Build................... 2022.1.0-7019-cdb9bec7210-releases/2022/1

[Step 3/11] Setting device configuration
[Step 4/11] Reading network files
[ INFO ] Read model took 248.27 ms
[Step 5/11] Resizing network to match image sizes and given batch
[ INFO ] Network batch size: 1
[Step 6/11] Configuring input of the mode