# The attention center model with OpenVINO™

This notebook demonstrates how the use [attention center model](https://github.com/google/attention-center/tree/main) with OpenVINO. This model is in the [TensorFlow Lite format](https://www.tensorflow.org/lite). Check out [this article](https://opensource.googleblog.com/2022/12/open-sourcing-attention-center-model.html) to find more information about this model.

## Imports

In [None]:
import time
import cv2

import numpy as np
import tensorflow as tf
from pathlib import Path
import matplotlib.pyplot as plt

from openvino.tools import mo
from openvino.runtime import serialize, Core

## Download the attention-center model

Download the model as part of repo https://github.com/google/attention-center/tree/main. The repo include model in folder `./model`. 

In [None]:
if not Path('./attention-center').exists():
    ! git clone https://github.com/google/attention-center

### Convert Tensorflow Lite model to OpenVINO IR format

The attention-center model is pre-trained model is in TensorFlow Lite format. To use it with OpenVINO, convert it to OpenVINO IR format with Model Optimizer. For more information about Model Optimizer, please, see the [Model Optimizer Developer Guide](https://docs.openvino.ai/latest/openvino_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide.html). This step is also skipped if the model is already converted.

In [None]:
tflite_model_path = Path("./attention-center/model/center.tflite")

ir_model_path = Path("./model/ir_center_model.xml")

core = Core()

if not ir_model_path.exists():
    model = mo.convert_model(tflite_model_path)
    serialize(model, ir_model_path.as_posix())
    print("IR model saved to {}".format(ir_model_path))
else:
    print("Read IR model from {}".format(ir_model_path))
    model = core.read_model(ir_model_path)

compiled_model = core.compile_model(model=model, device_name="CPU")

input_layer = compiled_model.input(0)
print(input_layer)

## Prepare image to use with attention-center model

The attention-center model takes an RGB image with shape (480, 640) as input.

In [None]:
class Image():
    def __init__(self, model_input_image_shape, filename):
        self.model_input_image_shape = model_input_image_shape

        self.image = cv2.imread(filename)
        self.real_input_image_shape = self.image.shape

    def prepare_image_tensor(self):
        rgb_image = cv2.cvtColor(self.image, cv2.COLOR_BGR2RGB)
        resized_image = cv2.resize(rgb_image, (self.model_input_image_shape[1], self.model_input_image_shape[0]))

        image_tensor = tf.constant(np.expand_dims(resized_image, axis=0),
                                   dtype=tf.float32)
        return image_tensor

    def scalt_center_to_real_image_shape(self, predicted_center):
        new_center_y = round(predicted_center[0] * self.real_input_image_shape[1] / self.model_input_image_shape[1])
        new_center_x = round(predicted_center[1] * self.real_input_image_shape[0] / self.model_input_image_shape[0])
        return (new_center_y, new_center_x)

    def print_image(self, predicted_center=None):
        image_to_print = self.image
        if predicted_center is not None:
            image_to_print = cv2.circle(image_to_print,
                                        predicted_center,
                                        radius=10,
                                        color=(3, 3, 255),
                                        thickness=-1)

        plt.imshow(cv2.cvtColor(image_to_print, cv2.COLOR_BGR2RGB))


image_file_name = Path("../data/image/coco.jpg")
input_image = Image((480, 640), image_file_name.as_posix())
image_tensor = input_image.prepare_image_tensor()
input_image.print_image()

## Get result with OpenVINO IR model

The attention-center model return a 2D point as outputs, which is the predicted center of human attention on the image.

In [None]:
output_layer = compiled_model.output(0)

# makes inference
res = compiled_model([image_tensor])[output_layer]
print(f'Prediction for image reshaped to model input picture {res[0][0], res[0][1]}')
predicted_center = input_image.scalt_center_to_real_image_shape(res[0])
print(f'Prediction for real image resolution {predicted_center}')
input_image.print_image(predicted_center)

## Get result with TensorFlow Lite API

In [None]:
# Loads a tflite model
with open(tflite_model_path, 'rb') as f:
    tflite_model_content = f.read()
# init Interpreter
interpreter = tf.lite.Interpreter(model_content=tflite_model_content)
interpreter.allocate_tensors()

# create the interpreter's input index-by-name map
index_map = {m['name']: m['index'] for m in interpreter.get_input_details()}

# make inference
interpreter.set_tensor(index_map['image:0'], image_tensor)
interpreter.invoke()
output_details = interpreter.get_output_details()
tf_res = interpreter.get_tensor(output_details[0]['index'])
print(f'Prediction for image reshaped to model input picture ({tf_res[0][0], tf_res[0][1]})')

tf_predicted_center = input_image.scalt_center_to_real_image_shape(tf_res[0])
print(f'Prediction for real image resolution {tf_predicted_center}')
input_image.print_image(tf_predicted_center)

## Compare performance.



In [None]:
num_images = 100

start = time.perf_counter()
for _ in range(num_images):
    res = compiled_model([image_tensor])[output_layer]
end = time.perf_counter()
time_ir = end - start

print(
    f"IR model in OpenVINO Runtime/CPU: {time_ir/num_images:.4f} "
    f"seconds per image, FPS: {num_images/time_ir:.2f}"
)


start = time.perf_counter()
for _ in range(num_images):
    interpreter.set_tensor(index_map['image:0'], image_tensor)
    interpreter.invoke()
    output_details = interpreter.get_output_details()
    pred_from_tflite = interpreter.get_tensor(output_details[0]['index'])
end = time.perf_counter()
time_ir = end - start

print(
    f"TensorFlow API: {time_ir/num_images:.4f} "
    f"seconds per image, FPS: {num_images/time_ir:.2f}"
)