<img src="https://10botics.com/logo_jnb.png" width="300"/>

# Using pose estimation model

<img src="./resource/estimation_on_image_sample.png" width=300 />

In this lesson, you will learn:

1. Import the tensorflow libraries
2. Download a pose detection model - movenet
3. Load a sample image
4. Resize the sample image
5. Run the model
6. Draw detected keypoints on the image

## Preparation

Refer to install_tf.sh if you have not installed the tensorflow required libraries. 

## Import the libraries

In [2]:
import tensorflow as tf
import numpy as np
import cv2

# Import matplotlib libraries
from matplotlib import pyplot as plt
from matplotlib.collections import LineCollection
import matplotlib.patches as patches

# Some modules to display an animation using imageio.
import imageio
from IPython.display import HTML, display

print("TF version:", tf.__version__)

## Download the model

We are going to use movenet as the sample. It is an ultra fast and accurate pose detection model. For more information about movenet, check below:

- https://www.tensorflow.org/hub/tutorials/movenet
- https://blog.tensorflow.org/2021/05/next-generation-pose-detection-with-movenet-and-tensorflowjs.html
- https://storage.googleapis.com/movenet/MoveNet.SinglePose%20Model%20Card.pdf

In [3]:
# Download the model 
!wget -nc -q -O model.tflite https://tfhub.dev/google/lite-model/movenet/singlepose/lightning/tflite/int8/4?lite-format=tflite

# Initialize the TFLite interpreter
interpreter = tf.lite.Interpreter(model_path="model.tflite")
interpreter.allocate_tensors()

## Define a function to use the model

In [4]:
def movenet(input_image):
    """Runs detection on an input image.

    Args:
      input_image: A [1, height, width, 3] tensor represents the input image
        pixels. Note that the height/width should already be resized and match the
        expected input resolution of the model before passing into this function.

    Returns:
      A [1, 1, 17, 3] float numpy array representing the predicted keypoint
      coordinates and scores.
    """
    # TF Lite format expects tensor type of uint8.
    input_image = tf.cast(input_image, dtype=tf.uint8)
    input_details = interpreter.get_input_details()    
    output_details = interpreter.get_output_details()
    interpreter.set_tensor(input_details[0]['index'], input_image.numpy())
    # Invoke inference.
    interpreter.invoke()
    # Get the model prediction.
    keypoints_with_scores = interpreter.get_tensor(output_details[0]['index'])
    return keypoints_with_scores

### Model Input

`interpreter.get_input_details()` returns the specification of the input the model requires

In [5]:
interpreter.get_input_details()

In [6]:
# Define model input size
model_input_size = 192

### Model Output

`interpreter.get_output_details()` returns the specification of the output the model generates.

In [7]:
interpreter.get_output_details()

## 1st Attempt: Feed the image to the model

### Load a sample image

In [8]:
# Show the image
image_path = 'resource/input_image4.jpeg'
import matplotlib.image as mpimg
image = mpimg.imread(image_path)
imgplot = plt.imshow(image)
plt.show()

In [9]:
# The following line will generate an error! What's the problem?
keypoints_with_scores = movenet(image)

### Error

- What is the meaning of this error message?
- What is our image shape?

In [10]:
image.shape

## 2nd Attempt: Adding dimension

In [13]:
expanded_image = tf.expand_dims(image, axis=0)

In [14]:
expanded_image.shape

In [15]:
# The following line will generate an error! What's the problem?

keypoints_with_scores = movenet(expanded_image)

## 3rd Attempt: Resize the image

- Our model only accept a 192x192 image.
- What is the size of our image?
- How do we reduce it into 192x192?

In [16]:
resized_image = tf.image.resize_with_pad(expanded_image, model_input_size, model_input_size)

In [None]:
print(resized_image.shape)

In [17]:
# The following lines will generate an error! What's the problem?

imgplot = plt.imshow(resized_image)
plt.show()

### Exercise #1 - Fix the error

In [18]:
imgplot = plt.imshow(resized_image)
plt.show()

#### Answer

In [19]:
imgplot = plt.imshow(resized_image[0])
plt.show()

### Conform to imshow standard

According to imshow documentation, the image data supported array shapes are:

- (M, N, 3): an image with RGB values (0-1 float or 0-255 int).

https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.imshow.html

In [None]:
# Check the left uppermost pixel
resized_image[0][0][0]

In [None]:
# Check a pixel somewhere in the centre
resized_image[0][100][100]

In [None]:
type(resized_image[0][100][100].numpy()[0])

#### The solution

In [20]:
imgplot = plt.imshow(resized_image[0].numpy() / 255.0)
plt.show()

### Run the model again

In [21]:
# Run model inference.
keypoints_with_scores = movenet(resized_image)
print(keypoints_with_scores)
keypoints_with_scores.shape

## Interpret the model output

In [22]:
# Dictionary that maps from joint names to keypoint indices.
KEYPOINT_DICT = {
    'nose': 0,
    'left_eye': 1,
    'right_eye': 2,
    'left_ear': 3,
    'right_ear': 4,
    'left_shoulder': 5,
    'right_shoulder': 6,
    'left_elbow': 7,
    'right_elbow': 8,
    'left_wrist': 9,
    'right_wrist': 10,
    'left_hip': 11,
    'right_hip': 12,
    'left_knee': 13,
    'right_knee': 14,
    'left_ankle': 15,
    'right_ankle': 16
}


### Reorganize the points into a more readable format

In [23]:
sides = ["left" , "right"]
body_parts = ["shoulder", "hip", "ankle", "knee"]

points = dict()

for side in sides:
    points[side] = dict()
    for body_part in body_parts:
        points[side][body_part] = keypoints_with_scores[0][0][KEYPOINT_DICT[f"{side}_{body_part}"]]


print(points)

In [128]:
points['left']['shoulder']

In [24]:
points['left']['knee']

## Annotate the points on the image

### Draw the annotation on the original image

In [25]:
def annotate_and_show(image, points):

    height, width, _ = image.shape
    
    plt.imshow(image /255.0)
    
    for side in sides:
        for body_part in body_parts:            
            absolute_x = points[side][body_part][1] * width
            absolute_y = points[side][body_part][0] * height
            # print(f"{side}_{body_part}, x = {absolute_x} , y = {absolute_y}")
            plt.plot(absolute_x, absolute_y, 'bo')
    
    plt.show()

annotate_and_show(image, points)


### Draw the annotation on the resized and padded image

In [26]:
annotate_and_show(resized_image[0], points)

### Compare the two output

Which one is more accurate? Why?

<img src="./resource/model_image_compare.png" width=600/>

## Consolidate our work

In [28]:
def read_image(image_path):
    # Show the image

    import matplotlib.image as mpimg
    image = mpimg.imread(image_path)
    
    return image    

def convert_to_movenet_format(image):
    expanded_image = tf.expand_dims(image, axis=0)    
    resized_image = tf.image.resize_with_pad(expanded_image, model_input_size, model_input_size)
    return resized_image


def reoganize_output(movenet_output):
    sides = ["left" , "right"]
    body_parts = ["shoulder", "hip", "ankle", "knee"]
    
    points = dict()
    
    for side in sides:
        points[side] = dict()
        for body_part in body_parts:
            points[side][body_part] = movenet_output[0][0][KEYPOINT_DICT[f"{side}_{body_part}"]]

    return points

    

### Let's try different images

In [30]:
# image_path = 'resource/input_image.jpeg'
image_path = 'resource/input_image2.jpeg'
# image_path = 'resource/input_image3.jpeg'
# image_path = 'resource/input_image4.jpeg'

image = read_image(image_path)
movenet_image = convert_to_movenet_format(image)
movenet_output = movenet(movenet_image)
points = reoganize_output(movenet_output)
annotate_and_show(movenet_image[0], points)


<hr/>

## Congratulation! You have finished this chapter.

This jupyter notebook is created by 10Botics. <br>
For permission to use in school, please contact info@10botics.com <br>
All rights reserved. 2024.