# Custom Gesture Creation

In [1]:
import os
import tensorflow as tf
from mediapipe_model_maker import gesture_recognizer

import matplotlib.pyplot as plt

2024-03-25 20:27:33.326273: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-03-25 20:27:33.326405: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-03-25 20:27:33.329602: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-03-25 20:27:33.345330: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.

TensorFlow Addons (TFA) has ended development and in

## Get The Dataset

The dataset for gesture recognition in model maker requires the following format: `<dataset_path>/<label_name>/<img_name>.*`. In addition, one of the label names (label_names) must be none. The none label represents any gesture that isn't classified as one of the other gestures.

This example uses a rock paper scissors dataset sample which is downloaded from GCS.

In [2]:
dataset_path = "../data/Rock-Paper-Scissors-Data"

In [3]:
print(dataset_path)
labels = []
for i in os.listdir(dataset_path):
  if os.path.isdir(os.path.join(dataset_path, i)):
    labels.append(i)
print(labels)

../data/Rock-Paper-Scissors-Data
['none', 'rock', 'paper', 'scissors']


In [4]:
NUM_EXAMPLES = 5

for label in labels:
  label_dir = os.path.join(dataset_path, label)
  example_filenames = os.listdir(label_dir)[:NUM_EXAMPLES]
  fig, axs = plt.subplots(1, NUM_EXAMPLES, figsize=(10,2))
  for i in range(NUM_EXAMPLES):
    axs[i].imshow(plt.imread(os.path.join(label_dir, example_filenames[i])))
    axs[i].get_xaxis().set_visible(False)
    axs[i].get_yaxis().set_visible(False)
  fig.suptitle(f'Showing {NUM_EXAMPLES} examples for {label}')

plt.show()

  plt.show()


## Load the dataset

Load the dataset located at dataset_path by using the Dataset.from_folder method. When loading the dataset, run the pre-packaged hand detection model from MediaPipe Hands to detect the hand landmarks from the images. Any images without detected hands are ommitted from the dataset. The resulting dataset will contain the extracted hand landmark positions from each image, rather than images themselves.

The HandDataPreprocessingParams class contains two configurable options for the data loading process:

`shuffle`: A boolean controlling whether to shuffle the dataset. Defaults to true.
`min_detection_confidence`: A float between 0 and 1 controlling the confidence threshold for hand detection.
`Split the dataset`: 80% for training, 10% for validation, and 10% for testing.

In [5]:
data = gesture_recognizer.Dataset.from_folder(
    dirname=dataset_path,
    hparams=gesture_recognizer.HandDataPreprocessingParams()
)
train_data, rest_data = data.split(0.8)
validation_data, test_data = rest_data.split(0.5)

Downloading https://storage.googleapis.com/mediapipe-assets/palm_detection_full.tflite to /tmp/model_maker/gesture_recognizer/palm_detection_full.tflite
Downloading https://storage.googleapis.com/mediapipe-assets/hand_landmark_full.tflite to /tmp/model_maker/gesture_recognizer/hand_landmark_full.tflite
INFO:tensorflow:Loading image /home/Suraj/Documents/GitHub/HGR/data/Rock-Paper-Scissors-Data/scissors/264.jpg
INFO:tensorflow:Loading image /home/Suraj/Documents/GitHub/HGR/data/Rock-Paper-Scissors-Data/scissors/122.jpg


I0000 00:00:1711378702.477952   79209 gl_context_egl.cc:85] Successfully initialized EGL. Major : 1 Minor: 5
I0000 00:00:1711378702.487642   79288 gl_context.cc:357] GL version: 3.2 (OpenGL ES 3.2 Mesa 24.0.3-arch1.2), renderer: Mesa Intel(R) HD Graphics 620 (KBL GT2F)
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.


INFO:tensorflow:Loading image /home/Suraj/Documents/GitHub/HGR/data/Rock-Paper-Scissors-Data/paper/103.jpg
INFO:tensorflow:Loading image /home/Suraj/Documents/GitHub/HGR/data/Rock-Paper-Scissors-Data/rock/239.jpg
INFO:tensorflow:Loading image /home/Suraj/Documents/GitHub/HGR/data/Rock-Paper-Scissors-Data/paper/791.jpg
INFO:tensorflow:Loading image /home/Suraj/Documents/GitHub/HGR/data/Rock-Paper-Scissors-Data/rock/529.jpg
INFO:tensorflow:Loading image /home/Suraj/Documents/GitHub/HGR/data/Rock-Paper-Scissors-Data/none/1299.jpg
INFO:tensorflow:Loading image /home/Suraj/Documents/GitHub/HGR/data/Rock-Paper-Scissors-Data/rock/760.jpg
INFO:tensorflow:Loading image /home/Suraj/Documents/GitHub/HGR/data/Rock-Paper-Scissors-Data/paper/465.jpg
INFO:tensorflow:Loading image /home/Suraj/Documents/GitHub/HGR/data/Rock-Paper-Scissors-Data/paper/269.jpg
INFO:tensorflow:Loading image /home/Suraj/Documents/GitHub/HGR/data/Rock-Paper-Scissors-Data/none/1446.jpg
INFO:tensorflow:Loading image /home/Sura

INFO:tensorflow:Load valid hands with size: 473, num_label: 4, labels: none,paper,rock,scissors.


## Train the model

Train the custom gesture recognizer by using the create method and passing in the training data, validation data, model options, and hyperparameters. For more information on model options and hyperparameters, see the Hyperparameters section below.

In [10]:
hparams = gesture_recognizer.HParams(learning_rate=0.003, epochs=40, export_dir="../tasks")
options = gesture_recognizer.GestureRecognizerOptions(hparams=hparams)
model = gesture_recognizer.GestureRecognizer.create(
    train_data=train_data,
    validation_data=validation_data,
    options=options
)

Model: "model_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 hand_embedding (InputLayer  [(None, 128)]             0         
 )                                                               
                                                                 
 batch_normalization_1 (Bat  (None, 128)               512       
 chNormalization)                                                
                                                                 
 re_lu_1 (ReLU)              (None, 128)               0         
                                                                 
 dropout_1 (Dropout)         (None, 128)               0         
                                                                 
 custom_gesture_recognizer_  (None, 4)                 516       
 out (Dense)                                                     
                                                           

Non-trainable params: 256 (1.00 KB)
_________________________________________________________________
None
INFO:tensorflow:Training the models...


INFO:tensorflow:Training the models...


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


## Evaluate the model performance

After training the model, evaluate it on a test dataset and print the loss and accuracy metrics.


In [11]:
loss, acc = model.evaluate(test_data, batch_size=1)
print(f"Test loss:{loss}, Test accuracy:{acc}")

Test loss:0.1339777261018753, Test accuracy:0.8541666865348816


## Export to Tensorflow Lite Model

After creating the model, convert and export it to a Tensorflow Lite model format for later use on an on-device application. The export also includes model metadata, which includes the label file.


In [12]:
model.export_model("../tasks/custom_gestures.task")

Using existing files at /tmp/model_maker/gesture_recognizer/gesture_embedder.tflite
Using existing files at /tmp/model_maker/gesture_recognizer/palm_detection_full.tflite
Using existing files at /tmp/model_maker/gesture_recognizer/hand_landmark_full.tflite
Using existing files at /tmp/model_maker/gesture_recognizer/canned_gesture_classifier.tflite
INFO:tensorflow:Assets written to: /tmp/tmpzz4lzhhy/saved_model/assets


INFO:tensorflow:Assets written to: /tmp/tmpzz4lzhhy/saved_model/assets
2024-03-25 20:32:12.088475: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:378] Ignored output_format.
2024-03-25 20:32:12.088514: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:381] Ignored drop_control_dependency.
2024-03-25 20:32:12.088737: I tensorflow/cc/saved_model/reader.cc:83] Reading SavedModel from: /tmp/tmpzz4lzhhy/saved_model
2024-03-25 20:32:12.089753: I tensorflow/cc/saved_model/reader.cc:51] Reading meta graph with tags { serve }
2024-03-25 20:32:12.089771: I tensorflow/cc/saved_model/reader.cc:146] Reading SavedModel debug info (if present) from: /tmp/tmpzz4lzhhy/saved_model
2024-03-25 20:32:12.093244: I tensorflow/cc/saved_model/loader.cc:233] Restoring SavedModel bundle.
2024-03-25 20:32:12.137516: I tensorflow/cc/saved_model/loader.cc:217] Running initialization op on SavedModel bundle at path: /tmp/tmpzz4lzhhy/saved_model
2024-03-25 20:32:12.153187: I ten

## Hyperparameters

You can further customize the model using the GestureRecognizerOptions class, which has two optional parameters for ModelOptions and HParams. Use the ModelOptions class to customize parameters related to the model itself, and the HParams class to customize other parameters related to training and saving the model.

ModelOptions has one customizable parameter that affects accuracy:

`dropout_rate`: The fraction of the input units to drop. Used in dropout layer. Defaults to 0.05.
`layer_widths`: A list of hidden layer widths for the gesture model. Each element in the list will create a new hidden layer with the specified width. The hidden layers are separated with BatchNorm, Dropout, and ReLU. Defaults to an empty list(no hidden layers).

HParams has the following list of customizable parameters which affect model accuracy:

`learning_rate`: The learning rate to use for gradient descent training. Defaults to 0.001.
`batch_size`: Batch size for training. Defaults to 2.
`epochs`: Number of training iterations over the dataset. Defaults to 10.
`steps_per_epoch`: An optional integer that indicates the number of training steps per epoch. If not set, the training pipeline calculates the default steps per epoch as the training dataset size divided by batch size.
`shuffle`: True if the dataset is shuffled before training. Defaults to False.
`lr_decay`: Learning rate decay to use for gradient descent training. Defaults to 0.99.
`gamma`: Gamma parameter for focal loss. Defaults to 2

Additional HParams parameter that does not affect model accuracy:

`export_dir`: The location of the model checkpoint files and exported model files.

For example, the following trains a new model with the dropout_rate of 0.2 and learning rate of 0.003.

In [None]:
hparams = gesture_recognizer.HParams(learning_rate=0.003, export_dir="exported_model_2")
model_options = gesture_recognizer.ModelOptions(dropout_rate=0.2)
options = gesture_recognizer.GestureRecognizerOptions(model_options=model_options, hparams=hparams)
model_2 = gesture_recognizer.GestureRecognizer.create(
    train_data=train_data,
    validation_data=validation_data,
    options=options
)

In [None]:
loss, accuracy = model_2.evaluate(test_data)
print(f"Test loss:{loss}, Test accuracy:{accuracy}")