# Hands detection via Transfert Learning on YOLOv5

This project demonstrates transfer learning using a pre-trained YOLOv5 model to identify hands.

## Table of contents
* [Part 0: Goals](#p0)
* [Part 1: Import YOLO Model and Architecture summary](#p1)
* [Part 2: Data Loading and Preprocessing](#p2)
* [Part 3: Transfer Learning on model](#p3)
* [Part 4: Training model](#p4)
* [Part 5: Evaluation](#p5)
* [Part 6: Visualizations of predictions](#p6)
* [Part 7: Hyperparameters tuning / Optimization](#p7)
* [Part 8: Deployment?](#p8)



<a id='p1'></a>
### Part 1: Import YOLO Model and Architecture summary

In [1]:
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' # Prevent tensorflow from showcasing common errors

import tensorflow as tf 

gpu_device = tf.device('GPU:0')

yolo_model = tf.saved_model.load('../models/yolov5s_saved_model')


I0000 00:00:1728305132.412326      25 cuda_executor.cc:1001] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
I0000 00:00:1728305132.738800      25 cuda_executor.cc:1001] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
I0000 00:00:1728305132.738891      25 cuda_executor.cc:1001] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
I0000 00:00:1728305132.803296      25 cuda_executor.cc:1001] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
I0000 00:00:1728305132.803422      25 cuda_executor.cc:1001] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
I0000 00:0

In [2]:
infer = yolo_model.signatures["serving_default"]
print(infer)
print(infer.graph)

ConcreteFunction Input Parameters:
  x (KEYWORD_ONLY): TensorSpec(shape=(1, 640, 640, 3), dtype=tf.float32, name='x')
Output Type:
  Dict[['output_0', TensorSpec(shape=(1, 25200, 85), dtype=tf.float32, name='output_0')]]
Captures:
  None
FuncGraph(name=signature_wrapper, id=139849093187008)


<a id='p2'></a>
### Part 2: Data Loading and Preprocessing

Datasets in use: 
* [Hagrid](https://www.kaggle.com/datasets/kapitanov/hagrid)
* [Hands and Palms](https://www.kaggle.com/datasets/shyambhu/hands-and-palm-images-dataset)

In [3]:
# Handling Hagrid
HAGRID_PATH = '../data/Hagrid/'
HAGRID_TRAIN_PATH = HAGRID_PATH + 'ann_train_val/'

import cv2 as cv
import numpy as np

img = cv.imread('../data/HandP/Hands/Hands/Hand_0000002.jpg', cv.IMREAD_COLOR)
img = cv.resize(img, (640, 640), interpolation = cv.INTER_LINEAR)
img = np.array(img) / 255.
img = np.expand_dims(img, axis=0)

# Handling HandP
img.shape

output = yolo_model(x=img)
output

W0000 00:00:1728305136.505999      96 gpu_timer.cc:114] Skipping the delay kernel, measurement accuracy will be reduced
W0000 00:00:1728305136.663214      96 gpu_timer.cc:114] Skipping the delay kernel, measurement accuracy will be reduced
W0000 00:00:1728305136.665987      96 gpu_timer.cc:114] Skipping the delay kernel, measurement accuracy will be reduced
W0000 00:00:1728305136.669605      96 gpu_timer.cc:114] Skipping the delay kernel, measurement accuracy will be reduced
W0000 00:00:1728305136.672924      96 gpu_timer.cc:114] Skipping the delay kernel, measurement accuracy will be reduced
W0000 00:00:1728305136.688219      96 gpu_timer.cc:114] Skipping the delay kernel, measurement accuracy will be reduced
W0000 00:00:1728305136.700206      96 gpu_timer.cc:114] Skipping the delay kernel, measurement accuracy will be reduced
W0000 00:00:1728305136.713784      96 gpu_timer.cc:114] Skipping the delay kernel, measurement accuracy will be reduced
W0000 00:00:1728305136.717972      96 gp

[<tf.Tensor: shape=(1, 25200, 85), dtype=float32, numpy=
 array([[[5.6672180e-03, 5.1899031e-03, 1.2079368e-02, ...,
          2.6467333e-03, 1.2484483e-03, 7.0888791e-03],
         [7.0290253e-03, 4.4718543e-03, 1.7182404e-02, ...,
          3.9004858e-03, 1.5730320e-03, 7.4830628e-03],
         [7.8687193e-03, 5.0323764e-03, 2.0617142e-02, ...,
          3.9325650e-03, 1.9124983e-03, 5.9388448e-03],
         ...,
         [9.5893282e-01, 9.6126729e-01, 1.1120750e-01, ...,
          1.3551764e-03, 7.7960861e-04, 7.4201624e-04],
         [9.5807046e-01, 9.6105462e-01, 1.1194242e-01, ...,
          1.5190671e-03, 7.4993831e-04, 8.0285326e-04],
         [9.6938431e-01, 9.7186548e-01, 1.9773616e-01, ...,
          2.1896928e-03, 1.2211924e-03, 1.3884176e-03]]], dtype=float32)>]

In [57]:
predictions = output[0]
boxes = predictions[:, :4]  
object_conf = predictions[:, 4] 
class_probs = predictions[:, 5:] 

In [58]:
confidence_threshold = 0.5
mask = object_conf > confidence_threshold

# Filter boxes, confidences, and class probabilities
filtered_boxes = boxes[mask]
filtered_confidences = object_conf[mask]
filtered_class_probs = class_probs[mask]
class_ids = tf.argmax(filtered_class_probs, axis=1)

In [53]:

image_height, image_width = img.shape[:2]

# Convert YOLOv5 box format (x_center, y_center, width, height) to (x_min, y_min, x_max, y_max)
def convert_box_format(box, img_width, img_height):
    x_center, y_center, width, height = box
    x_min = int((x_center - width / 2) * img_width)
    y_min = int((y_center - height / 2) * img_height)
    x_max = int((x_center + width / 2) * img_width)
    y_max = int((y_center + height / 2) * img_height)
    return [x_min, y_min, x_max, y_max]

pixel_boxes = [convert_box_format(box, image_width, image_height) for box in filtered_boxes.numpy()]


In [56]:

class_names = [f"Class {i}" for i in range(80)]

for i, box in enumerate(pixel_boxes):
    x_min, y_min, x_max, y_max = box
    class_id = class_ids[i].numpy()  
    confidence = filtered_confidences[i].numpy()
    
    cv.rectangle(img, (x_min, y_min), (x_max, y_max), (255, 0, 0), 2)
    
    label = f"{class_names[class_id]}: {confidence:.2f}"
    cv.putText(img, label, (x_min, y_min - 10), cv.FONT_HERSHEY_SIMPLEX, 0.9, (255, 0, 0), 2)

print(img.shape)
img = img.astype("uint8")
output_image_path = "/app/notebooks/output_image_with_bboxes.jpg"
success = cv.imwrite(output_image_path, img)

if success:
    print(f"Image saved successfully at {output_image_path}")
else:
    print("Error: Image not saved.")

(640, 640, 3)
Image saved successfully at /app/notebooks/output_image_with_bboxes.jpg


<a id='p3'></a>
### Part 3: Transfer Learning on model

<a id='p4'></a>
### Part 4: Training model

<a id='p5'></a>
### Part 5: Evaluation

<a id='p6'></a>
### Part 6: Visualizations of predictions

<a id='p7'></a>
### Part 7: Hyperparameters tuning / Optimization

<a id='p8'></a>
### Part 8: Deployment?