# Objects Detection (SSDLite, MobileNetV2, COCO)

> - 🤖 See [full list of Machine Learning Experiments](https://github.com/trekhleb/machine-learning-experiments) on **GitHub**<br/><br/>
> - ▶️ **Interactive Demo**: [try this model and other machine learning experiments in action](https://trekhleb.github.io/machine-learning-experiments/)

## Experiment overview

In this experiment we will use pre-trained [ssdlite_mobilenet_v2_coco](http://download.tensorflow.org/models/object_detection/ssdlite_mobilenet_v2_coco_2018_05_09.tar.gz) model from [Tensorflow detection models zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md) to do objects detection on the photos.

![objects_detection_ssdlite_mobilenet_v2.jpg](../../demos/src/images/objects_detection_ssdlite_mobilenet_v2.jpg)

_This notebook is inspired by [Objects Detection API Demo](https://colab.research.google.com/github/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb)_

## Importing Dependencies

- [tensorflow](https://www.tensorflow.org/) - for developing and training ML models.
- [matplotlib](https://matplotlib.org/) - for plotting the data.
- [numpy](https://numpy.org/) - for linear algebra operations.
- [cv2](https://pypi.org/project/opencv-python/) - for processing the images and drawing object detections on top of them.
- [PIL](https://pypi.org/project/Pillow/2.2.1/) - for convenient image loading.
- [pathlib](https://docs.python.org/3/library/pathlib.html) - for working with model files.
- [math](https://docs.python.org/3/library/math.html) - to do simple math operations while drawing the detection frames.
- [google.protobuf](https://pypi.org/project/protobuf/) - for reading the files in protobuf format.

In [84]:
# Selecting Tensorflow version v2 (the command is relevant for Colab only).
%tensorflow_version 2.x

UsageError: Line magic function `%tensorflow_version` not found.


In [85]:
import sys
  
# appending a path
sys.path.append("C:/Users/Gmun/AppData/Local/Programs/Python/Python310/modelos/SICP/Lib/site-packages/tensorflow")  

import tensorflow as tf
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import pathlib
import cv2 
import math
from PIL import Image
from google.protobuf import text_format
import platform

print('Python version:', platform.python_version())
print('Tensorflow version:', tf.__version__)
print('Keras version:', tf.keras.__version__)

Python version: 3.10.10
Tensorflow version: 2.11.0
Keras version: 2.11.0


## Loading the model

To do objects detection we're going to use [ssdlite_mobilenet_v2_coco](http://download.tensorflow.org/models/object_detection/ssdlite_mobilenet_v2_coco_2018_05_09.tar.gz) model from [Tensorflow detection models zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md).

The full name of the model will be **ssdlite_mobilenet_v2_coco_2018_05_09**.

In [86]:
# Create cache folder.
!mkdir .tmp

mkdir: no se puede crear el directorio «.tmp»: El archivo ya existe


In [87]:
# Loads the module from internet, unpacks it and initializes a Tensorflow saved model.
def load_model(model_name):
    model_url = 'http://download.tensorflow.org/models/object_detection/' + model_name + '.tar.gz'
    
    model_dir = tf.keras.utils.get_file(
        fname=model_name, 
        origin=model_url,
        untar=True,
        cache_dir=pathlib.Path('.tmp').absolute()
    )
    model = tf.saved_model.load(model_dir + '/saved_model')
    
    return model

In [88]:
#MODEL_NAME = 'ssdlite_mobilenet_v2_coco_2018_05_09'

#saved_model = load_model(MODEL_NAME)

In [89]:
def load_model_downloaded(model_dir):
    model = tf.saved_model.load(model_dir)
    return model

In [90]:
                                                                                    #tiempo que se demora en cargar el modelo
centernet = './Modelos/centernet_resnet101_v1_fpn_512x512_coco17_tpu-8/saved_model' #17.6s
efficiendet_d1 = './Modelos/efficientdet_d1_coco17_tpu-32/saved_model'              #24.6s
ssd = './Modelos/ssd_mobilenet_v2_320x320_coco17_tpu-8/saved_model'                 #9.9s

ssdlite = './Modelos/ssdlite_mobilenet_v2_coco_2018_05_09/saved_model'              #4.9s


MODEL_DOWNLOADED = ssdlite

saved_model = load_model_downloaded(MODEL_DOWNLOADED)

INFO:tensorflow:Saver not created because there are no variables in the graph to restore


In [91]:
# Exploring model signatures.
saved_model.signatures

_SignatureMap({'serving_default': <ConcreteFunction pruned(inputs) at 0x7FD6D505FC70>})

In [92]:
# Loading default model signature.
model = saved_model.signatures['serving_default']

## Loading model labels

Depending on what dataset has been used to train the model we need to download proper labels set from [tensorflow models](https://github.com/tensorflow/models/tree/master/research/object_detection/data) repository.

The **ssdlite_mobilenet_v2_coco** model has been trained on [COCO](http://cocodataset.org) dataset which has **90** objects categories. This list of categories we're going to download and explore. We need a label file with the name [mscoco_label_map.pbtxt](https://github.com/tensorflow/models/blob/master/research/object_detection/data/mscoco_label_map.pbtxt).

### Compiling the protobuf label map

Label object structure is defined in [string_int_label_map.proto](https://github.com/tensorflow/models/tree/master/research/object_detection/protos) file in [protobuf](https://developers.google.com/protocol-buffers) format.

In order to convert `mscoco_label_map.pbtxt` file to Python dictionary we need to load `string_int_label_map.proto` file and compile it using `protoc`. Before doing the we need to [install](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/installation.md#manual-protobuf-compiler-installation-and-usage) `protoc`.

One of the ways to **install** `protoc` is to load it manually:

```
PROTOC_ZIP=protoc-3.7.1-osx-x86_64.zip
curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v3.7.1/$PROTOC_ZIP
sudo unzip -o $PROTOC_ZIP -d .tmp/protoc
rm -f $PROTOC_ZIP
```

After that we may **compile** `proto` files by running:

```
.tmp/protoc/bin/protoc ./protos/*.proto --python_out=.
```

☝🏻 For simplicity reasons we have `string_int_label_map.proto` and its compiled version `string_int_label_map_pb2.py` in the `protos` directory. So let's just include this compiled package.

In [93]:
from protos import string_int_label_map_pb2

### Loading and parsing the labels

In [94]:
def load_labels(labels_name):
    labels_url = 'https://raw.githubusercontent.com/tensorflow/models/master/research/object_detection/data/' + labels_name
    
    labels_path = tf.keras.utils.get_file(
        fname=labels_name, 
        origin=labels_url,
        cache_dir=pathlib.Path('.tmp').absolute()
    )
    
    labels_file = open(labels_path, 'r')
    labels_string = labels_file.read()
    
    labels_map = string_int_label_map_pb2.StringIntLabelMap()
    try:
        text_format.Merge(labels_string, labels_map)
    except text_format.ParseError:
        labels_map.ParseFromString(labels_string)
    
    labels_dict = {}
    for item in labels_map.item:
        labels_dict[item.id] = item.display_name
    
    return labels_dict

In [95]:
def load_labels_downloaded(labels_name):
    
        labels_file = open(labels_name, 'r')
        labels_string = labels_file.read()
        
        labels_map = string_int_label_map_pb2.StringIntLabelMap()
        try:
            text_format.Merge(labels_string, labels_map)
        except text_format.ParseError:
            labels_map.ParseFromString(labels_string)
        
        labels_dict = {}
        for item in labels_map.item:
            labels_dict[item.id] = item.display_name
        
        return labels_dict

In [96]:
LABELS_NAME = './Label_maps/mscoco_label_map.pbtxt'
labels = load_labels_downloaded(LABELS_NAME)
labels

{1: 'person',
 2: 'bicycle',
 3: 'car',
 4: 'motorcycle',
 5: 'airplane',
 6: 'bus',
 7: 'train',
 8: 'truck',
 9: 'boat',
 10: 'traffic light',
 11: 'fire hydrant',
 13: 'stop sign',
 14: 'parking meter',
 15: 'bench',
 16: 'bird',
 17: 'cat',
 18: 'dog',
 19: 'horse',
 20: 'sheep',
 21: 'cow',
 22: 'elephant',
 23: 'bear',
 24: 'zebra',
 25: 'giraffe',
 27: 'backpack',
 28: 'umbrella',
 31: 'handbag',
 32: 'tie',
 33: 'suitcase',
 34: 'frisbee',
 35: 'skis',
 36: 'snowboard',
 37: 'sports ball',
 38: 'kite',
 39: 'baseball bat',
 40: 'baseball glove',
 41: 'skateboard',
 42: 'surfboard',
 43: 'tennis racket',
 44: 'bottle',
 46: 'wine glass',
 47: 'cup',
 48: 'fork',
 49: 'knife',
 50: 'spoon',
 51: 'bowl',
 52: 'banana',
 53: 'apple',
 54: 'sandwich',
 55: 'orange',
 56: 'broccoli',
 57: 'carrot',
 58: 'hot dog',
 59: 'pizza',
 60: 'donut',
 61: 'cake',
 62: 'chair',
 63: 'couch',
 64: 'potted plant',
 65: 'bed',
 67: 'dining table',
 70: 'toilet',
 72: 'tv',
 73: 'laptop',
 74: 'mo

## Exploring the model

In [97]:
# List model files
!ls -la .tmp/datasets/ssdlite_mobilenet_v2_coco_2018_05_09

total 40852
drwxrwxr-x 3 juan8ahp-ta juan8ahp-ta     4096 abr 13 16:31 .
drwxrwxr-x 3 juan8ahp-ta juan8ahp-ta     4096 may 23 13:46 ..
-rw-rw-r-- 1 juan8ahp-ta juan8ahp-ta       77 abr 13 16:31 checkpoint
-rw-rw-r-- 1 juan8ahp-ta juan8ahp-ta 19911343 abr 13 16:31 frozen_inference_graph.pb
-rw-rw-r-- 1 juan8ahp-ta juan8ahp-ta 18205188 abr 13 16:31 model.ckpt.data-00000-of-00001
-rw-rw-r-- 1 juan8ahp-ta juan8ahp-ta    17703 abr 13 16:31 model.ckpt.index
-rw-rw-r-- 1 juan8ahp-ta juan8ahp-ta  3665866 abr 13 16:31 model.ckpt.meta
-rw-rw-r-- 1 juan8ahp-ta juan8ahp-ta     4199 may 29 20:45 pipeline.config
drwxrwxr-x 2 juan8ahp-ta juan8ahp-ta     4096 abr 13 16:31 saved_model


In [98]:
# Check model pipeline.
!cat ./Modelos/ssdlite_mobilenet_v2_coco_2018_05_09/pipeline.config

model {
  ssd {
    num_classes: 90
    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }
    }
    feature_extractor {
      type: "ssd_mobilenet_v2"
      depth_multiplier: 1.0
      min_depth: 16
      conv_hyperparams {
        regularizer {
          l2_regularizer {
            weight: 3.99999989895e-05
          }
        }
        initializer {
          truncated_normal_initializer {
            mean: 0.0
            stddev: 0.0299999993294
          }
        }
        activation: RELU_6
        batch_norm {
          decay: 0.999700009823
          center: true
          scale: true
          epsilon: 0.0010000000475
          train: true
        }
      }
      use_depthwise: true
    }
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold:

In [99]:
model.inputs

[<tf.Tensor 'image_tensor:0' shape=(None, None, None, 3) dtype=uint8>]

In [100]:
model.outputs

[<tf.Tensor 'detection_boxes:0' shape=(None, 100, 4) dtype=float32>,
 <tf.Tensor 'detection_classes:0' shape=(None, 100) dtype=float32>,
 <tf.Tensor 'detection_scores:0' shape=(None, 100) dtype=float32>,
 <tf.Tensor 'num_detections:0' shape=(None,) dtype=float32>]

## Loading test images

In [101]:
def display_image(image_np):
    plt.figure()
    plt.imshow(image_np)

In [102]:
TEST_IMAGES_DIR_PATH = pathlib.Path('data_prueba_j')
TEST_IMAGE_PATHS = sorted(list(TEST_IMAGES_DIR_PATH.glob('*.jpeg')))#cambiar si es necesario
TEST_IMAGE_PATHS

[PosixPath('data_prueba_j/WhatsApp Image 2022-07-27 at 1.00.06 PM (2).jpeg'),
 PosixPath('data_prueba_j/WhatsApp Image 2022-07-27 at 1.00.06 PM.jpeg'),
 PosixPath('data_prueba_j/WhatsApp Image 2022-07-27 at 1.00.07 PM (1).jpeg'),
 PosixPath('data_prueba_j/WhatsApp Image 2022-07-27 at 12.57.36 PM.jpeg'),
 PosixPath('data_prueba_j/WhatsApp Image 2022-07-27 at 12.57.37 PM (2).jpeg'),
 PosixPath('data_prueba_j/WhatsApp Image 2022-07-27 at 12.57.39 PM (1).jpeg'),
 PosixPath('data_prueba_j/WhatsApp Image 2022-07-27 at 12.57.39 PM (2).jpeg'),
 PosixPath('data_prueba_j/WhatsApp Image 2022-07-27 at 12.57.40 PM.jpeg'),
 PosixPath('data_prueba_j/WhatsApp Image 2022-07-28 at 9.25.59 PM.jpeg'),
 PosixPath('data_prueba_j/WhatsApp Image 2022-07-28 at 9.27.28 PM.jpeg'),
 PosixPath('data_prueba_j/WhatsApp Image 2022-07-28 at 9.28.42 PM.jpeg'),
 PosixPath('data_prueba_j/WhatsApp Image 2022-07-28 at 9.29.31 PM (1).jpeg'),
 PosixPath('data_prueba_j/WhatsApp Image 2022-07-28 at 9.29.31 PM (2).jpeg'),
 Posi

In [103]:
'''
for image_path in TEST_IMAGE_PATHS:
    image_np = mpimg.imread(image_path)
    display_image(image_np)
'''

'\nfor image_path in TEST_IMAGE_PATHS:\n    image_np = mpimg.imread(image_path)\n    display_image(image_np)\n'

## Running the model

In [104]:
# Función para eliminar los objetos que no sean personas (Editado)
def detect_people(detected_people):
    indices = np.where(detected_people['detection_classes'] != 1)
    
    detection_scores_temp = detected_people['detection_scores']
    detection_classes_temp = detected_people['detection_classes']
    detection_boxes_temp = detected_people['detection_boxes']
    num_detections_temp = detected_people['num_detections']

    for i in indices[0]:
        detection_scores_temp = np.delete(detected_people['detection_scores'], indices[0])
        detection_classes_temp = np.delete(detected_people['detection_classes'], indices[0])
        detection_boxes_temp = np.delete(detected_people['detection_boxes'], indices[0], axis = 0) # Elimina las filas el índice
        num_detections_temp = detected_people['num_detections'] - len(indices[0])

    detected_people['detection_scores'] = detection_scores_temp
    detected_people['detection_classes'] = detection_classes_temp
    detected_people['detection_boxes'] = detection_boxes_temp
    detected_people['num_detections'] = num_detections_temp

    return detected_people

In [105]:
def detect_objects_on_image(image, model):

    
    image = np.asarray(image)
    input_tensor = tf.convert_to_tensor(image)
    # Adding one more dimension since model expect a batch of images.
    input_tensor = input_tensor[tf.newaxis, ...]

    output_dict = model(input_tensor)

    num_detections = int(output_dict['num_detections'])
    output_dict = {
        key:value[0, :num_detections].numpy() 
        for key,value in output_dict.items()
        if key != 'num_detections'
    }
    output_dict['num_detections'] = num_detections
    output_dict['detection_classes'] = output_dict['detection_classes'].astype(np.int64)
    
    return detect_people(output_dict)

In [106]:
def draw_detections_on_image(image, detections, labels):
    image_with_detections = image
    width, height, channels = image_with_detections.shape
    
    font = cv2.FONT_HERSHEY_SIMPLEX
    color = (0, 255, 0)
    label_padding = 5

    tamanoTexto = 1.4 # tamaño de la etiqueta que describe la clase del objeto
    
    num_detections = detections['num_detections']
    if num_detections > 0:
        for detection_index in range(num_detections):
            detection_score = detections['detection_scores'][detection_index]
            detection_box = detections['detection_boxes'][detection_index]
            detection_class = detections['detection_classes'][detection_index]
            detection_label = labels[detection_class]
            detection_label_full = detection_label + ' ' + str(math.floor(100 * detection_score)) + '%'
            
            y1 = int(width * detection_box[0])
            x1 = int(height * detection_box[1])
            y2 = int(width * detection_box[2])
            x2 = int(height * detection_box[3])
                        
            # Detection rectangle.    
            image_with_detections = cv2.rectangle(
                image_with_detections,
                (x1, y1),
                (x2, y2),
                color,
                3
            )
            
            # Label background.
            label_size = cv2.getTextSize(
                detection_label_full,
                cv2.FONT_HERSHEY_COMPLEX,
                tamanoTexto,
                2
            )
            image_with_detections = cv2.rectangle(
                image_with_detections,
                (x1, y1 - label_size[0][1] - 2 * label_padding),
                (x1 + label_size[0][0] + 2 * label_padding, y1),
                color,
                -1
            )
            
            # Label text.
            cv2.putText(
                image_with_detections,
                detection_label_full,
                (x1 + label_padding, y1 - label_padding),
                font,
                tamanoTexto,
                (0, 0, 0),
                1,
                cv2.LINE_AA
            )
            
    return image_with_detections

In [107]:
# Example of how detections dictionary looks like.
image_np = np.array(Image.open(TEST_IMAGE_PATHS[1]))
detections = detect_objects_on_image(image_np, model)
detections

{'detection_classes': array([], dtype=int64),
 'detection_scores': array([], dtype=float32),
 'detection_boxes': array([], shape=(0, 4), dtype=float32),
 'num_detections': 0}

In [108]:
# for image_path in TEST_IMAGE_PATHS:
#     image_np = np.array(Image.open(image_path))
#     detections = detect_objects_on_image(image_np, model)
#     image_with_detections = draw_detections_on_image(image_np, detections, labels)
#     plt.figure(figsize=(16, 12))
#     plt.imshow(image_with_detections)


# define a video capture object
vid = cv2.VideoCapture(0)
"""
#binarización 
fgbg = cv2.createBackgroundSubtractorMOG2(history=20)
 
# Deshabilitamos OpenCL, si no hacemos esto no funciona
cv2.ocl.setUseOpenCL(False)
"""

while(1):
      
    # Capture the video frame
    # by frame
    ret, frame = vid.read()
  
    
    # Display the resulting frame
   
    # Aplicamos el algoritmo
    #fgmask = fgbg.apply(frame)
 
	# Copiamos el umbral para detectar los contornos
 
   # cv2.imshow('Camara',frame)
	#cv2.imshow('Umbral',fgmask)
	


    detections = detect_objects_on_image(frame, model)
    image_with_detections = draw_detections_on_image(frame, detections, labels)
      
      
    cv2.imshow('frame', image_with_detections)

    # the 'q' button is set as the
    # quitting button you may use any
    # desired button of your choice
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
  
# After the loop release the cap object
vid.release()
# Destroy all the windows
cv2.destroyAllWindows()



## Converting the model to web-format

To use the `ssdlite_mobilenet_v2_coco_2018_05_09` model on the web we need to convert it into the format that will be understandable by [tensorflowjs](https://www.tensorflow.org/js). To do so we may use [tfjs-converter](https://github.com/tensorflow/tfjs/tree/master/tfjs-converter) as following:

```
tensorflowjs_converter \
    --input_format=tf_saved_model \
    --output_format=tfjs_graph_model \
   ./experiments/objects_detection_ssdlite_mobilenet_v2/.tmp/datasets/ssdlite_mobilenet_v2_coco_2018_05_09/saved_model \
    ./demos/public/models/objects_detection_ssdlite_mobilenet_v2
```

Alternative and easier way would be to use a [@tensorflow-models/coco-ssd](https://www.npmjs.com/package/@tensorflow-models/coco-ssd) npm package. But just for exploration purpose let's go one level deeper and use the model directly without wrapper modules.

You find this experiment in the [Demo app](https://trekhleb.github.io/machine-learning-experiments) and play around with it right in you browser to see how the model performs in real life.