# Objects Detection (SSDLite, MobileNetV2, COCO)

> - ü§ñ See [full list of Machine Learning Experiments](https://github.com/trekhleb/machine-learning-experiments) on **GitHub**<br/><br/>
> - ‚ñ∂Ô∏è **Interactive Demo**: [try this model and other machine learning experiments in action](https://trekhleb.github.io/machine-learning-experiments/)

## Experiment overview

In this experiment we will use pre-trained [ssdlite_mobilenet_v2_coco](http://download.tensorflow.org/models/object_detection/ssdlite_mobilenet_v2_coco_2018_05_09.tar.gz) model from [Tensorflow detection models zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md) to do objects detection on the photos.

![objects_detection_ssdlite_mobilenet_v2.jpg](../../demos/src/images/objects_detection_ssdlite_mobilenet_v2.jpg)

_This notebook is inspired by [Objects Detection API Demo](https://colab.research.google.com/github/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb)_

## Importing Dependencies

- [tensorflow](https://www.tensorflow.org/) - for developing and training ML models.
- [matplotlib](https://matplotlib.org/) - for plotting the data.
- [numpy](https://numpy.org/) - for linear algebra operations.
- [cv2](https://pypi.org/project/opencv-python/) - for processing the images and drawing object detections on top of them.
- [PIL](https://pypi.org/project/Pillow/2.2.1/) - for convenient image loading.
- [pathlib](https://docs.python.org/3/library/pathlib.html) - for working with model files.
- [math](https://docs.python.org/3/library/math.html) - to do simple math operations while drawing the detection frames.
- [google.protobuf](https://pypi.org/project/protobuf/) - for reading the files in protobuf format.

In [1]:
# Selecting Tensorflow version v2 (the command is relevant for Colab only).
%tensorflow_version 2.x

UsageError: Line magic function `%tensorflow_version` not found.


In [2]:
import sys
  
# appending a path
sys.path.append("C:/Users/Gmun/AppData/Local/Programs/Python/Python310/modelos/SICP/Lib/site-packages/tensorflow")  

import tensorflow as tf
# import matplotlib.pyplot as plt
# import matplotlib.image as mpimg
import numpy as np
import pathlib
import cv2 
import math
from PIL import Image
from google.protobuf import text_format
import platform

print('Python version:', platform.python_version())
print('Tensorflow version:', tf.__version__)
print('Keras version:', tf.keras.__version__)

## Loading the model

To do objects detection we're going to use [ssdlite_mobilenet_v2_coco](http://download.tensorflow.org/models/object_detection/ssdlite_mobilenet_v2_coco_2018_05_09.tar.gz) model from [Tensorflow detection models zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md).

The full name of the model will be **ssdlite_mobilenet_v2_coco_2018_05_09**.

In [None]:
# Create cache folder.
!mkdir .tmp

In [None]:
# Loads the module from internet, unpacks it and initializes a Tensorflow saved model.
def load_model(model_name):
    model_url = 'http://download.tensorflow.org/models/object_detection/' + model_name + '.tar.gz'
        
    model_dir = tf.keras.utils.get_file(
        fname=model_name, 
        origin=model_url,
        untar=True,
        cache_dir=pathlib.Path('.tmp').absolute()
    )
    model = tf.saved_model.load(model_dir + '/saved_model')
    
    return model

In [None]:
MODEL_NAME = 'ssdlite_mobilenet_v2_coco_2018_05_09'
saved_model = load_model(MODEL_NAME)

Downloading data from http://download.tensorflow.org/models/object_detection/ssdlite_mobilenet_v2_coco_2018_05_09.tar.gz
INFO:tensorflow:Saver not created because there are no variables in the graph to restore


In [None]:
# Exploring model signatures.
saved_model.signatures

_SignatureMap({'serving_default': <ConcreteFunction pruned(inputs) at 0x21871FE36D0>})

In [None]:
# Loading default model signature.
model = saved_model.signatures['serving_default']

## Loading model labels

Depending on what dataset has been used to train the model we need to download proper labels set from [tensorflow models](https://github.com/tensorflow/models/tree/master/research/object_detection/data) repository.

The **ssdlite_mobilenet_v2_coco** model has been trained on [COCO](http://cocodataset.org) dataset which has **90** objects categories. This list of categories we're going to download and explore. We need a label file with the name [mscoco_label_map.pbtxt](https://github.com/tensorflow/models/blob/master/research/object_detection/data/mscoco_label_map.pbtxt).

### Compiling the protobuf label map

Label object structure is defined in [string_int_label_map.proto](https://github.com/tensorflow/models/tree/master/research/object_detection/protos) file in [protobuf](https://developers.google.com/protocol-buffers) format.

In order to convert `mscoco_label_map.pbtxt` file to Python dictionary we need to load `string_int_label_map.proto` file and compile it using `protoc`. Before doing the we need to [install](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/installation.md#manual-protobuf-compiler-installation-and-usage) `protoc`.

One of the ways to **install** `protoc` is to load it manually:

```
PROTOC_ZIP=protoc-3.7.1-osx-x86_64.zip
curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v3.7.1/$PROTOC_ZIP
sudo unzip -o $PROTOC_ZIP -d .tmp/protoc
rm -f $PROTOC_ZIP
```

After that we may **compile** `proto` files by running:

```
.tmp/protoc/bin/protoc ./protos/*.proto --python_out=.
```

‚òùüèª For simplicity reasons we have `string_int_label_map.proto` and its compiled version `string_int_label_map_pb2.py` in the `protos` directory. So let's just include this compiled package.

In [None]:
from protos import string_int_label_map_pb2

ModuleNotFoundError: No module named 'protos'

### Loading and parsing the labels

In [None]:
def load_labels(labels_name):
    labels_url = 'https://raw.githubusercontent.com/tensorflow/models/master/research/object_detection/data/' + labels_name
    
    labels_path = tf.keras.utils.get_file(
        fname=labels_name, 
        origin=labels_url,
        cache_dir=pathlib.Path('.tmp').absolute()
    ) 

    labels_file = open(labels_path, 'r')
   
    labels_string = labels_file.read()
    
    labels_map = string_int_label_map_pb2.StringIntLabelMap()
    try:
        text_format.Merge(labels_string, labels_map)
    except text_format.ParseError:
        labels_map.ParseFromString(labels_string)
    
    labels_dict = {}
    for item in labels_map.item:
        labels_dict[item.id] = item.display_name
    
    return labels_dict

In [None]:
LABELS_NAME = 'mscoco_label_map.pbtxt'
labels = load_labels(LABELS_NAME)
# labels

## Exploring the model

In [None]:
# List model files
!ls -la .tmp/datasets/ssdlite_mobilenet_v2_coco_2018_05_09

'ls' is not recognized as an internal or external command,
operable program or batch file.


In [None]:
# Check model pipeline.
!cat .tmp/datasets/ssdlite_mobilenet_v2_coco_2018_05_09/pipeline.config

'cat' is not recognized as an internal or external command,
operable program or batch file.


In [None]:
model.inputs

[<tf.Tensor 'image_tensor:0' shape=(None, None, None, 3) dtype=uint8>]

In [None]:
model.outputs

[<tf.Tensor 'detection_boxes:0' shape=(None, 100, 4) dtype=float32>,
 <tf.Tensor 'detection_classes:0' shape=(None, 100) dtype=float32>,
 <tf.Tensor 'detection_scores:0' shape=(None, 100) dtype=float32>,
 <tf.Tensor 'num_detections:0' shape=(None,) dtype=float32>]

## Loading test images

In [None]:
def display_image(image_np):
    plt.figure()
    plt.imshow(image_np)

In [None]:
TEST_IMAGES_DIR_PATH = pathlib.Path('data_prueba_j')
TEST_IMAGE_PATHS = sorted(list(TEST_IMAGES_DIR_PATH.glob('*.jpeg')))#cambiar si es necesario
TEST_IMAGE_PATHS

[WindowsPath('data_prueba_j/WhatsApp Image 2022-07-27 at 1.00.06 PM (2).jpeg'),
 WindowsPath('data_prueba_j/WhatsApp Image 2022-07-27 at 1.00.06 PM.jpeg'),
 WindowsPath('data_prueba_j/WhatsApp Image 2022-07-27 at 1.00.07 PM (1).jpeg'),
 WindowsPath('data_prueba_j/WhatsApp Image 2022-07-27 at 12.57.36 PM.jpeg'),
 WindowsPath('data_prueba_j/WhatsApp Image 2022-07-27 at 12.57.37 PM (2).jpeg'),
 WindowsPath('data_prueba_j/WhatsApp Image 2022-07-27 at 12.57.39 PM (1).jpeg'),
 WindowsPath('data_prueba_j/WhatsApp Image 2022-07-27 at 12.57.39 PM (2).jpeg'),
 WindowsPath('data_prueba_j/WhatsApp Image 2022-07-27 at 12.57.40 PM.jpeg'),
 WindowsPath('data_prueba_j/WhatsApp Image 2022-07-28 at 9.25.59 PM.jpeg'),
 WindowsPath('data_prueba_j/WhatsApp Image 2022-07-28 at 9.27.28 PM.jpeg'),
 WindowsPath('data_prueba_j/WhatsApp Image 2022-07-28 at 9.28.42 PM.jpeg'),
 WindowsPath('data_prueba_j/WhatsApp Image 2022-07-28 at 9.29.31 PM (1).jpeg'),
 WindowsPath('data_prueba_j/WhatsApp Image 2022-07-28 at 9.

In [None]:
"""for image_path in TEST_IMAGE_PATHS:
    image_np = mpimg.imread(image_path)
    display_image(image_np)"""

'for image_path in TEST_IMAGE_PATHS:\n    image_np = mpimg.imread(image_path)\n    display_image(image_np)'

## Running the model

In [None]:
# Funci√≥n para detectar el tipo de objeto, el n√∫mero de objetos y las coordenadas de ellos
def detect_objects_on_image(image, model):
    image = np.asarray(image)
    input_tensor = tf.convert_to_tensor(image)
    # Adding one more dimension since model expect a batch of images.
    input_tensor = input_tensor[tf.newaxis, ...]

    output_dict = model(input_tensor)

    num_detections = int(output_dict['num_detections'])
    output_dict = {
        key:value[0, :num_detections].numpy() 
        for key,value in output_dict.items()
        if key != 'num_detections'
    }
    output_dict['num_detections'] = num_detections
    output_dict['detection_classes'] = output_dict['detection_classes'].astype(np.int64)

    return detect_people(output_dict)
    # return output_dict

In [None]:
# Funci√≥n para eliminar los objetos que no sean personas (Editado)
def detect_people(detected_people):
    indices = np.where(detected_people['detection_classes'] != 1)
    
    detection_scores_temp = detected_people['detection_scores']
    detection_classes_temp = detected_people['detection_classes']
    detection_boxes_temp = detected_people['detection_boxes']
    num_detections_temp = detected_people['num_detections']

    for i in indices[0]:
        detection_scores_temp = np.delete(detected_people['detection_scores'], indices[0])
        detection_classes_temp = np.delete(detected_people['detection_classes'], indices[0])
        detection_boxes_temp = np.delete(detected_people['detection_boxes'], indices[0], axis = 0) # Elimina las filas el √≠ndice
        num_detections_temp = detected_people['num_detections'] - len(indices[0])

    detected_people['detection_scores'] = detection_scores_temp
    detected_people['detection_classes'] = detection_classes_temp
    detected_people['detection_boxes'] = detection_boxes_temp
    detected_people['num_detections'] = num_detections_temp

    return detected_people

In [None]:
# Funci√≥n para dibujar el recuadro alrededor del objeto
def draw_detections_on_image(image, detections, labels):
    image_with_detections = image
    width, height, channels = image_with_detections.shape
    
    font = cv2.FONT_HERSHEY_SIMPLEX
    color = (0, 255, 0)
    label_padding = 5

    tamanoTexto = 1.4 # tama√±o de la etiqueta que describe la clase del objeto
    
    num_detections = detections['num_detections']
    if num_detections > 0:
        for detection_index in range(num_detections):
            detection_score = detections['detection_scores'][detection_index]
            detection_box = detections['detection_boxes'][detection_index]
            detection_class = detections['detection_classes'][detection_index]
            detection_label = labels[detection_class]
            detection_label_full = detection_label + ' ' + str(math.floor(100 * detection_score)) + '%'
            
            y1 = int(width * detection_box[0])
            x1 = int(height * detection_box[1])
            y2 = int(width * detection_box[2])
            x2 = int(height * detection_box[3])

            print("y1 = ", y1,"\nx1 = ", x1,"\ny2 = ", y2,"\nx2 = ", x2, "\n")
                        
            # Detection rectangle.    
            image_with_detections = cv2.rectangle(
                image_with_detections,
                (x1, y1),
                (x2, y2),
                color,
                3
            )
            
            # Label background.
            label_size = cv2.getTextSize(
                detection_label_full,
                cv2.FONT_HERSHEY_COMPLEX,
                tamanoTexto,
                2
            )
            image_with_detections = cv2.rectangle(
                image_with_detections,
                (x1, y1 - label_size[0][1] - 2 * label_padding),
                (x1 + label_size[0][0] + 2 * label_padding, y1),
                color,
                -1
            )
            
            # Label text.
            cv2.putText(
                image_with_detections,
                detection_label_full,
                (x1 + label_padding, y1 - label_padding),
                font,
                tamanoTexto,
                (0, 0, 0),
                1,
                cv2.LINE_AA
            )
            
    return image_with_detections

In [None]:
# Example of how detections dictionary looks like.
image_path = 5

image_np = np.array(Image.open(TEST_IMAGE_PATHS[image_path]))
detections = detect_objects_on_image(image_np, model)
print(detections["detection_boxes"])

"""detections["detection_boxes"] = [ y1         # Coord. vert. inferior
                                   , x1         # Coord. horiz. izquierda
                                   , y2         # Coord. vert. superior
                                   , x2]        # Coord. horiz. derecha"""


[[0.35484084 0.41569313 0.9854084  0.72175753]
 [0.7023598  0.00504667 0.9940541  0.35252637]]


'detections["detection_boxes"] = [ y1         # Coord. vert. inferior\n                                   , x1         # Coord. horiz. izquierda\n                                   , y2         # Coord. vert. superior\n                                   , x2]        # Coord. horiz. derecha'

In [None]:
# for image_path in TEST_IMAGE_PATHS:
image_np = np.array(Image.open(TEST_IMAGE_PATHS[image_path]))
detections = detect_objects_on_image(image_np, model)
image_with_detections = draw_detections_on_image(image_np, detections, labels)
# plt.figure(figsize=(4, 3))
# plt.imshow(image_with_detections)

y1 =  567 
x1 =  498 
y2 =  1576 
x2 =  866 

y1 =  1123 
x1 =  6 
y2 =  1590 
x2 =  423 



In [None]:
# Funci√≥n de conteo
count = 0
prev_pos = None
def count_func(x,y,w,h,count,prev_pos):
    
    if prev_pos is None:
        prev_pos = [x,y,w,h]
        print("Hola")
        return 0,prev_pos
    if bool((x+w)/2 > 250) & bool(prev_pos[0] < 250):
        count += 1
        print("Del if")
    elif bool((x+w)/2 <250) & bool(prev_pos[0] > 250):
        count -= 1
        print("Del elif")
    prev_pos = [x,y,w,h]    
    return count,prev_pos

In [None]:
# define a video capture object
# vid = cv2.VideoCapture("Video_prueba.mp4")
vid = cv2.VideoCapture(0)
"""
#binarizaci√≥n 
fgbg = cv2.createBackgroundSubtractorMOG2(history=20)
 
# Deshabilitamos OpenCL, si no hacemos esto no funciona
cv2.ocl.setUseOpenCL(False)
"""

write_txt = ""
ciclo = 0

width = 480     # Se obtiene ejecutando (width, height, channels)
height = 640    # print(image_with_detections.shape)
while(1):
    
    # Capture the video frame
    # by frame
    ret, frame = vid.read()
    
    # Display the resulting frame
   
    # Aplicamos el algoritmo
    #fgmask = fgbg.apply(frame)
 
	# Copiamos el umbral para detectar los contornos
 
   # cv2.imshow('Camara',frame)
	#cv2.imshow('Umbral',fgmask)
	
    detections = detect_objects_on_image(frame, model)
    image_with_detections = draw_detections_on_image(frame, detections, labels)
    
    ciclo += 1

    # String para escribir  en el archivo txt
    write_txt = write_txt + "###################### Ciclo " + str(ciclo)
    write_txt = write_txt + " ############################################\nBoxes:\n"
    
    for i in range (detections["num_detections"]):
        # Recuadro de [640,480]
        write_txt = write_txt + str(i + 1) + ":     "
        # Puntos inferiores y superiores del recuadro de la persona detectada
        coords_x_inf = round(detections["detection_boxes"][i][0] * height, 2)
        coords_y_inf = round(detections["detection_boxes"][i][1] * width, 2)
        coords_x_sup = round(detections["detection_boxes"][i][2] * height, 2)
        coords_y_sup = round(detections["detection_boxes"][i][3] * width, 2)
        write_txt = write_txt + "[" + str(coords_x_inf) + ", " + str(coords_y_inf) + "][" + str(coords_x_sup) + ", " + str(coords_y_sup) + "] Porcentajes:" + str(detections["detection_scores"][i]) + "\n"
        
        print(count_func(float(coords_x_sup)
                         ,float(coords_y_sup)
                         ,float(coords_x_inf)
                         ,float(coords_y_inf),count,prev_pos))
        
        count_2, prev_pos_2= count_func(float(coords_x_sup)
                         ,float(coords_y_sup)
                         ,float(coords_x_inf)
                         ,float(coords_y_inf),count,prev_pos)
        print(count, "Prev_pos", prev_pos)
        count = count_2
        prev_pos = prev_pos_2

        # print(type(int(coords_x_inf)))
    write_txt = write_txt + "\n"
    cv2.imshow('frame', image_with_detections)

    # the 'q' button is set as the
    # quitting button you may use any
    # desired button of your choice
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
  
# After the loop release the cap object
f = open("C:/Users/LENOVO/Desktop/pruebas_detection.txt", "w")

# print(write_txt)
f.write("               Recuadro de 480 x 640\n==============================================================\n")
f.write(write_txt)
f.close()
vid.release()
# Destroy all the windows
cv2.destroyAllWindows()

y1 =  258 
x1 =  101 
y2 =  479 
x2 =  591 

(0, [639.46, 443.27, 344.66, 76.14])
0 Prev_pos [640.0, 203.06, 483.21, 104.57]
y1 =  259 
x1 =  98 
y2 =  480 
x2 =  613 

(0, [640.0, 460.15, 346.28, 73.57])
0 Prev_pos [639.46, 443.27, 344.66, 76.14]
y1 =  258 
x1 =  102 
y2 =  480 
x2 =  595 

(0, [640.0, 446.89, 344.22, 76.72])
0 Prev_pos [640.0, 460.15, 346.28, 73.57]
y1 =  259 
x1 =  98 
y2 =  480 
x2 =  618 

(0, [640.0, 463.88, 346.45, 73.91])
0 Prev_pos [640.0, 446.89, 344.22, 76.72]
y1 =  258 
x1 =  98 
y2 =  480 
x2 =  599 

(0, [640.0, 449.27, 344.1, 74.09])
0 Prev_pos [640.0, 463.88, 346.45, 73.91]
y1 =  257 
x1 =  94 
y2 =  480 
x2 =  594 

(0, [640.0, 445.82, 343.96, 71.1])
0 Prev_pos [640.0, 449.27, 344.1, 74.09]
y1 =  260 
x1 =  94 
y2 =  479 
x2 =  614 

(0, [639.69, 460.98, 346.8, 71.02])
0 Prev_pos [640.0, 445.82, 343.96, 71.1]
y1 =  257 
x1 =  96 
y2 =  479 
x2 =  595 

(0, [639.04, 446.36, 343.95, 72.5])
0 Prev_pos [639.69, 460.98, 346.8, 71.02]
y1 =  259 
x1 =  95 
y2

## Converting the model to web-format

To use the `ssdlite_mobilenet_v2_coco_2018_05_09` model on the web we need to convert it into the format that will be understandable by [tensorflowjs](https://www.tensorflow.org/js). To do so we may use [tfjs-converter](https://github.com/tensorflow/tfjs/tree/master/tfjs-converter) as following:

```
tensorflowjs_converter \
    --input_format=tf_saved_model \
    --output_format=tfjs_graph_model \
   ./experiments/objects_detection_ssdlite_mobilenet_v2/.tmp/datasets/ssdlite_mobilenet_v2_coco_2018_05_09/saved_model \
    ./demos/public/models/objects_detection_ssdlite_mobilenet_v2
```

Alternative and easier way would be to use a [@tensorflow-models/coco-ssd](https://www.npmjs.com/package/@tensorflow-models/coco-ssd) npm package. But just for exploration purpose let's go one level deeper and use the model directly without wrapper modules.

You find this experiment in the [Demo app](https://trekhleb.github.io/machine-learning-experiments) and play around with it right in you browser to see how the model performs in real life.