# Boxes Computation

In this exercise, you will use the TensorFlow object detection API to get bounding boxes and classes on images.

But first, we need some installation.

## I. Installation

We will here follow the installation guide of the API, that can be found [here](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/installation.md).

### I.1. Dependencies

The first thing to do is to install all needed dependencies (if not installed yet):
- pip install --user Cython
- pip install --user contextlib2
- pip install --user pillow
- pip install --user lxml
- pip install --user jupyter
- pip install --user matplotlib



### I.2. Models

Now we will download the models (i.e. the architecture and trained weights of neural networks). They are available in the so called [detection model zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf1_detection_zoo.md).

For this exercise, we will use first the **faster_rcnn_inception_v2_coco** model: download it.

You should get a .tar.gz file, containing (among other files) `frozen-inference-graph.pb`: this is what we will use to perform object detection.

So extract it in the `data/models` folder.

We also need to clone the `models` part of the TensorFlow Object Detection API. To do so, open your terminal, and in the **root of the vivadata folder**, clone the repo with the following command:

```
git clone https://github.com/tensorflow/models.git
```

Finally, **do not commit those files**

### I.3. Protobuf

Protobuf (for protocol buffer) is a Google system, that will be used for configuration.

Go now in the newly cloned repo at the root of the vivadata folder `models/research`, and launch the configuration using protobuf:
```
protoc object_detection/protos/*.proto --python_out=.
```

You may need to install the protobuf compiler using the following command on Ubuntu: `sudo apt-get install protobuf-compiler`

For MacOS use `brew install protobuf`.


## II. Object Detection

### II.1. Setting the paths to the trained graph

First, we set the paths of the model we will use in a variable called `PATH_TO_CKPT`: this is the path to the `frozen_inference_graph.pb` that you downloaded in I.2.

In [1]:
### TODO: define the variable PATH_TO_CKPT
PATH_TO_CKPT = '/home/guillaume/code/GGIML/vivadata-student/data/models/frozen_inference_graph.pb'

Next you have to set the paths to the labels: indeed labels are just numbers, but we want them to be strings so that we can understand! The table to do so is in the folder you cloned: `models/research/object_detection/data/mscoco_label_map.pbtxt`.

Put that path into the variable `PATH_TO_LABELS`:

In [2]:
### TODO: define the variable PATH_TO_LABELS
PATH_TO_LABELS = 'models/research/object_detection/data/mscoco_label_map.pbtxt'

Have a look at this file. How many classes are there? Put that value into a variable called `NUM_CLASSES`

In [3]:
### TODO: define the variable NUM_CLASSES
NUM_CLASSES = 90

### II.2. Testing object detection

We made some utils functions for you, so that you will just have to put them together to do the object detection.

First, with the following code, you will compute the graph with the trained weights you downloaded:

In [4]:
import tensorflow.compat.v1 as tf

# Compute the graph
detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')

Then, you will have to use the functions `run_inference_for_single_image` provided in the `utils.py` file. This function is easy to find on the TensorFlow Object Detection API. Have a look at it and try to understand the big picture.

Then use it on the provided test images: `image1.jpg`, `image2` and `image3.jpg`.

In [6]:
import numpy as np
import matplotlib.pyplot as plt

def run_inference_for_single_image(image, graph):
    """Computes the prediction of the object detection on a single image

    Parameters:
    image (np.array): An image in shape [height, width, 3]
	graph: a graph object from TensorFlow

    Returns:
    dict: a dict containing the number of detections, their classes, boxes and scores

    """

    with graph.as_default():
        with tf.Session() as sess:
            # Get handles to input and output tensors
            ops = tf.get_default_graph().get_operations()
            all_tensor_names = {output.name for op in ops for output in op.outputs}
            tensor_dict = {}
            for key in ['num_detections', 'detection_boxes', 'detection_scores',
                        'detection_classes']:
                tensor_name = key + ':0'
                if tensor_name in all_tensor_names:
                    tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(tensor_name)

            image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')
 
            # Run inference
            output_dict = sess.run(tensor_dict,
                    feed_dict={image_tensor: np.expand_dims(image, 0)})

            # all outputs are float32 numpy arrays, so convert types as appropriate
            output_dict['num_detections'] = int(output_dict['num_detections'][0])
            output_dict['detection_classes'] = output_dict[
                'detection_classes'][0].astype(np.uint8)
            output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
            output_dict['detection_scores'] = output_dict['detection_scores'][0]
    return output_dict

In [13]:
image1 = tf.keras.preprocessing.image.load_img('../input/image1.jpg')
image2 = tf.keras.preprocessing.image.load_img('../input/image2.jpg')
image3 = tf.keras.preprocessing.image.load_img('../input/image3.jpg')

In [14]:
### TODO: Use run_inference_for_single_image to compute the object detection
output1 = run_inference_for_single_image(image1, detection_graph)
output2 = run_inference_for_single_image(image2, detection_graph)
output3 = run_inference_for_single_image(image3, detection_graph)

In [18]:
output1

{'num_detections': 5,
 'detection_boxes': array([[0.05651296, 0.37624934, 0.960169  , 0.9797819 ],
        [0.0280468 , 0.03250726, 0.86715823, 0.31454262],
        [0.        , 0.7006882 , 0.09992106, 0.7956755 ],
        [0.02265929, 0.7847978 , 0.29472443, 0.9639423 ],
        [0.        , 0.27655414, 0.85714257, 0.5556372 ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        

Now have a look at the output dictionary, can you understand its content? Save them in pickle format, in the next part of the challenge we will display and post process them!

In [16]:
!mkdir ../output

In [17]:
### TODO: Save the output dicts in pickle
import pickle
with open('../output/output_dicts.pkl', 'wb') as file:
    pickle.dump([output1, output2, output3], file)