# Unit 3: Train your Own TensorFlow Image Recognition Model

<img src="img/tensorflow_image_unit3_intro.png" width="700" />

<img src="img/robotignite_logo_text.png" width="700" />

<b>Estimated time to completion:</b> 2.5-10 hours, depending on the training time

<b>Simulated robot:</b> Mira Robot
<br><br>
<b>What will you learn with this unit?</b>
* Label images
* Prepare the package for training
* Train the model and monitor it through TensorBoard
* Use the trained model in a ROS environment

<p style="background:#407EAF;color:white;">**Example 3.1**</p><br>

### Step 0: Intro to Example

So, what is the purpose of this unit? Well, quite simple: What happens if you want to recognise something that is not on the ImageNet model list?<br>
In this example, you are going to learn all of the steps necessary to make Mira Robot recognise **itself**.

### Step 1: Labeling Images

Sadly, one of the most time-consuming tasks of this example is also the one that is unavoidable, if you want to train a custom element. Luckily for you, you won't have to do it for this example because, in the git you downloaded in **Unit 2**, you have two folders with a huge number of images of the **Mira Robot** labeled.

If you have a look, you will see that you have **two folders.**

<table style="float:left;background: #407EAF">
<tr>
<th>
<p class="transparent">Execute in WebShell #1</p>
</th>
</tr>
</table>

In [None]:
roscd tf_unit1_pkg/course_tflow_image_student_data;ls

* images_1_labels: Has only **ONE** label [mira_robot]
* images_2_labels: Has **TWO** labels [mira_robot, object]. All of the objects in the scene were labeled with the tag **object**, while all of the Mira Robots were labeled with **mira_robot**.

For this example, we will use the **images_1_labels**.

So, how do you label images? We will use a program that is called <a href="https://github.com/tzutalin/labelImg">LableImg</a>. This program generates **.xml** files based on images and how you label them. It makes the process less painful than writing the files by hand.

Let's practice one image:

<table style="float:left;background: #407EAF">
<tr>
<th>
<p class="transparent">Execute in WebShell #1</p>
</th>
</tr>
</table>

In [None]:
python3 labelImg/labelImg.py

You should get something like this if you open the Graphical Interface by clicking on the icon:

<img src="img/font-awesome_desktop.png">

<img src="img/tensorflow_images_unit3_label_img.png" width="600">

You can now upload any image through the IDE. Here, you have an example:

<img src="img/tesnsorflow_image_unit3_clasify_image.png" width="600">

Now, just open it with the GUI and you can start labeling by clicking the **CreateRectagleBox**, or by pressing the **W** key on the keyboard. Place these boxes where the object you want to label is, to mark it. Here is an example, try it yourself:

<img src="img/tesnsorflow_image_unit3_clasify_image_2.png" width="600">

<img src="img/tesnsorflow_image_unit3_clasify_image_3.png" width="600">

Once you have everything you want to train labeled, you have to save. This will generate an **.xml** file with the same name as your image. Do not change the names, if you want to make your life easier.

[tesnsorflow_image_unit3_clasify_image.xml](extra_files/tesnsorflow_image_unit3_clasify_image.xml)

Inside it, you have all of the information from the image, as well as the position, size, and label of each box. This will be used for the training of the TensorFlow model to validate the learning.

Each box is inside an **object** tag

One last thing you have to do is **COPY** **10%** of the images into a **test** folder, and the other **90%** to a **train** folder. The percentages really depend on you. If you are unsure of whether the model works, put more images in the **test** folder.<br>
It is **VERY IMPORTANT** that the images that are in the **test** folder **DON'T APPEAR** in the **train** folder. This guarantees that when testing, the training model is tested with images that it doesn't know. Otherwise, it won't learn correctly. It would be like testing students with the extact same exercise that you did yesterday in class... You won't know if they have learned or just have a good memory.

And that's it. Now, you just have to do the same thing for as many images as you need for the training. To make life a bit easier, you can import a full folder with all of the images inside.

### Step 2: Prepare the Image Data for the TensorFlow training

But, TensorFlow doesn't use **.xml** files. Instead, it uses a file type called **.records**. So, you have to convert your xml files.<br>
This is divided into **two main steps**:<br>
* Convert XML to CSV
* Convert CSV to RECORD

#### Step 2.1: XML to CSV

You first have to copy the images that we provide you into your working directory package: 

<table style="float:left;background: #407EAF">
<tr>
<th>
<p class="transparent">Execute in WebShell #1</p>
</th>
</tr>
</table>

In [None]:
# Copy the images files to a generic images folder in the scripts directory
roscd tf_unit1_pkg
rm -rf course_tflow_image_student_data
git clone https://bitbucket.org/theconstructcore/course_tflow_image_student_data.git

# We clean any data you might have created previously
rm -rf data
mkdir data
    
# Generate Image Folder with the images wanted
rm -rf images
mkdir images
cp -a course_tflow_image_student_data/images_1_labels/. images

You will use this **data** folder afterwards for the **CSV** and **RECORD** files that are generated.

Now, you have to make the conversion from **XML** to **CSV**. For this, you will use the following python script:

<p style="background:#3B8F10;color:white;" id="import_pb_to_tensorboard">**Python Program {3.1-py}: xml_to_csv.py** </p>

In [None]:
#!/usr/bin/env python
import os
import glob
import pandas as pd
import xml.etree.ElementTree as ET


def xml_to_csv(path):
    xml_list = []
    for xml_file in glob.glob(path + '/*.xml'):
        tree = ET.parse(xml_file)
        root = tree.getroot()
        for member in root.findall('object'):
            value = (root.find('filename').text,
                     int(root.find('size')[0].text),
                     int(root.find('size')[1].text),
                     member[0].text,
                     int(member[4][0].text),
                     int(member[4][1].text),
                     int(member[4][2].text),
                     int(member[4][3].text)
                     )
            xml_list.append(value)
    column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']
    xml_df = pd.DataFrame(xml_list, columns=column_name)
    return xml_df


def main():
    for directory in ['train', 'test']:
        image_path = os.path.join(os.getcwd(), 'images/{}'.format(directory))
        xml_df = xml_to_csv(image_path)
        xml_df.to_csv('data/{}_labels.csv'.format(directory), index=None, encoding='utf-8')
        print('Successfully converted xml to csv.')


main()

<p style="background:#3B8F10;color:white;" id="import_pb_to_tensorboard">**END Python Program {3.1-py}: xml_to_csv.py** </p>

You can now execute this script:

<table style="float:left;background: #407EAF">
<tr>
<th>
<p class="transparent">Execute in WebShell #1</p>
</th>
</tr>
</table>

In [None]:
# Convert XML to CSV files
roscd tf_unit1_pkg
python scripts/xml_to_csv.py

You should now have two files in the **data** folder: **test_labels.csv** and **train_labels.csv**.

#### Step 2.2: CSV to RECORD

Now to generate the Record files, you need to execute another python script:

<p style="background:#3B8F10;color:white;" id="import_pb_to_tensorboard">**Python Program {3.2-py}: generate_tfrecord_n.py** </p>

In [None]:
from __future__ import division
from __future__ import print_function
from __future__ import absolute_import

"""
Usage:
  # From tensorflow/models/
  # Create train data:
  python generate_tfrecord.py --csv_input=data/train_labels.csv  --output_path=data/train.record

  # Create test data:
  python generate_tfrecord.py --csv_input=data/test_labels.csv  --output_path=data/test.record
"""

import os
import io
import pandas as pd
import numpy as np
import tensorflow as tf

from PIL import Image
from object_detection.utils import dataset_util
from collections import namedtuple, OrderedDict
from extract_training_lables_csv import extract_training_labels_csv, class_text_to_int
import sys
reload(sys)  
sys.setdefaultencoding('utf8')
    
flags = tf.app.flags
flags.DEFINE_string('image_path_input', '', 'Path to the Images refered to in the CSV')
flags.DEFINE_string('csv_input', '', 'Path to the CSV input')
flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
FLAGS = flags.FLAGS


def split(df, group):
    data = namedtuple('data', ['filename', 'object'])
    gb = df.groupby(group)
    return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]


def create_tf_example(group, path, unique_label_array):
    with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
        encoded_jpg = fid.read()
    encoded_jpg_io = io.BytesIO(encoded_jpg)
    image = Image.open(encoded_jpg_io)
    width, height = image.size

    filename = group.filename.encode('utf8')
    image_format = b'jpg'
    xmins = []
    xmaxs = []
    ymins = []
    ymaxs = []
    classes_text = []
    classes = []

    

    for index, row in group.object.iterrows():
        xmins.append(row['xmin'] / width)
        xmaxs.append(row['xmax'] / width)
        ymins.append(row['ymin'] / height)
        ymaxs.append(row['ymax'] / height)
        classes_text.append(row['class'].encode('utf8'))
        classes.append(class_text_to_int(row['class'], unique_label_array))

    tf_example = tf.train.Example(features=tf.train.Features(feature={
        'image/height': dataset_util.int64_feature(height),
        'image/width': dataset_util.int64_feature(width),
        'image/filename': dataset_util.bytes_feature(filename),
        'image/source_id': dataset_util.bytes_feature(filename),
        'image/encoded': dataset_util.bytes_feature(encoded_jpg),
        'image/format': dataset_util.bytes_feature(image_format),
        'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
        'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
        'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
        'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
        'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
        'image/object/class/label': dataset_util.int64_list_feature(classes),
    }))
    return tf_example


def main(_):
    writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
    
    path = os.path.join(os.getcwd(), FLAGS.image_path_input)
    examples = pd.read_csv(FLAGS.csv_input)
    unique_label_array = extract_training_labels_csv(examples)
    grouped = split(examples, 'filename')
    for group in grouped:
        tf_example = create_tf_example(group, path, unique_label_array)
        writer.write(tf_example.SerializeToString())

    writer.close()
    output_path = os.path.join(os.getcwd(), FLAGS.output_path)
    print('Successfully created the TFRecords: {}'.format(output_path))


if __name__ == '__main__':
    tf.app.run()

<p style="background:#3B8F10;color:white;" id="import_pb_to_tensorboard">**END Python Program {3.2-py}: generate_tfrecord_n.py** </p>

<p style="background:#3B8F10;color:white;" id="import_pb_to_tensorboard">**Python Program {3.2b-py}: extract_training_lables_csv.py** </p>

In [None]:
import pandas as pd
import numpy as np

def extract_training_labels_csv(csv_object):
    """
    We suppose that in the test and train image sets there are all the labels
    """
    unique_label_set = set([])
    d = csv_object.loc[: , "class"]
    for key,value in d.iteritems():
        unique_label_set.add(value)
    
    # We convert to have functions like sort
    unique_label_list = list(unique_label_set)
    unique_label_list.sort()
    return unique_label_list

def class_text_to_int(row_label, unique_label_array):
    """
    The labels will be assigned by alpabetical order
    This is very importatnt for the config file for training
    """
    class_integer = int(unique_label_array.index(row_label) + 1)
    #print ("Label="+str(row_label)+", Index="+str(class_integer))
    return class_integer
    

if __name__ == "__main__":
    print "Opening CSV..."
    csv_input_for_labels = "scripts/dummy1.csv"
    examples = pd.read_csv(csv_input_for_labels)
    print "Opened CSV..."
    unique_label_array = extract_training_labels_csv(examples)
    label_contents = ""
    for lable in unique_label_array:
        print "Generating Index for lable=="+str(lable)
        index = class_text_to_int(lable, unique_label_array)
        print "Label=="+str(lable)+", index ="+str(index)
        label_contents += "item {\n    id : "+str(index)+"\n    name : "+str(lable)+"\n}\n"

<p style="background:#3B8F10;color:white;" id="import_pb_to_tensorboard">**END Python Program {3.2b-py}: extract_training_lables_csv.py** </p>

In **extract_training_lables_csv.py**, you extract the labels that you are going to use. In this first example, 3.1, the images only have **one label**, which is **mira_robot**. This will therefore, extract the label **mira_robot** from csv files generates previously.

And in **generate_tfrecord_n.py** file, there is a line that we have to know about:

In [None]:
from object_detection.utils import dataset_util

Here you are importing from a folder called **object_detection**. To make this course easier, you are going to download it into your package and set the python path to find it.

<table style="float:left;background: #407EAF">
<tr>
<th>
<p class="transparent">Execute in WebShell #1</p>
</th>
</tr>
</table>

In [None]:
roscd tf_unit1_pkg
# Download the git models with the object detection module inside
rm -rf models
# We dont clone from git because its somewhat unstable , you can try though :git clone https://github.com/tensorflow/models.git
cp -r course_tflow_image_student_data/tf_models/models ./
# Compile protos messages python modules
cd models/research
protoc object_detection/protos/*.proto --python_out=.
echo "Check proto python files generated"
ls object_detection/protos/*_pb2.py
# We make all the modules inside models/research and also the slim folder inside available anywhere in python interpreter.
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
echo $PYTHONPATH
roscd tf_unit1_pkg/scripts

Here you basically copy it from the repo we give you **course_tflow_image_student_data** and compile the **protobuffer** files to generate the python classes for each one. You don't need to know what they do, just know that you need them to be able to use this TensorFlow Library.

Ok, you can now execute the **python script generate_tfrecord.py**

<table style="float:left;background: #407EAF">
<tr>
<th>
<p class="transparent">Execute in WebShell #1</p>
</th>
</tr>
</table>

In [None]:
roscd tf_unit1_pkg/scripts
python scripts/generate_tfrecord_n.py --image_path_input=images/train --csv_input=data/train_labels.csv  --output_path=data/train.record
python scripts/generate_tfrecord_n.py --image_path_input=images/test --csv_input=data/test_labels.csv  --output_path=data/test.record

This should generate the **.record** files for the **train** and **test** in the **data** folder.

#### Step 2.3: Extra step: Check that the record files are ok

This is a step that is good to know how to do, although it's not necessary for the training. You have to know how to read what there is inside of a **.record** file, so that you can check other people's files and make sure that everything went ok.

For this, you have to launch a python script:

<p style="background:#3B8F10;color:white;" id="import_pb_to_tensorboard">**Python Program {3.3-py}: tfrecord_inspector.py** </p>

In [None]:
import tensorflow as tf

print "TRAIN.RECORD data ==>"
for example in tf.python_io.tf_record_iterator("data/train.record"):
    result = tf.train.Example.FromString(example)
    print result
    
print "TEST.RECORD data ==>"
for example in tf.python_io.tf_record_iterator("data/test.record"):
    result = tf.train.Example.FromString(example)
    print result

<p style="background:#3B8F10;color:white;" id="import_pb_to_tensorboard">**END Python Program {3.3-py}: tfrecord_inspector.py** </p>

It's simply converting the **record** file to a String. You can then save that output into a log file, like this:

<table style="float:left;background: #407EAF">
<tr>
<th>
<p class="transparent">Execute in WebShell #1</p>
</th>
</tr>
</table>

In [None]:
# Check the Tf Record has been done
roscd tf_unit1_pkg
rm tfrecord_inspector.log
python scripts/tfrecord_inspector.py >> tfrecord_inspector.log

You can then read the **tfrecord_inspector.log** file. Beware that this can be a very big file because it also contains the pixel image data. We recommend that you use VIM to open it, it's the fastest way and it doesn't open all the data at once.

<table style="float:left;background: #407EAF">
<tr>
<th>
<p class="transparent">Execute in WebShell #1</p>
</th>
</tr>
</table>

In [None]:
vim tfrecord_inspector.log

Then type **":q"**, to exit VIM.

### Step 3: Copy Model Data for Training

To train, you need a **TensorFlow** model. This is made by a series of files that define the different DeepLearing Neural Network operations. You can find loads of models; some are **faster** than others, some are **more precise** and change depending on their application. Some are just for images, others for sound, still others for stockmarket data... You name it.

In your case, you have to download the following model:

<table style="float:left;background: #407EAF">
<tr>
<th>
<p class="transparent">Execute in WebShell #1</p>
</th>
</tr>
</table>

In [None]:
# We clean up previous images
rm -rf ./models/research/object_detection/images

cp -a data/. ./models/research/object_detection/data
cp -r images ./models/research/object_detection/
cp -r training ./models/research/object_detection/

# Copy Selected model from the user
cp -r course_tflow_image_student_data/tf_models/ssd_mobilenet_v1_coco_11_06_2017 ./models/research/object_detection/
cp course_tflow_image_student_data/tf_models/ssd_mobilenet_v1_coco.config ./models/research/object_detection/training/

Here you are copying the **ssd_mobilenet_v1_coco.config** file and the **ssd_mobilenet_v1_coco_11_06_2017.tar.gz**. The **.tar.gz** contains the binary information of the model, used by TensorFlow. It's more complex than that, but it's not necessary to know for this course.<br>
The **.config** file **IS** important to know because here you will have to change some elements to indicate where to extract the TensorFlow model from, how many labels you have, the basic size of the training images... It configures all of the aspects of the training procedure. We are going to just comment on the essentials. Open the **ssd_mobilenet_v1_coco.config** in the **IDE**:

* Number Of Classes: How many labels you have; in our case, **ONE**, mira_robot.

In [None]:
model {
  ssd {
    num_classes: 1

* fine_tune_checkpoint: Here you state the model file path. In our case, **ssd_mobilenet_v1_coco_11_06_2017/model.ckpt**.

In [None]:
}
  fine_tune_checkpoint: "ssd_mobilenet_v1_coco_11_06_2017/model.ckpt"

* Batch Size: How many images are used in each training step. If you have RAM memory problems, you have to **LOWER** this value. The minimum is, of course, **1**. But the lower it is , the slower you will train.

In [None]:
train_config: {
  batch_size: 1

* Data Record File Paths and Label file PBTXT: Where to get the training and test image data, and also the labels list (which is extracted from the **object-detection.pbtxt** file that you are going to create now).

In [None]:
train_input_reader: {
  tf_record_input_reader {
    input_path: "data/train.record"
  }
  label_map_path: "training/object-detection.pbtxt"
}

eval_config: {
  num_examples: 8000
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 10
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "data/test.record"
  }
  label_map_path: "training/object-detection.pbtxt"
  shuffle: false
  num_readers: 1
}

At the end, the **ssd_mobilenet_v1_coco.config** should look like the one in the public git, inside the **tf_models** folder. It is advisable that you always download the files from the official source though, just in case there was an improvement.

### Step 4: Create the Label List file object-detection.pbtxt

<table style="float:left;background: #407EAF">
<tr>
<th>
<p class="transparent">Execute in WebShell #1</p>
</th>
</tr>
</table>

So, let's create this file called **object-detection.pbtxt**. Here you will state the labels for your training and the **ID** associated with it.

<table style="float:left;background: #407EAF">
<tr>
<th>
<p class="transparent">Execute in WebShell #1</p>
</th>
</tr>
</table>

In [None]:
roscd tf_unit1_pkg
rm -rf training
mkdir training
python scripts/generate_pbtxt_file.py

Lets have a look at this **generate_pbtxt_file.py** file. Of course you can generate this **pbtxt** file by hand, but this script will give you the power to then change the number of labels much easier:

<p style="background:#3B8F10;color:white;" id="import_pb_to_tensorboard">**END Python Program {3.3b-py}: generate_pbtxt_file.py** </p>

In [None]:
#!/usr/bin/env python
import sys
import os
import pandas as pd
from extract_training_lables_csv import extract_training_labels_csv, class_text_to_int


def create_label_contents(csv_input_for_labels):
    print "Opening CSV..."
    examples = pd.read_csv(csv_input_for_labels)
    print "Opened CSV..."
    unique_label_array = extract_training_labels_csv(examples)
    label_contents = ""
    for lable in unique_label_array:
        print "Generating Index for lable=="+str(lable)
        index = class_text_to_int(lable, unique_label_array)
        label_contents += "item {\n    id : "+str(index)+"\n    name : '"+str(lable)+"'\n}\n"
    
    return label_contents

def generate_pbtxt_files(file_path, csv_input_for_labels):
    print "Openening file=="+str(file_path)
    file = open(file_path,'w')
    #contents = "item {\nid : 1\nname : 'mira_robot'\n}\nitem {\nid: 2\nname: 'object'\n}"
    print "Start create_label_contents..."
    contents = create_label_contents(csv_input_for_labels)
    print "Done create_label_contents..."
    file.write(contents)
    file.close() 
    print "Pbtxt Generated..."+str(file_path)
  
if __name__ == "__main__":
    
    file_path = "training/object-detection.pbtxt"
    csv_input_for_labels = "data/train_labels.csv"
    generate_pbtxt_files(file_path, csv_input_for_labels)

<p style="background:#3B8F10;color:white;" id="import_pb_to_tensorboard">**Python Program {3.3b-py}: generate_pbtxt_file.py** </p>

In [None]:
item {
    id : 1
    name : 'label1'
}

item {
  id: 2
  name: 'label2'
}

...

item {
  id: n
  name: 'labeln'
}

In this case, because there is only **ONE** label, you just have to write:

In [None]:
item {
    id : 1
    name : 'mira_robot'
}

### Step 5: It's Time to Train!

Now, it's the moment of truth. We have to start the training process and cross our fingers that everything was set up correctly.

<table style="float:left;background: #407EAF">
<tr>
<th>
<p class="transparent">Execute in WebShell #1</p>
</th>
</tr>
</table>

Check that the object_detection module is inside the python path.<br>
If you are launching this in a new WebShell, or you are restarting from here, the python path won't have it.<br>
**THIS IS A SOURCE OF ERRORS**, so please make sure to check it.

In [None]:
roscd tf_unit1_pkg
cd models/research
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
echo $PYTHONPATH

### **And now, start the training.**

<table style="float:left;background: #407EAF">
<tr>
<th>
<p class="transparent">Execute in WebShell #1</p>
</th>
</tr>
</table>

In [None]:
roscd tf_unit1_pkg
cd models/research/object_detection
# We set it so that train.py can find object_recognition
python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v1_coco.config

If all went well, you should see that it starts to output the step times. That means it's training.<br>

<img src="img/tensorflow_image_unit3_steps.png" width="500" />

If the scripts gets killed, it means that you chose a **batch_size** that was too big for the processing power you have. Lower it in the **ssd_mobilenet_v1_coco.config** and try again. Also, using a lot of training images might kill the process as well, so again, remove some of the images, redo the previous steps, and try again.

You can also monitor the system load by executing **top** or **htop** (this last one has more colours). Here you can see the load average and the RAM memory used.

<table style="float:left;background: #407EAF">
<tr>
<th>
<p class="transparent">Execute in WebShell #1</p>
</th>
</tr>
</table>

In [None]:
top

<img src="img/tesnorflow_image_unit3_top.png" width="500" />

<table style="float:left;background: #407EAF">
<tr>
<th>
<p class="transparent">Execute in WebShell #1</p>
</th>
</tr>
</table>

In [None]:
htop

<img src="img/tesnorflow_images_unit3_htop.png" width="500" />

### Step 6: Use TensorBoard to check the training progress

As you did in **Unit 2**, you can start the **TensorBoard** client and monitor the training progress. Just run the following commands in another **WebShell**:

<table style="float:left;background: #407EAF">
<tr>
<th>
<p class="transparent">Execute in WebShell #2</p>
</th>
</tr>
</table>

In [None]:
# Activate tensorboard:
roscd tf_unit1_pkg
cd scripts/models/research/object_detection
tensorboard --logdir=training/

Then, connect the TensorBoard client to your browser, as explained in **Unit 2**, and get the IP, like this:

<table style="float:left;background: #407EAF">
<tr>
<th>
<p class="transparent">Execute in WebShell #2</p>
</th>
</tr>
</table>

In [None]:
public_ip

If all was done correctly, you should get something similar to this:

<img src="img/tensorflow_image_unit3_label1_start.png" width="700" />

And after about **2 hours**, you should have something similar to this. As you can see, the **TotalLoss** has stabilised around the **1.00** value. That's considered to be a model that has already learned and won't get any better. But, basically, you have to stop the learning process when the **TotalLoss** stabilises and mantains its value. Otherwise, you might **overtrain** your model, which is not recommended.

<img src="img/tensorflow_image_unit3_label1_end.png" width="700" />

<p style="background:#AB0017;color:white;">**WARNING**</p><br>

Please note that **RobotIgnite** has an automatic system that stops your session if you don't use your session for around 30 minutes, so please continue working on your session while you do the training.<br>
If you want a platform to do this training for heavy and complex models that take days, please use our **developement platform** <a href="http://www.theconstructsim.com/rds-ros-development-studio/">ROS Developement Studio</a>. You can have it for free if you want a small system, and you can pay if you want full **GPU Support** and more than **8 Cores** for cutting your training times.

<img src="img/tensorflow_images_unit3_rds.png" width="700" />

<p style="background:#AB0017;color:white;">**END WARNING**</p><br>

### Step 7: Export Inference Graph

The training process generates an **inference graph** at the end. This graph is a file that you will use to load it and make predictions and detections of the objects on a scene, in real time. This makes sense because you only have to do the training once, and then use that knowledge.

The first thing to do is export the files to your **scripts folder** for ease of use:

In [None]:
roscd tf_unit1_pkg
cd models/research
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
echo $PYTHONPATH

roscd tf_unit1_pkg
python scripts/extract_newest_ckpt_name.py models/research/object_detection/training/ newest_ckpt.txt
index_newest_ckpt=`cat newest_ckpt.txt`
    
cd models/research/object_detection/
rm -rf learned_model
echo "NewestVersion=="$index_newest_ckpt


python export_inference_graph.py \
    --input_type image_tensor \
    --pipeline_config_path training/$model_config_file_name \
    --trained_checkpoint_prefix training/model.ckpt-$index_newest_ckpt \
    --output_directory learned_model
ls learned_model

roscd tf_unit1_pkg
rm -rf learned_model
cp -r models/research/object_detection/learned_model ./

See the **index_newest_ckpt** string in one command? This number is replaced by the **last model version** from the training. To know that, the **extract_newest_ckpt_name.py** file was executed:

<p style="background:#3B8F10;color:white;" id="import_pb_to_tensorboard">**Python Program {3.3c-py}: extract_newest_ckpt_name.py** </p>

In [None]:
import sys
from os import path
from os import listdir
from os.path import isfile, join

def extract_newest_ckpt_name(ckpt_folder_path, output_data_file_path):
    """
    It serach inside the ckpt_folder_path for ckpt files, and saves into the 
    output_data_file_path the one with the biggest number, which is the most recent one.
    """
    files_list = [f for f in listdir(ckpt_folder_path) if isfile(join(ckpt_folder_path, f))]
    print (str(files_list))
    matching = [s for s in files_list if "model.ckpt-" in s]
    print (str(matching))
    meta_files_list = [s for s in matching if ".meta" in s]
    print (str(meta_files_list))
    
    # Get the highest number file model.ckpt-XXX.meta
    MAX_index_file_version = 0
    for meta_file in meta_files_list:
        # filename = model.ckpt-XXX
        filename, file_extension = path.splitext(meta_file)
        # aux1 = XXX.meta
        index_file_version = int(filename.split("-")[1])
        if index_file_version > MAX_index_file_version:
            MAX_index_file_version = index_file_version
    
    
    print "MAX_INDEX="+str(MAX_index_file_version)
    
    print "Opening file=="+str(output_data_file_path)
    file = open(output_data_file_path,'w')
    print "Start create_label_contents..."
    contents = str(MAX_index_file_version)
    print "Done create_label_contents..."
    file.write(contents)
    file.close() 
    print "Pbtxt Generated..."+str(output_data_file_path)
    
    return None

if __name__ == "__main__":
    """
    python scripts/extract_newest_ckpt_name.py /home/user/simulation_ws/src/tensorflow_image_automatic_learning/models/research/object_detection/training/ /home/user/simulation_ws/src/tensorflow_image_automatic_learning/newest_ckpt.txt
    """
    ckpt_folder_path= sys.argv[1] 
    output_data_file_path= sys.argv[2]
    extract_newest_ckpt_name(ckpt_folder_path, output_data_file_path)

<p style="background:#3B8F10;color:white;" id="import_pb_to_tensorboard">**Python Program {3.3c-py}: extract_newest_ckpt_name.py** </p>

<table style="float:left;background: #407EAF">
<tr>
<th>
<p class="transparent">Execute in WebShell #1</p>
</th>
</tr>
</table>

In [None]:
roscd tf_unit1_pkg/scripts
cd models/research/object_detection
ls training

This **extract_newest_ckpt_name.py** file will go instide the **models/research/object_detection/training** folder, and get the **ckpt** file with the highest number. This will we used for the generation of a **frozen-model** named **learned_model**, similar to the one you used in **Unit2**.

### Step 8: Copy Validation Images

We also copy validation images into the **test_images** folder, inside **object_detection**, to launch a validation script afterwards. This is crucial to see how well our model does with images that it hasn't seen before.

<table style="float:left;background: #407EAF">
<tr>
<th>
<p class="transparent">Execute in WebShell #1</p>
</th>
</tr>
</table>

In [None]:
# We copy and rename the images inside test to the test_images dir
# I did it manually
roscd tf_unit1_pkg
cp -a course_tflow_image_student_data/validation_images/. models/research/object_detection/test_images

These images are ones that haven't been used in the training or the testing. They don't even have any Mira inside. This is just to test how well they perform in unknown situations.

### Step 9: Launch the Testing Training Script

Now, we have to use these **validation images** to test the model. To do so, you have to launch this script:

<p style="background:#3B8F10;color:white;" id="import_pb_to_tensorboard">**Python Program {3.4-py}: validate_learning.py** </p>

In [None]:
import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile
import sys
import pandas as pd
from os import listdir
from os.path import isfile, join

from collections import defaultdict
from io import StringIO
#from matplotlib import pyplot as plt
from PIL import Image
import matplotlib.pyplot as plt
import matplotlib.image as mpimg

from extract_training_lables_csv import extract_training_labels_csv

if tf.__version__ < '1.4.0':
  raise ImportError('Please upgrade your tensorflow installation to v1.4.* or later!')
  
# /home/user/simulation_ws/src/tensorflow_image_automatic_learning
path_to_learn_pkg= sys.argv[1]
# learned_model
PATH_TO_MODEL_LEARNED= sys.argv[2]
# my_images/validation
PATH_TO_TEST_IMAGES_DIR = sys.argv[3]


research_module_path = os.path.join(path_to_learn_pkg,"models/research")
object_detection_module_path = os.path.join(research_module_path,"object_detection")
sys.path.append(research_module_path)
sys.path.append(object_detection_module_path)

print(sys.path)

from object_detection.utils import ops as utils_ops
from utils import label_map_util
from utils import visualization_utils as vis_util

# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = PATH_TO_MODEL_LEARNED + '/frozen_inference_graph.pb'
#final_path_to_ckpt = os.path.join(path_to_learn_pkg,PATH_TO_CKPT)
final_path_to_ckpt = PATH_TO_CKPT

# List of the strings that is used to add correct label for each box. In our case mira_robot
PATH_TO_LABELS = os.path.join(path_to_learn_pkg, 'training/object-detection.pbtxt')

csv_input_for_labels=os.path.join(path_to_learn_pkg,'data/train_labels.csv')
examples = pd.read_csv(csv_input_for_labels)
print "Opened CSV..."
unique_label_array = extract_training_labels_csv(examples)
NUM_CLASSES = len(unique_label_array)


detection_graph = tf.Graph()
with detection_graph.as_default():
  od_graph_def = tf.GraphDef()
  with tf.gfile.GFile(final_path_to_ckpt, 'rb') as fid:
    serialized_graph = fid.read()
    od_graph_def.ParseFromString(serialized_graph)
    tf.import_graph_def(od_graph_def, name='')
    
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)

def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)
      
# For the sake of simplicity we will use only 2 images:
# image1.jpg
# image2.jpg
# If you want to test the code with your images, just add path to the images to the TEST_IMAGE_PATHS.
#final_path_to_test_img = os.path.join(path_to_learn_pkg,PATH_TO_TEST_IMAGES_DIR)
final_path_to_test_img = PATH_TO_TEST_IMAGES_DIR

numberof_validation_img = len([f for f in listdir(final_path_to_test_img) if isfile(join(final_path_to_test_img, f))])

TEST_IMAGE_PATHS = [ os.path.join(final_path_to_test_img, 'image{}.jpg'.format(i)) for i in range(1, numberof_validation_img+1) ]

# Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)

def run_inference_for_single_image(image, graph):
  with graph.as_default():
    with tf.Session() as sess:
      # Get handles to input and output tensors
      ops = tf.get_default_graph().get_operations()
      all_tensor_names = {output.name for op in ops for output in op.outputs}
      tensor_dict = {}
      for key in [
          'num_detections', 'detection_boxes', 'detection_scores',
          'detection_classes', 'detection_masks'
      ]:
        tensor_name = key + ':0'
        if tensor_name in all_tensor_names:
          tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(
              tensor_name)
      if 'detection_masks' in tensor_dict:
        # The following processing is only for single image
        detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])
        detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])
        # Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.
        real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32)
        detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])
        detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1])
        detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
            detection_masks, detection_boxes, image.shape[0], image.shape[1])
        detection_masks_reframed = tf.cast(
            tf.greater(detection_masks_reframed, 0.5), tf.uint8)
        # Follow the convention by adding back the batch dimension
        tensor_dict['detection_masks'] = tf.expand_dims(
            detection_masks_reframed, 0)
      image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')

      # Run inference
      output_dict = sess.run(tensor_dict,
                             feed_dict={image_tensor: np.expand_dims(image, 0)})

      # all outputs are float32 numpy arrays, so convert types as appropriate
      output_dict['num_detections'] = int(output_dict['num_detections'][0])
      output_dict['detection_classes'] = output_dict[
          'detection_classes'][0].astype(np.uint8)
      output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
      output_dict['detection_scores'] = output_dict['detection_scores'][0]
      if 'detection_masks' in output_dict:
        output_dict['detection_masks'] = output_dict['detection_masks'][0]
  return output_dict
  
for image_path in TEST_IMAGE_PATHS:
  image = Image.open(image_path)
  # the array based representation of the image will be used later in order to prepare the
  # result image with boxes and labels on it.
  image_np = load_image_into_numpy_array(image)
  # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
  image_np_expanded = np.expand_dims(image_np, axis=0)
  # Actual detection.
  output_dict = run_inference_for_single_image(image_np, detection_graph)
  # Visualization of the results of a detection.
  vis_util.visualize_boxes_and_labels_on_image_array(
      image_np,
      output_dict['detection_boxes'],
      output_dict['detection_classes'],
      output_dict['detection_scores'],
      category_index,
      instance_masks=output_dict.get('detection_masks'),
      use_normalized_coordinates=True,
      line_thickness=8)
  #plt.figure(figsize=IMAGE_SIZE)
  #plt.imshow(image_np)
  imgplot = plt.imshow(image_np)
  plt.show()

<p style="background:#3B8F10;color:white;" id="import_pb_to_tensorboard">**END Python Program {3.4-py}: validate_learning.py** </p>

In [None]:
TEST_IMAGE_PATHS = [ os.path.join(final_path_to_test_img, 'image{}.jpg'.format(i)) for i in range(1, 6) ]

This line specifies the range of images from the validation folder, which were copied into the **test_images**. They have to all be named something similar to **imageNumber.jpg**. And if you have a range from 1-5,then you have to state **range(1,5+1)**.

Now, you can execute the script:

<table style="float:left;background: #407EAF">
<tr>
<th>
<p class="transparent">Execute in WebShell #1</p>
</th>
</tr>
</table>

In [None]:
# We copy and rename the images inside test to the test_images dir
# I did it manually
roscd tf_unit1_pkg
pwd_now=$(pwd)
python ./scripts/validate_learning.py $pwd_now $pwd_now"/learned_model" $pwd_now"/models/research/object_detection/test_images"

You now just have to go to the **Graphical Tools** and you will get the **validations images**, one by one, with their detections. To go to the next one, close the image to make the next one appear. As you may see, not all of them are correctly classified. But that's where adding more images and different modules comes into play. We won't talk about that in this basic course, however.

<img src="img/font-awesome_desktop.png">

<p style="background:#EE9023;color:white;">**Exercise 3.1**</p>

Try it with your own uploaded validation images. See how it recognises people, or if it behaves with other robots, animals or objects.

<p style="background:#EE9023;color:white;">**END Exercise 3.1**</p>

### Step 10: Launch the Testing Training Script

And here comes the **ROS** connection again. So, what we want is for our robot, Mira, to recognise itself in the virtual world. And perhaps in the **real world**. So, we have to make recognitions in real time.

#### Step 10.1: Create the Python script that combines TensorFlow with ROS

This script is a combination of the **test_training.py** and the **image_recognition.py** from **Unit 1**.

<p style="background:#3B8F10;color:white;" id="import_pb_to_tensorboard">**Python Program {3.5-py}: search_for_mira_robot.py** </p>

In [None]:
#!/usr/bin/env python
import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile

from collections import defaultdict
from io import StringIO
#from matplotlib import pyplot as plt
#from PIL import Image
import matplotlib.pyplot as plt
import matplotlib.image as mpimg

import rospkg
import rospy
from sensor_msgs.msg import Image
from std_msgs.msg import String
from cv_bridge import CvBridge
import cv2

if tf.__version__ < '1.4.0':
  raise ImportError('Please upgrade your tensorflow installation to v1.4.* or later!')
  
# get an instance of RosPack with the default search paths
rospack = rospkg.RosPack()
# get the file path for rospy_tutorials

path_to_learn_pkg = rospack.get_path('learn_newobjects_tf_pkg')
research_module_path = os.path.join(path_to_learn_pkg,"scripts/models/research")
object_detection_module_path = os.path.join(path_to_learn_pkg,"scripts/models/research/object_detection")
sys.path.append(object_detection_module_path)

#print(sys.path)

from object_detection.utils import ops as utils_ops
from utils import label_map_util

from utils import visualization_utils as vis_util

# What model to download.
MODEL_NAME = 'learned_model'
# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'

scripts_module_path = os.path.join(path_to_learn_pkg,"scripts/")
final_path_to_ckpt = os.path.join(scripts_module_path,PATH_TO_CKPT)

# List of the strings that is used to add correct label for each box. In our case mira_robot
PATH_TO_LABELS = os.path.join(scripts_module_path, 'training/object-detection.pbtxt')

NUM_CLASSES = 1


detection_graph = tf.Graph()
with detection_graph.as_default():
  od_graph_def = tf.GraphDef()
  with tf.gfile.GFile(final_path_to_ckpt, 'rb') as fid:
    serialized_graph = fid.read()
    od_graph_def.ParseFromString(serialized_graph)
    tf.import_graph_def(od_graph_def, name='')
    
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)

def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)
      
# Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)

def run_inference_for_single_image(image, graph):
  with graph.as_default():
    with tf.Session() as sess:
      # Get handles to input and output tensors
      ops = tf.get_default_graph().get_operations()
      all_tensor_names = {output.name for op in ops for output in op.outputs}
      tensor_dict = {}
      for key in [
          'num_detections', 'detection_boxes', 'detection_scores',
          'detection_classes', 'detection_masks'
      ]:
        tensor_name = key + ':0'
        if tensor_name in all_tensor_names:
          tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(
              tensor_name)
      if 'detection_masks' in tensor_dict:
        # The following processing is only for a single image
        detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])
        detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])
        # Reframing is required to translate the mask from box coordinates to image coordinates and fit the image size.
        real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32)
        detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])
        detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1])
        detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
            detection_masks, detection_boxes, image.shape[0], image.shape[1])
        detection_masks_reframed = tf.cast(
            tf.greater(detection_masks_reframed, 0.5), tf.uint8)
        # Follow the convention by adding back the batch dimension
        tensor_dict['detection_masks'] = tf.expand_dims(
            detection_masks_reframed, 0)
      image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')

      # Run inference
      output_dict = sess.run(tensor_dict,
                             feed_dict={image_tensor: np.expand_dims(image, 0)})

      # all outputs are float32 numpy arrays, so convert types as appropriate
      output_dict['num_detections'] = int(output_dict['num_detections'][0])
      output_dict['detection_classes'] = output_dict[
          'detection_classes'][0].astype(np.uint8)
      output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
      output_dict['detection_scores'] = output_dict['detection_scores'][0]
      if 'detection_masks' in output_dict:
        output_dict['detection_masks'] = output_dict['detection_masks'][0]
  return output_dict
  

class RosTensorFlow():
    def __init__(self):
        # Processing the variable to process only half of the frame's lower load
        self._process_this_frame = True
        self._cv_bridge = CvBridge()

        self._sub = rospy.Subscriber('image', Image, self.callback, queue_size=1)
        self._pub = rospy.Publisher('result', String, queue_size=1)
        self.score_threshold = rospy.get_param('~score_threshold', 0.1)
        self.use_top_k = rospy.get_param('~use_top_k', 5)
        
        

    def callback(self, image_msg):
        if (self._process_this_frame):
            
            image_np = self._cv_bridge.imgmsg_to_cv2(image_msg, "bgr8")
    
            # Expand dimensions since the model expects images to have shapes: [1, None, None, 3]
            image_np_expanded = np.expand_dims(image_np, axis=0)
            # Actual detection.
            output_dict = run_inference_for_single_image(image_np, detection_graph)
            # Visualization of the results of a detection.
            vis_util.visualize_boxes_and_labels_on_image_array(
                image_np,
                output_dict['detection_boxes'],
                output_dict['detection_classes'],
                output_dict['detection_scores'],
                category_index,
                instance_masks=output_dict.get('detection_masks'),
                use_normalized_coordinates=True,
                line_thickness=8)
            cv2.imshow("Image window", image_np)
            cv2.waitKey(1)
        else:
            pass
        # We invert it
        self._process_this_frame = not self._process_this_frame
        
        
        
    def main(self):
        rospy.spin()

if __name__ == '__main__':
    rospy.init_node('search_mira_robot_node')
    tensor = RosTensorFlow()
    tensor.main()

<p style="background:#3B8F10;color:white;" id="import_pb_to_tensorboard">**END Python Program {3.5-py}: search_for_mira_robot.py** </p>

#### Step 10.2: Create the launch file for starting the SearchFor MiraRobot

Now, just create the launch file in exactly the same way as you did in Unit 1:

<p style="background:#3B8F10;color:white;" id="start_image_recognition">**Launch File {3.6-launch}: start_search_mira_robot.launch** </p>

<p style="background:#3B8F10;color:white;" id="start_image_recognition">**END Launch File {3.6-launch}: start_search_mira_robot.launch** </p>

And now, you launch it:

<table style="float:left;background: #407EAF">
<tr>
<th>
<p class="transparent">Execute in WebShell #1</p>
</th>
</tr>
</table>

In [None]:
# Step 6: Launch the Testing for Mira in the learn_newobjects_tf_pkg main.launch world
roslaunch tf_unit1_pkg start_search_mira_robot.launch

You now have to go to the **Graphical Tools**:

<img src="img/font-awesome_desktop.png">

<img src="img/tensorflow_image_unit3_label1_results1.png">

<img src="img/tensorflow_image_unit3_label1_results2.png">

### Conclusions of Example:

You probably have seen that teaching this model only **ONE** thing has led to making it an **object** recogniser, rather than  having the ability to differentiate between **mira_robot** and **everything else**.<br>
So, that's the next step that **you will have to do** in the following exercise, **3.2**.

<p style="background:#407EAF;color:white;">END **Example 3.1**</p><br>

<p style="background:#EE9023;color:white;">**Exercise 3.2**</p>

So, now you have to make Mira Robot **differentiate** between **mira_robots** and **other objects**.<br>
This means that you will have to train with **TWO** labels. Therefore, you will have to make all of the necessary modifications so that it trains with the label **mira_robot** and the label **object**.<br>
* To make this task less painful, we have already provided a folder with all of the images labeled with two tags. You can find them in the public git **course_tflow_image_student_data/images_2_labels**.

<p style="background:#EE9023;color:white;">**END Exercise 3.2**</p>

<p style="background:green;color:white;">Solution Exercise 3.2</p>

Please try to do it by yourself, unless you get stuck or need some inspiration. You will learn much more if you fight for each exercise.

<img src="img/robotignite_logo_text.png" width="700" />

Follow this link to open the solutions notebook for Unit 3:[solutions_tensofrflow_images_unit3](extra_files/solutions_tensofrflow_images_unit3.ipynb)

The learning process should look something like this:

<img src="img/tensorflow_image_unit3_15hourslearning_labels2.png" width="700" />

This is a learning process of more than 15 hours. But, as you can see, after about the 8th hour, there is no significant improvement.

Your detection process should look something like this:

<img src="img/tensorflow_image_unit3_ex3-2_solution.gif" width="400" />

<p style="background:green;color:white;">END Solution Exercise 3.2</p>