# CS445 Final Project


Introduction
---


Title: Object Detection of Ships using Tensorflow

Authors: Gabriel Vigil, Reece Sharp  


---


  
Note: This project was done on Google Colab due to the ease of working together along with the stronger system capabiltiies.  

Originally this project's scope was smaller, and only intended to use class implementations, but it soon became apparant that the dataset warranted deep learning techniques. As a result, we opted to leverage Google's Tensorflow library against this dataset and perform an analysis on the results.  
  
The benefit to this dataset is that it only has 1 class, which helps streamlines object detection training.
  
- Use different pre-trained models (already trained on 90 types of images, including boats)
- Train a model (Possibly faster R-CNN https://github.com/tensorpack/tensorpack/tree/master/examples/FasterRCNN)




#Setting up Tensorflow




In [0]:
#Get tensorflow loaded in, make sure to set runtime settings to GPU
#Note: the numpy version must be changed in order for the training to correctly complete
#You may have to restart the runtime here in order to update these changes, and you can rerun to make sure everything is as it should be
!pip install numpy==1.17.4
!pip install tensorflow==1.15.2
%tensorflow_version 1.x
import tensorflow as tf
device_name = tf.test.gpu_device_name()
print(device_name)

TensorFlow 1.x selected.
/device:GPU:0


Setting up the TensorFlow Models
---

In addition to setting up Tensorflow, we need to grab the repo of Tensorflow models

Now that we have the dataset downloaded, it needs to be formatted. Currently it's in the form of Pascal VOC, but for use in tensorflow, needs to be in TFRecord. Tensorflows's github has a dataset converstion tool that was created for this purpose: https://github.com/tensorflow/models/blob/master/research/object_detection/dataset_tools/create_pascal_tf_record.py

It should be noted that this tool is more-so used for the Pascal VOC yearly datasets, it needs to be adjusted for use with anything else.

In [0]:
!git clone https://github.com/tensorflow/models.git

Cloning into 'models'...
remote: Enumerating objects: 28, done.[K
remote: Counting objects: 100% (28/28), done.[K
remote: Compressing objects: 100% (28/28), done.[K
remote: Total 34948 (delta 9), reused 12 (delta 0), pack-reused 34920[K
Receiving objects: 100% (34948/34948), 512.72 MiB | 31.38 MiB/s, done.
Resolving deltas: 100% (22637/22637), done.


This is included in the earlier Tensorflow model pull from gitub, and it is just a list of required libraries needed to run the models and their associated functions.

In [0]:
import os
os.chdir('/content/models/research/')
!python setup.py build
!python setup.py install

running build
running build_py
copying object_detection/protos/ssd_pb2.py -> build/lib/object_detection/protos
copying object_detection/protos/preprocessor_pb2.py -> build/lib/object_detection/protos
copying object_detection/protos/argmax_matcher_pb2.py -> build/lib/object_detection/protos
copying object_detection/protos/grid_anchor_generator_pb2.py -> build/lib/object_detection/protos
copying object_detection/protos/box_coder_pb2.py -> build/lib/object_detection/protos
copying object_detection/protos/post_processing_pb2.py -> build/lib/object_detection/protos
copying object_detection/protos/eval_pb2.py -> build/lib/object_detection/protos
copying object_detection/protos/hyperparams_pb2.py -> build/lib/object_detection/protos
copying object_detection/protos/calibration_pb2.py -> build/lib/object_detection/protos
copying object_detection/protos/pipeline_pb2.py -> build/lib/object_detection/protos
copying object_detection/protos/string_int_label_map_pb2.py -> build/lib/object_detection/p

We also need to append the location of the object detection library to the path so python can find it both during the TFRecord creation, and later during the training

In [0]:
!echo $PYTHONPATH
os.environ['PYTHONPATH'] = "/tensorflow-1.15.2/python3.6:/env/python:/content/models/research:/content/models/research/slim"
#os.environ['PYTHONPATH'] += ":/content/models/research:/content/models/research/slim"
!echo $PYTHONPATH

/tensorflow-1.15.2/python3.6:/env/python:/content/models/research:/content/models/research/slim
/tensorflow-1.15.2/python3.6:/env/python:/content/models/research:/content/models/research/slim


There are also a few included .proto files included in the Tensorflow models, which are basic definition files. They need to be compiled for the given system before anything can be run. The tool to do this is Google's 'protoc' (proto-compile). It was uploaded to google drive as this streamlines the setup.
Download location: https://github.com/protocolbuffers/protobuf/releases  
File: protoc-3.10.1-win64.zip



In [0]:
!wget --no-check-certificate 'https://docs.google.com/uc?export=download&id=1HXNY8NN8lKBJPMqGeT-uJawmJwbbT-UA' -O "/content/protoc-3.10.1-win64.zip"

--2020-05-13 00:15:30--  https://docs.google.com/uc?export=download&id=1HXNY8NN8lKBJPMqGeT-uJawmJwbbT-UA
Resolving docs.google.com (docs.google.com)... 74.125.206.102, 74.125.206.139, 74.125.206.100, ...
Connecting to docs.google.com (docs.google.com)|74.125.206.102|:443... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: https://doc-0g-64-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/slmaf0u2rfnhvv23lkmeuk3i7tumi2ah/1589328900000/03054563583731495869/*/1HXNY8NN8lKBJPMqGeT-uJawmJwbbT-UA?e=download [following]
--2020-05-13 00:15:31--  https://doc-0g-64-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/slmaf0u2rfnhvv23lkmeuk3i7tumi2ah/1589328900000/03054563583731495869/*/1HXNY8NN8lKBJPMqGeT-uJawmJwbbT-UA?e=download
Resolving doc-0g-64-docs.googleusercontent.com (doc-0g-64-docs.googleusercontent.com)... 64.233.167.132, 2a00:1450:400c:c0a::84
Connecting to doc-0g-64-docs.googleusercontent.com (doc-0g-64-d

In [0]:
!unzip -q '/content/protoc-3.10.1-win64.zip'

replace include/google/protobuf/type.proto? [y]es, [n]o, [A]ll, [N]one, [r]ename: A


In [0]:
!pwd
#!protoc object_detection/protos/string_int_label_map.proto --python_out=.
!protoc object_detection/protos/*.proto --python_out=.

/content/models/research


This is a generalized check just to make sure all of the libraries are where they need to be for use with the Tensorflow models. 

In [0]:
os.chdir('/content/models/research/object_detection/builders/')

In [0]:
!python model_builder_test.py

#Setting Up The Dataset


Now the dataset needs to be imported from Kaggle. The dataset is found here: https://www.kaggle.com/tomluther/ships-in-google-earth. We can use google's filesystem and Kaggle's public API to download a dataset directly the working directory.



In [0]:
# Colab library to upload files to notebook
os.chdir('/content/')
from google.colab import files

# Install Kaggle library
!pip install -q kaggle

In [0]:
# Make a root dir for the kaggle library to find the public api key
!mkdir /content/.kaggle

import json

token = {"username":"reecesharp","key":"2a6e9751619c9254396396f83bfc71db"}

with open('/content/.kaggle/kaggle.json', 'w') as file:
    json.dump(token, file)

mkdir: cannot create directory ‘/content/.kaggle’: File exists


In [0]:
# Moving the file to the correct dir on google colab, and changing
# permissions so we're the only ones that can see it
!cp /content/.kaggle/kaggle.json ~/.kaggle/kaggle.json

In [0]:
!kaggle config set -n path -v{/content}

- path is now set to: {/content}


In [0]:
!chmod 600 /root/.kaggle/kaggle.json

In [0]:
# Download data for the ship dataset
!kaggle datasets download -d tomluther/ships-in-google-earth -p /content

ships-in-google-earth.zip: Skipping, found more recently modified local copy (use --force to force download)


In [0]:
!unzip /content/ships-in-google-earth.zip

Archive:  /content/ships-in-google-earth.zip
  inflating: tl_data/.floyddata      
  inflating: tl_data/.floydexpt      
  inflating: tl_data/.floydignore    
  inflating: tl_data/test/ImageSets/Layout/test.txt  
  inflating: tl_data/test/ImageSets/Main/boat_test.txt  
  inflating: tl_data/test/ImageSets/Main/test.txt  
  inflating: tl_data/test/ImageSets/Segmentation/test.txt  
  inflating: tl_data/test/JPEGImages/Test1.jpg  
  inflating: tl_data/test/JPEGImages/Test10.jpg  
  inflating: tl_data/test/JPEGImages/Test100.jpg  
  inflating: tl_data/test/JPEGImages/Test11.jpg  
  inflating: tl_data/test/JPEGImages/Test12.jpg  
  inflating: tl_data/test/JPEGImages/Test13.jpg  
  inflating: tl_data/test/JPEGImages/Test14.jpg  
  inflating: tl_data/test/JPEGImages/Test15.jpg  
  inflating: tl_data/test/JPEGImages/Test16.jpg  
  inflating: tl_data/test/JPEGImages/Test17.jpg  
  inflating: tl_data/test/JPEGImages/Test18.jpg  
  inflating: tl_data/test/JPEGImages/Test19.jpg  
  inflating: tl_da

Dataset FileSystem  
---

- ships-in-google-earth  
    - tl_data  
        - test 
            - annotations  
            - ImageSets  
            - JPEGImages  
        - tl_data (Note: this file is redundant, it's a copy of the parent folder, leading the the file being 2x as big for no reason  
        - training
            - annotations  
            - ImageSets  
            - JPEGImages  

Annotation Fix
---

The current 'folder' value in the annotation .xml files set by the dataset author are incorrect. They should relate to the folder structure of the project and where the .jpg relating to the .xml annotation are found. At the moment they contain values like 'GoogleEarth', 'GoogleEarth2', 'GM', and 'untitled_folder'. Instead, the .jpg files are always found in the same folder, and as a result should be standardized to be some value. A script was written to standardize both the training and test annotation files, so they can be associated with the correct .jpg files.

In [0]:
#Switch to the location of the object detection model information
os.chdir('/content/models/research/object_detection')

In [0]:
!cat /content/tl_data/training/annotations/GE_1.xml

<annotation>
    <folder>GoogleEarth</folder>
    <filename>GE_1.jpg</filename>
    <size>
        <width>4800</width>
        <height>2908</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented>
    <object>
        <name>boat</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        
        <difficult>0</difficult>
        <bndbox>
            <xmin>2538</xmin>
            <ymin>1090</ymin>
            <xmax>2915</xmax>
            <ymax>1294</ymax>
        </bndbox>
    </object>
    <object>
        <name>boat</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        
        <difficult>0</difficult>
        <bndbox>
            <xmin>2508</xmin>
            <ymin>1467</ymin>
            <xmax>2808</xmax>
            <ymax>1606</ymax>
        </bndbox>
    </object>
</annotation>

In [0]:
import xml.etree.ElementTree as ET

for partition in ['training', 'test']:
    print("Editing: ", partition)
    directory = '/content/tl_data/' + partition +'/annotations'
    for file in os.listdir(directory):
        path = os.path.join(directory, file)
        #print(path)
        tree = ET.parse(path)
        root = tree.getroot()
        for folder in root.findall('folder'):
            print("Before:", folder.text, "| After: ", end="")
            folder.text = 'JPEGImages'
            print(folder.text)
        tree.write(path)


Editing:  training
Before: GE | After: JPEGImages
Before: untitled_folder | After: JPEGImages
Before: untitled_folder | After: JPEGImages
Before: GoogleEarth2 | After: JPEGImages
Before: GE | After: JPEGImages
Before: GoogleEarth | After: JPEGImages
Before: GE | After: JPEGImages
Before: untitled_folder | After: JPEGImages
Before: GE | After: JPEGImages
Before: untitled_folder | After: JPEGImages
Before: GoogleEarth | After: JPEGImages
Before: untitled_folder | After: JPEGImages
Before: GoogleEarth | After: JPEGImages
Before: GE | After: JPEGImages
Before: GoogleEarth2 | After: JPEGImages
Before: untitled_folder | After: JPEGImages
Before: GoogleEarth2 | After: JPEGImages
Before: GE | After: JPEGImages
Before: untitled_folder | After: JPEGImages
Before: untitled_folder | After: JPEGImages
Before: untitled_folder | After: JPEGImages
Before: untitled_folder | After: JPEGImages
Before: GE | After: JPEGImages
Before: GE | After: JPEGImages
Before: GE | After: JPEGImages
Before: untitled_fo

In [0]:
!cat /content/tl_data/training/annotations/GE_1.xml

<annotation>
    <folder>JPEGImages</folder>
    <filename>GE_1.jpg</filename>
    <size>
        <width>4800</width>
        <height>2908</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented>
    <object>
        <name>boat</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        
        <difficult>0</difficult>
        <bndbox>
            <xmin>2538</xmin>
            <ymin>1090</ymin>
            <xmax>2915</xmax>
            <ymax>1294</ymax>
        </bndbox>
    </object>
    <object>
        <name>boat</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        
        <difficult>0</difficult>
        <bndbox>
            <xmin>2508</xmin>
            <ymin>1467</ymin>
            <xmax>2808</xmax>
            <ymax>1606</ymax>
        </bndbox>
    </object>
</annotation>

This is part of the issue with the formatting of the dataset, with standardizing the 'folder' value to 'JPEGImages', the path it is going to search is '/content/tl_data/\{training, test}/JPEGImages/JPEGImages/*.jpg. The parent 'JPEGImages' in the directory has to be the same as the value set in the script above, the child directory will always be "JPEGImages".
  
This double directory will then allow the create_pascal_tf_record to correctly serialize the dataset into a singular file.

In [0]:
!mkdir /content/tl_data/training/JPEGImages/JPEGImages
!mv /content/tl_data/training/JPEGImages/GE* /content/tl_data/training/JPEGImages/JPEGImages/

In [0]:
!mkdir /content/tl_data/test/JPEGImages/JPEGImages
!mv /content/tl_data/test/JPEGImages/Test* /content/tl_data/test/JPEGImages/JPEGImages/

The model works with numbers, so a basic label map is used to convert those ID values to their

In [0]:
label_map =  """
item {
  id: 1
  name: 'boat'
}
"""
with open("/content/label_map.pbtxt", "w") as text_file:
    text_file.write(label_map)

Generating TFRecords from Pascal VOC
---

After correcting the details in the raw dataset and creating a basic label_map for the model to use, they can now be converted from Pascal VOC to TFRecord

Note: The files for the conversion are based on the Pascal VOC challenge datasets, so they will need to be slightly adapted for use with other training datasets, even if they follow the same formats found here: http://host.robots.ox.ac.uk/pascal/VOC/

In [0]:
#update file for training #https://drive.google.com/open?id=17W-UK7Le7qdgAoXP6b6D5w9FxUQINoKi
!wget --no-check-certificate 'https://docs.google.com/uc?export=download&id=17W-UK7Le7qdgAoXP6b6D5w9FxUQINoKi' -O "/content/models/research/object_detection/dataset_tools/create_pascal_tf_record.py"

--2020-05-13 00:17:49--  https://docs.google.com/uc?export=download&id=17W-UK7Le7qdgAoXP6b6D5w9FxUQINoKi
Resolving docs.google.com (docs.google.com)... 74.125.206.139, 74.125.206.101, 74.125.206.100, ...
Connecting to docs.google.com (docs.google.com)|74.125.206.139|:443... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: https://doc-0o-64-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/je25ouppuseggfgv0a95b2q4e24f7aud/1589329050000/03054563583731495869/*/17W-UK7Le7qdgAoXP6b6D5w9FxUQINoKi?e=download [following]
--2020-05-13 00:17:49--  https://doc-0o-64-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/je25ouppuseggfgv0a95b2q4e24f7aud/1589329050000/03054563583731495869/*/17W-UK7Le7qdgAoXP6b6D5w9FxUQINoKi?e=download
Resolving doc-0o-64-docs.googleusercontent.com (doc-0o-64-docs.googleusercontent.com)... 64.233.167.132, 2a00:1450:400c:c0a::84
Connecting to doc-0o-64-docs.googleusercontent.com (doc-0o-64-d

In [0]:
# Convert raw training xml data to TFRecords
!python dataset_tools/create_pascal_tf_record.py --data_dir=/content/tl_data/training --output_path=/content/pascal_train.record



W0513 00:19:27.985458 140510598010752 module_wrapper.py:139] From dataset_tools/create_pascal_tf_record.py:159: The name tf.python_io.TFRecordWriter is deprecated. Please use tf.io.TFRecordWriter instead.

I0513 00:19:27.990636 140510598010752 create_pascal_tf_record.py:164] Reading from PASCAL VOC2007 dataset.

W0513 00:19:27.991322 140510598010752 module_wrapper.py:139] From /content/models/research/object_detection/utils/dataset_util.py:62: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.

I0513 00:19:27.993096 140510598010752 create_pascal_tf_record.py:176] On image 0 of 555
  if not xml:
I0513 00:19:34.269020 140510598010752 create_pascal_tf_record.py:176] On image 100 of 555
I0513 00:19:34.379502 140510598010752 create_pascal_tf_record.py:176] On image 200 of 555
I0513 00:19:34.493955 140510598010752 create_pascal_tf_record.py:176] On image 300 of 555
I0513 00:19:34.622373 140510598010752 create_pascal_tf_record.py:176] On image 400 of 555
I0513 00:1

In [0]:
#update file for test #https://drive.google.com/open?id=1wwrKs88gn5ynlMPrq0doskKkBz1p4a9v
!wget --no-check-certificate 'https://docs.google.com/uc?export=download&id=1wwrKs88gn5ynlMPrq0doskKkBz1p4a9v' -O "/content/models/research/object_detection/dataset_tools/create_pascal_tf_record.py"

--2020-05-13 00:19:41--  https://docs.google.com/uc?export=download&id=1wwrKs88gn5ynlMPrq0doskKkBz1p4a9v
Resolving docs.google.com (docs.google.com)... 74.125.206.100, 74.125.206.138, 74.125.206.102, ...
Connecting to docs.google.com (docs.google.com)|74.125.206.100|:443... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: https://doc-0k-64-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/1rlfhifqmnhmogaa8d2nbqq3t51ocibu/1589329125000/03054563583731495869/*/1wwrKs88gn5ynlMPrq0doskKkBz1p4a9v?e=download [following]
--2020-05-13 00:19:41--  https://doc-0k-64-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/1rlfhifqmnhmogaa8d2nbqq3t51ocibu/1589329125000/03054563583731495869/*/1wwrKs88gn5ynlMPrq0doskKkBz1p4a9v?e=download
Resolving doc-0k-64-docs.googleusercontent.com (doc-0k-64-docs.googleusercontent.com)... 64.233.167.132, 2a00:1450:400c:c0a::84
Connecting to doc-0k-64-docs.googleusercontent.com (doc-0k-64-d

In [0]:
#Convert raw test xml data to TFRecords
!python dataset_tools/create_pascal_tf_record.py --data_dir=/content/tl_data/test --output_path=/content/pascal_test.record



W0513 00:19:53.856768 140719410993024 module_wrapper.py:139] From dataset_tools/create_pascal_tf_record.py:159: The name tf.python_io.TFRecordWriter is deprecated. Please use tf.io.TFRecordWriter instead.

I0513 00:19:53.859265 140719410993024 create_pascal_tf_record.py:164] Reading from PASCAL VOC2007 dataset.

W0513 00:19:53.859947 140719410993024 module_wrapper.py:139] From /content/models/research/object_detection/utils/dataset_util.py:62: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.

I0513 00:19:53.860495 140719410993024 create_pascal_tf_record.py:176] On image 0 of 100
  if not xml:


In [0]:
# SSD with Mobilenet v1 configuration
#adapted from https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssd_mobilenet_v1_coco.config
config = """
model {
  ssd {
    num_classes:1  #number of classes to be trained. in my case 1
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.2
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.3333
      }
    }
    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }
    }
    box_predictor {
      convolutional_box_predictor {
        min_depth: 0
        max_depth: 0
        num_layers_before_predictor: 0
        use_dropout: false
        dropout_keep_probability: 0.8
        kernel_size: 1
        box_code_size: 4
        apply_sigmoid_to_scores: false
        conv_hyperparams {
          activation: RELU_6,
          regularizer {
            l2_regularizer {
              weight: 0.00004
            }
          }
          initializer {
            truncated_normal_initializer {
              stddev: 0.03
              mean: 0.0
            }
          }
          batch_norm {
            train: true,
            scale: true,
            center: true,
            decay: 0.9997,
            epsilon: 0.001,
          }
        }
      }
    }
    feature_extractor {
      type: 'ssd_mobilenet_v1'
      min_depth: 16
      depth_multiplier: 1.0
      conv_hyperparams {
        activation: RELU_6,
        regularizer {
          l2_regularizer {
            weight: 0.00004
          }
        }
        initializer {
          truncated_normal_initializer {
            stddev: 0.03
            mean: 0.0
          }
        }
        batch_norm {
          train: true,
          scale: true,
          center: true,
          decay: 0.9997,
          epsilon: 0.001,
        }
      }
    }
    loss {
      classification_loss {
        weighted_sigmoid {
        }
      }
      localization_loss {
        weighted_smooth_l1 {
          anchorwise_output: true
        }
      }
      hard_example_miner {
        num_hard_examples: 3000
        iou_threshold: 0.99
        loss_type: CLASSIFICATION
        max_negatives_per_positive: 3
        min_negatives_per_image: 0
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-8
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
  }
}
train_config: {
  batch_size: 10
  optimizer {
    rms_prop_optimizer: {
      learning_rate: {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.004
          decay_steps: 800720
          decay_factor: 0.95
        }
      }
      momentum_optimizer_value: 0.9
      decay: 0.9
      epsilon: 1.0
    }
  }
  fine_tune_checkpoint: "ssd_mobilenet_v1_coco_11_06_2017/model.ckpt"
  from_detection_checkpoint: true
  # Note: The below line limits the training process to 200K steps, which we
  # empirically found to be sufficient enough to train the pets dataset. This
  # effectively bypasses the learning rate schedule (the learning rate will
  # never decay). Remove the below line to train indefinitely.
  #num_steps: 200000
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
}
train_input_reader: {
  tf_record_input_reader {
    input_path: "/content/pascal_train.record"
  }
  label_map_path: "/content/label_map.pbtxt" #"training/object-detection.pbtxt" #
}
eval_config: {
    # (Optional): Uncomment the line below if you installed the Coco evaluation tools
    # and you want to also run evaluation
    # metrics_set: "coco_detection_metrics"
    # (Optional): Set this to the number of images in your <PATH_TO_IMAGES_FOLDER>/train
    # if you want to also run evaluation
    num_examples: 694
    # Note: The below line limits the evaluation process to 10 evaluations.
    # Remove the below line to evaluate indefinitely.
    max_evals: 10
}
eval_input_reader: {
  tf_record_input_reader {
    input_path: "/content/pascal_test.record"
  }
  label_map_path: "/content/label_map.pbtxt" #"training/object-detection.pbtxt"
  shuffle: false
  num_readers: 1
  }
  """
#should be written to /content/models/research/object_detection
with open("ssd_mobilenet_v1_coco.config", "w") as text_file:
    text_file.write(config)
print("Written to: ")
!pwd

Written to: 
/content/models/research/object_detection


This is a download of the pre-trained model to train with, as it would take a much larger amount of time to train a model from scratch than it would to augment a functioning one.

In [0]:
!wget http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_11_06_2017.tar.gz
!tar -xvf ssd_mobilenet_v1_coco_11_06_2017.tar.gz

--2020-05-13 00:20:12--  http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_11_06_2017.tar.gz
Resolving download.tensorflow.org (download.tensorflow.org)... 64.233.167.128, 2a00:1450:400c:c0a::80
Connecting to download.tensorflow.org (download.tensorflow.org)|64.233.167.128|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 128048406 (122M) [application/x-tar]
Saving to: ‘ssd_mobilenet_v1_coco_11_06_2017.tar.gz’


2020-05-13 00:20:13 (180 MB/s) - ‘ssd_mobilenet_v1_coco_11_06_2017.tar.gz’ saved [128048406/128048406]

ssd_mobilenet_v1_coco_11_06_2017/
ssd_mobilenet_v1_coco_11_06_2017/model.ckpt.index
ssd_mobilenet_v1_coco_11_06_2017/model.ckpt.meta
ssd_mobilenet_v1_coco_11_06_2017/frozen_inference_graph.pb
ssd_mobilenet_v1_coco_11_06_2017/model.ckpt.data-00000-of-00001
ssd_mobilenet_v1_coco_11_06_2017/graph.pbtxt


OPTIONAL: This provides the tensorboard visualization of training, but in a Google Colab setting 



In [0]:
pip install tensorboardcolab



In [0]:
%load_ext tensorboard

In [0]:
%tensorboard --logdir /content/tl_data/training/

#Using The Outdated Training Function?

Using the old training function from tensorflow to train the model isn't bad, but it's regarded as legacy now. Errors were found with NaN issues, which could be avoided by adding in a low constant (1e-8). We made no attempt to do this and instead moved to the current training function.

In [0]:
#!cp /content/models/research/object_detection/legacy/train.py /content/models/research/object_detection/

In [0]:
#os.chdir('/content/models/research/object_detection/')
#!python train.py --logtostderr --train_dir=/content/tl_data/training/ --pipeline_config_path=/content/models/research/object_detection/ssd_mobilenet_v1_coco.config

#Using The Current Training Function.



In this case the up-to-date training method was used to train the model. This training function was more efficient with allocations of memory and allowed a larger batch-size than the legacy training method.

#Train + Output (First Run)

Now that both Tensorflow and its models are set up along with a correctly configured dataset, the only thing to do now is to train it.

After allowing this to run for around 25k steps, the model will be more specialized for our dataset. The next step is to package and use this model with our test set to see how well it did

In [0]:
!python model_main.py --logtostderr --train_dir=/content/tl_data/training/ --pipeline_config_path=/content/models/research/object_detection/ssd_mobilenet_v1_coco.config

The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



W0512 11:15:45.707380 140495344437120 module_wrapper.py:139] From /content/models/research/object_detection/utils/config_util.py:102: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.



W0512 11:15:45.711732 140495344437120 model_lib.py:629] Forced number of epochs for all eval validations to be 1.

W0512 11:15:45.711966 140495344437120 module_wrapper.py:139] From /content/models/research/object_detection/utils/config_util.py:488: The name tf.logging.info is deprecated. Please use tf.compat.v1.logging.info instead.

INFO:tensorflow:Maybe overwriting train_steps: None
I0512 11:15:45.712178 1404953444

In [0]:
!mkdir /content/checkpoints

In [0]:
#Note: this a temp dir with a different name for each training session, could change to /tmp/* to grab it
!cp  /tmp/tmprokhuwal/* /content/checkpoints/ -r

In [0]:
!python export_inference_graph.py --input_type image_tensor --pipeline_config_path ssd_mobilenet_v1_coco.config --trained_checkpoint_prefix /content/checkpoints/model.ckpt-25558 --output_directory trained_inference_graph/

The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



W0512 16:14:15.038335 140581669259136 module_wrapper.py:139] From export_inference_graph.py:145: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.


W0512 16:14:15.046457 140581669259136 module_wrapper.py:139] From /content/models/research/object_detection/exporter.py:402: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead.


W0512 16:14:15.046749 140581669259136 module_wrapper.py:139] From /content/models/research/object_detection/exporter.py:121: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.


W0512 16:14:15.086095 140581669259136

In [0]:
!zip -r boat_finder.zip trained_inference_graph

  adding: trained_inference_graph/ (stored 0%)
  adding: trained_inference_graph/frozen_inference_graph.pb (deflated 9%)
  adding: trained_inference_graph/saved_model/ (stored 0%)
  adding: trained_inference_graph/saved_model/saved_model.pb (deflated 9%)
  adding: trained_inference_graph/saved_model/variables/ (stored 0%)
  adding: trained_inference_graph/pipeline.config (deflated 69%)
  adding: trained_inference_graph/model.ckpt.meta (deflated 93%)
  adding: trained_inference_graph/checkpoint (deflated 42%)
  adding: trained_inference_graph/model.ckpt.index (deflated 67%)
  adding: trained_inference_graph/model.ckpt.data-00000-of-00001 (deflated 7%)


In [0]:
import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile

from distutils.version import StrictVersion
from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image

# This is needed since the notebook is stored in the object_detection folder.
sys.path.append("..")
from object_detection.utils import ops as utils_ops
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as vis_util



### Model preparation variable
MODEL_NAME = 'trained_inference_graph'
PATH_TO_FROZEN_GRAPH = MODEL_NAME + '/frozen_inference_graph.pb'
PATH_TO_LABELS = '/content/label_map.pbtxt'
#PATH_TO_LABELS = 'training/object-detection.pbtxt'
NUM_CLASSES = 1 #remember number of objects you are training? cool.


### Load a (frozen) Tensorflow model into memory.
detection_graph = tf.Graph()
with detection_graph.as_default():
  od_graph_def = tf.GraphDef()
  with tf.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid:
    serialized_graph = fid.read()
    od_graph_def.ParseFromString(serialized_graph)
    tf.import_graph_def(od_graph_def, name='')


###Loading label map
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)

### Load image into numpy function
def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)

###STATING THE PATH TO IMAGES TO BE TESTED
PATH_TO_TEST_IMAGES_DIR = '/content/tl_data/test/JPEGImages/JPEGImages/'
#TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(1, 4) ]
TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'Test{}.jpg'.format(i)) for i in range(1, 100) ]
IMAGE_SIZE = (12, 8)

### Function to run inference on a single image which will later be used in an iteration
def run_inference_for_single_image(image, graph):
  with graph.as_default():
    with tf.Session() as sess:
      # Get handles to input and output tensors
      ops = tf.get_default_graph().get_operations()
      all_tensor_names = {output.name for op in ops for output in op.outputs}
      tensor_dict = {}
      for key in [
          'num_detections', 'detection_boxes', 'detection_scores',
          'detection_classes', 'detection_masks'
      ]:
        tensor_name = key + ':0'
        if tensor_name in all_tensor_names:
          tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(
              tensor_name)
      if 'detection_masks' in tensor_dict:
        # The following processing is only for single image
        detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])
        detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])
        # Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.
        real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32)
        detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])
        detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1])
        detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
            detection_masks, detection_boxes, image.shape[1], image.shape[2])
        detection_masks_reframed = tf.cast(
            tf.greater(detection_masks_reframed, 0.5), tf.uint8)
        # Follow the convention by adding back the batch dimension
        tensor_dict['detection_masks'] = tf.expand_dims(
            detection_masks_reframed, 0)
      image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')

      # Run inference
      output_dict = sess.run(tensor_dict,
                             feed_dict={image_tensor: image})

      # all outputs are float32 numpy arrays, so convert types as appropriate
      output_dict['num_detections'] = int(output_dict['num_detections'][0])
      output_dict['detection_classes'] = output_dict[
          'detection_classes'][0].astype(np.int64)
      output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
      output_dict['detection_scores'] = output_dict['detection_scores'][0]
      if 'detection_masks' in output_dict:
        output_dict['detection_masks'] = output_dict['detection_masks'][0]
  return output_dict



### To iterate on each image in the test image path defined 
### NB define the range of numbers and let it match the number of imAGES IN TEST FOLDER +1
for image_path in TEST_IMAGE_PATHS:
  image = Image.open(image_path)
  # the array based representation of the image will be used later in order to prepare the
  # result image with boxes and labels on it.
  image_np = load_image_into_numpy_array(image)
  # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
  image_np_expanded = np.expand_dims(image_np, axis=0)
  # Actual detection.
  output_dict = run_inference_for_single_image(image_np_expanded, detection_graph)
  # Visualization of the results of a detection.
  vis_util.visualize_boxes_and_labels_on_image_array(
      image_np,
      output_dict['detection_boxes'],
      output_dict['detection_classes'],
      output_dict['detection_scores'],
      category_index,
      instance_masks=output_dict.get('detection_masks'),
      use_normalized_coordinates=True,
      line_thickness=1)
  display(Image.fromarray(image_np))

#Train + Output (Second Run)

In [0]:
!python model_main.py --logtostderr --train_dir=/content/tl_data/training/ --pipeline_config_path=/content/models/research/object_detection/ssd_mobilenet_v1_coco.config



W0513 00:20:37.223889 140251405678464 module_wrapper.py:139] From /content/models/research/object_detection/utils/config_util.py:137: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.



W0513 00:20:37.227893 140251405678464 model_lib.py:686] Forced number of epochs for all eval validations to be 1.

W0513 00:20:37.228059 140251405678464 module_wrapper.py:139] From /content/models/research/object_detection/utils/config_util.py:523: The name tf.logging.info is deprecated. Please use tf.compat.v1.logging.info instead.

INFO:tensorflow:Maybe overwriting train_steps: None
I0513 00:20:37.228179 140251405678464 config_util.py:523] Maybe overwriting train_steps: None
INFO:tensorflow:Maybe overwriting use_bfloat16: False
I0513 00:20:37.228306 140251405678464 config_util.py:523] Maybe overwriting use_bfloat16: False
INFO:tensorflow:Maybe overwriting sample_1_of_n_eval_examples: 1
I0513 00:20:37.228417 140251405678464 config_util.py:523] Maybe overwriting sample_1_of

In [0]:
!mkdir /content/checkpoints

In [0]:
#Note: this a temp dir with a different name for each training session, could change to /tmp/* to grab it
!cp  /tmp/tmprokhuwal/* /content/checkpoints/ -r

In [0]:
!python export_inference_graph.py --input_type image_tensor --pipeline_config_path ssd_mobilenet_v1_coco.config --trained_checkpoint_prefix /content/checkpoints/model.ckpt-7738 --output_directory trained_inference_graph/



W0513 01:44:31.307078 140642665944960 module_wrapper.py:139] From export_inference_graph.py:145: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.


W0513 01:44:31.314666 140642665944960 module_wrapper.py:139] From /content/models/research/object_detection/exporter.py:419: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead.


W0513 01:44:31.314935 140642665944960 module_wrapper.py:139] From /content/models/research/object_detection/exporter.py:138: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.


W0513 01:44:31.354599 140642665944960 module_wrapper.py:139] From /content/models/research/object_detection/core/preprocessor.py:3030: The name tf.image.resize_images is deprecated. Please use tf.image.resize instead.


W0513 01:44:31.388779 140642665944960 module_wrapper.py:139] From /content/models/research/object_detection/meta_architectures/ssd_meta_arch.py:600: The name tf.GraphKeys is depreca

In [0]:
import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile

from distutils.version import StrictVersion
from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image

# This is needed since the notebook is stored in the object_detection folder.
sys.path.append("..")
from object_detection.utils import ops as utils_ops
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as vis_util



### Model preparation variable
MODEL_NAME = 'trained_inference_graph'
PATH_TO_FROZEN_GRAPH = MODEL_NAME + '/frozen_inference_graph.pb'
PATH_TO_LABELS = '/content/label_map.pbtxt'
#PATH_TO_LABELS = 'training/object-detection.pbtxt'
NUM_CLASSES = 1 #remember number of objects you are training? cool.


### Load a (frozen) Tensorflow model into memory.
detection_graph = tf.Graph()
with detection_graph.as_default():
  od_graph_def = tf.GraphDef()
  with tf.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid:
    serialized_graph = fid.read()
    od_graph_def.ParseFromString(serialized_graph)
    tf.import_graph_def(od_graph_def, name='')


###Loading label map
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)

### Load image into numpy function
def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)

###STATING THE PATH TO IMAGES TO BE TESTED
PATH_TO_TEST_IMAGES_DIR = '/content/tl_data/test/JPEGImages/JPEGImages/'
#TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(1, 4) ]
TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'Test{}.jpg'.format(i)) for i in range(1, 100) ]
IMAGE_SIZE = (12, 8)

### Function to run inference on a single image which will later be used in an iteration
def run_inference_for_single_image(image, graph):
  with graph.as_default():
    with tf.Session() as sess:
      # Get handles to input and output tensors
      ops = tf.get_default_graph().get_operations()
      all_tensor_names = {output.name for op in ops for output in op.outputs}
      tensor_dict = {}
      for key in [
          'num_detections', 'detection_boxes', 'detection_scores',
          'detection_classes', 'detection_masks'
      ]:
        tensor_name = key + ':0'
        if tensor_name in all_tensor_names:
          tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(
              tensor_name)
      if 'detection_masks' in tensor_dict:
        # The following processing is only for single image
        detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])
        detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])
        # Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.
        real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32)
        detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])
        detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1])
        detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
            detection_masks, detection_boxes, image.shape[1], image.shape[2])
        detection_masks_reframed = tf.cast(
            tf.greater(detection_masks_reframed, 0.5), tf.uint8)
        # Follow the convention by adding back the batch dimension
        tensor_dict['detection_masks'] = tf.expand_dims(
            detection_masks_reframed, 0)
      image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')

      # Run inference
      output_dict = sess.run(tensor_dict,
                             feed_dict={image_tensor: image})

      # all outputs are float32 numpy arrays, so convert types as appropriate
      output_dict['num_detections'] = int(output_dict['num_detections'][0])
      output_dict['detection_classes'] = output_dict[
          'detection_classes'][0].astype(np.int64)
      output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
      output_dict['detection_scores'] = output_dict['detection_scores'][0]
      if 'detection_masks' in output_dict:
        output_dict['detection_masks'] = output_dict['detection_masks'][0]
  return output_dict



### To iterate on each image in the test image path defined 
### NB define the range of numbers and let it match the number of imAGES IN TEST FOLDER +1
for image_path in TEST_IMAGE_PATHS:
  image = Image.open(image_path)
  # the array based representation of the image will be used later in order to prepare the
  # result image with boxes and labels on it.
  image_np = load_image_into_numpy_array(image)
  # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
  image_np_expanded = np.expand_dims(image_np, axis=0)
  # Actual detection.
  output_dict = run_inference_for_single_image(image_np_expanded, detection_graph)
  # Visualization of the results of a detection.
  vis_util.visualize_boxes_and_labels_on_image_array(
      image_np,
      output_dict['detection_boxes'],
      output_dict['detection_classes'],
      output_dict['detection_scores'],
      category_index,
      instance_masks=output_dict.get('detection_masks'),
      use_normalized_coordinates=True,
      line_thickness=1)
  display(Image.fromarray(image_np))

Regarding the output of this model, which is just one variation of many object-detectors, it performed fairly well. Most of the test dataset was correctly analyzed by the 

#Results

Our tests of using a tensor flow object detection network yielded interesting results. We trained two different networks with slightly different input variables and surprisingly the networks inferred the data in different ways.

Our first network(N1) took variables of 25000 steps, a batch size of 10 and a learning rate of .004. This was the starting learning rate as they both had the same decreasing learning rate as a higher step rate was reached. The second network(N2) had a much smaller time input with only 8000 steps but a larger batch size of 12 and a smaller starting learning rate of .003. 

The results are given as percentages of Average Precision and Average Recall. Average precision is how often a guess by the network is correct and average recall is how many boats the networks correctly guessed compared to how many there actually were. They are then averaged based on the IoU(intersection over union) which is how much the networks guess overlaps the actual correct answer. If the IoU is .5 or above it is considered a correct guess. IoU=.5:.95 is an average of the precision of correct answers over the interval .5 to .95. The MaxDets is how many guesses the network was allowed to make. We will mostly be paying attention to the IoU values of .5 and .75 with maxDets at 100 and Area being all.

| Mean Accuracy Type | IoU  | Area | maxDets  |  N1 | N2 |
|---|---|---|---|---|---|
|  Precision | 0.50:0.95 |  all | 100 | 0.458  | 0.438 |
|  Precision | 0.50      |  all | 100 | 0.822  | 0.840 |
|  Precision | 0.75      |  all | 100 | 0.454  |  0.411 |
|  Precision | 0.50:0.95 | small   |  100 | 0.269  | 0.294 |
|  Precision | 0.50:0.95 | medium  |  100 | 0.510  | 0.460 |
|  Precision | 0.50:0.95 | large   |  100 | 0.700  | 0.669 |
|  Recall    | 0.50:0.95 | all | 1   | 0.346 | 0.334 |
|  Recall    | 0.50:0.95 | all | 10  | 0.555 | 0.515 |
|  Recall    | 0.50:0.95 | all | 100 | 0.560 | 0.519 |
|  Recall    | 0.50:0.95 | small  | 100 | 0.454  | 0.471 |
|  Recall    | 0.50:0.95 | medium | 100 | 0.572  | 0.516 |
|  Recall    | 0.50:0.95 | large  | 100 | 0.714  | 0.743  |

N1 took around four hours to train. The Average precision for IoU=.5 was .822 and for IoU=.75 was .454. The Average Recall was .560.

N2 took only an hour to train. The Average precision for IoU=.5 was .840 and for IoU=.75 was .411. The Average recall was .519.

The N1 out performed N2 in every category except IoU=.5. N1 was much more precise with the IoU=.75 test and the average recall was higher as well. The fact that N2 beat N1 at general correct guesses was surprising as N1 had a much longer training time. We can take the inputs of the smaller learning rate and higher batch size and assume that both of these lead to a quicker success for generally finding images. We can also take this result to assume that a larger step rate is most important for creating a higher accuracy of guesses. 

Visually looking at the results N2 had much larger and wilder guesses. This includes guesses of land being boats and guessing a boats wake is included with the boat. N1 was much more accurate at just guessing the boat itself. 