# **Creating TFRecords**
In order to train the multibox detector on your own data, you need to generate tfrecords from your own data.

### **Arguments**
* **dataset**: A list of dictionaries where each entry includes one or multiple bounding boxes. **IMPORTANT** Entries to the list must be specified in the dictionary format shown below.
* **dataset_name**: The name to save the dataset as.
* **output_dir**: The directory to save the tfrecords file to.
* **num_shards**: Number of files to split **dataset** into.
* **num_threads**: Number of threads to use while converting **dataset** into tfrecords.
* **shuffle**: Bool specifying whether to shuffle the data before writing to tfrecords or not.

### **Outputs**
Will save a tfrecords file with the name **dataset_name** in the directory specified by **output_dir**.

In [1]:
import create_tfrecords

## This is a sample list entry for the dataset
## All list elements in the dictionary accept multiple values to specify multiple bboxes.
data = {'filename': 'sample_1',
        'id': '1',
        'class': {
            'label': [0],
            'text': b'mouse',
            },
        'object': {
            # Each of the bbox coords (xmax, xmin, ymax, ymin) must be floats
            'bbox': {
                'xmax': [100.0], 
                'xmin': [0.0],
                'ymax': [100.0],
                'ymin': [0.0],
                'label': [0],
                'count': 1
                }
            }
        }

# Input to createtfrecords must be a list
dataset = [data]

# Call to tfrecords
create_tfrecords.create(dataset,
                        dataset_name = 'sample',
                        output_directory = './',
                        num_shards = 1,
                        num_threads = 1,
                        shuffle = True
                       )

Launching 1 threads for spacings: [[0, 1]]



Instructions for updating:
Use tf.gfile.GFile.


Exception in thread Thread-4:
Traceback (most recent call last):
  File "/home/andrew_work/miniconda3/envs/draft_mars_dev/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "/home/andrew_work/miniconda3/envs/draft_mars_dev/lib/python3.7/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/home/andrew_work/MARS_Developer/multibox_detection/create_tfrecords.py", line 241, in _process_image_files_batch
    image_buffer, height, width = _process_image(filename, coder)
  File "/home/andrew_work/MARS_Developer/multibox_detection/create_tfrecords.py", line 179, in _process_image
    image_data = tf.gfile.FastGFile(filename, 'r').read()
  File "/home/andrew_work/miniconda3/envs/draft_mars_dev/lib/python3.7/site-packages/tensorflow_core/python/lib/io/file_io.py", line 122, in read
    self._preread_check()
  File "/home/andrew_work/miniconda3/envs/draft_mars_dev/lib/python3.7/site-packages/tensorflow_core/python/lib/io/file_io.py",

2021-02-19 13:00:42.545207: Finished writing all 1 images in data set.
0 examples failed.


[]

# **Training MARS Multibox Detector**

### Required Arguments
* **tfrecords**: path to the binary file(s) that contain your training set. If multiple files, separate with commas.
* **logdir**: path to directory where summary and checkpoint files will be stored.
* **cfg**: path to training configuration file.
* **bbox_priors**: path to the bounding box priors pickle file.

### Optional Arguments
* **pretrained_model**: Path to continue training a pretrained Inception-v3 model.
* **trainable_scopes**: Comma-separated list of scopes to filter the set of variables to train.
* **use_moving_averages**: If True, then the moving averages will be used for each model variable from the pretrained network.
* **restore_moving_averages**: If True, then the moving averages themselves be restored from the pretrained network.
* **max_number_of_steps**: The maximum number of iterations to run.
* **batch_size**: The batch size.

The example below assumes the current working directory is the `multibox_detector` directory, which contains `train.py`. To train the multibox detector with the required arguments, simply replace the areas indicated below with the appropriate paths.

In [None]:
import train
import pickle
import config
import numpy as np

# Paths
cfg_path = 'absolute/path/to/your/config/file.yaml'
priors_path = 'absolute/path/to/your/config/file.yaml'
tfrecords_path = ['absolute/path/to/your/tfrecords/file(s)'] # Separate multiple training data files with ','
logdir_path = 'absolute/path/to/your/logging/directory'

# Configuration file
cfg = config.parse_config_file(cfg_path)

# Load priors into numpy array
with open(priors_path, 'rb') as f:
    bbox_priors = pickle.load(f, encoding='latin1')
bbox_priors = np.array(bbox_priors).astype(np.float32)

# Call training function
train.train(
        tfrecords=tfrecords_path,
        bbox_priors=bbox_priors,
        logdir=logdir_path,
        cfg=cfg
    )

## Visualizing Training Data
### Required Arguments
* **tfrecords**: path to the binary file(s) that contain your training set. If multiple files, separate with commas.
* **cfg**: path to training configuration file.
### Outputs
* Displays each training image with its corresponding bounding box drawn over it - cycle through by hitting the enter key.

In [None]:
from visualize_inputs import visualize
from config import parse_config_file

# Paths
cfg_path = 'absolute/path/to/your/config/file.yaml'
tfrecords_path = ['absolute/path/to/your/tfrecords/file(s)'] # Separate multiple training data files with ','

# Parse config file
cfg = parse_config_file(cfg_path)

# Call visualization function - hit enter to cycle through image
visualize(
    tfrecords=tfrecords_path,
    cfg=cfg
)

## Generating Priors Files
### Required Arguments
* **dataset**: path to either a tfrecord or pkl file to generate aspect ratios and priors.
### Optional Arguments
* **aspect_ratios**: list of hand-defined aspect ratios to use in generating the priors.
### Outputs
* **(dataset)-priors.pkl**: pkl file containing the priors generated from the training set where (dataset) is the file name of the dataset provided.

**OR**

* **priors_hand_gen.pkl**: pkl file containing the hand generated priors
This will only generate priors from the training data if the `dataset` argument is provided. If the `aspect_ratio` argument is provided then this will generate priors from the hand-defined aspect ratios. Provide the `aspect_ratio` argument in the same format as seen below.

In [None]:
from priors_generator import generate_priors_from_data
import tensorflow as tf
tf.compat.v1.enable_eager_execution()

# Example call generating priors from the dataset
dataset = 'absolute/path/to/your/tfrecords/file'
aspect_ratios = None

p_1 = generate_priors_from_data(
            dataset=dataset,
            aspect_ratios=aspect_ratios
        )

# Example call generating priors from hand-defined aspect ratios
dataset = None
aspect_ratios = [1, 2, 3, 4, 5, 6, 1./2, 1./3, 1./4, 1./5]

p_2 = generate_priors_from_data(
            dataset=dataset,
            aspect_ratios=aspect_ratios
        )

# Evaluating MARS Multibox Detector

### Required Arguments
* **tfrecords**: path to the binary file that the image data from your validation set. If multiple files, separate with commas.
* **summary_dir**: path to directory where summary and checkpoint files will be stored - be sure to include a `/` at the end of the path.
* **checkpoint_path**: Either a path to a specific model, or a path to a directory where checkpoint files are stored. If a directory, the latest model will be tested against.
* **priors**: path to the bounding box priors pickle file.
* **config**: path to validation configuration file.

### Optional Arguments
* **max_iterations**: Maximum number of interations to run.

The example below assume the current working directory is the `multibox_detector` directory, which contains `evaluation/eval.py`. To evaluate the multibox detector with the required arguments, simply replace the areas indicated below with the appropriate paths. 

### Outputs
* **(summary_dir)/cocoEval.pkl**: pickle file that stores evaluation data used to compute the precision-recall (PR) curves for various Intersection over Union bounds `[IoU > 0.5, IoU > 0.75, IoU > 0.85, IoU > 0.9, IoU > 0.95]`.

In [None]:
from config import parse_config_file
import pickle
import numpy as np
import os
import sys
sys.path.append('../')
sys.path.append('./evaluation/')
from evaluation.eval import eval

# Paths
cfg_path = 'absolute/path/to/your/config/file.yaml'
priors_path = 'absolute/path/to/your/config/file.yaml'
tfrecords_path = ['absolute/path/to/your/tfrecords/file(s)'] # Separate multiple training data files with ','
summary_dir = 'absolute/path/to/your/summary/directory'
checkpoint_path = 'absolute/path/to/your/model/training/checkpoint'

# Parse config file
cfg = parse_config_file(cfg_path)

# Load priors into numpy array
with open(priors_path, 'rb') as f:
    bbox_priors = pickle.load(f, encoding='latin1')
bbox_priors = np.array(bbox_priors).astype(np.float32)

# Call evaluation function
eval(
    tfrecords=tfrecords_path,
    bbox_priors=bbox_priors,
    summary_dir=sumary_dir,
    checkpoint_path=checkpoint_path,
    max_iterations = 1,
    cfg=cfg
) 

## Visualization Validation Performance
### Required Arguments
* **cocoEval**: path to the file containing the cocoEval.pkl file described above - don't forget the `/` at the end.
* **save_name**: path to save the precision-recall curve to.

### Outputs
* **(save_name).pdf**: pdf containing the PR curve described above.
* **(save_name).png**: png version of the pdf.

In [None]:
import sys
import pickle
sys.path.append('./evaluation/')
from prcurve_separate import pr_curve

# Paths
summary_dir = 'absolute/path/to/your/summary/directory' # Same summary dir as in 'Evaluating MARS Multibox Detector'

# Load cocoEval file
save_name = summary_dir +'pr_curve'
with open(summary_dir + 'cocoEval.pkl', 'rb') as fp: cocoEval = pickle.load(fp)

# Make Precision-Recall curve
pr_curve(cocoEval, save_name)

# Testing MARS Multibox Detector

### Required Arguments
* **tfrecords**: path to the binary file that contains your testing set.
* **cfg**: path to testing configuration file.
* **bbox_priors**: numpy array of bbox priors
* **checkpoint_path**: Either a path to a specific model, or a path to a directory where checkpoint files are stored. If a directory, the latest model will be tested against.
* **save_dir**: path to directory where you would like to store the json file containing the test results.

### Optional Arguments
* **max_iterations**: Maximum number of detections to store per image.
* **max_detections**: Maximum number of iterations to run. Set to 0 to run on all records.

The example below assumes the current working directory is the `multibox_detector` directory, which contains `detect.py`.

### Outputs
* **results**: JSON file stored the results for the testing data.

In [None]:
import detect
import pickle
from config import parse_config_file
import numpy as np

# Paths
cfg_path = 'absolute/path/to/your/config/file.yaml'
priors_path = 'absolute/path/to/your/config/file.yaml'
tfrecords_path = ['absolute/path/to/your/tfrecords/file(s)'] # Separate multiple training data files with ','
save_dir = 'absolute/path/to/your/save/directory'
checkpoint_path = 'absolute/path/to/your/model/training/checkpoint'

# Parse configuration file 
cfg = parse_config_file(cfg_path)

# Load in bbox priors
with open(priors_path, 'rb') as f:
    # bbox_priors = pickle.load(f)
    u = pickle._Unpickler(f)
    u.encoding = 'latin1'
    bbox_priors = u.load()
    bbox_priors = np.array(bbox_priors).astype(np.float32)    

# Run detection
detect.detect(
    tfrecords=tfrecords_path,
    bbox_priors=bbox_priors,
    checkpoint_path=checkpoint_path,
    save_dir = save_dir,
    max_detections = 100,
    max_iterations = 0,
    cfg=cfg
)

## Visualize Test Performance
### Required Arguments
* **tfrecords**: path to the binary file that contains your testing set.
* **config**: path to testing configuration file.
* **priors**: path to the bounding box priors pickle file.
* **checkpoint_path**: Either a path to a specific model, or a path to a directory where checkpoint files are stored. If a directory, the latest model will be tested against.

### Outputs
* The testing images with their corresponding bounding boxes drawn over them - click through by hitting the enter key.

In [None]:
import visualize_detect
from config import parse_config_file
import pickle
import numpy as np

# Paths
cfg_path = 'absolute/path/to/your/config/file.yaml'
priors_path = 'absolute/path/to/your/config/file.yaml'
tfrecords_path = ['absolute/path/to/your/tfrecords/file(s)'] # Separate multiple training data files with ','
save_dir = 'absolute/path/to/your/save/directory'
checkpoint_path = 'absolute/path/to/your/model/training/checkpoint'

# Parse config file
cfg = parse_config_file('/home/andrew_work/nu/data/detection_top/config_test.yaml')

# Load priors into numpy array
with open('/home/andrew_work/nu/data/detection_top/priors_black_top.pkl', "rb") as f:
    bbox_priors = pickle.load(f, encoding="latin1")
bbox_priors = np.array(bbox_priors).astype(np.float32)
  
# Image + predicted bounding box. Press enter key to cycle through
visualize_detect.detect_visualize(
    tfrecords=tfrecords_path,
    bbox_priors=bbox_priors,
    checkpoint_path=checkpoint_path,
    cfg=cfg
)