# Understanding TensorFlow Object Detection Configuration

The MobileNet model pipeline (and assuming all models) relies upon a protobuf (.pbtxt) file.   This notebook simply illustrates how to use the utilities in TensorFlow to consume this.

Why is that important?  I found 90% of my errors were configuration issues - file not found.  Which leads to "well, what is it looking for?" and "from what relative path".   Use the TF Utilities as much as possible - you'll find they are always one step ahead of you!


In [None]:
import os
import sys
import tensorflow as tf

In [None]:
# This is needed since we cloned tensorflow/models under code.
# - if you don't know what this means
#   Look at the notebook TrainModel_Step1_Local
#      in this notebook, you basically set up the project with includes cloning 
#      and compiling the tensorflow/models repo
#   we are using the utilities found in that repo

cwd = os.getcwd()
models = os.path.join(cwd, 'code/models/research/')
slim = os.path.join(cwd, 'code/models/research/slim')
sys.path.append(models)
sys.path.append(slim)

from code.cfa_utils.example_utils import feature_obj_detect
from code.models.research.object_detection.utils import config_util

## GLOBALS

In [None]:
CODE_DIR = os.path.join(cwd, 'code')
TF_TRAIN_CONFIG = os.path.join(CODE_DIR, 'sagemaker_mobilenet_v1_ssd_retrain.config')

## TensorFlow Utilities

### tf.io
this is not domain specific - i.e. not tied to object detection

#### tf.io.gfile
file i/o related utilites - probably everything you'll need to do with directories (but not os.path operations)

In [None]:
# file exists utility
tf.io.gfile.exists(TF_TRAIN_CONFIG)

### object_detection/utils
these utilities are specific to the object detection 
#### hint 
Reading this output is difficult.   Read the underlying pbtxt file (in github) - it's much easier to read.  The main elements are:
- model
- train_config
- train_input_reader
- eval_config
- eval_input_reader
- graph_rewriter

In [None]:
# get the training pipeline parameters
pipeline_config_dict = config_util.get_configs_from_pipeline_file(TF_TRAIN_CONFIG)
print (pipeline_config_dict.keys())

In [None]:
model_config = pipeline_config_dict['model']
train_config = pipeline_config_dict['train_config']
train_input_config = pipeline_config_dict['train_input_config']
eval_config = pipeline_config_dict['eval_config']
eval_input_config = pipeline_config_dict['eval_input_configs'] # !! note the inconsistent config(s)
graph_rewriter_config = pipeline_config_dict['graph_rewriter_config']

In [None]:
print ("train_input_config:", type(train_input_config))
print ("                   ", train_input_config)

tf_record_input_reader = train_input_config.tf_record_input_reader
print ("tf_record_input_reader:", type(tf_record_input_reader))
print ("                       ", tf_record_input_reader.input_path)

### Summary 
This is redundant - but here's the basic code to read the input sources and verify they exist

In [None]:
def check_input_data_existance(pipeline_config_dict):
    input_keys = ['train_input_config', 'eval_input_config']
    for input_key in input_keys:
        print ("checking inputs for:", input_key)
        input_config = pipeline_config_dict[input_key]
        path_list = input_config.tf_record_input_reader.input_path
        for p in path_list:
            exists = tf.io.gfile.exists(p)
            print ("path:", exists, p)

In [None]:
pipeline_config_dict = config_util.get_configs_from_pipeline_file(TF_TRAIN_CONFIG)  # pipeline config dict

check_input_data_existance(pipeline_config_dict)