# TI Sitara x AWS NEO Image Classification Example

1. [Introduction](#Introduction)
2. [Prerequisites and Preprocessing](#Prequisites-and-Preprocessing)
  1. [Permissions and environment variables](#Permissions-and-environment-variables)
  2. [Data preparation](#Data-preparation)
3. [Training the MobileNet V2 model](#Training-the-MobileNet-V2-model)
4. [Compiling the model using NEO](#Compiling-the-model-using-NEO)
5. [Deploy the trained model to TI Sitara for inference](#Deploy-the-trained-model-to-TI-Sitara-for-inference)

## Introduction

This notebook will demo how to train TensorFlow Mobilenet V2 model in Sagemaker, compile using the Neo API backend, to optimize for TI Sitara AM57xx. Finally, we can deploy compiled model to device and do inference using the Neo Deep Learning Runtime.

To get started, we need to set up the environment with a few prerequisite steps, for permissions, configurations, and so on.

## Prequisites and Preprocessing

### Permissions and environment variables

Here we set up the linkage and authentication to AWS services.

* The roles used to give learning and hosting access to your data. This will automatically be obtained from the role used to start the notebook
* The S3 bucket that you want to use for training and model data

In [None]:
import boto3
import sagemaker
import time
from sagemaker.utils import name_from_base
from sagemaker import get_execution_role 

In [None]:
role = get_execution_role() 
sess = sagemaker.Session()
region = sess.boto_region_name

bucket = sess.default_bucket()

### Data preparation

We will use the [Caltech101](http://www.vision.caltech.edu/Image_Datasets/Caltech101) dataset in this example. Inside this dataset, pictures of objects belonging to 101 categories. About 40 to 800 images per category. Most categories have about 50 images. Collected in September 2003 by Fei-Fei Li, Marco Andreetto, and Marc 'Aurelio Ranzato.  The size of each image is roughly 300 x 200 pixels.

In [None]:
!wget http://www.vision.caltech.edu/Image_Datasets/Caltech101/101_ObjectCategories.tar.gz

In [None]:
!tar -xf 101_ObjectCategories.tar.gz

In this example, we use `.tfrecord` as input. The TFRecord format is a simple format for storing a sequence of binary records.

In [None]:
import tensorflow as tf
import glob, imageio, shutil, os

# Gather file paths to all iamges
data_dir = '101_ObjectCategories'
object_dirs = glob.glob(data_dir + '/*')

objects = {}
for d in object_dirs:
    objects[d.split('/')[1]] = glob.glob(d + '/*.jpg')

# Create an integer label for each object category
categories = list(objects.keys())
category_labels = {}
for i in range(len(categories)):
    category_labels[categories[i]] = i

def int64_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

def bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

def float_feature(value):
    return tf.train.Feature(bytes_list=tf.train.FloatList(value=[value]))

# Create train/valid directories to store our TFRecords
if not os.path.exists('tfrecords') and not os.path.isdir('tfrecords'):
    os.mkdir('tfrecords')

object_names = list(objects.keys())
# Create a separate TFRecord file for each object category
train_writer = tf.io.TFRecordWriter('tfrecords/' + 'train.tfrecord')
valid_writer = tf.io.TFRecordWriter('tfrecords/' + 'valid.tfrecord')
for o in object_names:
    # Write each image of the object into that file
    num_images = len(objects[o])
    for index in range(num_images):
        i = objects[o][index]
        # Let's make 80% train and leave 20% for validation
        if index < num_images * 0.8:
            writer = train_writer
        else:
            writer = valid_writer
        image = imageio.imread(i)
        shape = image.shape
        if len(shape) != 3:
            continue
        label = category_labels[o]
        # Create features dict for this image
        features = {
            'height' : int64_feature(shape[0]),
            'width' : int64_feature(shape[1]),
            'depth' : int64_feature(shape[2]),
            'image' : bytes_feature(image.tostring()),
            'label' : int64_feature(int(label))
        }
        # Create Example out of this image and write it to the TFRecord
        example = tf.train.Example(features=tf.train.Features(feature=features))
        writer.write(example.SerializeToString())
train_writer.close()
valid_writer.close()

Upload the data to a S3 bucket.

In [None]:
inputs = sess.upload_data(path='tfrecords', key_prefix='data/DEMO-caltech101')

## Training the MobileNet V2 model

Create a training job using the ``sagemaker.TensorFlow`` estimator

In [None]:
from sagemaker.tensorflow import TensorFlow
import os

source_dir = os.path.join(os.getcwd(), 'source_dir')
estimator = TensorFlow(entry_point='mobilenet_v2_training.py',
                       source_dir=source_dir,
                       role=role,
                       framework_version='1.12.0',
                       hyperparameters={'throttle_secs': 30},
                       training_steps=1000, 
                       evaluation_steps=100,
                       train_instance_count=2, 
                       train_instance_type='ml.p3.2xlarge', 
                       base_job_name='demo-ti-neo'
                      )

estimator.fit(inputs)

Now the model is ready to be compiled by Neo to be optimized for our hardware of choice. We are using the  ``TensorFlowModel.compile`` method to do this. For this example, our target hardware is ``'sitara'``. You can changed these to other supported target hardware if you prefer.

### Compiling the model using NEO
The ``input_shape`` is the definition for the model's input tensor and ``output_path`` is where the compiled model will be stored in S3. **Important. If the following command result in a permission error, scroll up and locate the value of execution role returned by `get_execution_role()`. The role must have access to the S3 bucket specified in ``output_path``.**

After compiling, the compiled model will be stored in a S3 bucket as a tarball file.

In [None]:
output_path = 's3://{}/tf-mobilenet/output'.format(bucket)
compiled_mobilenet = estimator.compile_model(target_instance_family='sitara_am57x', 
                                             input_shape={'input':[1,224,224,3]},
                                             role=role,
                                             framework='tensorflow',
                                             framework_version='1.12.0',
                                             output_path=output_path,
                                             compile_max_run=15 * 60
                                            )

## Deploy the trained model to TI Sitara for inference

***Please execute the following on a sitara device.***

### Download compiled model from S3 to device

After compilation using Neo, we have a optimized model stored in S3 bucket. Now download the compiled model, and then deploy it use Neo DLR runtime on device.

Before we can access the S3 bucket which has the compiled model, we need to configure the aws account credentials beforehand. Please refer to [AWS Security Credentials](https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html) for more information.

In [None]:
s3 = boto3.client('s3')
object_path = prefix+'/output/mobilenet_v2_1.0_224-sitara_am57x.tar.gz'
s3.download_file(bucket, object_path, 'mobilenet_v2.tar.gz')

In [None]:
!tar xvf mobilenet_v2.tar.gz -C ./mobilenet_v2

### Use DLR to read compiled model

In [None]:
from __future__ import print_function
from dlr import DLRModel
import numpy as np
import time

In [None]:
# Load the model
model_path = "./mobilenet_v2"

In [None]:
device = 'cpu'
model = DLRModel(model_path, device)

### Download an image to prepare for predictions

In [None]:
# display input image
!wget -O /tmp/test.jpg http://www.vision.caltech.edu/Image_Datasets/Caltech256/images/080.frog/080_0001.jpg
file_name = '/tmp/test.jpg'
# test image
from IPython.display import Image
Image(file_name)  

### Image pre-process

MobileNet V2 has a fixed input shape which is (3, 224, 224), and in order to utilize 4 EVEs on Sitara Am57XX batch size 4 is the minimal requirement.

In [None]:
import PIL.Image
image = PIL.Image.open(file_name)

# Resize
image = np.asarray(image.resize((224, 224)))

# Normalize
image = image*2/255.0 - 1

image = np.concatenate([image[np.newaxis, :, :]]*batch_size)

print(image.shape)

### Inference

In [None]:
#flatten within a input array
input_data = {'input': image}

print('Testing inference on mobilenete_v2...')
start_time = time.time()
out = model.run(input_data) #need to be a list of input arrays matching input names
index = np.argmax(out[0][0,:])
prob = np.amax(out[0][0,:])
print('inference time is ' + str((time.time()-start_time)/batch_size) + ' seconds')

In [None]:
# Load names for ImageNet classes
object_categories = {}
with open("imagenet1000_clsidx_to_labels.txt", "r") as f:
    for line in f:
        key, val = line.strip().split(':')
        object_categories[key] = val

print("Result: label - " + object_categories[str(index)] + " probability - " + str(prob))