### Update Tensorflow Version to 1.6.0

In [None]:
!pip install https://anaconda.org/intel/tensorflow/1.6.0/download/tensorflow-1.6.0-cp36-cp36m-linux_x86_64.whl --user 

# Part 1: Data Wrangling with Breeds on CPU

# Objective

Understand ways to find a data set and to prepare a data set for machine learning and training.

## Activities 
**In this section of the training you will**
- Transfer a data set from the shared location on the server to your current directory. 
- View your initial data
- Clean and normalize the data set
- Organize the data into training and testing groups 


# Find a Data set

### Research Existing Data Sets

Artificial intelligence projects depend upon data. When beginning a project, data scientists look for existing data sets that are similar to or match the given problem. This saves time and money, and leverages the work of others, building upon the body of knowledge for all future projects. 

Typically you begin with a search engine query. For this project, we were looking for a data set with an unencumbered license.  

This project starts with the Oxford IIIT Pet Data set http://www.robots.ox.ac.uk/~vgg/data/pets/ , a 37-category pet data set with roughly 200 images for each class. The images have a large variations in scale, pose, and lighting. All images have an associated ground truth annotation of breed, head region of interest (ROI), and pixel-level trimap segmentation.


### Background
"The pet images were downloaded from Catster* and Dogster*, two social web sites dedicated to the collection and discussion of images of pets, from Flickr* groups, and from Google Images*. People uploading images to Catster and Dogster provide the breed information as well, and the Flickr groups are specific to each breed, which simplifies tagging. For each of the 37 breeds, about 2,000 – 2,500 images were downloaded from these data sources to form a pool of candidates for inclusion in the dataset. From this candidate list, images were dropped if any of the following conditions applied, as judged by the annotators: (i) the image was gray scale, (ii) another image portraying the same animal existed (which happens frequently in Flickr), (iii) the illumination was poor, (iv) the pet was not centered in the image, or (v) the pet was wearing clothes. The most common problem in all the data sources, however, was found to be errors in the breed labels. Thus labels were reviewed by the human annotators and fixed whenever possible. When fixing was not possible, for instance because the pet was a cross breed, the image was dropped.”

From *Cats and Dogs*, http://www.robots.ox.ac.uk/~vgg/publications/2012/parkhi12a/parkhi12a.pdf

# Fetch Your Data
![Fetch Data](assets/part1_1.jpg)

### Activity 
Click the cell below and then click **Run**.

In [None]:
!rm -rf breeds/
!mkdir -p breeds
!rsync -r --progress /data/aidata/breeds/original/ breeds/

!echo "Done."

<br>

# View the Baseline Data

Take a look at the images in your data set. This gives you some idea as to how much cleaning and normalizing will be required. 


![View and Understand Your Data](assets/part1_2.jpg)

### Activity

In the cell below, update the display_images function by changing the **numOfImages** parameter to a number from 1 to 5. Click **Save**, and then click **Run**.
 
*Hint: The display_images function sets a display grid showing NxN pet images. The default number of images is set to **?**. Change the **?** character to something greater than 1; for example, **numOfImages = 6**.*


In [None]:
import os
import glob
import re
import random
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
%matplotlib inline

def get_category(file):
    m = re.search("\d", file, re.IGNORECASE)
    if m:
        return file[:m.start() - 1].lower().split("/")[1]

def display_images(file_names, numOfImages = ?):
    indicies = random.sample(range(len(file_names)), numOfImages * numOfImages)
    train_images = [file_names[i] for i in indicies]
    
    fig, axes = plt.subplots(nrows=numOfImages,ncols=numOfImages, figsize=(15,15), sharex=True, sharey=True, frameon=False)
    for i,ax in enumerate(axes.flat):
        ax.get_xaxis().set_visible(False)
        ax.get_yaxis().set_visible(False)
        curr_i = train_images[i]
        imgplot = mpimg.imread(curr_i)
        ax.imshow(imgplot)
        ax.text(10,20,get_category(curr_i), fontdict={"backgroundcolor": "black","color": "white" })
        ax.axis('off')
    plt.tight_layout(h_pad=0, w_pad=0)    
    
    
display_images(glob.glob('breeds/*.jpg'))

print("Done.")

<br>

# Clean and Normalize the Data
Existing image recognition data-sets often include images of multiple dimensions, color mixed with black and white photos, maybe even line art plus photos. File names may follow multiple formats, and the subject matter within the images may be single, multiple, profile, straight-on face, back of head, surrounded by a complex background or more. 
Cleaning and normalizing the data means fixing the inconsistencies so that the machine processing can occur with minimal errors. Oftentimes data cleaning is tedious and requires significant time commitment. 
Data preprocessing techniques include:
1.	Data cleaning − Eliminates noise and resolves inconsistencies in the data. 
2.	Data integration − Migrates data from various different sources into one coherent source, such as a data warehouse.
3.	Data transformation – Standardizes or normalizes any form of data.
4.	Data reduction – Reduces the size of the data by aggregating it.

Another name for this effort is extract, transform, and load (ETL).
This project required the team to normalize the file dimensions, file names and create a data layout expected by the framework. 

It is common for the data cleanup tasks to be pared with framework and topology selection because different topologies expect different data layouts and formats. When experimenting with different topologies it might be necessary to have several copies of the data in various formats.  Multiple copies of data-sets can take up a lot of space, so ensure you’ve got lots of storage and processing capability.

![Clean and Normalize the Data](assets/part1_3.jpg)

## Activity
The code in the next cell performs some of the cleanup tasks. Review the code and notice that it is removing corrupt files, files with the wrong format, and files with incorrect metadata.

Click the cell below and then click **Run**.

In [None]:
import cv2

for file in glob.glob("breeds/*"):
    if not file.endswith(".jpg"):
        #Not ending in .jpg
        print("Deleting (.mat): " + file)
        os.remove(os.path.join(os.getcwd(), file))
    else: 
        flags = cv2.IMREAD_COLOR
        im = cv2.imread(file, flags)
        
        if im is None:
            #Can't read in image
            print("Deleting (None): " + file)
            os.remove(os.path.join(os.getcwd(), file))
            continue
        elif len(im.shape) != 3:
            #Wrong amount of channels
            print("Deleting (len != 3): " + file)
            os.remove(os.path.join(os.getcwd(), file))
            continue
        elif im.shape[2] != 3:
            #Wrong amount of channels
            print("Deleting (shape[2] != 3): " + file)
            os.remove(os.path.join(os.getcwd(), file))
            continue
            
        with open(os.path.join(os.getcwd(), file), 'rb') as f:
            check_chars = f.read()
        if check_chars[-2:] != b'\xff\xd9':
            #Wrong ending metadata for jpg standard
            print('Deleting (xd9): ' + file)
            os.remove(os.path.join(os.getcwd(), file))
        elif check_chars[:4] != b'\xff\xd8\xff\xe0':
            #Wrong Start Marker / JFIF Marker metadata for jpg standard
            print('Deleting (xd8/xe0): ' + file)
            os.remove(os.path.join(os.getcwd(), file))
        elif check_chars[6:10] != b'JFIF':
            #Wrong Identifier metadata for jpg standard
            print('Deleting (xd8/xe0): ' + file)
            os.remove(os.path.join(os.getcwd(), file))
        elif "beagle_116.jpg" in file or "chihuahua_121.jpg" in file:
            #Using EXIF Data to determine this
            print('Deleting (corrupt jpeg data): ', file)
            os.remove(os.path.join(os.getcwd(), file))  


print('Done.')

# Augment Your Data

Most of the time you’re cleaning data and removing noise. Since our app needs to work with images of wet, muddy, or injured animals, or perhaps blurry images because the animal is running away in fear, we actually need to ADD noise to the data-set. 

We decided to add image noise by building a small program to flip, flop, blur, and extract color channels from the images in the dataset. These actions expanded our training data-set by 6x.

The cell below uses a parallel method to scale the image processing tasks to all available processors.

In [None]:
#%%bash

#echo "Start resizing to 227x227"
#parallel -j 200 convert {} -resize 227x227 -filter spline -unsharp 0x6+0.5+0 -background black -gravity center -extent 227x227  {} ::: *.jpg
#echo "Resizing done"

#mkdir flop
#echo "Start augmentation 1"
#parallel -j 200 convert {} -flop flop/{.}-flop.jpg ::: *.jpg
#echo "Finish augmetation 1"

#mkdir flip
#echo "Start augmentation 2"
#parallel -j 200 convert {} -transverse -rotate 90 flip/{.}-flip.jpg ::: *.jpg
#echo "Finish augmetation 2"

#mkdir blur
#echo "Start augmentation 3"
#parallel -j 200 convert {} -blur 0x1 blur/{.}-blur.jpg ::: *.jpg
#echo "Finish augmetation 3"

#mkdir red
#echo "Start augmentation 4"
#parallel -j 200 convert {} -channel R -separate red/{.}-red.jpg ::: *.jpg
#echo "Finish augmetation 4"

#mkdir blue
#echo "Start augmentation 5"
#parallel -j 200 convert {} -channel B -separate blue/{.}-blue.jpg ::: *.jpg
#echo "Finish augmetation 5"

#mkdir green
#echo "Start augmentation 6"
#parallel -j 200 convert {} -channel G -separate green/{.}-green.jpg ::: *.jpg
#echo "Finish augmetation 6"

#echo "Copying augmented data to main folder"
#cp flop/* flip/* blur/* red/* blue/* green/* .

#echo "Augmentation done"

from multiprocessing import Pool
from PIL import Image
import sys

def resize_image(file, size=224):
    black_background = Image.new('RGB', (size, size), "black")
    img = Image.open(file)
    img.thumbnail((size,size))
    x, y = img.size
    black_background.paste(img, (int((size - x) / 2), int((size - y) / 2)))
    black_background.save(file)
    return black_background
  
pool = Pool()
for i, _ in enumerate(pool.map(resize_image, glob.glob("breeds/*"))):
    if i % 10 == 0:
        sys.stdout.write('\r{0} out of {1} processed'.format(i+1, len(glob.glob("breeds/*"))))
        
sys.stdout.write('\n')
sys.stdout.flush()

display_images(glob.glob('breeds/*.jpg'))

print("Done.")

<br>
<br>
# Organize Data for Consumption by TensorFlow*

The framework you choose for your project determines how you need to organize your data. After extensive experimentation we selected TensorFlow for this project. This section describes how to organize your data layers.

We are splitting the images into training and validation sets, with 80 percent of the images targeted for training and 20 percent of the images targeted for validation.  Our data needs to be organized in a specific manner. That organization is to have each image in a folder that dictates which category it belongs to.  

We'll create a train and a validation folder.  Within those folders, we'll have directories with each category name and then the respective images within their category folder.

This next cell of code creates the data layout as expected.

![Organize Data for Consumption by Framework](assets/part1_4.jpg)

### Activity 
In the cell below, set the **train_ratio** to **0.8** and then click **Run**.

*Hint: We set the train_ratio = ? to a value between 0 and 1 to define our train and validation split*.

In [None]:
import os
import re
import errno
import math

def get_category(file):
    m = re.search("\d", file, re.IGNORECASE)
    if m:
        return file[:m.start() - 1].lower()

def make_sure_path_exists(path):
    try:
        os.makedirs(path)
    except OSError as exception:
        if exception.errno != errno.EEXIST:
            raise

train_ratio = ?
        
file_names = os.listdir('breeds')
category_names = [ get_category(file) for file in file_names]
category_names = [ name for name in category_names if name is not None ]
category_names = sorted(list(set(category_names)))
for category in category_names:
    make_sure_path_exists("breeds/train/" + str(category))
    make_sure_path_exists("breeds/validation/" + str(category)) 

   
for idx, category in enumerate(category_names):
    category_list = []
    for file in file_names:
        if category.lower() in file.lower():
            category_list.append(file)
    
    category_list = sorted(category_list)
    split_ratio = math.floor(len(category_list) * train_ratio)
    train_list = category_list[:split_ratio]
    validation_list = category_list[split_ratio:]
    for i, file in enumerate(train_list):
        os.rename("breeds/" + file, "breeds/train/" + str(category) + "/" + file)
        if i % 10 == 0:
            sys.stdout.write('\r>> Moving train image %d to category folder %s' % (i+1, category))
            sys.stdout.flush()
        
    sys.stdout.write('\n')
    sys.stdout.flush()        
        
    for i, file in enumerate(validation_list):
        os.rename("breeds/" + file, "breeds/validation/" + str(category) + "/" + file)
        if i % 10 == 0:
            sys.stdout.write('\r>> Moving validation image %d to category folder %s' % (i+1, category))
            sys.stdout.flush()
                
    sys.stdout.write('\n')
    sys.stdout.flush()      

print("Done.")

<br>
<br>
 
# Confirm Folder Structure is Correct

We have a sorted folder, 37 breeds folders, and pictures of those breeds within their respective folders.

![Confirm Folder Structure is Correct](assets/part1_5.jpg)

### Activity 
Click the cell below and then click **Run**.

In [None]:
for root, dirs, files in os.walk("breeds"):
    level = root.replace(os.getcwd(), '').count(os.sep)
    print('{0}{1}/'.format('    ' * level, os.path.basename(root)))
    for f in files[:5]:
        print('{0}{1}'.format('    ' * (level + 1), f))
print("Done.")

# Optimize Data for Ingestion

### Data Input/Output
A TFRecords file represents a sequence of (binary) strings. The format is not random access, so it is suitable for streaming large amounts of data but not suitable if fast sharding or other non-sequential access is desired. See Data IO (Python Functions), https://www.tensorflow.org/api_guides/python/python_io#tfrecords_format_details

### Standard TensorFlow* Format
Another approach is to convert whatever data you have into a supported format. This approach makes it easier to mix and match data-sets and network architectures. The recommended format for TensorFlow is a TFRecords file containing tf.train.Example protocol buffers (which contain Features as a field). You write a little program that gets your data, stuffs it in an Example protocol buffer, serializes the protocol buffer to a string, and then writes the string to a TFRecords file using the tf.python_io.TFRecordWriter. See https://www.tensorflow.org/versions/r1.0/programmers_guide/reading_data#file_formats

![Optimize Data for Ingestion](assets/part1_6.jpg)

### Activity

When creating a TFRecord file you can split the dataset into shards.  This can be especially beneficial if you have a particularly large dataset and don't want to end up with a single 1+GB file.  

Below, we're creating shards of data based on the files into the number passed in **\_NUM\_SHARDS**.

In the cell below, set **\_NUM\_SHARDS** to a value between **1** and **5** and then click **Run**.

In [None]:
import tensorflow as tf

_NUM_SHARDS = ?
_SHARD_NAME = "breeds"
LABELS_FILENAME = 'labels.txt'

class ImageReader(object):
    def __init__(self):
        # Initializes function that decodes RGB JPEG data.
        self._decode_jpeg_data = tf.placeholder(dtype=tf.string)
        self._decode_jpeg = tf.image.decode_jpeg(self._decode_jpeg_data, channels=3)

    def read_image_dims(self, sess, image_data):
        image = self.decode_jpeg(sess, image_data)
        return image.shape[0], image.shape[1]

    def decode_jpeg(self, sess, image_data):
        image = sess.run(self._decode_jpeg,
                         feed_dict={self._decode_jpeg_data: image_data})
        assert len(image.shape) == 3
        assert image.shape[2] == 3
        return image

def int64_feature(values):
    if not isinstance(values, (tuple, list)):
        values = [values]
    return tf.train.Feature(int64_list=tf.train.Int64List(value=values))


def bytes_feature(values):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[values]))
    
def image_to_tfexample(image_data, image_format, height, width, class_id):
    return tf.train.Example(features=tf.train.Features(feature={
      'image/encoded': bytes_feature(image_data),
      'image/format': bytes_feature(image_format),
      'image/class/label': int64_feature(class_id),
      'image/height': int64_feature(height),
      'image/width': int64_feature(width),
    }))

def write_label_file(labels_to_class_names, dataset_dir,
                     filename=LABELS_FILENAME):
    labels_filename = os.path.join(dataset_dir, filename)
    with tf.gfile.Open(labels_filename, 'w') as f:
        for label in labels_to_class_names:
            class_name = labels_to_class_names[label]
            f.write('%d:%s\n' % (label, class_name))


def _get_filenames_and_classes(dataset_dir, sorted_dir):
    breeds_root = os.path.join(dataset_dir, sorted_dir)
    directories = []
    class_names = []
    for filename in os.listdir(breeds_root):
        path = os.path.join(breeds_root, filename)
        if os.path.isdir(path):
            directories.append(path)
            class_names.append(filename)

    photo_filenames = []
    for directory in directories:
        for filename in os.listdir(directory):
            path = os.path.join(directory, filename)
            photo_filenames.append(path)

    return photo_filenames, sorted(class_names)


def _get_dataset_filename(dataset_dir, split_name, shard_id):
    output_filename = _SHARD_NAME + '_%s_%05d-of-%05d.tfrecord' % (
      split_name, shard_id, _NUM_SHARDS)
    return os.path.join(dataset_dir, output_filename)


def _convert_dataset(split_name, filenames, class_names_to_ids, dataset_dir):
    assert split_name in ['train', 'validation']

    num_per_shard = int(math.ceil(len(filenames) / float(_NUM_SHARDS)))

    with tf.Graph().as_default():
        image_reader = ImageReader()

        with tf.Session('') as sess:

            for shard_id in range(_NUM_SHARDS):
                output_filename = _get_dataset_filename(
                    dataset_dir, split_name, shard_id)

                with tf.python_io.TFRecordWriter(output_filename) as tfrecord_writer:
                    start_ndx = shard_id * num_per_shard
                    end_ndx = min((shard_id+1) * num_per_shard, len(filenames))
                    for i in range(start_ndx, end_ndx):
                        sys.stdout.write('\r>> Converting image %d/%d shard %d' % (
                            i+1, len(filenames), shard_id))
                        sys.stdout.flush()

                        # Read the filename:
                        image_data = tf.gfile.FastGFile(filenames[i], 'rb').read()
                        height, width = image_reader.read_image_dims(sess, image_data)

                        class_name = os.path.basename(os.path.dirname(filenames[i]))
                        class_id = class_names_to_ids[class_name]

                        example = image_to_tfexample(
                            image_data, b'jpg', height, width, class_id)
                        tfrecord_writer.write(example.SerializeToString())

    sys.stdout.write('\n')
    sys.stdout.flush()


def _dataset_exists(dataset_dir):
    for split_name in ['train', 'validation']:
        for shard_id in range(_NUM_SHARDS):
            output_filename = _get_dataset_filename(
              dataset_dir, split_name, shard_id)
            if not tf.gfile.Exists(output_filename):
                return False
    return True

print("Done.")

### Activity
TensorFlow requires separate data sets for training and validation and that the data be stored in two separate records. Why separate image sets for training and validation? To prevent *overfitting*, which occurs when you train and test on the same images. You train on a set, then test on a new/different set to validate that the machine is truly learning to recognize the images. 

Our records will contain the words **train** and **validation** in their path to distinguish between the two. We used the industry standard ratio of 80 percent train and 20 percent test/validation to split the data-set.

In the cell below, set the two function calls to **\_convert\_dataset\_** first parameter to **"train"** and **"validation"** and then click **Run**.

*Hint: Look at the filenames being passed into the **\_convert\_dataset\_** function and make sure you are matching that with the correct label you are replacing into the **?????**.*

In [None]:
import random

def run(dataset_dir):
    if not tf.gfile.Exists(dataset_dir):
        tf.gfile.MakeDirs(dataset_dir)

    if _dataset_exists(dataset_dir):
        print('Dataset files already exist. Exiting without re-creating them.')
        return

    train_photo_filenames, class_names = _get_filenames_and_classes(dataset_dir, "train")
    validation_photo_filenames, class_names = _get_filenames_and_classes(dataset_dir, "validation")
    class_names_to_ids = dict(zip(class_names, range(len(class_names))))
    
    # First, convert the training and validation sets.
    _convert_dataset(?????, train_photo_filenames, class_names_to_ids,
                   dataset_dir)
    _convert_dataset(????????, validation_photo_filenames, class_names_to_ids,
                   dataset_dir)

    # Finally, write the labels file:
    labels_to_class_names = dict(zip(range(len(class_names)), class_names))
    write_label_file(labels_to_class_names, dataset_dir)

    print('\nFinished converting the Breeds dataset!')

run('breeds')

<br>

### After All of This Data Wrangling We Can Actually Begin the Training Process

When we started this project, we always had an edge device in mind as our ultimate deployment platform. To that end we always considered three things when selecting our topology or network: time to train, size, and inference speed. 

**Time to Train:** Depending on the number of layers and computation required, a network can take a significantly shorter or longer time to train. Computation time and programmer time are costly resources, so we wanted short training times.  

**Size:** Since we're targeting an edge device and a an Intel® Movidius™ Neural Compute Stick stick we must consider the size of the network that is allowed in memory as well as supported networks.

**Inference Speed:** Typically the deeper and larger the network, the slower the inference speed. In our use case we are working with a live video stream; we want at least 10 frames per second on inference.

At this point we're going to continue with the TensorFlow framework plus the GoogLeNet Inception* v1 topology/network since we're currently working on a simpler dataset.


![GoogLeNet](assets/googlenet.png)

# Part 2: Training CatVsDog with TensorFlow and GoogLeNet Inception* v1 on CPU

# Objective 
Understand the stages of preparing for training using the TensorFlow framework and an GoogLeNet Inception v1 topology. You will initiate training and view a completed graph, and learn about the relationship between accuracy and loss.

# Activities 
**In this section of the training you will**
- Download pretrained model
- Clone TensorFlow/models Github* repo
- Modify/add files within repo to add our dataset
- Initiate training and review live training logs

### Pretrained Models
"Neural nets work best when they have many parameters, making them powerful function approximators. However, this means they must be trained on very large datasets. Because training models from scratch can be a very computationally intensive process requiring days or even weeks, we are using a pre-trained models provided by Google. This CNNs have been trained on the ILSVRC-2012-CLS image classification dataset." From https://github.com/tensorflow/models/tree/master/research/slim#pre-trained-models.

![Download pre-trained model](assets/part2_1.jpg)

### Activity
Click the cell below and then click **Run**.

In [None]:
!wget http://download.tensorflow.org/models/inception_v1_2016_08_28.tar.gz
!tar xf inception_v1_2016_08_28.tar.gz
!rm -rf checkpoints
!mkdir checkpoints
!mv inception_v1.ckpt checkpoints
!rm inception_v1_2016_08_28.tar.gz
!echo "Done."

### TensorFlow/Models

The TensorFlow team provides nice wrappers around a lot of functionality that needs to be done when training using TensorFlow.  Below, we're going to pull in one of these repos directly so that we have access to those wrappers.

![Clone TensorFlow Models Repo](assets/part2_2.jpg)

### Activity
Click the cell below and then click **Run**.

In [None]:
!git clone --depth 1 https://github.com/tensorflow/models
!echo "Done."    

### Adding to the Slim Datasets

To use the wrappers we're going to have to modify and add some existing code to the repo.  Below we're overwriting the **dataset_factory.py** file with a slightly modified version that knows about our breeds dataset and an additional Python* import statement.  We're also copying over **breeds.py** since this contains information specific to our dataset that will be utilized by the **dataset_factory**.

![Modify Repo Scripts](assets/part2_3.jpg)

### Activity
Click the cell below and then click **Run**.

In [None]:
!cp breeds.py models/research/slim/datasets/breeds.py
!cp dataset_factory_modified.py models/research/slim/datasets/dataset_factory.py
!cp train_image_classifier_modified.py models/research/slim/train_image_classifier.py
!echo "Done."

# Start Training

Let’s start training with TensorFlow.

CPUs, which includes Intel® Xeon Phi™ processors, achieve optimal performance when TensorFlow is built from source with all of the instructions supported by the target CPU.

Beyond using the latest instruction sets, Intel has added support for the Intel® Math Kernel Library for Deep Neural Networks (Intel® MKL-DNN) to TensorFlow. While the name is not completely accurate, these optimizations are often simply referred to as MKL or *TensorFlow with MKL*. TensorFlow with Intel MKL-DNN contains details on the Intel® MKL optimizations.

The two configurations listed below are used to optimize CPU performance by adjusting the thread pools.

- **intra_op_parallelism_threads**: Nodes that can use multiple threads to parallelize their execution will schedule the individual pieces into this pool.
- **inter_op_parallelism_threads**: All ready nodes are scheduled in this pool.

These configurations are set via the tf.ConfigProto and passed to tf.Session in the config attribute as shown in the snippet below. For both configuration options, if they are unset or set to zero, will default to the number of logical CPU cores. Testing has shown that the default is effective for systems ranging from one CPU with 4 cores to multiple CPUs with 70+ combined logical cores. A common alternative optimization is to set the number of threads in both pools equal to the number of physical cores rather than logical cores.

Intel MKL uses the following environment variables to tune performance:

**KMP_BLOCKTIME** - Sets the time, in milliseconds, that a thread should wait, after completing the execution of a parallel region, before sleeping.

**KMP_AFFINITY** - Enables the runtime library to bind threads to physical processing units.

**KMP_SETTINGS** - Enables (true) or disables (false) the printing of OpenMP* runtime library environment variables during program execution.

**OMP_NUM_THREADS** - Specifies the number of threads to use.

See *Optimizing for CPU*, https://www.tensorflow.org/performance/performance_guide#optimizing_for_cpu.

**Best Settings for Intel® Xeon Processor - 5th Generation  (2 Socket -- 44 Cores)**
![Tensorflow Optimization](assets/tf_optimize.png)

![Optimize Performance for CPU](assets/part2_4.jpg)

### Activity
In the cell below, update **OMP_NUM_THREADS** to **"12"**, **KMP_BLOCKTIME** to **"1"**, and then click **Run**.

In [None]:
import os
import tensorflow as tf

os.environ["KMP_BLOCKTIME"] = ?
os.environ["KMP_AFFINITY"] = "granularity=fine,compact,1,0"
os.environ["KMP_SETTINGS"] = "1"
os.environ["OMP_NUM_THREADS"] = ??
print("Done.")

### Fine-Tuning a Model from an Existing Checkpoint

"Rather than training from scratch, we'll often want to start from a pre-trained model and fine-tune it. To indicate a checkpoint from which to fine-tune, we'll call training with the --checkpoint_path flag and assign it an absolute path to a checkpoint file.

When fine-tuning a model, we need to be careful about restoring checkpoint weights. In particular, when we fine-tune a model on a new task with a different number of output labels, we wont be able restore the final logits (classifier) layer. For this, we'll use the --checkpoint_exclude_scopes flag. This flag hinders certain variables from being loaded. When fine-tuning on a classification task using a different number of classes than the trained model, the new model will have a final 'logits' layer whose dimensions differ from the pre-trained model. For example, if fine-tuning an ImageNet-trained model on Flowers, the pre-trained logits layer will have dimensions [2048 x 1001] but our new logits layer will have dimensions [2048 x 5]. Consequently, this flag indicates to TF-Slim to avoid loading these weights from the checkpoint.

Keep in mind that warm-starting from a checkpoint affects the model's weights only during the initialization of the model. Once a model has started training, a new checkpoint will be created in --train_dir. If the fine-tuning training is stopped and restarted, this new checkpoint will be the one from which weights are restored and not the --checkpoint_path. Consequently, the flags --checkpoint_path and --checkpoint_exclude_scopes are only used during the 0-th global step (model initialization). Typically for fine-tuning one only want train a sub-set of layers, so the flag --trainable_scopes allows to specify which subsets of layers should trained, the rest would remain frozen." See https://github.com/tensorflow/models/tree/master/research/slim#fine-tuning-a-model-from-an-existing-checkpoint.

![Fine-Tune a Model](assets/part2_5.jpg)

### Activity
In the cell below, update the **max_number_of_steps** parameter to a number between **500** and **1500**, the **intra_op** parameter to the number **12** and then click **Run**.

In [None]:
!rm -rf train_dir
!mkdir train_dir

!python models/research/slim/train_image_classifier.py \
    --train_dir=train_dir \
    --dataset_name=breeds \
    --dataset_split_name=train \
    --clone_on_cpu=true \
    --dataset_dir=breeds \
    --model_name=inception_v1 \
    --checkpoint_path=checkpoints/inception_v1.ckpt \
    --checkpoint_exclude_scopes=InceptionV1/Logits \
    --trainable_scopes=InceptionV1/Logits \
    --max_number_of_steps=???? \
    --learning_rate=0.01 \
    --batch_size=32 \
    --save_interval_secs=60 \
    --save_summaries_secs=60 \
    --inter_op=2 \
    --intra_op=??

!echo "Done."

# Part 3: Evaluate, Freeze and Test Your Training Results

### Evaluate Your Latest Training Checkpoint

Earlier we created a TFRecord file with our validation images.  Below, we'll be using our validation set to determine our accuracy by running the eval_image_classifier script.  It will give us the Accuracy and Recall for Top 5.

![Evaluate Your Checkpoint](assets/part3_1.jpg)

### Activity
Click the cell below and then click **Run**.

In [None]:
!rm -rf eval_dir
!mkdir eval_dir
!python models/research/slim/eval_image_classifier.py \
    --checkpoint_path=$(ls -t train_dir/model.ckpt* | head -1 | rev | cut -d '.' -f2- | rev) \
    --eval_dir=eval_dir \
    --dataset_dir=breeds \
    --dataset_name=breeds \
    --dataset_split_name=validation \
    --model_name=inception_v1

!echo "Done."    

### Export Your Inference Graph of Inception v1

We want to export our inference graph of Inception v1 so we can use it later to create a frozen graph (.pb) file.  Below, we'll run the export_inference_graph script that will take the inceptionv1 model and our dataset to create a .pb file.  Passing in our dataset is important since it will make sure to create a final layer of 37 categories rather than the 1000 from ImageNet.

![Export Inference Graph](assets/part3_2.jpg)

### Activity
Click the cell below and then click **Run**.

In [None]:
!python models/research/slim/export_inference_graph.py \
    --alsologtostderr \
    --model_name=inception_v1 \
    --image_size=224 \
    --batch_size=1 \
    --output_file=train_dir/inception_v1_inf_graph.pb \
    --dataset_name=breeds
    
!echo "Done."    

### Clone the Main TensorfFow Repo

We're cloning the main TensorFlow/TensorFlow repository since it contains the script to create a frozen graph.

![Clone TensorFlow Repo](assets/part3_3.jpg)

### Activity
Click the cell below and then click **Run**.

In [None]:
!git clone --depth 1 https://github.com/tensorflow/tensorflow.git
    
!echo "Done."    

### Freeze Your Graph

Freezing your graph will take the inference graph definition we created above and the latest checkpoint file that was created during training.  It will merge these two into a single file for a convenient way to have the graph definition and weights for deployment.

![Freeze Your Graph](assets/part3_4.jpg)

### Activity
Click the cell below and then click **Run**.

In [None]:
!python tensorflow/tensorflow/python/tools/freeze_graph.py \
    --clear_devices=true \
    --input_graph=train_dir/inception_v1_inf_graph.pb \
    --input_checkpoint=$(ls -t train_dir/model.ckpt* | head -1 | rev | cut -d '.' -f2- | rev) \
    --input_binary=true \
    --output_graph=train_dir/frozen_inception_v1.pb \
    --output_node_names=InceptionV1/Logits/Predictions/Reshape_1
    
!echo "Done."    

### Look at a Sample Image

We're going to use this image to run through the network and see the results.

![Display a Sample Image](assets/part3_5.jpg)

### Activity
Click the cell below and then click **Run**.

In [None]:
from PIL import Image

Image.open('breeds/train/maine_coon/Maine_Coon_100.jpg')

### Inference on an Image

We can use the newly created frozen graph file to test a sample image.  We're using the label_image script that takes an image, frozen graph, labels.txt files, and displays the top five probabilities for the given image.

![Inference on Image](assets/part3_6.jpg)

### Activity
Click the cell below and then click **Run**.

In [None]:
!python tensorflow/tensorflow/examples/label_image/label_image.py \
    --image=breeds/train/maine_coon/Maine_Coon_100.jpg \
    --input_layer=input \
    --input_height=224 \
    --input_width=224 \
    --output_layer=InceptionV1/Logits/Predictions/Reshape_1 \
    --graph=train_dir/frozen_inception_v1.pb \
    --labels=breeds/labels.txt
    
print("Done.")    

### Summary

- Getting your dataset
- Sorting your dataset
- Generating TFRecord files
- Learning about fine-tuning and checkpoints
- Train your dataset with fine-tune checkpoint
- Evaluating your training
- Creating a frozen graph
- Using a frozen graph to test image classification

# Part 4: Additional Fine Tuning (Optional)

### Fine Tuning the Entire Network

We previously fine tuned only the final layer of the network.  Now we're going to allow for all of the layers in the network to be trained but we're going to use a much lower learning rate.  This will let the network narrow in and tune the remaining weights we didn't tune from the ImageNet checkpoint.  We'll want to make sure not to train too much though, or we might start to overfit, so we'll limit the steps to about 500-1500.

### Activity
In the cell below, update the **max_number_of_steps** parameter to a number between **500** and **1500**, the **learning_rate** to **0.0001** and then click **Run**.

In [None]:
!python models/research/slim/train_image_classifier.py \
    --train_dir=train_dir/all \
    --dataset_name=breeds \
    --dataset_split_name=train \
    --clone_on_cpu=true \
    --dataset_dir=breeds \
    --model_name=inception_v1 \
    --checkpoint_path=train_dir \
    --max_number_of_steps=???? \
    --learning_rate=???? \
    --learning_rate_decay_type=fixed \
    --batch_size=32 \
    --save_interval_secs=60 \
    --save_summaries_secs=60 \
    --inter_op=2 \
    --intra_op=12

!echo "Done."

### Activity
Click the cell below and then click **Run**.

In [None]:
!python models/research/slim/eval_image_classifier.py \
    --checkpoint_path=$(ls -t train_dir/all/model.ckpt* | head -1 | rev | cut -d '.' -f2- | rev) \
    --eval_dir=eval_dir/all \
    --dataset_dir=breeds \
    --dataset_name=breeds \
    --dataset_split_name=validation \
    --model_name=inception_v1

!echo "Done."    

### Activity
Click the cell below and then click **Run**.

In [None]:
!python tensorflow/tensorflow/python/tools/freeze_graph.py \
    --clear_devices=true \
    --input_graph=train_dir/inception_v1_inf_graph.pb \
    --input_checkpoint=$(ls -t train_dir/all/model.ckpt* | head -1 | rev | cut -d '.' -f2- | rev) \
    --input_binary=true \
    --output_graph=train_dir/all/frozen_inception_v1.pb \
    --output_node_names=InceptionV1/Logits/Predictions/Reshape_1
    
!echo "Done."    

### Activity
Click the cell below and then click **Run**.

In [None]:
from PIL import Image

Image.open('breeds/train/maine_coon/Maine_Coon_100.jpg')

### Activity
Click the cell below and then click **Run**.

In [None]:
!python tensorflow/tensorflow/examples/label_image/label_image.py \
    --image=breeds/train/maine_coon/Maine_Coon_100.jpg \
    --input_layer=input \
    --input_height=224 \
    --input_width=224 \
    --output_layer=InceptionV1/Logits/Predictions/Reshape_1 \
    --graph=train_dir/all/frozen_inception_v1.pb \
    --labels=breeds/labels.txt
    
print("Done.")    

### Resources

TensorFlow* Optimizations on Modern Intel® Architecture, https://software.intel.com/en-us/articles/tensorflow-optimizations-on-modern-intel-architecture

Intel Optimized TensorFlow Wheel Now Available, https://software.intel.com/en-us/articles/intel-optimized-tensorflow-wheel-now-available

Build and Install TensorFlow* on Intel® Architecture, https://software.intel.com/en-us/articles/build-and-install-tensorflow-on-intel-architecture

TensorFlow, https://www.tensorflow.org/


### Case Studies

Manufacturing Package Fault Detection Using Deep Learning, https://software.intel.com/en-us/articles/manufacturing-package-fault-detection-using-deep-learning

Automatic Defect Inspection Using Deep Learning for Solar Farm, https://software.intel.com/en-us/articles/automatic-defect-inspection-using-deep-learning-for-solar-farm


**Notices**

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.

This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps.

The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on request.

Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by visiting www.intel.com/design/literature.htm.

This sample source code is released under the Intel Sample Source Code License Agreement.

Intel, the Intel logo, Intel Xeon Phi, Movidius, and Xeon are trademarks of Intel Corporation in the U.S. and/or other countries. 

*Other names and brands may be claimed as the property of others.

© 2018 Intel Corporation