Image Recognition
===

This notebook will create a convolutional neural network to classify images in either the mnist or cifar-10 datasets.

In [1]:
# Tensorflow and numpy to create the neural network
import tensorflow as tf
import numpy as np

# Matplotlib to plot info to show our results
import matplotlib.pyplot as plt

# OS to load files and save checkpoints
import os

from tensorflow.python.keras.datasets import fashion_mnist

%matplotlib inline

Loading the data
---

This code will load the dataset that you'll use to train and test the model.

The code provided will load the mnist or cifar data from files, you'll need to add the code that processes it into a format your neural network can use.

MNIST
---

Run this cell to load mnist data.

In [2]:
# Load MNIST data from tf examples

'''image_height = 28
image_width = 28

color_channels = 1

model_name = "mnist"

mnist = tf.contrib.learn.datasets.load_dataset("mnist")

train_data = mnist.train.images
train_labels = np.asarray(mnist.train.labels, dtype=np.int32)

eval_data = mnist.test.images
eval_labels = np.asarray(mnist.test.labels, dtype=np.int32)

category_names = list(map(str, range(10)))

# TODO: Process mnist data
train_data = np.reshape(train_data, (-1, image_height, image_width, color_channels))

print(train_data.shape)

eval_data = np.reshape(eval_data, (-1, image_height, image_width, color_channels))'''

'image_height = 28\nimage_width = 28\n\ncolor_channels = 1\n\nmodel_name = "mnist"\n\nmnist = tf.contrib.learn.datasets.load_dataset("mnist")\n\ntrain_data = mnist.train.images\ntrain_labels = np.asarray(mnist.train.labels, dtype=np.int32)\n\neval_data = mnist.test.images\neval_labels = np.asarray(mnist.test.labels, dtype=np.int32)\n\ncategory_names = list(map(str, range(10)))\n\n# TODO: Process mnist data\ntrain_data = np.reshape(train_data, (-1, image_height, image_width, color_channels))\n\nprint(train_data.shape)\n\neval_data = np.reshape(eval_data, (-1, image_height, image_width, color_channels))'

In [3]:
image_height = 28
image_width = 28

color_channels = 1
model_name = "mnist_fashion"

((train_data, train_labels),(eval_data, eval_labels)) = fashion_mnist.load_data()

train_data = np.reshape(train_data, (-1, image_height, image_width, color_channels))

eval_data = np.reshape(eval_data, (-1, image_height, image_width, color_channels))

train_data = train_data.astype("float32")/255.0

eval_data = train_data.astype("float32")/255.0
category_name = ("top", "trouser","pullover", "dress", "coat", "sandal", "shirt", "sneaker" , "bag", "ankle boot")

CIFAR-10
---

Run this cell to load cifar-10 data

In [4]:
# Load cifar data from file

image_height = 32
image_width = 32

color_channels = 3

model_name = "cifar"

def unpickle(file):
    import pickle
    with open(file, 'rb') as fo:
        dict = pickle.load(fo, encoding='bytes')
    return dict

cifar_path = './cifar-10-data/'

train_data = np.array([])
train_labels = np.array([])

# Load all the data batches.
for i in range(1,6):
    data_batch = unpickle(cifar_path + 'data_batch_' + str(i))
    train_data = np.append(train_data, data_batch[b'data'])
    train_labels = np.append(train_labels, data_batch[b'labels'])


# Load the eval batch.
eval_batch = unpickle(cifar_path + 'test_batch')

eval_data = eval_batch[b'data']
eval_labels = eval_batch[b'labels'] 

# Load the english category names.
category_names_bytes = unpickle(cifar_path + 'batches.meta')[b'label_names']
category_names = list(map(lambda x: x.decode("utf-8"), category_names_bytes))

# TODO: Process Cifar data
def process_data(data): 
    float_data = np.array(data, dtype=float) / 255.0 
    reshaped_data = np.reshape(float_data, (-1, color_channels, image_height, image_width)) 
    transposed_data = np.transpose(reshaped_data, [0, 2, 3, 1])
    return transposed_data

train_data = process_data(train_data) 
eval_data = process_data(eval_data)


Once the data is processed, you have a few variables for the data itself and info about its shape:

### Model Info

- **image_height, image_width** - The height and width of the processed images
- **color_channels** - the number of color channels in the image. This will be either 1 for grayscale or 3 for rgb.
- **model_name** - either "cifar" or "mnist" - if you need to handle anything differently based on the model, check this variable.
- **category_names** - strings for each category name (used to print out labels when testing results)

### Training Data

- **train_data** - the training data images
- **train_labels** - the labels for the training data - the "answer key"

### Evaluation Data

- **eval_data** - Image data for evaluation. A different set of images to test your network's effectiveness.
- **eval_labels** - the answer key for evaluation data.

Building the Neural Network Model
--

Next, you'll build a neural network with the following architecture:

- An input placeholder that takes one or more images.
- 1st Convolutional layer with 32 filters and a kernel size of 5x5 and same padding
- 1st Pooling layer with a 2x2 pool size and stride of 2
- 2nd Convolutional layer with 64 filters and a kernel size of 5x5 and same padding
- 2nd Pooling layer with a 2x2 pool size and stride of 2
- Flatten the pooling layer
- A fully connected layer with 1024 units
- A dropout layer with a rate of 0.4
- An output layer with an output size equal to the number of labels.

In [5]:
# TODO: The neural network
class ConvNet:
    def __init__(self,image_height, image_width, channels, num_classes):
        self.input_layer = tf.placeholder(dtype = tf.float32, shape=[None, image_height, image_width, channels], 
                                          name="inputs") # All of the images
        print(self.input_layer.shape)
        conv_layer_1 = tf.layers.conv2d(self.input_layer, filters = 32, kernel_size=[6,6], padding="same", 
                                        activation=tf.nn.relu) # first conv layer
        print(conv_layer_1.shape)
        
        pooling_layer_1 = tf.layers.max_pooling2d(conv_layer_1, pool_size=[3,3], strides=2)  # first pooling layer
        print(pooling_layer_1.shape)
        
        conv_layer_2 = tf.layers.conv2d(pooling_layer_1, filters = 64, kernel_size=[6,6], padding="same", 
                                        activation=tf.nn.relu)  # Second conv layer
        print(conv_layer_2.shape)
        
        pooling_layer_2 = tf.layers.max_pooling2d(conv_layer_2, pool_size=[3, 3], strides=2) # Second pooling layer
        print(pooling_layer_2.shape) 
        
        conv_layer_3 = tf.layers.conv2d(pooling_layer_2, filters = 96, kernel_size=[6,6], padding="same", 
                                        activation=tf.nn.relu)  # Third conv layer
        print(conv_layer_3.shape)
        
        pooling_layer_3 = tf.layers.max_pooling2d(conv_layer_3, pool_size=[3, 3], strides=2) # Third pooling layer
        print(pooling_layer_3.shape) 
        
        flattened_pooling = tf.layers.flatten(pooling_layer_3) # flattening pooling layer so we can get the dense layer
        dense_layer = tf.layers.dense(flattened_pooling, 1024, activation=tf.nn.relu)  # densing the layer connencting 
        print(dense_layer.shape)                                                        #    all the neurons
        dropout = tf.layers.dropout(dense_layer, rate=0.1, training = True) # takes a percentage of all the neurons in 
        outputs = tf.layers.dense(dropout, num_classes)#the weight of each choice     #the input and deactivates them at random
        print(outputs.shape)
        self.choice = tf.argmax(outputs, axis=1)    #finds best choice and picks
        self.probabilities = tf.nn.softmax(outputs) # finds probabilties for what the image might be
        self.labels = tf.placeholder(dtype=tf.float32, name="labels") # all of the labels of the pictures
        self.accuracy, self.accuracy_op = tf.metrics.accuracy(self.labels, self.choice) # finds accuracy
        one_hot_labels = tf.one_hot(indices=tf.cast(self.labels, dtype=tf.int32), depth=num_classes)  #makes it a hot label
        self.loss = tf.losses.softmax_cross_entropy(onehot_labels=one_hot_labels, logits=outputs) #finds loss
        optimizer = tf.train.GradientDescentOptimizer( learning_rate=1e-2) #optimizes
        self.train_operation = optimizer.minimize(loss=self.loss, global_step=tf.train.get_global_step()) # minimizes loss
        

The Training Process
---

The cells below will set up and run the training process.

- Set up initial values for batch size, training length.
- Process data into batched datasets to feed into the network.
- Run through batches of training data, update weights, save checkpoints.

In [6]:
# TODO: initialize variables
training_steps =20000
batch_size = 64
path = "./" + model_name + "-cnn/"
load_checkpoint = False
performance_graph = np.array([]) 

In [7]:
# TODO: implement the training loop
# TODO: implement the training loop
tf.reset_default_graph()

dataset = tf.data.Dataset.from_tensor_slices((train_data, train_labels))
dataset = dataset.shuffle(buffer_size=train_labels.shape[0])
dataset = dataset.batch(batch_size)
dataset = dataset.repeat() # shuffles and makes a batch of data

dataset_iterator = dataset.make_initializable_iterator()
next_element = dataset_iterator.get_next()  # iterates through the training data 
    
cnn = ConvNet(image_height,image_width,color_channels,10)

saver = tf.train.Saver(max_to_keep=2)

if not os.path.exists(path):
    os.makedirs(path)
                            # it find the folder or creates the folder
with tf.Session() as sess:
    
    if load_checkpoint:
        checkpoint = tf.train.get_checkpoint_state(path)
        saver.restore(sess, checkpoint.model_checkpoint_path) # loads checkpoint
    else:                
        sess.run(tf.global_variables_initializer()) # if doesn't load checkpoint inits all global variables
    
    sess.run(tf.local_variables_initializer()) # inits all local variables
    sess.run(dataset_iterator.initializer) # init the data iterator
    for step in range(training_steps):
        current_batch = sess.run(next_element)  # Gets current batch of data
        
        batch_inputs = current_batch[0] # gets the images out of the data
        batch_labels = current_batch[1] # gets the labels of the images out of the data
        
        sess.run((cnn.train_operation, cnn.accuracy_op), feed_dict={cnn.input_layer:batch_inputs, cnn.labels:batch_labels})
           # it runs training, runs finding a accuracy and you feed all the place holders inputs
        if step % 10 == 0: 
             performance_graph=np.append(performance_graph, 
             sess.run(cnn.accuracy))  # gets the accuracy and adds it to the graph list
        if step % 1000 == 0 and step > 0: # print accuracy every 1000 times and saves
            current_acc = sess.run(cnn.accuracy)
            
            print("Accuracy at step " + str(step) + ": " + str(current_acc))
            print("Saving checkpoint")
            saver.save(sess, path + model_name, step)
        
    print("Saving final checkpoint for training session.")
    saver.save(sess, path + model_name, step)

Instructions for updating:
Colocations handled automatically by placer.
(?, 32, 32, 3)
Instructions for updating:
Use keras.layers.conv2d instead.
(?, 32, 32, 32)
Instructions for updating:
Use keras.layers.max_pooling2d instead.
(?, 15, 15, 32)
(?, 15, 15, 64)
(?, 7, 7, 64)
(?, 7, 7, 96)
(?, 3, 3, 96)
Instructions for updating:
Use keras.layers.flatten instead.
Instructions for updating:
Use keras.layers.dense instead.
(?, 1024)
Instructions for updating:
Use keras.layers.dropout instead.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
(?, 10)
Instructions for updating:
Use tf.cast instead.


UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node conv2d/Conv2D (defined at <ipython-input-5-46b40680c040>:8) ]]

Caused by op 'conv2d/Conv2D', defined at:
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\ipykernel_launcher.py", line 16, in <module>
    app.launch_new_instance()
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\traitlets\config\application.py", line 658, in launch_instance
    app.start()
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\ipykernel\kernelapp.py", line 505, in start
    self.io_loop.start()
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tornado\platform\asyncio.py", line 148, in start
    self.asyncio_loop.run_forever()
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\asyncio\base_events.py", line 539, in run_forever
    self._run_once()
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\asyncio\base_events.py", line 1775, in _run_once
    handle._run()
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\asyncio\events.py", line 88, in _run
    self._context.run(self._callback, *self._args)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tornado\ioloop.py", line 690, in <lambda>
    lambda f: self._run_callback(functools.partial(callback, future))
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tornado\ioloop.py", line 743, in _run_callback
    ret = callback()
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tornado\gen.py", line 787, in inner
    self.run()
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tornado\gen.py", line 748, in run
    yielded = self.gen.send(value)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\ipykernel\kernelbase.py", line 378, in dispatch_queue
    yield self.process_one()
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tornado\gen.py", line 225, in wrapper
    runner = Runner(result, future, yielded)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tornado\gen.py", line 714, in __init__
    self.run()
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tornado\gen.py", line 748, in run
    yielded = self.gen.send(value)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\ipykernel\kernelbase.py", line 365, in process_one
    yield gen.maybe_future(dispatch(*args))
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tornado\gen.py", line 209, in wrapper
    yielded = next(result)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\ipykernel\kernelbase.py", line 272, in dispatch_shell
    yield gen.maybe_future(handler(stream, idents, msg))
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tornado\gen.py", line 209, in wrapper
    yielded = next(result)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\ipykernel\kernelbase.py", line 542, in execute_request
    user_expressions, allow_stdin,
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tornado\gen.py", line 209, in wrapper
    yielded = next(result)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\ipykernel\ipkernel.py", line 294, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\ipykernel\zmqshell.py", line 536, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\IPython\core\interactiveshell.py", line 2854, in run_cell
    raw_cell, store_history, silent, shell_futures)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\IPython\core\interactiveshell.py", line 2880, in _run_cell
    return runner(coro)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\IPython\core\async_helpers.py", line 68, in _pseudo_sync_runner
    coro.send(None)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\IPython\core\interactiveshell.py", line 3057, in run_cell_async
    interactivity=interactivity, compiler=compiler, result=result)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\IPython\core\interactiveshell.py", line 3248, in run_ast_nodes
    if (await self.run_code(code, result,  async_=asy)):
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\IPython\core\interactiveshell.py", line 3325, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-7-4ae015fc6d43>", line 13, in <module>
    cnn = ConvNet(image_height,image_width,color_channels,10)
  File "<ipython-input-5-46b40680c040>", line 8, in __init__
    activation=tf.nn.relu) # first conv layer
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\util\deprecation.py", line 324, in new_func
    return func(*args, **kwargs)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\layers\convolutional.py", line 424, in conv2d
    return layer.apply(inputs)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\keras\engine\base_layer.py", line 1227, in apply
    return self.__call__(inputs, *args, **kwargs)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\layers\base.py", line 530, in __call__
    outputs = super(Layer, self).__call__(inputs, *args, **kwargs)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\keras\engine\base_layer.py", line 554, in __call__
    outputs = self.call(inputs, *args, **kwargs)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\keras\layers\convolutional.py", line 194, in call
    outputs = self._convolution_op(inputs, self.kernel)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 966, in __call__
    return self.conv_op(inp, filter)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 591, in __call__
    return self.call(inp, filter)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 208, in __call__
    name=self.name)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\gen_nn_ops.py", line 1026, in conv2d
    data_format=data_format, dilations=dilations, name=name)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\ops.py", line 3300, in create_op
    op_def=op_def)
  File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\ops.py", line 1801, in __init__
    self._traceback = tf_stack.extract_stack()

UnknownError (see above for traceback): Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node conv2d/Conv2D (defined at <ipython-input-5-46b40680c040>:8) ]]


Evaluating Performance
---

These cells will evaluate the performance of your network!

In [None]:
# TODO: Display graph of performance over time

plt.figure().set_facecolor('white') 
plt.xlabel("Steps/10") 
plt.ylabel("Accuracy") 
plt.plot(performance_graph) 

In [None]:
# TODO: Run through the evaluation data set, check accuracy of model
with tf.Session() as sess: 
        checkpoint = tf.train.get_checkpoint_state(path) 
        saver.restore(sess, checkpoint.model_checkpoint_path) 
        sess.run(tf.local_variables_initializer()) 
        for image, label in zip(eval_data, eval_labels):
            sess.run(cnn.accuracy_op, feed_dict={cnn.input_layer:[image], cnn.labels:label}) 
        print(sess.run(cnn.accuracy)) 
        

In [None]:

# Expand this box to check the final code for this cell.
# TODO: Get a random set of images and make guesses for each
with tf.Session() as sess:
    checkpoint = tf.train.get_checkpoint_state(path)
    saver.restore(sess,checkpoint.model_checkpoint_path)
    
    indexes = np.random.choice(len(eval_data), 10, replace=False)
    
    rows = 5
    cols = 2
    
    fig, axes = plt.subplots(rows, cols, figsize=(5,5))
    fig.patch.set_facecolor('white')
    image_count = 0
    
    for idx in indexes:
        image_count += 1
        sub = plt.subplot(rows,cols,image_count)
        img = eval_data[idx]
        if model_name == "mnist" or model_name == "mnist_fashion":
            img = img.reshape(28, 28)
        plt.imshow(img)
        guess = sess.run(cnn.choice, feed_dict={cnn.input_layer:[eval_data[idx]]})
        if model_name == "mnist" or model_name == "mnist_fashion":
            guess_name = str(guess[0])
            actual_name = str(eval_labels[idx])
        else:
            guess_name = category_names[guess[0]]
            actual_name = category_names[eval_labels[idx]]
        sub.set_title("G: " + guess_name + " A: " + actual_name)
    plt.tight_layout()