# The Exercise

Work in pairs and using the starter code -- train a neural network whose input is a cropped (81x81) color image and whose output is the probability that the center pixel of that image is part of a traffic light.

# Topics to Discuss

## Architecture
    Any combination of convolutional, fully-connected and pooling layers is possible as long as the output is the right size. Decreasing H and W can be done with pooling or changing the strides of the convolutions.
## Learning Rate
    Decreasing the learning rate over time can help
## Loss
    What is the appropriate loss for binary classification? (sigmoid cross entropy)
## Overfitting
    How can we tell if our net is overfitting? If the train set loss and test set loss diverge too much
    What can we do about it? Regularization, Data Augmentation. There are also more advanced options like batchnorm and dropout.

In [1]:
#########################
######## Imports ########
#########################

import numpy as np
import os,sys
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 
import tensorflow as tf
import matplotlib.pyplot as plt
%matplotlib notebook

In [2]:
#########################
#### Neural Net Class ###
#########################
class TLNet:

    def __init__(self, out_dir, data_dir=None, crop_size=81):
        
        #########################
        #### Initialize Net #####
        #########################
        
        self.out_dir = out_dir
        if not os.path.exists(out_dir):
            os.makedirs(out_dir)
        self.data_dir = data_dir
        self.crop_size = crop_size
        # Tensorflow placeholders serve as input pipes that we can fill later with data from any dataset we
        # wish (as long as its examples are the expected shape)
        self.input_data = tf.placeholder(tf.uint8, shape=[None, crop_size, crop_size, 3], name="input_imgs")
        self.labels = tf.placeholder(tf.uint8, [None, 1], name="labels")

        self.model_out = self.model(self.input_data)
        
        # We use a sigmoid on the model output so that it's predictions will be between 0 and 1
        # representing the likelihood in percentage terms that the input image IS a positive example
        self.prediction = tf.nn.sigmoid(self.model_out, name="prediction")
        
        self.loss = self.calc_loss(self.model_out, tf.cast(self.labels, tf.float32))
        
        self.opt = None
        
        # Where to save the checkpoint
        self.model_path = os.path.join(self.out_dir, "TLNet.ckpt")


    def model(self, images):
        
        #########################
        ##### Architecture ######
        #########################
        
        # Cast uint8 images to float32 and normalize so their values are between 0 and 1
        # Normalization is very important in training neural nets
        images = tf.cast(images, tf.float32)
        images *= (1. / 256)
        size=2
        size2=2
        
        # Layers of the net
        conv1 = self.conv_layer(images, 12*size, 'conv1', kernel=5, stride=1)
        conv2 = self.conv_layer(conv1, 24*size, 'conv2', kernel=5, stride=2*size2)
        conv3 = self.conv_layer(conv2, 24*size, 'conv3', kernel=3, stride=1)
        conv4 = self.conv_layer(conv3, 48*size, 'conv4', kernel=3, stride=2*size2)
        conv5 = self.conv_layer(conv4, 48*size, 'conv5', kernel=3, stride=1)
#         conv6 = self.conv_layer(conv5, 96*size, 'conv6', kernel=3, stride=2*size2)
#         conv7 = self.conv_layer(conv6, 96*size, 'conv7', kernel=3, stride=1)
#         conv8 = self.conv_layer(conv7, 192*size, 'conv8', kernel=3, stride=2*size2)
        conv9 = self.conv_layer(conv5, 64*size, 'conv9', kernel=3, stride=1)
        conv10 = self.conv_layer(conv9, 24*size, 'conv10', kernel=3, stride=1)
                
        fc11 = self.fc_layer(conv10, output_neurons=48, name='fc11', doRelu=True)
        fc12 = self.fc_layer(fc11, output_neurons=1, name='fc12', doRelu=False)
    
        output = tf.identity(fc12, name='output')

        return output

    def calc_loss(self, model_out, label):
        
        #########################
        ######### Loss ##########
        #########################
        
        # Sigmoid cross entropy is the go-to loss for yes/no classification tasks
        loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=model_out, labels=label))

        return loss

    def save(self, sess, model_path):
        
        # Save the current model
        saver = tf.train.Saver()
        save_path = saver.save(sess, model_path)
        return save_path

    def restore(self, sess, model_path):
        
        # Restore previously trained model
        saver = tf.train.Saver()
        saver.restore(sess, model_path)

    def train(self, iters, learning_rate, batch_size, restore=False, ckpt_path=None):
        
        #########################
        ####### Train Net #######
        #########################
        
        # Prep data
                
        dataset_train = self.prep_data(os.path.join(self.data_dir, 'train', 'data1.bin'), 
                                 os.path.join(self.data_dir, 'train', 'labels1.bin'), 
                                 batch_size=batch_size, crop_size=self.crop_size)
        dataset_val = self.prep_data(os.path.join(self.data_dir, 'val', 'data1.bin'), 
                                os.path.join(self.data_dir, 'val', 'labels1.bin'), 
                                batch_size=batch_size, crop_size=self.crop_size)

        train_iterator = tf.data.Iterator.from_structure(dataset_train.output_types, dataset_train.output_shapes)
        batch_of_train_images = train_iterator.get_next()
        train_iterator = train_iterator.make_initializer(dataset_train)

        val_iterator = tf.data.Iterator.from_structure(dataset_val.output_types, dataset_val.output_shapes)
        batch_of_val_images = val_iterator.get_next()
        val_iterator = val_iterator.make_initializer(dataset_val)
        
        if not self.opt:
            
            # Define optimizer
            self.opt = tf.train.AdamOptimizer(learning_rate).minimize(self.loss)

        with tf.Session() as sess:

            # Initialize variables and data iterators
            sess.run(train_iterator)
            sess.run(val_iterator)
            sess.run(tf.global_variables_initializer())

            if restore:
                # Restore trained weights and biases from checkpoint
                self.restore(sess, ckpt_path)

                print ("Resuming from previously saved checkpoint ...")
                
            # Define losses and accuracy
            train_loss = 0.
            val_loss = 0.
            val_accuracy = 0.

            # Train
            for i in range(iters):

                try:
                    # Get a batch of images and labels from the training data generator
                    train_batch = sess.run(batch_of_train_images)
                    train_images = train_batch[0]
                    train_labels = train_batch[1]
                    
                    # Run the optimizer (this is where the real training is occurring)
                    _, out_loss = sess.run([self.opt, self.loss], feed_dict=
                                           {self.input_data: train_images, self.labels: train_labels})
                    
                    # Update the loss average over all of training
                    train_loss += np.mean(out_loss)

                    if (i+1) % 10 == 0:
                        # Print loss statement
                        print ("Iter: " + str(i+1) + ", Current train loss: %f" % (train_loss / (i+1)))

                    if (i+1) % 100 == 0:
                        # Save checkpoint
                        print ("Saving checkpoint")
                        save_path = self.save(sess, self.model_path)

                    if (i+1) % 100 == 0:
                        # Evaluate the net on the validation set
                        print ("Performing Evaluation")
                        
                        val_batch = sess.run(batch_of_val_images)
                        val_images = val_batch[0]
                        val_labels = val_batch[1]

                        out_prediction, out_val_loss = sess.run([self.prediction, self.loss],
                                               feed_dict={self.input_data: val_images, self.labels: val_labels})
                        
                        val_loss += np.mean(out_val_loss)
                        print ("Current loss on val set: %f" % (val_loss / ((i+1) / 100.)))
                        
                        val_accuracy += np.mean(np.round(out_prediction) == val_labels)
                        print ("Current accuracy on val set (percent of examples labeled correctly): " + str(float(val_accuracy / ((i+1) / 100))))


                except tf.errors.OutOfRangeError:
                    pass

            print ("Model saved at " + save_path)

    def evaluate(self, ckpt_path, batch_size, crop_size=81):
        
        #########################
        ## Evaluate Trained Net #
        #########################
                
        dataset_val = self.prep_data(os.path.join(self.data_dir, 'val', 'data1.bin'), os.path.join(self.data_dir, 'val', 'labels1.bin'), batch_size=batch_size, crop_size=self.crop_size)

        val_iterator = tf.data.Iterator.from_structure(dataset_val.output_types, dataset_val.output_shapes)

        batch_of_images = val_iterator.get_next()

        input_images_test = batch_of_images[0]
        input_labels_test = batch_of_images[1]

        val_iterator = val_iterator.make_initializer(dataset_val)

        with tf.Session() as sess:

            # Initialize
            sess.run(val_iterator)
            sess.run(tf.global_variables_initializer())

            self.restore(sess, ckpt_path)

            batch_out = sess.run(batch_of_images)
            val_images = batch_out[0]
            val_labels = batch_out[1]

            print ("resuming from previously saved checkpoint")

            predict_result, out_loss = sess.run([self.prediction, self.loss], feed_dict={self.input_data: val_images, self.labels: val_labels})

            print ("Loss on prediction was:")
            print (np.mean(out_loss))
            print ("Prediction was:")
            print (np.squeeze(predict_result))
            print ("Label was:")
            print (np.squeeze(val_labels))
            print ("Accuracy: %f" % np.mean(np.round(predict_result) == val_labels))
    
    def predict(self, ckpt_path, img, crop_size=81):
        
        # FUNCTION
        # Given an image and a checkpoint path, restore a trained model and run it on a single image

        # INPUT
        # **ckpt_path: filepath to model checkpoint
        # **img: numpy array of shape (crop_size, crop_size, 3) and dtype uint8

        # OUTPUT
        # **prediction between 0 and 1 of how likely it is to be a traffic light

        # Normalize the image, convert it to float32 and reshape it to be a single batch
        
#         img = img.astype(np.float32) / 256.
        img = img.reshape(1,crop_size,crop_size,3)
        
        with tf.Session() as sess:

            # Initialize
            sess.run(tf.global_variables_initializer())

            self.restore(sess, ckpt_path)

            print ("resuming from previously saved checkpoint")

            # Run inference (note that the session does not need to be fed a label because
            # we are not calculating the loss or running the optimizer)
            predict_result = sess.run([self.prediction], feed_dict={self.input_data: img})

            return float(np.squeeze(predict_result))

    
    #########################
    ######## Layers #########
    #########################
    
    def fc_layer(self, input_tensor, output_neurons, name, doRelu=True):
        with tf.variable_scope(name):
                          
            shape = input_tensor.get_shape().as_list()
            dim = 1
            for d in shape[1:]:
                dim *= d
            x = tf.reshape(input_tensor, [-1, dim])
                          
            activation = None
            if doRelu:
                activation = tf.nn.relu
            fc = tf.layers.dense(x, output_neurons, activation=activation)

            return fc


    def conv_layer(self, input_tensor, output_channels, name, kernel=3, stride=1, doRelu=True):
        with tf.variable_scope(name):
            strides = (stride,stride)
            conv = tf.layers.conv2d(input_tensor, filters=output_channels, kernel_size=kernel, strides=strides, padding='SAME', data_format='channels_last', kernel_initializer=tf.keras.initializers.glorot_normal())

            if doRelu:
                conv = tf.nn.relu(conv)

            return conv

    def avg_pool(self, input_tensor, name, stride=2):
        return tf.nn.avg_pool(input_tensor, ksize=[1, 2, 2, 1], strides=[1, stride, stride, 1], padding='SAME', name=name)

    def max_pool(self, input_tensor, name, stride=2):
        return tf.nn.max_pool(input_tensor, ksize=[1, 2, 2, 1], strides=[1, stride, stride, 1], padding='SAME', name=name)

    #########################
    ##### Import Data #######
    #########################    

    def prep_data(self, g_data, g_label, batch_size, crop_size=81):

        # FUNCTION
        # Given binary files, imports data as tensorflow dataset objects

        # INPUT
        # **g_data: filepath to binary data file
        # **g_label: filepath to binary label file
        # **batch_size: the number of examples the dataset will supply at each iteration
        # **crop_size: size of the input image (assumed to be square)

        # OUTPUT
        # **tensorflow dataset object

        filename_dataset = tf.data.Dataset.list_files(g_data)

        image_dataset = filename_dataset.map(lambda x: tf.decode_raw(tf.read_file(x), tf.uint8))
        image_dataset = image_dataset.map(lambda x: tf.reshape(x, [-1, crop_size, crop_size, 3]))
        image_dataset = image_dataset.flat_map(lambda x: tf.data.Dataset.from_tensor_slices(x))
        image_dataset = image_dataset.batch(batch_size, drop_remainder=True)

        filename_dataset = tf.data.Dataset.list_files(g_label)

        label_dataset = filename_dataset.map(lambda x: tf.decode_raw(tf.read_file(x), tf.uint8))
        label_dataset = label_dataset.map(lambda x: tf.reshape(x, [-1, 1]))
        label_dataset = label_dataset.flat_map(lambda x: tf.data.Dataset.from_tensor_slices(x))
        label_dataset = label_dataset.batch(batch_size, drop_remainder=True)

        full_dataset = tf.data.Dataset.zip((image_dataset, label_dataset))
        full_dataset = full_dataset.prefetch(buffer_size=batch_size*5)
        full_dataset = full_dataset.shuffle(buffer_size=batch_size*10)
        full_dataset = full_dataset.repeat()

        return full_dataset

# Build and Train Model

In [3]:
tf.reset_default_graph()

In [4]:


data_dir = r'C:\Users\RENT\m\CityScapes\data_dir'
out_dir = r'C:\Users\RENT\m\CityScapes\output'
ckpt_path = r'C:\Users\RENT\m\CityScapes\output\TLNet.ckpt'
model = TLNet(data_dir=data_dir, out_dir=out_dir, crop_size=81)

Instructions for updating:
Use `tf.keras.layers.Conv2D` instead.
Instructions for updating:
Please use `layer.__call__` method instead.
Instructions for updating:
Use keras.layers.Dense instead.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where


In [5]:

#train(self, iters, learning_rate, batch_size, restore=False, ckpt_path=None):
model.train(1000, .0005, 24*2, restore=False, ckpt_path=ckpt_path)


Instructions for updating:
Use `tf.compat.v1.data.get_output_types(dataset)`.
Instructions for updating:
Use `tf.compat.v1.data.get_output_shapes(dataset)`.
Instructions for updating:
Use `tf.compat.v1.data.get_output_types(iterator)`.
Instructions for updating:
Use `tf.compat.v1.data.get_output_shapes(iterator)`.
Instructions for updating:
Use `tf.compat.v1.data.get_output_classes(iterator)`.
Iter: 10, Current train loss: 0.663870
Iter: 20, Current train loss: 0.649088
Iter: 30, Current train loss: 0.634894
Iter: 40, Current train loss: 0.613644
Iter: 50, Current train loss: 0.607394
Iter: 60, Current train loss: 0.611693
Iter: 70, Current train loss: 0.609833
Iter: 80, Current train loss: 0.599338
Iter: 90, Current train loss: 0.588259
Iter: 100, Current train loss: 0.584676
Saving checkpoint
Performing Evaluation
Current loss on val set: 0.420714
Current accuracy on val set (percent of examples labeled correctly): 0.7708333333333334
Iter: 110, Current train loss: 0.583886
Iter: 120

In [6]:
model.evaluate(ckpt_path=ckpt_path, batch_size=200)


INFO:tensorflow:Restoring parameters from C:\Users\RENT\m\CityScapes\output\TLNet.ckpt
resuming from previously saved checkpoint
Loss on prediction was:
0.32579038
Prediction was:
[9.94965792e-01 3.30614448e-01 8.69884610e-01 1.68733686e-01
 9.78450954e-01 1.46060109e-01 6.25239193e-01 8.78840983e-02
 9.52152789e-01 3.78055423e-01 9.48817968e-01 6.73235655e-01
 6.55926764e-01 1.59432560e-01 9.95313764e-01 5.50065041e-02
 9.80191588e-01 2.79084742e-02 9.78257716e-01 2.67519057e-02
 9.95723605e-01 9.12732780e-02 8.96021366e-01 2.21678853e-01
 2.74720609e-01 1.62198126e-01 1.65425599e-01 2.83115804e-01
 9.57064807e-01 3.37624431e-01 1.96381807e-02 3.27995002e-01
 7.30727375e-01 1.86159015e-02 7.75111139e-01 2.51118749e-01
 5.52193224e-01 8.28776956e-02 9.90196109e-01 1.28346354e-01
 8.55542302e-01 8.05739760e-01 4.02074337e-01 2.11368293e-01
 9.95648026e-01 4.98700440e-02 9.95713770e-01 7.74019718e-01
 9.91321445e-01 2.35939026e-01 9.99160647e-01 1.00343019e-01
 9.92553890e-01 2.51507699e

# Run Inference

In [7]:
fake_image = np.zeros((81,81,3), dtype=np.uint8)
model.predict(ckpt_path, fake_image)

INFO:tensorflow:Restoring parameters from C:\Users\RENT\m\CityScapes\output\TLNet.ckpt
resuming from previously saved checkpoint


0.13355004787445068