### Deep Neural Network using TensorFlow 

Here we will learn to implement the same using tensorflow.

- First we will try to see limitation of small data size for label classification

- We will then use large dataset and change neurons in each hidden layers to improve accuracy

- Lastly, we will introduce drop outs to regularize our network



In [1]:
import tensorflow as tf
import numpy as np
import h5py

#### First run your cat vs non-cat dataset that you have used before

In [17]:
secondRUN=1

In [3]:
# prepare your data
# 1(a) load data from .h5 file and convert to numpy array
# training set
train_dataset = h5py.File('datasets/cat-non-cat/train_catvnoncat.h5', "r")
train_x = np.array (train_dataset["train_set_x"][:])
train_y = np.array( train_dataset["train_set_y"][:])
# testing set
test_dataset = h5py.File('datasets/cat-non-cat/test_catvnoncat.h5', "r")
test_x = np.array (test_dataset["test_set_x"][:])
test_y = np.array( test_dataset["test_set_y"][:])
# class list
classes = np.array(test_dataset["list_classes"][:])

#flatten
train_X= train_x.reshape(train_x.shape[0], -1)/255.
test_X= test_x.reshape(test_x.shape[0], -1)/255.

# warning: check dimensions and be sure of it!!!
print('classes are:', classes)
print('train_x:', train_X.shape)
print('train_y (training labels):', train_y.shape)

classes are: [b'non-cat' b'cat']
train_x: (209, 12288)
train_y (training labels): (209,)


### Skip this in your first run 

**MNIST** dataset consists of 60,000 training data and 10,000 test data 

In [18]:
if secondRUN:
    mnist = tf.keras.datasets.mnist
    (x_train, train_y),(x_test, test_y) = mnist.load_data()
    train_X, test_X = x_train.reshape(x_train.shape[0], -1) / 255.0, x_test.reshape(x_test.shape[0], -1) / 255.0
    print(train_X.shape)
    print(test_X.shape)

(60000, 784)
(10000, 784)


**Run from here**

- Try changing the number of neuronal connections in each hiddden layes
- For MNIST dataset (n_output = 2)

In [31]:
# recall your network but now increase number of neurons here!!! (the accuracy should rise up)
n_hidden1 = 200    # 512
n_hidden2 = 100    # 256
n_hidden3 = 50     #  64
n_inputs = train_X.shape[1]
samples=train_y.shape[0]
n_output = len(classes)
if secondRUN:
    n_output= 10

print(samples)
print(n_output)

60000
10


**Tensorflow placeholder**

A placeholder is simply a variable that we will assign data to at a later step. It allows us to create our operations and build our computation graph, without needing the data. In TensorFlow terminology, we then feed data into the graph through these placeholders.

In [32]:
X = tf.placeholder(tf.float32, shape=(None,n_inputs), name="X")
# y = tf.placeholder(tf.float32, shape=(None), name="y")
y = tf.placeholder(tf.int32, shape=(None), name="y")
keep_prob = tf.placeholder(tf.float32, name="keep_prob")
X.shape

TensorShape([Dimension(None), Dimension(784)])

**Neuron layer function**

This function will help us to design our layers. It will need parameters: {inputs, number of neurons, the activation function, and the name of the layer}.

Recall,
For layer $l$, the linear part is: $Z^{[l]} = W^{[l]} A^{[l-1]} + b^{[l]}$ (weighted sum of inputs plus biases)


In [33]:
def neuron_layer (A, n_neurons, name, activation=None):
    with tf.name_scope(name):
        n_inputs = int(A.get_shape()[1])
        stddev = 2/np.sqrt(n_inputs+n_neurons)
        
        # Outputs random values from a truncated normal distribution. (Note: this will allow faster convergence)... 
        # you can play with this and see how it affects ...
        init = tf.truncated_normal((n_inputs,n_neurons), stddev = stddev)
        W = tf.Variable(init, name="kernel")
        b = tf.Variable(tf.zeros([n_neurons]), name ="bias")
        
        Z = tf.matmul(A, W)+b
        if activation is not None:
            return activation (X)
        else:
            return Z

In [34]:
with tf.name_scope("dnn"):
    hiddden1 = neuron_layer(X, n_hidden1, name="hidden1", activation=tf.nn.relu)
    hiddden2 = neuron_layer(hiddden1, n_hidden2, name="hidden2", activation=tf.nn.relu)
    hiddden3 = neuron_layer(hiddden2, n_hidden3, name="hidden3", activation=tf.nn.relu)
    
    logits = neuron_layer(hiddden3, n_output, name="outputs")

In [35]:
# with drop-out
# with tf.name_scope("dnn"):
#     hiddden1 = neuron_layer(X, n_hidden1, name="hidden1", activation=tf.nn.relu)
#     hiddden2 = neuron_layer(hiddden1, n_hidden2, name="hidden2", activation=tf.nn.relu)
#     hiddden3 = neuron_layer(hiddden2, n_hidden3, name="hidden3", activation=tf.nn.relu)
# #     hiddden3 = tf.nn.dropout(hiddden3, keep_prob)
#     drop_out = tf.nn.dropout(hiddden3, 0.25)  # DROP-OUT here
#     logits = neuron_layer(drop_out, n_output, name="outputs")

**Define loss or compute cost**

Here we have used, [sigmoid cross entropy loss](https://www.tensorflow.org/api_docs/python/tf/nn/sigmoid_cross_entropy_with_logits)

In [36]:
with tf.name_scope("loss"):
    xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=logits)
    l = tf.reduce_mean(xentropy, name = "loss")

**Optimization**


In [37]:
learning_rate = 0.005
with tf.name_scope("train"):
    optimizer = tf.train.GradientDescentOptimizer(learning_rate)
    training_op=optimizer.minimize(l)

**Prediction**

In [38]:
with tf.name_scope("eval"):
    correct = tf.nn.in_top_k(logits, y, 1)
    accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))

In [39]:
init = tf.global_variables_initializer()
save = tf.train.Saver()

In [40]:
with tf.name_scope("summaries"):
    tf.summary.scalar("loss", l)
    merged = tf.summary.merge_all()

In [41]:
def shuffle_batch(X, y, batch_size):
    rnd_idx = np.random.permutation(len(X))
    n_batches = len(X) // batch_size
    for batch_idx in np.array_split(rnd_idx, n_batches):
        X_batch, y_batch = X[batch_idx], y[batch_idx]
        yield X_batch, y_batch

In [43]:
# Execute
n_epochs = 2000
batch_size = 256
# number of training samples is critical for training your deep neural network
with tf.Session() as sess:
    init.run()
    
    for epoch in range(n_epochs):
#         for iteration in range (train_X.shape[0]// batch_size):
        for X_batch, y_batch in shuffle_batch (train_X, train_y, batch_size):
            _, loss_val = sess.run([training_op, l], feed_dict={X: X_batch, y: y_batch })
        if epoch % 10 == 0:    
            acc_train = accuracy.eval(feed_dict={X: X_batch, y: y_batch})
            acc_val = accuracy.eval(feed_dict={X: test_X, y: test_y})
            print("epoch:", epoch, "loss:",loss_val , "Training (batch) accuracy:", acc_train, "Val accuracy:", acc_val)
            
#     save_path= saver.save(sess, "myFirstModel.ckpt")

epoch: 0 loss: 1.4478667 Training (batch) accuracy: 0.69921875 Val accuracy: 0.6627
epoch: 10 loss: 0.6228868 Training (batch) accuracy: 0.82421875 Val accuracy: 0.87
epoch: 20 loss: 0.39652473 Training (batch) accuracy: 0.8984375 Val accuracy: 0.8854
epoch: 30 loss: 0.39734223 Training (batch) accuracy: 0.8984375 Val accuracy: 0.8933
epoch: 40 loss: 0.3940316 Training (batch) accuracy: 0.91796875 Val accuracy: 0.8976
epoch: 50 loss: 0.4544785 Training (batch) accuracy: 0.87109375 Val accuracy: 0.9014
epoch: 60 loss: 0.42724705 Training (batch) accuracy: 0.8828125 Val accuracy: 0.9046
epoch: 70 loss: 0.4051661 Training (batch) accuracy: 0.890625 Val accuracy: 0.9065
epoch: 80 loss: 0.31898707 Training (batch) accuracy: 0.91796875 Val accuracy: 0.9088
epoch: 90 loss: 0.40013003 Training (batch) accuracy: 0.890625 Val accuracy: 0.9097
epoch: 100 loss: 0.3063572 Training (batch) accuracy: 0.921875 Val accuracy: 0.9112
epoch: 110 loss: 0.4346578 Training (batch) accuracy: 0.87890625 Val ac

epoch: 960 loss: 0.27997655 Training (batch) accuracy: 0.90625 Val accuracy: 0.9235
epoch: 970 loss: 0.27361807 Training (batch) accuracy: 0.9375 Val accuracy: 0.9236
epoch: 980 loss: 0.26678193 Training (batch) accuracy: 0.9453125 Val accuracy: 0.9237
epoch: 990 loss: 0.29656076 Training (batch) accuracy: 0.8984375 Val accuracy: 0.9239
epoch: 1000 loss: 0.2304095 Training (batch) accuracy: 0.93359375 Val accuracy: 0.9237
epoch: 1010 loss: 0.2303172 Training (batch) accuracy: 0.921875 Val accuracy: 0.9236
epoch: 1020 loss: 0.3135133 Training (batch) accuracy: 0.94921875 Val accuracy: 0.9237
epoch: 1030 loss: 0.29263908 Training (batch) accuracy: 0.93359375 Val accuracy: 0.9237
epoch: 1040 loss: 0.2622586 Training (batch) accuracy: 0.921875 Val accuracy: 0.9238
epoch: 1050 loss: 0.31181186 Training (batch) accuracy: 0.921875 Val accuracy: 0.9238
epoch: 1060 loss: 0.2960021 Training (batch) accuracy: 0.921875 Val accuracy: 0.9237
epoch: 1070 loss: 0.32274312 Training (batch) accuracy: 0.

epoch: 1910 loss: 0.24972343 Training (batch) accuracy: 0.9296875 Val accuracy: 0.9249
epoch: 1920 loss: 0.28432736 Training (batch) accuracy: 0.9296875 Val accuracy: 0.9249
epoch: 1930 loss: 0.23483682 Training (batch) accuracy: 0.9296875 Val accuracy: 0.925
epoch: 1940 loss: 0.31743768 Training (batch) accuracy: 0.9375 Val accuracy: 0.9249
epoch: 1950 loss: 0.30538252 Training (batch) accuracy: 0.93359375 Val accuracy: 0.9249
epoch: 1960 loss: 0.28561026 Training (batch) accuracy: 0.92578125 Val accuracy: 0.9249
epoch: 1970 loss: 0.26672268 Training (batch) accuracy: 0.94921875 Val accuracy: 0.925
epoch: 1980 loss: 0.260058 Training (batch) accuracy: 0.9140625 Val accuracy: 0.9247
epoch: 1990 loss: 0.2039333 Training (batch) accuracy: 0.94921875 Val accuracy: 0.925


### Exploring training on larger datasets

### Tasks:
    [0] Use MNIST data to train your network (play with different network architectures!!!)
    [1] Split the data into train-val set and retrain your model also computing validation accuracy
    [2] Make deeper network to achieve higher performance of your training model
    [3] Visualize your loss and accuracy on tensorboard
    [4] Add droppout layer and retrain your model. Comment on the difference that you have seen in the training function

In [None]:
# help for loading MNIST data
import numpy as np
data = np.load('datasets/MNIST_data/mnist.npz')
x_train= data['x_train']
y_train = data['y_train']
x_test = data['x_test']
y_test = data['y_test']

# Note if this does not work then use below
# mnist = tf.keras.datasets.mnist
# (x_train, y_train), (x_test, y_test) = mnist.load_data()