# Deep neural network using tensorflow

In this tutorial I want to implement a DNN that I explained **here** using tensorflow. Tensorflow is a framework (In really simply words: a bunch of commands to quickly write programs for deep learning) a lot used these days. So what a best opportunity to grasp the basic directly in a project. **here** a cheatsheet for tensorflow.

Let's use the same dataset used for logistic regression.

In [109]:
import pandas as pd
db=pd.read_csv("abalone.data", header=None)
db.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8
0,M,0.455,0.365,0.095,0.514,0.2245,0.101,0.15,15
1,M,0.35,0.265,0.09,0.2255,0.0995,0.0485,0.07,7
2,F,0.53,0.42,0.135,0.677,0.2565,0.1415,0.21,9
3,M,0.44,0.365,0.125,0.516,0.2155,0.114,0.155,10
4,I,0.33,0.255,0.08,0.205,0.0895,0.0395,0.055,7


## Data preprocessing

We want to classify based on the information in the database (columns 1-8) if the abalone if Male, Female or Infant (column 0). So let's split the database in examples and labels. Convert them to mnumpy vector/matrix in order to iterate through them and then encode every element in the label vector as a 3-element (k=3) vector

In [110]:
import numpy as np
labels=db[0].as_matrix()
examples=db.drop(0, axis=1).as_matrix()
zeros=np.zeros((labels.shape[0],3))

#encode the labels
for idx, item in enumerate(labels):
    if item=='I':
        zeros[idx]=np.array([1,0,0])
    elif item=='M':
        zeros[idx]=np.array([0,1,0])
    else:
        zeros[idx]=np.array([0,0,1])
labels=zeros
print("Examples shape:", examples.shape)
print("Labels shape:", labels.shape)

Examples shape: (4177, 8)
Labels shape: (4177, 3)


The last thing to do before define the graph is shuffle the examples and then split the dataset in train and test examples.

In [111]:
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test=train_test_split(examples, labels, test_size=0.3, shuffle=True)
print("Train examples shape:", x_train.shape)
print("Test examples shape:", x_test.shape)

Train examples shape: (2923, 8)
Test examples shape: (1254, 8)


# The model

## Define the graph

Now we can define the number of layers and the number of neurons that our model is going to have. First of all, we have examples with **8 features** (8 columns), so the **input layer** is going to have 8 neurons. Secondly, every label example is a 3-dimensional vector (k=3), so the **output layer** is going to have k neurons. In other words, **3 neurons**. Now we have to define how many hidden layers (I'm thinking 3 to keep it easy) and how many neurons for each hidden layer (I'm thinking 20, 30, 40)

In [112]:
import tensorflow as tf
tf.reset_default_graph()

#input layer (8 neurons)
examples=tf.placeholder(tf.float32, shape=[None,8])

#first hidden layer (variables initialized with uniform distribution)
w1=tf.get_variable("h1_weights", [8,20])
b1=tf.get_variable("h1_biases", [20])
logit1=tf.add(tf.matmul(examples,w1), b1)
output1=tf.sigmoid(logit1) #output1 is a [N,20] matrix

#second hidden layer (variables initialized with a gaussian distribution )
initW2=tf.random_normal((20,30), mean=0, stddev=1)
initB2=tf.random_normal((1,30), mean=0, stddev=1)
w2=tf.get_variable("h2_weights", initializer=initW2)
b2=tf.get_variable("h2_biases", initializer=initB2)
logit2=tf.add(tf.matmul(output1,w2), b2)
output2=tf.sigmoid(logit2) #output2 is a [N,200] matrix

#third hidden layer
initW3=tf.random_normal((30,40), mean=0, stddev=1)
initB3=tf.random_normal((1,40), mean=0, stddev=1)
w3=tf.get_variable("h3_weights", initializer=initW3)
b3=tf.get_variable("h3_biases", initializer=initB3)
logit3=tf.add(tf.matmul(output2,w3), b3)
output3=tf.sigmoid(logit3) #output3 is a [N,40] matrix

#output layer(3 neurons)
initWOut=tf.random_normal((40,3), mean=0, stddev=1)
initBOut=tf.random_normal((1,3), mean=0, stddev=1)
wout=tf.get_variable("out_weights", initializer=initWOut)
bout=tf.get_variable("out_biases", initializer=initBOut)
logitout=tf.add(tf.matmul(output3,wout),bout)
output=tf.nn.softmax(logitout)

#placeholder to feed the labels
labels=tf.placeholder(tf.float32, shape=[None,3])

#calculate cost (I added 1e-9=0.000000001) to avoid log(0)=infinity
exampleError=-tf.reduce_sum(labels * tf.log(output + 1e-9)+ (1-labels) * tf.log(1-output+ 1e-9), 1)
error=tf.reduce_mean(exampleError,0)

#define optimizer for backpropragation
learning_rate=0.0001
opt=tf.train.GradientDescentOptimizer(learning_rate)
gradient=opt.minimize(error)

#define the operation to meausure the accuracy
#for every example's output vector, return the index of highest value, if it match the index of the result
#return 1 (correct prediction) else 0
pred=tf.equal(tf.argmax(labels,1), tf.argmax(output,1))
#calculate over all the examples, how many are 1, more ones, more correct prediction, more accuracy!
accuracy=tf.reduce_mean(tf.cast(pred, tf.float32)) #it is a number between 0-1

Before run the graph, I'm going to define a series of statement that we are going to use later to show the value of different variables such as the distribution of weights, the error and the distribution of the output in each layer using **tensorboard**

In [113]:
tf.summary.histogram('h1_weight', w1)
tf.summary.histogram('h1_output', output1)
tf.summary.histogram('h2_weight', w2)
tf.summary.histogram('h2_output', output2)
tf.summary.histogram('h3_weight', w3)
tf.summary.histogram('h3_output', output3)
tf.summary.histogram('output_weight', wout)
tf.summary.histogram('output', output)

tf.summary.scalar('error', error)
tf.summary.scalar('accuracy', accuracy)

<tf.Tensor 'accuracy:0' shape=() dtype=string>

## Run the graph

Now that we have defined the graph, we have to:
initialize the variables,
group the summaries defined above and
save the graph structure and the values inside the graph (weight, biases, error.. values) on a file

In [114]:
init=tf.global_variables_initializer()
summ=tf.summary.merge_all()
file=tf.summary.FileWriter('summaries/', sess.graph)

Finally, we have to define define epoches and batches and then we can run the graph!

We devide the train dataset in batches of 100 examples. We are going to feed 100 examples each time instead of all togheter to lighten the computational power required. We start with the first 100 examples (from 0 to 99) and we say to tensorflow, pass the first 100 examples x_train\[batch_start:batch_end\] through the placeholder named `example` and the first 100 y_train labels through the placeholder `labels`.
The next line ask to compute the variables gradient and error defined before. The variables gradient will update the weights, while the variable error we are going to use it to see how the error is changing through time

In [115]:
N=x_train.shape[0] #take number of examples in the train dataset
epoches=1000
batches=int(N/100) #100 examples in each batch

#run the graph
with tf.Session() as sess:
    sess.run(init)
    for epoch in range(epoches):
        batch_start=0
        batch_end=100
        ep_error=0 #error for the epoch
        for batch in range(batches):
            feed={examples:x_train[batch_start:batch_end], labels:y_train[batch_start:batch_end]}
            #ask to compute the gradient and error variables
            _,err=sess.run([gradient,error], feed_dict=feed)
            batch_start+=100
            batch_end +=100
            ep_error+=err #error for the batch

        if epoch%10==0:
            feed={examples:x_test, labels:y_test}
            acc, summary=sess.run([accuracy,summ], feed)
            #add the values of the graph to the file and assign it to their epoch
            file.add_summary(summary, epoch)

        if epoch%50==0:
            print("Epoch",epoch,"| Error", ep_error, "| Accuracy", acc)

print("Training finished!")

Epoch 0 | Error 111.56911993026733 | Accuracy 0.32535884
Epoch 50 | Error 78.18855547904968 | Accuracy 0.13716109
Epoch 100 | Error 59.14495897293091 | Accuracy 0.14593302
Epoch 150 | Error 55.31067454814911 | Accuracy 0.3676236
Epoch 200 | Error 55.01319134235382 | Accuracy 0.42424244
Epoch 250 | Error 54.83446550369263 | Accuracy 0.47767144
Epoch 300 | Error 54.661794781684875 | Accuracy 0.4952153
Epoch 350 | Error 54.49092781543732 | Accuracy 0.5
Epoch 400 | Error 54.32077193260193 | Accuracy 0.50717705
Epoch 450 | Error 54.15148341655731 | Accuracy 0.52073365
Epoch 500 | Error 53.983920216560364 | Accuracy 0.5350877
Epoch 550 | Error 53.819246768951416 | Accuracy 0.53748006
Epoch 600 | Error 53.658419370651245 | Accuracy 0.5366826
Epoch 650 | Error 53.501829385757446 | Accuracy 0.5334928
Epoch 700 | Error 53.34944772720337 | Accuracy 0.52870816
Epoch 750 | Error 53.20088732242584 | Accuracy 0.52711326
Epoch 800 | Error 53.05561935901642 | Accuracy 0.52551836
Epoch 850 | Error 52.91

Every 10 epochs we are going to compute the accuracy and save the values of the graph on the file that we subsequently access to print the data on tensorboard using the command `tensorboard --logdir=path/summaries` and then open the broswer and type `localhost/portNumber`