## Tensorflow with example

Thanks to @edureka on youtube.com, I prepared the following notebook for a quick hands-on neural network model for a classification problem. 

In [1]:
import tensorflow as tf
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

In [2]:
df = pd.read_csv('../data/Sonar.csv')

In [3]:
df.head()

Unnamed: 0,V1,V2,V3,V4,V5,V6,V7,V8,V9,V10,...,V52,V53,V54,V55,V56,V57,V58,V59,V60,Class
0,0.02,0.0371,0.0428,0.0207,0.0954,0.0986,0.1539,0.1601,0.3109,0.2111,...,0.0027,0.0065,0.0159,0.0072,0.0167,0.018,0.0084,0.009,0.0032,1
1,0.0453,0.0523,0.0843,0.0689,0.1183,0.2583,0.2156,0.3481,0.3337,0.2872,...,0.0084,0.0089,0.0048,0.0094,0.0191,0.014,0.0049,0.0052,0.0044,1
2,0.0262,0.0582,0.1099,0.1083,0.0974,0.228,0.2431,0.3771,0.5598,0.6194,...,0.0232,0.0166,0.0095,0.018,0.0244,0.0316,0.0164,0.0095,0.0078,1
3,0.01,0.0171,0.0623,0.0205,0.0205,0.0368,0.1098,0.1276,0.0598,0.1264,...,0.0121,0.0036,0.015,0.0085,0.0073,0.005,0.0044,0.004,0.0117,1
4,0.0762,0.0666,0.0481,0.0394,0.059,0.0649,0.1209,0.2467,0.3564,0.4459,...,0.0031,0.0054,0.0105,0.011,0.0015,0.0072,0.0048,0.0107,0.0094,1


**Preparing the sample and labels:** 

Note the .values applied to df. For later use in tensorflow, X should be an array not a pandas data frame. 

In [4]:
X = df.drop('Class', axis = 1).values
y = df['Class']

In [5]:
def one_hot_encode(labels):
    n_labels = len(labels)
    n_unique_labels = len(np.unique(labels))
    one_hot_encode = np.zeros((n_labels, n_unique_labels))
    one_hot_encode[np.arange(n_labels), labels] = 1
    return one_hot_encode

In [6]:
y_o = one_hot_encode(y)

In [7]:
Xtrain, Xtest, ytrain, ytest = train_test_split(X, y_o, test_size = 0.2, random_state = 0)

In [8]:
learning_rate = 0.3
training_epochs = 1500
cost_history = np.empty(shape=[1], dtype = float)
n_class = 2
model_path = "./dp_model/dp_solar_model"

We use a neural network with 6 layers (i.e. 4 hidden layers), with the following dimensions:

In [9]:
n_0 = X.shape[1] # this is equal to the number of features per sample
n_1 = 50 # size of hidden layer 1
n_2 = 50 # size of hidden layer 2
n_3 = 50 # size of hidden layer 3
n_4 = 50 # size of hidden layer 4
n_5 = len(np.unique(y)) # this is equal to the number of classes 

In [10]:
x_ = tf.placeholder(tf.float32, shape=[None, n_0], name = 'x_')
yreal_ = tf.placeholder(tf.float32, shape=[None, n_5], name = 'yreal_')

Overall, we have 6 layers: 0, 1, 2, 3, 4, 5. The first layer is denoted by $\boldsymbol{a}^{(0)}$ and the final layer is denoted by $a^{(L)}$ where $L = 5$. So, for each layer $l$, we have: $$\boldsymbol{a}^l = \mathrm{W}^l \boldsymbol{a}^{l - 1} + \boldsymbol{b}^l.$$

Even though for one sample it is convenient to write the linear map as above, with many samples we will rewrite the above forumlat as follows:
\begin{eqnarray}
\boldsymbol{a}^{(1)} = \boldsymbol{a}^{(l - 1)} \mathrm{W}^{(1)} + \boldsymbol{b}^{(1)}
\end{eqnarray}
where the dimenstion of $W$ is $n_0 \times n_1$.

Since $\boldsymbol{a}^{(0)}$ is indeed the input $(x)_{N \times n_0}$, where $N$ is the number of samples, the above formula reads as: 
\begin{eqnarray}
\boldsymbol{a}^{(1)} = \boldsymbol{x} \mathrm{W}^{(1)} + \boldsymbol{b}^{(1)}
\end{eqnarray}
Verify that the dimension of $\boldsymbol{a}^{(1)}$ is $N \times n_1$.

\begin{eqnarray}
\boldsymbol{z}^l = \boldsymbol{a}^{l - 1}  \mathrm{W}^l + \boldsymbol{b}^l
\end{eqnarray}


Therefore the dimensions of the weight matrices are as follows:

$\mathrm{W}^1$ is $n_0 \times n_1$

$\mathrm{W}^2$ is $n_1 \times n_2$

$\mathrm{W}^3$ is $n_2 \times n_3$

$\mathrm{W}^4$ is $n_3 \times n_4$

$\mathrm{W}^5$ is $n_4 \times n_5$

In [11]:
w = {
    '1' : tf.Variable(tf.truncated_normal([n_0, n_1])), 
    '2' : tf.Variable(tf.truncated_normal([n_1, n_2])),
    '3' : tf.Variable(tf.truncated_normal([n_2, n_3])),
    '4' : tf.Variable(tf.truncated_normal([n_3, n_4])),
    '5' : tf.Variable(tf.truncated_normal([n_4, n_5]))
}
b = {
    '1' : tf.Variable(tf.truncated_normal([n_1])),
    '2' : tf.Variable(tf.truncated_normal([n_2])),
    '3' : tf.Variable(tf.truncated_normal([n_3])),
    '4' : tf.Variable(tf.truncated_normal([n_4])),
    '5' : tf.Variable(tf.truncated_normal([n_5]))
}

The next step is to define the hypothesis function of the neural network. That is the formula for the output layer which is found by the sequence of $\boldsymbol{a}^l =  \boldsymbol{a}^{l - 1} \mathrm{W}^l + \boldsymbol{b}^l$

In [12]:
with tf.variable_scope('layer_1'):
    z_1 = tf.add(tf.matmul(x_, w['1']), b['1'])
    a_1 = tf.nn.sigmoid(z_1)

with tf.variable_scope('layer_2'):
    z_2 = tf.add(tf.matmul(a_1, w['2']), b['2'])
    a_2 = tf.nn.sigmoid(z_2)

with tf.variable_scope('layer_3'):
    z_3 = tf.add(tf.matmul(a_2, w['3']), b['3'])
    a_3 = tf.nn.sigmoid(z_3)

with tf.variable_scope('layer_4'):
    z_4 = tf.add(tf.matmul(a_3, w['4']), b['4'])
    a_4 = tf.nn.relu(z_4)

with tf.variable_scope('output_layer'):
    ypred_ = tf.add(tf.matmul(a_4, w['5']), b['5'])
#     ypred_ = tf.nn.relu(z_out)

In [13]:
cost_function_ = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=ypred_, labels=yreal_))

In [14]:
with tf.variable_scope('logging'):
    tf.summary.scalar('cost', cost_function_)
    summary = tf.summary.merge_all()

In [15]:
init = tf.global_variables_initializer()

In [16]:
training_step_ = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost_function_)

In [17]:
Xtrain.shape

(166, 60)

In [18]:
cost_history = []
accuracy_history = []

In [19]:
correct_prediction_ = tf.equal(tf.argmax(yreal_, 1), tf.argmax(ypred_, 1))

In [20]:
accuracy_ = tf.reduce_mean(tf.cast(correct_prediction_, tf.float32))

In [21]:
with tf.Session() as sess:
    sess.run(init)
    training_writer = tf.summary.FileWriter("../logs/training", sess.graph)
    for epoch in range(training_epochs):
        sess.run(training_step_, feed_dict={x_:Xtrain, yreal_:ytrain})
        cost, training_summary = sess.run([cost_function_, summary], feed_dict={x_:Xtrain, yreal_:ytrain})
        training_writer.add_summary(training_summary, epoch)
        accuracy = sess.run(accuracy_, feed_dict={x_:Xtrain, yreal_:ytrain})
        cost_history = np.append(cost_history, cost)
        accuracy_history = np.append(accuracy_history, accuracy_)
        if epoch % 10 == 0:
        	print("epoch: ", epoch, " - cost", cost, "- accuracy ", accuracy)
# accuracy on the test set:
    sess.run(accuracy_, feed_dict={x_:Xtest, yreal_:ytest})
    ypredLabel_ = tf.argmax(ypred_, 1)
    ypredLabel = sess.run(ypredLabel_, feed_dict={x_:Xtest})

epoch:  0  - cost 56.95342 - accuracy  0.45783132
epoch:  10  - cost 0.6973416 - accuracy  0.45783132
epoch:  20  - cost 0.68456984 - accuracy  0.57831323
epoch:  30  - cost 0.68279636 - accuracy  0.57831323
epoch:  40  - cost 0.681412 - accuracy  0.56626505
epoch:  50  - cost 0.679464 - accuracy  0.58433735
epoch:  60  - cost 0.6957744 - accuracy  0.4698795
epoch:  70  - cost 0.6887726 - accuracy  0.5421687
epoch:  80  - cost 0.68841714 - accuracy  0.5421687
epoch:  90  - cost 0.6881763 - accuracy  0.5421687
epoch:  100  - cost 0.68789387 - accuracy  0.5421687
epoch:  110  - cost 0.68756396 - accuracy  0.5421687
epoch:  120  - cost 0.6871912 - accuracy  0.5421687
epoch:  130  - cost 0.6867595 - accuracy  0.5421687
epoch:  140  - cost 0.6862539 - accuracy  0.5421687
epoch:  150  - cost 0.6856283 - accuracy  0.5421687
epoch:  160  - cost 0.68488014 - accuracy  0.5421687
epoch:  170  - cost 0.68027896 - accuracy  0.5421687
epoch:  180  - cost 0.67010134 - accuracy  0.5421687
epoch:  190 

Now we evaluate the accuracy of the trained model on the test set:

In [22]:
ytestLabel = np.argmax(ytest, 1)

In [23]:
pd.DataFrame(np.transpose([ypredLabel, ytestLabel]), columns=['Actual', 'Prediction'])

Unnamed: 0,Actual,Prediction
0,0,1
1,1,1
2,0,1
3,1,1
4,0,0
5,1,1
6,0,0
7,0,0
8,1,1
9,0,0
