We will replicate the work we did in Hw1P3 using a neural network. 

We first need to install tensorflow and other dependencies.

In [1]:
import tensorflow as tf
import scipy.io as sio
import numpy as np
#import matplotlib.pyplot as plt

from datetime import datetime

Then we load the matrix A1P3 from the matlab file and store its feature vectors in trainX and its labels in trainY.

In [2]:
# Import data
A = sio.loadmat('/Users/f002bpv/Documents/MATLAB/engg177/hw1/A1P3.mat')['A1P3']
trainX = A[1:,:].T
trainY = A[0,:].T.reshape(10000,1)
#trainY = np.concatenate([trainY,abs(trainY-1)],axis=1)

As a sanity check, we'll check the shapes of these objects.

In [3]:
A.shape

(5, 10000)

In [4]:
trainX.shape

(10000, 4)

In [5]:
trainY.shape

(10000, 1)

For convenience for code reuse, we create some parameters that will inform the rest of the model. 

In [6]:
#Set Model parameters
num_features = trainX.shape[1]
num_labels = trainY.shape[1]
num_samples = trainX.shape[0]

We then set training parameters. We'll use a decaying learning rate.

In [8]:
# Training parameters
training_epochs =10000
epoch_size = 100
display_step = 500
global_step = tf.Variable(0, trainable=False)
learning_rate = tf.train.exponential_decay(learning_rate=0.01,
                                          global_step= global_step,
                                          decay_steps=num_samples,
                                          decay_rate= 0.95,
                                          staircase=False)

Finally, we set a directory that specifies where our summaries will be written.

In [9]:
summaries_dir = "/Users/f002bpv/tmp/summary_logs/" + datetime.now().strftime('%Y%m%d_%H_%M_%S')

We're now ready to start coding the neural network. We'll start by opening an interactive session. 

In [10]:
# Open interactive Session
sess = tf.InteractiveSession()

First, we define the placeholder variables that will eventually hold values taken from trainX and trainY.

In [11]:
x_ = tf.placeholder(tf.float32, [None, num_features])
y_ = tf.placeholder(tf.float32, [None, num_labels])

This function will be used to define the weight and bias variables. It will automatically create summaries of how those variables change, which we store for later visualization.

In [12]:
# Add summary ops to collect data
def variable_summaries(var):
    """Attach a lot of summaries to a Tensor (for TensorBoard visualization)."""
    with tf.name_scope('summaries'):
        mean = tf.reduce_mean(var)
        tf.summary.scalar('mean', mean)
        with tf.name_scope('stddev'):
            stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean)))
        tf.summary.scalar('stddev', stddev)
        tf.summary.scalar('max', tf.reduce_max(var))
        tf.summary.scalar('min', tf.reduce_min(var))
        tf.summary.histogram('histogram', var)

This function defines a layer of a neural network. In this case, we will build a degenerate NN with only one layer, which has only one neuron. This function also generates a number of summaries.

In [13]:
# Define a layer
def nn_layer(input_tensor, input_dim, output_dim, layer_name, act=tf.nn.relu):
    """Reusable code for making a simple neural net layer.

    It does a matrix multiply, bias add, and then uses relu to nonlinearize.
    It also sets up name scoping so that the resultant graph is easy to read,
    and adds a number of summary ops.
    """
    # Adding a name scope ensures logical grouping of the layers in the graph.
    with tf.name_scope(layer_name):
        # This Variable will hold the state of the weights for the layer
        with tf.name_scope('weights'):
            weights = tf.Variable(
                tf.random_normal([input_dim, output_dim], 
                                 mean=0.0, stddev=1.0, dtype=tf.float32))
            variable_summaries(weights)
        with tf.name_scope('biases'):
            biases = tf.Variable(
                tf.random_normal([output_dim], mean=0.0, stddev=1.0, 
                                 dtype=tf.float32))
            variable_summaries(biases)
        with tf.name_scope('Wx_plus_b'):
            mul_obj = tf.matmul(input_tensor, weights)
            preactivate = tf.add(mul_obj, biases)
            tf.summary.histogram('pre_activations', preactivate)
        activations = act(preactivate, name='activation')
        tf.summary.histogram('activations', activations)
        return activations

Here we create a single neuron, using a sigmoidal activation function, which we store as F_OP.

In [14]:
# Define the (single-neuron) layer
F_OP = nn_layer(x_, num_features, num_labels, 'layer1', act=tf.nn.sigmoid)

Given F_OP, we need a means of evaluating loss. We'll use $\ell_2$ loss. If $\hat{y}$ stores the outputs of the activation function for each data point in the training set, and $y$ stores their expected classes, this error is given by:

$$\frac{1}{2}\sum\limits_{i=1}^{10000} (\hat{y}_i-y_i)^2$$

In [15]:
# Define loss
with tf.name_scope('l2_loss'):
    l2_loss = tf.nn.l2_loss(tf.sub(F_OP, y_))
tf.summary.scalar('l2_loss', l2_loss)

<tf.Tensor 'l2_loss_1:0' shape=() dtype=string>

Now that we've defined the activation and the loss, we need to define the learning procedure that will attempt to optimize the activation so as to minimize the loss. We'll use Gradient Descent in order to accomplish this. 

In [16]:
# Define Optimization Procedure
with tf.name_scope('train'):
    train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(
      l2_loss, global_step = global_step)


We should be able to evaluate the trained neuron F_OP. In order to do so, we define a vector correct_prediction, which checks whether F_OP agrees with the labels for each training vector, and accuracy, which is the proportion of agreement. 

In [17]:
# Define 'Accuracy'
with tf.name_scope('accuracy'):
    with tf.name_scope('correct_prediction'):
        correct_prediction = tf.equal(tf.round(F_OP), y_)
    with tf.name_scope('accuracy'):
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
tf.summary.scalar('accuracy', accuracy)


<tf.Tensor 'accuracy_1:0' shape=() dtype=string>

In order to record our results for later evaluation, we combine all of the summaries and produce writer objects.

In [18]:
# Merge all the summaries and write them out to /tmp/mnist_logs (by default)
merged = tf.summary.merge_all()
train_writer = tf.summary.FileWriter(summaries_dir + '/train',
                                      sess.graph)
test_writer = tf.summary.FileWriter(summaries_dir + '/test')



Now that we're ready to go, we initialize all of the variables.

In [19]:
# Initialize Variables
tf.global_variables_initializer().run()

Finally, we will now actually execute the gradient descent loop. Every so often we will evaluate the performance of the learned neuron on the total training set and print the results to screen. 

In [20]:
# Fit the Training Data
for epoch in range(training_epochs):
    if epoch % display_step == 0:  # Record summaries and test-set accuracy
        summary, acc, rate = sess.run([merged, accuracy, learning_rate], feed_dict={x_:trainX, y_:trainY})
        test_writer.add_summary(summary, epoch)
        print('Accuracy and Training Rate at step %s: %s' % (epoch, acc))
    else:  # Record train set summaries, and train
        i = np.random.randint(0, high=num_samples, size=epoch_size)
        summary, _ = sess.run([merged, train_step], feed_dict={x_:trainX[i,:], y_:trainY[i,:]})
        train_writer.add_summary(summary, epoch)

# Evaluate Performance
print("final cost on test set: %s" %str(sess.run(accuracy, feed_dict={x_:trainX, y_:trainY})))

Accuracy and Training Rate at step 0: 0.343
Accuracy and Training Rate at step 500: 0.9034
Accuracy and Training Rate at step 1000: 0.953
Accuracy and Training Rate at step 1500: 0.9677
Accuracy and Training Rate at step 2000: 0.975
Accuracy and Training Rate at step 2500: 0.9755
Accuracy and Training Rate at step 3000: 0.982
Accuracy and Training Rate at step 3500: 0.9813
Accuracy and Training Rate at step 4000: 0.9867
Accuracy and Training Rate at step 4500: 0.9891
Accuracy and Training Rate at step 5000: 0.9886
Accuracy and Training Rate at step 5500: 0.9895
Accuracy and Training Rate at step 6000: 0.9924
Accuracy and Training Rate at step 6500: 0.9881
Accuracy and Training Rate at step 7000: 0.9882
Accuracy and Training Rate at step 7500: 0.9907
Accuracy and Training Rate at step 8000: 0.9896
Accuracy and Training Rate at step 8500: 0.9893
Accuracy and Training Rate at step 9000: 0.9927
Accuracy and Training Rate at step 9500: 0.9907
final cost on test set: 0.993


Having finished our computation, we close the session and exit. 

In [21]:
sess.close()