## Getting started with TensorFlow!

The following tutorial has been modified from the "Introduction to TensorFlow" from the Udacity Deep Learning Nanodegree. We start by importing the usual suspect modules.

In [1]:
import tensorflow as tf
import pandas as pd
import numpy as np

In TensorFlow, data doesn't get stored as integers, floats, strings etc. Rather, these values are stored in objects called tensors (hence the module name!) In the example below, the number 1 is an int32 tensor with 0 dimensions, while the list 1:4 is an int32 tensor with 1 dimensions.

In [12]:
dim_zero_constant = tf.constant(1)
dim_one_constant = tf.constant([1,2,3,4])

with tf.Session() as sess:
    dim_zero_output = sess.run(dim_zero_constant)
    dim_one_output = sess.run(dim_one_constant)
    print(dim_zero_output)
    print(dim_one_output)

1
[1 2 3 4]


The TensorFlow API is constructed around this idea of a computational graph, which is just a way of conceptualizing a mathematical process. Our job is to build the graph, which then gets run as a "TensorFlow Session" by calling ```tf.Session()``` and running via ```sess.run()```. Essentially, the session is in charge of allocating the processes to the CPU or GPU depending on your setup.

<img src="image/computational_graph.png" style="height: 75%;width: 75%; position: relative; right: 5%">

Okay, that's great.. but quite often you'll be working with tensors that aren't constant. For this you use ```tf.placeholder()``` in combination with ```feed_dict()```. When I first started with TensorFlow - which admittedly was **not** very long ago, so trust me when I say that we're probably in the same boat - I found them quite confusing. Try not to overthink this, placeholders are nothing more than a variable that we will assing data to later on in the code. It allows us to build up our operations and create the computational graph without needing the data immediately. We then pass the placeholder a dictionary of data, or in TensorFlow terms, we *feed a data dictionary to the placeholder*.

And that's basically it. So in the example below, we tell TensorFlow that ```x``` will at some point in the future be of type ```tf.string```. Then when we run the session, we feed the placeholder a data dictionary - "Hello World" - and that's that! This data can then be used to perform whatever operation is necessary.

In [3]:
x = tf.placeholder(tf.string)

with tf.Session() as sess:
    output = sess.run(x, feed_dict = {x: "Hello World!"})
    print(output)

Hello World!


## Moving beyond the basics

Now let's get to the heart of TensorFlow - building neural networks. We've already seen ```tf.constant()``` and ```tf.placeholder()``` objects, but these can't be directly modified, and thus aren't appropriate if you want to update weights and biases like you would when creating neural networks. This is where ```tf.Variable()``` comes in to play. Now, if you're like me and aren't super familiar or comfortable with building your scripts up from definitions, don't fret. Take some time to read through the definition descriptions below and it'll all make sense shortly.

In [4]:
def get_weights(n_features, n_labels):
    
    """
    Return TensorFlow weights
    :param n_features: Number of features
    :param n_labels: Number of labels
    :return: TensorFlow weights
    
    Notes:
    tf.truncated_nromal() returns a tensor with random values from a normal distribution
    """
    return tf.Variable(tf.truncated_normal((n_features, n_labels)))

In [5]:
def get_biases(n_labels):
    """
    Return TensorFlow bias
    :param n_labels: Nunmber of labels
    :return: TensorFlow bias
    
    Notes:
    tf.Variable() creates a tesnor with an initial value that can be modified, much like a Python variable
    tf.zeros() returns a tensor with all zeros
    We can set the bias to zero because the randomized weights already prevent the model from getting stuck
    """
    
    return tf.Variable(tf.zeros(n_labels))
    

In [6]:
def linear(input, w, b):
    """
    Return linear function in TensorFlow
    :param input: TensorFlow input, ie x variable
    :param w: TensorFlow weights
    :param b: TensorFlow bias
    :return: TensorFlow linear combination equal to xw + b
    
    Notes:
    tf.matmul() takes the matrix multiplication of two matrices - don't forget order matters!
    """
    
    return tf.add(tf.matmul(input, w), b)

In [7]:
from tensorflow.examples.tutorials.mnist import input_data

In [8]:
def mnist_features_labels(n_labels):
    """
    Gets the first n labels from the MNIST dataset
    :param n_labels: Number of labels to use
    :return: Tuple of feature list and label list
    
    Notes:
    read_data_sets is deprecated and will be removed in the future!
    """
    
    mnist_features = []
    mnist_labels = []
    
    mnist = input_data.read_data_sets('/datasets/ud730/mnist', one_hot = True)
    
    # In order to make this run faster, only look at first 10,000 images
    for mnist_feature, mnist_label in zip(*mnist.train.next_batch(10000)):
        
        # Only add features and labels if it's for the first n labels - note n can be 1 to 10
        if mnist_label[:n_labels].any():
            mnist_features.append(mnist_feature)
            mnist_labels.append(mnist_label[:n_labels])
            
    return mnist_features, mnist_labels

In [9]:
# Number of features is 28 x 28 image = 784 features ie one feature for each pixel
n_features = 784

# Number of labels
n_labels = 3

# Features and labels
features = tf.placeholder(tf.float32)
labels = tf.placeholder(tf.float32)


# Weights and biases
w = get_weights(n_features, n_labels)
b = get_biases(n_labels)

# Linear combination xW + b
logits = linear(features, w, b)

# Training data
train_features, train_labels = mnist_features_labels(n_labels)

Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
Instructions for updating:
Please write your own downloading logic.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting /datasets/ud730/mnist\train-images-idx3-ubyte.gz
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting /datasets/ud730/mnist\train-labels-idx1-ubyte.gz
Instructions for updating:
Please use tf.one_hot on tensors.
Extracting /datasets/ud730/mnist\t10k-images-idx3-ubyte.gz
Extracting /datasets/ud730/mnist\t10k-labels-idx1-ubyte.gz
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.


In [10]:
# Run the computational graph

with tf.Session() as sess:
    
    # Initialize weights and biases
    sess.run(tf.global_variables_initializer())
    
    for epoch in range(100):
        # Softmax
        prediction = tf.nn.softmax(logits)

        # Cross entropy to quantify how far off the predictions are
        cross_entropy = -tf.reduce_sum(labels * tf.log(prediction), reduction_indices = 1)

        # Training loss
        loss = tf.reduce_mean(cross_entropy)

        # Learning rate
        learning_rate = 0.8

        # Optimizer
        optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)

        _, l = sess.run(
            [optimizer, loss],
            feed_dict = {features: train_features, labels: train_labels})

        print("Loss = {}".format(l))

Loss = 10.412758827209473
Loss = 3.8213024139404297
Loss = 2.5183959007263184
Loss = 1.9152885675430298
Loss = 1.7444946765899658
Loss = 1.0473048686981201
Loss = 0.855830729007721
Loss = 0.6647284030914307
Loss = 0.5960330367088318
Loss = 0.5474634766578674
Loss = 0.5090176463127136
Loss = 0.47640350461006165
Loss = 0.4482499957084656
Loss = 0.4237080216407776
Loss = 0.4021487236022949
Loss = 0.3830606937408447
Loss = 0.36602139472961426
Loss = 0.3506879210472107
Loss = 0.3367922902107239
Loss = 0.32413098216056824
Loss = 0.3125483989715576
Loss = 0.301920622587204
Loss = 0.2921425700187683
Loss = 0.28312066197395325
Loss = 0.27477189898490906
Loss = 0.2670232653617859
Loss = 0.25981199741363525
Loss = 0.25308412313461304
Loss = 0.24679362773895264
Loss = 0.240900456905365
Loss = 0.23536986112594604
Loss = 0.23017048835754395
Loss = 0.22527460753917694
Loss = 0.22065706551074982
Loss = 0.21629507839679718
Loss = 0.2121679186820984
Loss = 0.20825672149658203
Loss = 0.204544335603714
Lo

In [30]:
import math

def batches(batch_size, features, labels):
    """
    Create batches of features and labels
    :param batch_size: Batch size
    :param features: List of features
    :param labels: List of labels
    :return: Batches of (Features, labels)
    """
    
    assert len(features) == len(labels)
    
    output_batches = []
    
    sample_size = len(features)
    for start_i in range(0, sample_size, batch_size):
        end_i = start_i + batch_size
        batch = [features[start_i:end_i], labels[start_i:end_i]]
        output_batches.append(batch)
    return(output_batches)

In [31]:
def print_epoch_stats(epoch_i, sess, last_features, last_labels):
    """
    Print cost and validation accuracy of an epoch    
    """
    current_cost = sess.run(
        cost,
        feed_dict = {features: last_features, labels: last_labels})
    
    valid_accuracy = sess.run(
        accuracy,
        feed_dict = {features:  valid_features, labels: valid_labels})
    
    print("Epoch: {:<4} - Cost: {:<8.3} Accuracy: {:5.3}".format(
        epoch_i,
        current_cost,
        valid_accuracy))