<h1>Using TensorFlow With GPU</h1>
<p>Assuming you have an NVIDIA GPU with Cuda Compute Capability 3.0 or above...</p>
<p>Build TensorFlow from <a href="https://www.tensorflow.org/versions/master/get_started/os_setup.html#source">source</a> and configure it using the following command:</p>

In [None]:
%%bash
# cd to tensorflow root, do the following... The unofficial setting lets you use 3.0 GPUs instead of minimum 3.5
# TF_UNOFFICIAL_SETTING=1 ./configure

<p>Note that the above has some interactive prompts you need to fill out, so you can't do it from within this notebook. Then create a pip install like this:</p>

In [None]:
%%bash
# Make sure you're using python 2.7
python --version
# download and install tensorflow with gpu capability (from the pip package you build from source!!!)
# see: https://www.tensorflow.org/versions/master/get_started/os_setup.html#create-pip

# ====================== UNCOMMENT THIS LINE ===================== #
# pip install /tmp/tensorflow_pkg/tensorflow-0.5.0-py2-none-any.whl
#
# Note: the name of the .whl may change in the future...
#
# Or if you're doing it from another machine...
# ====================== UNCOMMENT THIS LINE ===================== #
# pip install tensorflow
#
# Or, if that doesn't work
# ====================== UNCOMMENT THIS LINE ===================== #
# conda install -c https://conda.anaconda.org/jjhelmus tensorflow

<p>Now your code will run through the damn GPU from your iPython notebook. Sick, huh? Now do this:</p>

In [4]:
import numpy as np
import tensorflow as tf
CPU = "/cpu:0"
#GPU = "/gpu:0"
GPU = "/cpu:0"

<h2>Computational Graphs with TensorFlow</h2>
<p>TensorFlow uses graphs to define computations. You create constants and operations, and using a <code>Session()</code> object, allow TesnorFlow to automatically handle the overhead of allocating resources and calling external libraries for you. When the session finishes, the resources are freed and the session terminates.</p>

In [None]:
# some constants and an operation (variables...)
matrix1 = tf.constant([[3.,3.]]) # 1 x 2 matrix
matrix2 = tf.constant([[2.],[2.]]) # 2 x 1 matrix
product = tf.matmul(matrix1, matrix2) # (1 x 2) * (2 x 1)

# create/run the session
sess = tf.Session()
result = sess.run(product)

print(result)

#close session
sess.close()

<p>Trying the above using a <code>with</code> block...</p>

In [None]:
with tf.Session() as sess:
    result = sess.run(product)
    print(result)

<p>You can also run it on a GPU or CPU by design:</p>

In [None]:
def run_on_dev(dev="/gpu:0"):
    with tf.Session() as sess:
        with tf.device(dev):
            A = tf.constant([[3.,3.]])
            B = tf.constant([[2.],[2.]])
            product = tf.matmul(A, B)
            result = sess.run(product)
            print(result)

dev1 = CPU
dev2 = GPU
run_on_dev(dev1)
run_on_dev(dev2)

## Interactive Session

In [None]:
# create interactive session
sess = tf.InteractiveSession()
x = tf.Variable([1.0, 2.0])
a = tf.constant([3.0, 3.0])

# initialize x
x.initializer.run()

# add an op to subtract a from x
sub = tf.sub(x, a)
print(sub.eval())

sess.close()

### Data Types

In [None]:
# floats
# print(tf.float32)
# print(tf.float64)

# ints
# print(tf.int64)
# print(tf.int32)
# print(tf.int16)
# print(tf.int8)
# print(tf.uint8)

# other
# print(tf.string)
# print(tf.bool)
# print(tf.complex64)

# quantized
# print(tf.qint32)
# print(tf.qint8)
# print(tf.quint8)

### Device Allocation & Logging

In [None]:
def dev_log(dev):
    with tf.device(dev):
        a = tf.constant([1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0], shape=[9,1], name='a')
        b = tf.constant([1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0], shape=[1,9], name='b')
        c = tf.matmul(a, b)
    # run
    with tf.Session(config=tf.ConfigProto(log_device_placement=True)) as sess: 
        print(sess.run(c))

In [None]:
# run on CPU
dev_log(CPU)

# run on GPU
dev_log(GPU)

In [None]:
def big_tensor_multiply(dev,dim=1000):
    with tf.device(dev):
        a = tf.constant(np.random.rand(dim,dim), shape=[dim,dim], name='a')
        b = tf.constant(np.random.rand(dim,dim), shape=[dim,dim], name='b')
        c = tf.matmul(a, b)
        d = tf.matrix_inverse(c)
    # soft_placement allows tensorflow to allocate ops to device of its choice
    with tf.Session(config=tf.ConfigProto(allow_soft_placement=True, log_device_placement=True)) as sess:
        print(sess.run(d))

In [None]:
big_tensor_multiply(GPU)

In [None]:
def use_multiple_devices(devices):
    for d in devices:
        print("Using device: " + d)
        big_tensor_multiply(d)            

In [None]:
use_multiple_devices([GPU,CPU])

## Using Variables

In [None]:
def counter_step(step):
    # really overblown counter
    state = tf.Variable(0, name="counter")
    
    # val 1
    one = tf.constant(step)
    
    # val step + state
    new_val = tf.add(state, one)
    
    # update assign operation
    update = tf.assign(state, new_val)

    # graph launch
    init_operation = tf.initialize_all_variables()

    # run graph
    with tf.Session() as sess:
        sess.run(init_operation)
        print(sess.run(state))
        for _ in range(3):
            sess.run(update)
            print(sess.run(state))

In [None]:
counter_step(4)

## Simple Regression Example

In [None]:
x_data = np.random.rand(100).astype("float32")
y_data = x_data * 0.1 + 0.3

# Weights
W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))

# Bias
b = tf.Variable(tf.zeros([1]))

# Layer function
y = W * x_data + b

# MSE
loss = tf.reduce_mean(tf.square(y - y_data))

# SGD
optimizer = tf.train.GradientDescentOptimizer(0.5)
train = optimizer.minimize(loss)

# initialize variables...
init = tf.initialize_all_variables()

with tf.Session() as sess:
    with tf.device(GPU):
        sess.run(init)
        for step in xrange(201):
            sess.run(train)
            if step % 20 == 0:
                print(step, sess.run(W), sess.run(b))

<h2>MNIST example: Handwritten Digit Recognition</h2>
<p>First, to download and install data for MNIST dataset. This code will be reused later as well...</p>

In [None]:
import input_data # comes from the file provided in the tutorial...
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
# mnist.train -- training data
# mnist.test -- testing data
# mnist.train.images -- training images
# mnist.train.labels -- training labesls

In [None]:
print("Here's what the training data looks like:\n")
print(mnist.train.images)
print("\nNum images: " + str(len(mnist.train.images)))
print("Num labels: " + str(len(mnist.train.labels)))

<h2>Model Parameters</h2>
<p>In this example we will only use a single layer model...</p>

In [None]:
# create an input vector for flattened images...
x = tf.placeholder(tf.float32, [None, 784])

# weight matrix 784 x 10 
W = tf.Variable(tf.zeros([784,10]))

# Biases
b = tf.Variable(tf.zeros([10]))

### Softmax Regression

In [None]:
# y = output = softmax(Sum(W * x) + b)
y = tf.nn.softmax(tf.matmul(x, W) + b)

### Cross-Entropy Output
General Form of the Cross Entropy loss function
\begin{align}
H_{y^{\prime}}\left(y\right) &= -\sum_i  \ y^{\prime}_i \ log\left(y_i\right)
\end{align}

In [None]:
# OP: Truth value
t = tf.placeholder(tf.float32, [None, 10])

# OP: Loss function
cross_entropy = -tf.reduce_sum(t * tf.log(y))

### Backpropagation Training

In [None]:
# OP: GD optimization
learning_rate = 0.01
train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(cross_entropy)

### Initialization

In [None]:
# OP: initialize stuff. duh.
init = tf.initialize_all_variables()

### Training & Testing

In [None]:
# 1000 training iterations
def train_test(dev):
    batch_size = 100
    num_epochs = 1000

    with tf.Session(config=tf.ConfigProto(allow_soft_placement=True)) as sess:
        with tf.device(dev):
            # Run the session
            sess.run(init)
            for i in range(num_epochs):
                # periodic print out
                if i % (num_epochs/10.0) == 0: print("Epoch: " + str(i) + "...")
                batch_inputs, truth_values = mnist.train.next_batch(batch_size)
                sess.run(train_step, feed_dict={x: batch_inputs, t: truth_values})

            # OP: compare truth values to predictions
            correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(t, 1))

            # OP: calculate accuracy
            accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

            # RUN: print the accuracy
            test_result = sess.run(accuracy, feed_dict={x: mnist.test.images, t: mnist.test.labels})
            print("Accuracy on Test set: " + str(test_result))

In [None]:
train_test(GPU)

# Using a Deeper Model: MNIST
But first a recap...

In [5]:
import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


In [6]:
# Input & Truth Vector
def get_placeholders():
    x = tf.placeholder("float",shape=[None,784])
    t = tf.placeholder("float",shape=[None,10])
    return (x, t)

# Weights & Bias
def get_model_params():
    W = tf.Variable(tf.zeros([784,10]))
    b = tf.Variable(tf.zeros([10]))
    return (W, b)

# Softmax Layer
def get_softmax_layer(x, W, b):
    return tf.nn.softmax(tf.matmul(x, W) + b)

# Cost Function
def get_cross_entropy_function(t, y):
    return -tf.reduce_sum(t * tf.log(y))

# Training module
def get_training_module(learning_rate, cost_function):
    return tf.train.GradientDescentOptimizer(learning_rate).minimize(cost_function)

# Test the model
def do_test_model(inputs, outputs, truth):
    correct_prediction = tf.equal(tf.argmax(outputs,1), tf.argmax(truth,1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
    test_accuracy = accuracy.eval(feed_dict={inputs: mnist.test.images, truth: mnist.test.labels})
    print("Test Accuracy: " + str(test_accuracy))

# Training iterations
def do_train_model(training_algo, input_values, truth_values, batch_size, num_epochs):
    for i in range(num_epochs):
        if i % (num_epochs/10) == 0: print("Epoch " + str(i) + "...")
        batch = mnist.train.next_batch(batch_size)
        training_algo.run(feed_dict={input_values: batch[0], truth_values: batch[1]})

# Train/test
def do_train_test(learning_rate, batch_size, num_epochs):
    x, t = get_placeholders()
    W, b = get_model_params()
    y = get_softmax_layer(x, W, b)
    ce = get_cross_entropy_function(t, y)
    training_algo = get_training_module(learning_rate, ce)
    with tf.Session() as sess:
        sess.run(tf.initialize_all_variables())
        do_train_model(training_algo, x, t, batch_size, num_epochs)
        do_test_model(x, y, t)

In [7]:
learning_rate = 0.01
batch_size = 100
num_epochs = 1000
do_train_test(learning_rate, batch_size, num_epochs)

Epoch 0...
Epoch 100...
Epoch 200...
Epoch 300...
Epoch 400...
Epoch 500...
Epoch 600...
Epoch 700...
Epoch 800...
Epoch 900...
Test Accuracy: 0.9165


## Convolutional Model
Parameters and placeholders...

In [8]:
import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

# Inputs and truth placeholder
x, t = get_placeholders()

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


Functions we'll need to do convolutions, etc...

In [9]:
def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(initial)

def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape)
    return tf.Variable(initial)

def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

### First Convolutional Layer
This will create a convolution layer with 32 filters, each being a 5x5 pixel patch. The shape will therefore be [5,5,1,32] which indicates the size of our filters, the number of input channels (1), and the number of output channels (32). There is also a bias vector for each output channel, so a 32-dim vector of bias terms. 

In [10]:
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])

To apply this layer, we reshape it to a 4d tensor, with the 2nd and 3rd dimensions corresponding to the image width and heigh, and the final dimension to the number of color channels. (1 for greyscale). 

In [11]:
x_image = tf.reshape(x, [-1, 28, 28, 1])

Convolve x_image with the weight tensor, add bias, and apply ReLU, and finally a max pool...

In [12]:
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)

### Second Convolutional Layer
64 features for each 5x5 patch.

In [13]:
# Weight & Bias
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])

# ReLU activation function & Max-pooling layer
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

### Densly Connected Layer
Now that the image size has been reduced to 7x7, we add a fully-connected layer with 1024 neurons to allow processing on the entire image. We reshape the tensor from the pooling layer into a batch of vectors, multiply by a weight matrix, add a bias, and apply a ReLU.

In [14]:
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])

# pooling layer
h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

# Dropout
keep_prob = tf.placeholder("float")
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

### Readout Layer
Finally, we add a softmax layer, just like for the one layer softmax regression above.

In [15]:
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])

y_conv = get_softmax_layer(h_fc1_drop, W_fc2, b_fc2)

## Train and Test

In [16]:
# Cross Entropy loss function
ce = get_cross_entropy_function(t, y_conv)

# Function to optimize
training_algo = tf.train.AdamOptimizer(1e-4).minimize(ce)

# Correct Prediction
correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(t, 1))

# Accuracy calculation
accuracy_calc = tf.reduce_mean(tf.cast(correct_prediction, "float"))

Training method...

In [17]:
# Run the Training...
def run_convolutional_mnist_model(training_algo, accuracy_calc, batch_size):
    config=tf.ConfigProto(allow_soft_placement=True)
    #config.gpu_options.allocator_type = 'BFC'
    with tf.Session(config=config) as sess:
        sess.run(tf.initialize_all_variables())
        for i in range(2000):
            batch = mnist.train.next_batch(batch_size)
            if i % 100 == 0:
                train_accuracy = accuracy_calc.eval(feed_dict={x:batch[0], t:batch[1], keep_prob:1.0})
                print("Epoch: %d, Training Accuracy: %g" % (i, train_accuracy))
            training_algo.run(feed_dict={x:batch[0], t:batch[1], keep_prob:0.5})
            
        test_acc = 0
        #test_acc = accuracy_calc.eval(feed_dict={x:mnist.test.images, t:mnist.test.labels, keep_prob:1.0})
        #print("Test Accuracy: %g" % test_acc)

### Run the training...

In [None]:
run_convolutional_mnist_model(training_algo, accuracy_calc, 100)

## Git Update Shit

In [23]:
%%bash
git add .
git pull
git commit -m "autocommit message"
git push -u origin master

Already up-to-date.
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working directory clean
Branch master set up to track remote branch master from origin.


Everything up-to-date
