# Hello friendly folks at Northwestern! 

This notebok contains a walkthrough to help you get started with TensorFlow. We'll work on a few exercises for...

* Linear Regression, using low-level TensorFlow.  
* Logistic Regression, using low-level TensorFlow.  
* Deep Neural Network, using low-level TensorFlow.
* The above, with Canned Estimators.
* Practical stuff: training models on (possibly large) amounts of structured data.
* Custom Estimators.

Of course, Deep Learning is a wide and rich field, and TensorFlow can do much more than the above. Here are some recent articles you can check out:

* https://research.googleblog.com/2017/06/supercharge-your-computer-vision-models.html
* https://research.googleblog.com/2017/07/building-your-own-neural-machine.html
* https://magenta.tensorflow.org (for many projects using TensorFlow for art & music).

# Installation
Before you begin, please make sure you have TensorFlow version 1.3.0rc0 (or higher) installed on your machine, where *"rc"* means *"release candidate"*. We'll be working with the Datasets API, which we're currently developing to make it easy to efficiently train models on large amounts of data (say, that are too big to fit into memory).


# Learning more

Here are some short videos and a couple book recommendations.

* Machine Learning Recipes: https://goo.gl/uRR7r4
* Hands-On Machine Learning with Scikit-Learn and TensorFlow: http://shop.oreilly.com/product/0636920052289.do
* Deep Learning with Python: https://www.manning.com/books/deep-learning-with-python

You can follow TensorFlow on Twitter, if you like, at https://twitter.com/tensorflow

## Imports
If you can successfully run this cell, then your machine is properly configured for this tutorial. If the only line that's causing you problems is *import pylab* and the next, you can safely comment those out and skip the cells that use them later.

In [None]:
# Python 2 & 3 compatibility
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import collections
import os

import numpy as np
import pandas as pd

import pylab
%matplotlib inline

import tensorflow as tf
print('You are running TensorFlow version:', tf.__version__)

from tensorflow.examples.tutorials.mnist import input_data

# 1) Linear Regression (with low-level code)

Let's start our journey by taking a look at lower-level TensorFlow code, to get a sense for how things work under the hood. Fear not, intrepid reader - you need not write graph-level code in practice unless you'd like to.

### Let's generate some data

This function will create a noisy dataset that's roughly linear, according to the equation *y = mx + b + noise*. As you might expect, we'll then try to find the best fit line. Of course, there's a closed form solution to this - but we'll use gradient descent as an exercise.

In [None]:
def make_noisy_data(m=0.1, b=0.3, n=100):
    x = np.random.rand(n).astype(np.float32)
    noise = np.random.normal(scale=0.01, size=len(x))
    y = m * x + b + noise
    return x, y

Create training and testing data.

In [None]:
x_train, y_train = make_noisy_data()
x_test, y_test = make_noisy_data()

In [None]:
# Uncomment the following lines to plot the data.
# Isn't it beautiful?
# Training data is shown in blue, and testing data in green.
# pylab.plot(x_train, y_train, 'b.')
# pylab.plot(x_test, y_test, 'g.')

### Prepare the graph

In [None]:
# The following line (as you might imagine) clears the graph.
# Why do we need it? Jupyter Notebooks maintain state.
# If you run this Notebook twice (and forget to reset it), 
# this line will restore everything to a clean state for you.
tf.reset_default_graph()

In [None]:
# You can think of a Session as an 'execution environment' for a graph.
# We won't need this until we're ready to run the graph, but
# I'll create it now, just for kicks.
sess = tf.Session()

In [None]:
# Path to a log directory.
# As written, it will be created in the same directory as this notebook.
# Later, we'll use TensorBoard to visualize data stored 
# in this directory - and it will be awesome.

# Tip:
# If you have trouble with TensorBoard, delete
# the log directory, and re-run the notebook.
LOGDIR = './graphs'

Define placeholders for data we'll feed to the graph.

In [None]:
# You can think of a 'Placeholder' as a promise. It's a value we 
# promise to provide when we execute the graph.
# A lot of this code is for display purposes.
# - 'tf.name_scope' nests our placeholders under an 'input' block
# - name='x-input' gives TensorBoard a display name for this node.
# shape=[None] means x_placeholder is a one dimensional array of any length. 
# - this is so we can feed a 'batch' of data later, for example,
# - for stochastic gradient descent, or to make predictions.
with tf.name_scope('input'):
    x_placeholder = tf.placeholder(shape=[None], dtype=tf.float32, name='x-input')
    y_placeholder = tf.placeholder(shape=[None], dtype=tf.float32, name='y-input')

### Define our model.

Here, we'll use a linear model (e.g., *y = mx + b*)

In [None]:
with tf.name_scope('model'):
    m = tf.Variable(tf.random_normal([1]), name='m')
    b = tf.Variable(tf.random_normal([1]), name='b')
    # This is the same as y = tf.add(tf.mul(m, x_placeholder), b), 
    # but looks nicer
    y = m * x_placeholder + b

### The Loss and Optimizer

Define a loss function (*mean squared error*) and an optimizer (*vanilla gradient descent*).

In [None]:
LEARNING_RATE = 0.5 # a magic number!
# as you gain experience with Deep Learning,
# you will become proficient in picking proper
# values (or just stop worrying about it)

with tf.name_scope('training'):
    with tf.name_scope('loss'):
        loss = tf.reduce_mean(tf.square(y - y_placeholder))
    with tf.name_scope('optimizer'):
        optimizer = tf.train.GradientDescentOptimizer(LEARNING_RATE)
        train = optimizer.minimize(loss) 

### Log values in TensorBoard
Later, we'll get this for free.

In [None]:
# Write the graph
writer = tf.summary.FileWriter(LOGDIR)
writer.add_graph(sess.graph)

# Attach summaries to Tensors (for TensorBoard visualization)
tf.summary.histogram('m', m)
tf.summary.histogram('b', b)
tf.summary.scalar('loss', loss)

# This op will calculate our summary data when run
summary_op = tf.summary.merge_all()

### Initialize variables
At this point, our graph is complete - and we're nearly ready to begin training. First, variables must be initialized. Don't forget this line - the fate of the universe is uncertain, if you do.

In [None]:
# Notice we're running this line with our session.
# All the TensorFlow code prior to this point has
# served to define the graph.
sess.run(tf.global_variables_initializer())

In [None]:
# 'm' and 'b' were initialized to random values
# let's see what these were
initial_vals = sess.run([m, b])
print ("Initial values for m: %f, b: %f" % (initial_vals[0], initial_vals[1]))

### Train
Here, we'll iteratively update the values for 'm' and 'b' using gradient descent

In [None]:
TRAIN_STEPS = 201

for step in range(TRAIN_STEPS):
        
    # Session will run two ops:
    # - summary_op prepares summary data we'll write to disk in a moment
    # - train will use the optimizer to adjust our variables to reduce loss
    summary_result, _ = sess.run([summary_op, train], 
                                  feed_dict={x_placeholder: x_train, 
                                             y_placeholder: y_train})
    # write the summary data to disk
    writer.add_summary(summary_result, step)
    
    # Uncomment the following two lines to watch training happen real time.
    if step % 20 == 0:
        vals = sess.run([m, b])
        print("Step: %d, m: %f, b: %f" % (step, vals[0], vals[1]))
    
# close the writer when we're finished using it
writer.close()

### See the learned values for 'm' and 'b'

In [None]:
# If things worked properly, 'm' should be about 0.1, 
# and 'b' should be about 0.3
# (+/- a bit, because we added noise when we generated the data)
print ("Learned values for m: %f, b: %f" % (sess.run(m), sess.run(b)))

### Use the trained model to make a prediction

In [None]:
# Use the trained model to make a prediction!
# Remember that x_placeholder must be a vector, hence [2] not just 2.
# We expect the result to be (about): 2 * 0.1 + 0.3 + noise ~= 0.5
sess.run(y, feed_dict={x_placeholder: [2]})

### Start TensorBoard

Let's see what we got for all that work logging variables.

Start TensorBoard by running this command from a terminal.

```$ tensorboard --logdir=graphs```

Note: first ```cd``` into the directory that contains this notebook. If you are running TensorFlow in a *virtualenv* and you have opened a new terminal window, be sure to start the *virtualenv* again before running TensorBoard.

After you have run this command, open TensorBoard by pointing your browser to *http://localhost:6006* Then, click on the tabs for 'scalars', 'distributions', 'histograms', and 'graphs' to learn more.

If you run into trouble, delete LOGDIR (to clear information from previous runs), then re-run this script, and restart TensorBoard.

# Logistic Regression, using low-level code

We will now use a linear model to classify handwritten digits from the MNIST dataset.

In [None]:
tf.reset_default_graph()
sess = tf.Session()

In [None]:
# Import the MNIST dataset. 
# It will be downloaded to '/tmp/data' if you don't already have a local copy.
mnist = input_data.read_data_sets('/tmp/data', one_hot=True)

In [None]:
# Uncomment these lines to understand the format of the dataset.

# 1. There are 55k, 5k, and 10k examples in train, validation, and test.
# print ('Train, validation, test: %d, %d, %d' % 
#       (len(mnist.train.images), len(mnist.validation.images), len(mnist.test.images)))

# 2. The format of the labels is 'one-hot'.
# The fifth image happens to be a '6'.
# This is represented as '[ 0.  0.  0.  0.  0.  1.  0.  0.  0.  0.]'
# print (mnist.train.labels[4])

# You can find the index of the label, like this:
# print (np.argmax(mnist.train.labels[4]))

# 3. An image is a 'flattened' array of 28*28 = 784 pixels.
# print (len(mnist.train.images[4]))

# 4. To display an image, first reshape it to 28x28.
# pylab.imshow(mnist.train.images[4].reshape((28,28)), cmap=pylab.cm.gray_r)   
# pylab.title('Label: %d' % np.argmax(mnist.train.labels[4])) 

In [None]:
NUM_CLASSES = 10
NUM_PIXELS = 28 * 28
TRAIN_STEPS = 2000
BATCH_SIZE = 100
LEARNING_RATE = 0.5

In [None]:
# Define inputs
images = tf.placeholder(dtype=tf.float32, shape=[None, NUM_PIXELS])
labels = tf.placeholder(dtype=tf.float32, shape=[None, NUM_CLASSES])

In [None]:
# Define model
W = tf.Variable(tf.truncated_normal([NUM_PIXELS, NUM_CLASSES]))
b = tf.Variable(tf.zeros([NUM_CLASSES]))
y = tf.matmul(images, W) + b

In [None]:
# Define loss and optimizer
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y, labels=labels))
train_step = tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(loss) 

In [None]:
# Initialize variables after the model is defined
sess.run(tf.global_variables_initializer())

In [None]:
# Train the model
for i in range(TRAIN_STEPS):
    batch_images, batch_labels = mnist.train.next_batch(BATCH_SIZE)
    sess.run(train_step, feed_dict={images: batch_images, labels: batch_labels})

In [None]:
# Evaluate the trained model
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(labels, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
                                  
print("Accuracy %f" % sess.run(accuracy, feed_dict={images: mnist.test.images, 
                                                    labels: mnist.test.labels}))

A method to make predictions on a single image

In [None]:
def predict(i):
    image = mnist.test.images[i]
    actual_label = np.argmax(mnist.test.labels[i])
    prediction = tf.argmax(y,1)
    predicted_label = sess.run(prediction, feed_dict={images: [image]})
    print ("Predicted: %d, actual: %d" % (predicted, actual))
    pylab.imshow(mnist.test.images[i].reshape((28,28)), cmap=pylab.cm.gray_r) 
    return predicted_label, actual_label

predict(5)

# 3) A Deep Neural Network, using low-level TensorFlow code.

Using the magic of automatic differentiation, we will now write a Deep Neural Network to classify handwritten digits. If this seems like a big jump from Linear and Logistic Regression - keep in mind, the goal of this exercise is to show you that the part of the code that does the *"hard work"* (training the model) is nearly identical. We need only change the code to specify the model (a stack of fully connected layers instead of y = Wx + b). Once that's done, we can train the DNN using TensorFlow in the *same* way we train the Linear / Logistic models.

In [None]:
tf.reset_default_graph()
sess = tf.Session()

In [None]:
# number of neurons in each hidden layer
HIDDEN1_SIZE = 500
HIDDEN2_SIZE = 250

NUM_CLASSES = 10
NUM_PIXELS = 28 * 28

# experiment with the nubmer of training steps to 
# see the effect
TRAIN_STEPS = 2000
BATCH_SIZE = 100

# we're using a different learning rate than the previous
# notebook, and a new optimizer
LEARNING_RATE = 0.001

In [None]:
# Define inputs
with tf.name_scope('input'):
    images = tf.placeholder(tf.float32, [None, NUM_PIXELS], name="pixels")
    labels = tf.placeholder(tf.float32, [None, NUM_CLASSES], name="labels")

In [None]:
# Method to create a fully connected layer
def fc_layer(input, size_out, name="fc", activation=None):
    with tf.name_scope(name):
        size_in = int(input.shape[1])
        w = tf.Variable(tf.truncated_normal([size_in, size_out], stddev=0.1), name="weights")
        b = tf.Variable(tf.constant(0.1, shape=[size_out]), name="bias")
        wx_plus_b = tf.matmul(input, w) + b
        if activation: return activation(wx_plus_b)
        return wx_plus_b
    
# The way we initialize variables has an affect on how quickly 
# training converges. We may explore with different strategies later.
# w = tf.Variable(tf.truncated_normal(shape=[size_in, size_out], stddev=1.0 / math.sqrt(float(size_in))))

In [None]:
# Define the model

# First, we'll create two fully connected layers, with ReLU activations
fc1 = fc_layer(images, HIDDEN1_SIZE, "fc1", activation=tf.nn.relu)
fc2 = fc_layer(fc1, HIDDEN2_SIZE, "fc2", activation=tf.nn.relu)

# Next, we'll apply Dropout to the second layer
# This can help prevent overfitting, and I've added it here
# for illustration. You can comment this out, if you like.
dropped = tf.nn.dropout(fc2, keep_prob=0.9)

# Finally, we'll calculate logists. This will be
# the input to our Softmax function. Notice we 
# don't apply an activation at this layer.
# If you've commented out the dropout layer,
# switch the input here to 'fc2'.
y = fc_layer(dropped, NUM_CLASSES, name="output")

In [None]:
# Define loss and an optimizer
with tf.name_scope("loss"):
    loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y, labels=labels))
    tf.summary.scalar('loss', loss)

with tf.name_scope("optimizer"):
    # Whereas in the previous notebook we used a vanilla GradientDescentOptimizer
    # here, we're using Adam. This is a single line of code change, and more
    # importantly, TensorFlow will still automatically analyze our graph
    # and determine how to adjust the variables to decrease the loss.
    train = tf.train.AdamOptimizer(LEARNING_RATE).minimize(loss)

In [None]:
# Define evaluation
with tf.name_scope("evaluation"):
    # these there lines are identical to the previous notebook.
    correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(labels, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    tf.summary.scalar('accuracy', accuracy)    

In [None]:
# Set up logging.
# We'll use a second FileWriter to summarize accuracy on
# the test set. This will let us display it nicely in TensorBoard.
train_writer = tf.summary.FileWriter(os.path.join(LOGDIR, "train"))
train_writer.add_graph(sess.graph)
test_writer = tf.summary.FileWriter(os.path.join(LOGDIR, "test"))
summary_op = tf.summary.merge_all()

In [None]:
sess.run(tf.global_variables_initializer())

In [None]:
for step in range(TRAIN_STEPS):
    batch_xs, batch_ys = mnist.train.next_batch(BATCH_SIZE)
    summary_result, _ = sess.run([summary_op, train], 
                                    feed_dict={images: batch_xs, labels: batch_ys})

    train_writer.add_summary(summary_result, step)
    train_writer.add_run_metadata(tf.RunMetadata(), 'step%03d' % step)
    
    # calculate accuracy on the test set, every 100 steps.
    # we're using the entire test set here, so this will be a bit slow
    if step % 100 == 0:
        summary_result, acc = sess.run([summary_op, accuracy], 
                                       feed_dict={images: mnist.test.images, 
                                                  labels: mnist.test.labels})
        test_writer.add_summary(summary_result, step)
        test_writer.add_run_metadata(tf.RunMetadata(), 'step%03d' % step)
        print ("test accuracy: %f at step %d" % (acc, step))


print("Accuracy %f" % sess.run(accuracy, 
                               feed_dict={images: mnist.test.images,
                                          labels: mnist.test.labels}))
train_writer.close()
test_writer.close()

# 4) Linear Regression with a Canned Estimator  

Now let's begin working with higher-level code. We will again train a linear regression model, in just a few lines.

### Input Pipeline

In [None]:
x_dict = {'x': x_train}
train_input = tf.estimator.inputs.numpy_input_fn(x_dict, y_train,
                                                 shuffle=True,
                                                 num_epochs=None) # repeat forever

### Describe input feature usage

In [None]:
features = [tf.feature_column.numeric_column('x')]

### Build and train the model
After you run this next block, you should see an output line in the logs resembling:

```WARNING:tensorflow:Using temporary folder as model directory: /var/folders/sf/j86k2fg96m96w2hmwlsdrvr8006h_5/T/tmpSkPFHV```

You can then start TensorBoard, pointing to that directory like this:

```$ tensorboard --logdir=/var/folders/sf/j86k2fg96m96w2hmwlsdrvr8006h_5/T/tmpSkPFHV```

In [None]:
estimator = tf.estimator.LinearRegressor(features)
estimator.train(train_input, steps=1000)

### Predict

In [None]:
data_source = tf.estimator.inputs.numpy_input_fn({'x': x_test}, shuffle=False)

predictions = list(estimator.predict(data_source))
preds = [p['predictions'][0] for p in predictions]

#for y in predictions:
#    print(y['predictions'])
#predictions

In [None]:
pylab.scatter(x_train, y_train)
pylab.plot(x_test, np.array(preds), 'g')

Hopefully, this felt easier that the lower-level code above.

# 5) Training Models on (possibly large amounts) of Structured Data

### Download the dataset

Here, we'll work with the "Adult dataset" from the U.S. Census bureau. Our task will be to predict whether an individual makes more than $50,000 a year based attributes such as education, hours of work per week, etc. More about this dataset is [here](https://archive.ics.uci.edu/ml/machine-learning-databases/adult/old.adult.names).

### You can adapt this code for a problem you care about

Our goal here is to demonstrate how to work with data you might represent in a CSV file. Hopefully, you can adapt this code to a problem you care about. 

### What if I have *lots* of data?

The code presented here can be adapted to any CSV dataset that fits in memory (using the *pandas input function*) or a dataset of pretty much any size (using the *Datasets API*, below) - which contains logic to efficiently read it from disk. When you're training large models using GPUs, you want to be sure your input pipeline doesn't bottleneck (or starve) the GPU. The Datasets API handle this for you.

In [None]:
census_train_url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data'
census_train_path = tf.contrib.keras.utils.get_file('census.train', census_train_url)

census_test_url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.test'
census_test_path = tf.contrib.keras.utils.get_file('census.test', census_test_url)

### Load the data

In [None]:
column_names = [
  'age', 'workclass', 'fnlwgt', 'education', 'education-num',
  'marital-status', 'occupation', 'relationship', 'race', 'sex',
  'capital-gain', 'capital-loss', 'hours-per-week', 'native-country',
  'income'
]

census_train = pd.read_csv(census_train_path, index_col=False, names=column_names) 
census_test = pd.read_csv(census_train_path, index_col=False, names=column_names) 

census_train_label = census_train.pop('income') == " >50K" 
census_test_label = census_test.pop('income') == " >50K"

In [None]:
census_train.head(10)

In [None]:
census_train_label[:20]

### Input pipeline

In [None]:
train_input = tf.estimator.inputs.pandas_input_fn(
    census_train, 
    census_train_label,
    shuffle=True, 
    batch_size = 32, # process 32 examples at a time
    num_epochs=None,
)

In [None]:
test_input = tf.estimator.inputs.pandas_input_fn(
    census_test, 
    census_test_label, 
    shuffle=True, 
    num_epochs=1)

In [None]:
features, labels = train_input()
features

### Feature description

In [None]:
features = [
    tf.feature_column.numeric_column('hours-per-week'),
    tf.feature_column.bucketized_column(tf.feature_column.numeric_column('education-num'), list(range(25))),
    tf.feature_column.categorical_column_with_vocabulary_list('sex', ['male','female']),
    tf.feature_column.categorical_column_with_hash_bucket('native-country', 1000),
]

In [None]:
estimator = tf.estimator.LinearClassifier(features, model_dir='census/linear',n_classes=2)

In [None]:
estimator.train(train_input, steps=5000)

### Evaluate the model

In [None]:
estimator.evaluate(test_input)

In [None]:
predictions = [p for p in estimator.predict(test_input)]
print (predictions[0]["probabilities"])

## DNN model

### Update input pre-processing

In [None]:
features = [
    tf.feature_column.numeric_column('education-num'),
    tf.feature_column.numeric_column('hours-per-week'),
    tf.feature_column.numeric_column('age'),
    tf.feature_column.indicator_column(
        tf.feature_column.categorical_column_with_vocabulary_list('sex',['male','female'])),
    tf.feature_column.embedding_column(  # now using embedding!
        tf.feature_column.categorical_column_with_hash_bucket('native-country', 1000), 10)
]

In [None]:
estimator = tf.estimator.DNNClassifier(hidden_units=[20,20], 
                                       feature_columns=features, 
                                       n_classes=2, 
                                       model_dir='census/dnn')

In [None]:
estimator.train(train_input, steps=5000)

In [None]:
estimator.evaluate(test_input)

## Custom Input Pipeline using Datasets API

### Read the data

In [None]:
def census_input_fn(path):
    def input_fn():    
        dataset = (
            tf.contrib.data.TextLineDataset(path)
                .map(csv_decoder)
                .shuffle(buffer_size=100)
                .batch(32)
                .repeat())

        columns = dataset.make_one_shot_iterator().get_next()
        income = tf.equal(columns.pop('income')," >50K") 
        return columns, income
    return input_fn

In [None]:
csv_defaults = collections.OrderedDict([
  ('age',[0]),
  ('workclass',['']),
  ('fnlwgt',[0]),
  ('education',['']),
  ('education-num',[0]),
  ('marital-status',['']),
  ('occupation',['']),
  ('relationship',['']),
  ('race',['']),
  ('sex',['']),
  ('capital-gain',[0]),
  ('capital-loss',[0]),
  ('hours-per-week',[0]),
  ('native-country',['']),
  ('income',['']),
])

In [None]:
def csv_decoder(line):
  parsed = tf.decode_csv(line, csv_defaults.values())
  return dict(zip(csv_defaults.keys(), parsed))

### Try the input function

In [None]:
tf.reset_default_graph()
census_input = census_input_fn(census_train_path)
training_batch = census_input()

In [None]:
with tf.Session() as sess:
    features, high_income = sess.run(training_batch)

In [None]:
print(features['education'])

In [None]:
print(features['age'])

In [None]:
print(high_income)

## 6) A Custom Estimator for a CNN

In [None]:
train,test = tf.contrib.keras.datasets.mnist.load_data()
x_train,y_train = train 
x_test,y_test = test

mnist_train_input = tf.estimator.inputs.numpy_input_fn({'x':np.array(x_train, dtype=np.float32)},
                                                       np.array(y_train,dtype=np.int32),
                                                       shuffle=True,
                                                       num_epochs=None)

mnist_test_input = tf.estimator.inputs.numpy_input_fn({'x':np.array(x_test, dtype=np.float32)},
                                                      np.array(y_test,dtype=np.int32),
                                                      shuffle=True,
                                                      num_epochs=1)


### tf.estimator.LinearClassifier

In [None]:
estimator = tf.estimator.LinearClassifier([tf.feature_column.numeric_column('x',shape=784)], 
                                          n_classes=10,
                                          model_dir="mnist/linear")
estimator.train(mnist_train_input, steps = 10000)

In [None]:
estimator.evaluate(mnist_test_input)

### Examine the results with [TensorBoard](http://0.0.0.0:6006)
$> tensorboard --logdir mnnist/DNN

In [None]:
estimator = tf.estimator.DNNClassifier(hidden_units=[256],
                                       feature_columns=[tf.feature_column.numeric_column('x',shape=784)], 
                                       n_classes=10,
                                       model_dir="mnist/DNN")
estimator.train(mnist_train_input, steps = 10000)

In [None]:
estimator.evaluate(mnist_test_input)

In [None]:
# Parameters
BATCH_SIZE = 128
STEPS = 10000

## A Custom Model

In [None]:
def build_cnn(input_layer, mode):
    with tf.name_scope("conv1"):  
      conv1 = tf.layers.conv2d(inputs=input_layer,filters=32, kernel_size=[5, 5],
                               padding='same', activation=tf.nn.relu)

    with tf.name_scope("pool1"):  
      pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2)

    with tf.name_scope("conv2"):  
      conv2 = tf.layers.conv2d(inputs=pool1,filters=64, kernel_size=[5, 5],
                               padding='same', activation=tf.nn.relu)

    with tf.name_scope("pool2"):  
      pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2)

    with tf.name_scope("dense"):  
      pool2_flat = tf.reshape(pool2, [-1, 7 * 7 * 64])
      dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu)

    with tf.name_scope("dropout"):  
      is_training_mode = mode == tf.estimator.ModeKeys.TRAIN
      dropout = tf.layers.dropout(inputs=dense, rate=0.4, training=is_training_mode)

    logits = tf.layers.dense(inputs=dropout, units=10)

    return logits


In [None]:
def model_fn(features, labels, mode):
  # Describing the model
  input_layer = tf.reshape(features['x'], [-1, 28, 28, 1])
    
  tf.summary.image('mnist_input',input_layer)
    
  logits = build_cnn(input_layer, mode)
 
  # Generate Predictions
  classes = tf.argmax(input=logits, axis=1)
  predictions = {
      'classes': classes,
      'probabilities': tf.nn.softmax(logits, name='softmax_tensor')
  }

  if mode == tf.estimator.ModeKeys.PREDICT:
    # Return an EstimatorSpec object
    return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)

  with tf.name_scope('loss'):
    loss = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=labels, logits=logits)
  
  loss = tf.reduce_sum(loss)
  tf.summary.scalar('loss', loss)
    
  with tf.name_scope('accuracy'):
    accuracy = tf.cast(tf.equal(tf.cast(classes,tf.int32),labels),tf.float32)
  accuracy = tf.reduce_mean(accuracy)
  tf.summary.scalar('accuracy', accuracy)

  # Configure the Training Op (for TRAIN mode)
  if mode == tf.estimator.ModeKeys.TRAIN:
    train_op = tf.contrib.layers.optimize_loss(
        loss=loss,
        global_step=tf.train.get_global_step(),
        learning_rate=1e-4,
        optimizer='Adam')

    return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions,
                                      loss=loss, train_op=train_op)

  # Configure the accuracy metric for evaluation
  eval_metric_ops = {
      'accuracy': tf.metrics.accuracy(
          classes,
          input=labels)
  }

  return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions,
                                    loss=loss, eval_metric_ops=eval_metric_ops)


## Run config

In [None]:
# create estimator
run_config = tf.contrib.learn.RunConfig(model_dir='mnist/CNN')
estimator = tf.estimator.Estimator(model_fn=model_fn, config=run_config)

# train for 1000 steps
# this is too few
estimator.train(input_fn=mnist_train_input, steps=1000)

# evaluate
estimator.evaluate(input_fn=mnist_test_input)

# predict
preds = estimator.predict(input_fn=test_input_fn)

## Distributed tensorflow: using experiments

In [None]:
# Run an experiment
from tensorflow.contrib.learn.python.learn import learn_runner

# Enable TensorFlow logs
tf.logging.set_verbosity(tf.logging.INFO)

In [None]:
# create experiment
def experiment_fn(run_config, hparams):
  # create estimator
  estimator = tf.estimator.Estimator(model_fn=model_fn,
                                     config=run_config)
  return tf.contrib.learn.Experiment(
      estimator,
      train_input_fn=train_input_fn,
      eval_input_fn=test_input_fn,
      train_steps=STEPS
  )

# run experiment
learn_runner.run(experiment_fn,
    run_config=run_config)

### Examine the results with [TensorBoard](http://0.0.0.0:6006)
$> tensorboard --logdir mnist/CNN