In [1]:
import tensorflow as tf

In [2]:
import numpy as np

We will need the below functions to be able to visualize the tensorflow graph in jupyter

In [3]:
from IPython.display import clear_output, Image, display, HTML

def strip_consts(graph_def, max_const_size=32):
    """Strip large constant values from graph_def."""
    strip_def = tf.GraphDef()
    for n0 in graph_def.node:
        n = strip_def.node.add() 
        n.MergeFrom(n0)
        if n.op == 'Const':
            tensor = n.attr['value'].tensor
            size = len(tensor.tensor_content)
            if size > max_const_size:
                tensor.tensor_content = "<stripped %d bytes>"%size
    return strip_def


def show_graph(graph_def, max_const_size=32):
    """Visualize TensorFlow graph."""
    if hasattr(graph_def, 'as_graph_def'):
        graph_def = graph_def.as_graph_def()
    strip_def = strip_consts(graph_def, max_const_size=max_const_size)
    code = """
        <script>
          function load() {{
            document.getElementById("{id}").pbtxt = {data};
          }}
        </script>
        <link rel="import" href="https://tensorboard.appspot.com/tf-graph-basic.build.html" onload=load()>
        <div style="height:600px">
          <tf-graph-basic id="{id}"></tf-graph-basic>
        </div>
    """.format(data=repr(str(strip_def)), id='graph'+str(np.random.rand()))

    iframe = """
        <iframe seamless style="width:1200px;height:620px;border:0" srcdoc="{}"></iframe>
    """.format(code.replace('"', '&quot;'))
    display(HTML(iframe))

A computational graph is a series of TensorFlow operations arranged into a graph of nodes. 

Let's build a simple computational graph. 

Each node takes zero or more tensors as inputs and produces a tensor as an output. 

One type of node is a constant. Like all TensorFlow constants, it takes no inputs, and it outputs a value it stores internally. We can create two floating point Tensors node1 and node2 as follows:


In [4]:
node1 = tf.constant(3.0, tf.float32)
node2 = tf.constant(4.0) # also tf.float32 implicitly
print(node1, node2)

(<tf.Tensor 'Const:0' shape=() dtype=float32>, <tf.Tensor 'Const_1:0' shape=() dtype=float32>)


Notice that printing the nodes does not output the values 3.0 and 4.0 as you might expect. Instead, they are nodes that, when evaluated, would produce 3.0 and 4.0, respectively. To actually evaluate the nodes, we must run the computational graph within a session. A session encapsulates the control and state of the TensorFlow runtime.

In [5]:
sess = tf.InteractiveSession()
print(sess.run([node1, node2]))

[3.0, 4.0]


We can build more complicated computations by combining Tensor nodes with operations (Operations are also nodes.). For example, we can add our two constant nodes and produce a new graph as follows:

In [6]:
node3 = tf.add(node1, node2)
print("node3: ", node3)
print("sess.run(node3): ",sess.run(node3))

('node3: ', <tf.Tensor 'Add:0' shape=() dtype=float32>)
('sess.run(node3): ', 7.0)


In [7]:
show_graph(tf.get_default_graph().as_graph_def())

As it stands, this graph is not especially interesting because it always produces a constant result. A graph can be parameterized to accept external inputs, known as placeholders. A placeholder is a promise to provide a value later.

In [8]:
a = tf.placeholder(tf.float32)
b = tf.placeholder(tf.float32)
adder_node = a + b  # + provides a shortcut for tf.add(a, b)

We can evaluate this graph with multiple inputs by using the feed_dict parameter to specify Tensors that provide concrete values to these placeholders:

In [9]:
print(sess.run(adder_node, {a: 3, b:4.5}))
print(sess.run(adder_node, {a: [1,3], b: [2, 4]}))

7.5
[ 3.  7.]


We can make the computational graph more complex by adding another operation. For example,

In [10]:
add_and_triple = adder_node * 3.
print(sess.run(add_and_triple, {a: 3, b:4.5}))

22.5


In [11]:
show_graph(tf.get_default_graph().as_graph_def())

In machine learning we will typically want a model that can take arbitrary inputs, such as the one above. To make the model trainable, we need to be able to modify the graph to get new outputs with the same input. Variables allow us to add trainable parameters to a graph. They are constructed with a type and initial value

# Linear regression
Let us reset the cureent graph because we won't need this anymore

In [12]:
from tensorflow.python.framework import ops
ops.reset_default_graph()
sess = tf.InteractiveSession()

In [13]:
W = tf.Variable([.3], tf.float32)
b = tf.Variable([-.3], tf.float32)
x = tf.placeholder(tf.float32)
linear_model = W * x + b

Constants are initialized when you call tf.constant, and their value can never change. By contrast, variables are not initialized when you call tf.Variable. To initialize all the variables in a TensorFlow program, you must explicitly call a special operation as follows:

In [14]:
show_graph(tf.get_default_graph().as_graph_def())

In [15]:
init = tf.global_variables_initializer()
sess.run(init)

In [16]:
show_graph(tf.get_default_graph().as_graph_def())

It is important to realize init is a handle to the TensorFlow sub-graph that initializes all the global variables. Until we call sess.run, the variables are uninitialized.

Since x is a placeholder, we can evaluate linear_model for several values of x simultaneously as follows:

In [17]:
print(sess.run(linear_model, {x:[1,2,3,4]}))

[ 0.          0.30000001  0.60000002  0.90000004]


We've created a model, but we don't know how good it is yet. To evaluate the model on training data, we need a y placeholder to provide the desired values, and we need to write a loss function.

A loss function measures how far apart the current model is from the provided data. We'll use a standard loss model for linear regression, which sums the squares of the deltas between the current model and the provided data. linear_model - y creates a vector where each element is the corresponding example's error delta. We call tf.square to square that error. Then, we sum all the squared errors to create a single scalar that abstracts the error of all examples using tf.reduce_sum:

In [18]:
y = tf.placeholder(tf.float32)
squared_deltas = tf.square(linear_model - y)
print(sess.run(squared_deltas, {x:[1,2,3,4], y:[0,-1,-2,-3]}))


[  0.           1.68999982   6.75999928  15.21000099]


In [19]:
show_graph(tf.get_default_graph().as_graph_def())

In [20]:
loss = tf.reduce_sum(squared_deltas)
print(sess.run(loss, {x:[1,2,3,4], y:[0,-1,-2,-3]}))

23.66


In [21]:
show_graph(tf.get_default_graph().as_graph_def())

We could improve this manually by reassigning the values of W and b to the perfect values of -1 and 1. A variable is initialized to the value provided to tf.Variable but can be changed using operations like tf.assign. For example, W=-1 and b=1 are the optimal parameters for our model. We can change W and b accordingly:

In [22]:
fixW = tf.assign(W, [-1.])
fixb = tf.assign(b, [1.])
sess.run([fixW, fixb])

[array([-1.], dtype=float32), array([ 1.], dtype=float32)]

In [23]:
print(sess.run(loss, {x:[1,2,3,4], y:[0,-1,-2,-3]}))

0.0


In [24]:
show_graph(tf.get_default_graph().as_graph_def())

# tf.train API

A complete discussion of machine learning is out of the scope of this tutorial. However, TensorFlow provides optimizers that slowly change each variable in order to minimize the loss function. The simplest optimizer is gradient descent. It modifies each variable according to the magnitude of the derivative of loss with respect to that variable. In general, computing symbolic derivatives manually is tedious and error-prone. Consequently, TensorFlow can automatically produce derivatives given only a description of the model using the function tf.gradients. For simplicity, optimizers typically do this for you. For example,

In [25]:
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)

In [26]:
show_graph(tf.get_default_graph().as_graph_def())

In [27]:
sess.run(init) # reset values to incorrect defaults.
for i in range(1000):
  sess.run(train, {x:[1,2,3,4], y:[0,-1,-2,-3]})
  if (i%100)==1: print(sess.run([W, b]))

print("Final:", sess.run([W, b]))

[array([-0.39679998], dtype=float32), array([-0.49552], dtype=float32)]
[array([-0.8445884], dtype=float32), array([ 0.54307097], dtype=float32)]
[array([-0.95341456], dtype=float32), array([ 0.86303312], dtype=float32)]
[array([-0.98603582], dtype=float32), array([ 0.95894355], dtype=float32)]
[array([-0.99581414], dtype=float32), array([ 0.98769313], dtype=float32)]
[array([-0.99874526], dtype=float32), array([ 0.99631089], dtype=float32)]
[array([-0.99962395], dtype=float32), array([ 0.99889427], dtype=float32)]
[array([-0.99988729], dtype=float32), array([ 0.9996686], dtype=float32)]
[array([-0.9999662], dtype=float32), array([ 0.99990064], dtype=float32)]
[array([-0.99998981], dtype=float32), array([ 0.99997008], dtype=float32)]
('Final:', [array([-0.9999969], dtype=float32), array([ 0.99999082], dtype=float32)])


Evaluation

In [28]:
x_train = [1,2,3,4]
y_train = [0,-1,-2,-3]
curr_W, curr_b, curr_loss  = sess.run([W, b, loss], {x:x_train, y:y_train})
print("W: %s b: %s loss: %s"%(curr_W, curr_b, curr_loss))

W: [-0.9999969] b: [ 0.99999082] loss: 5.69997e-11


# tf.contrib.learn

tf.contrib.learn is a high-level TensorFlow library that simplifies the mechanics of machine learning

To define a custom model that works with tf.contrib.learn, we need to use tf.contrib.learn.Estimator


Instead of sub-classing Estimator, we simply provide Estimator a function model_fn that tells tf.contrib.learn how it can evaluate predictions, training steps, and loss. The code is as follows:

In [29]:
ops.reset_default_graph()
sess = tf.InteractiveSession()

In [30]:
def model(features, labels, mode):
  # Build a linear model and predict values
  W = tf.get_variable("W", [1], dtype=tf.float64)
  b = tf.get_variable("b", [1], dtype=tf.float64)
  y = W*features['x'] + b
  # Loss sub-graph
  loss = tf.reduce_sum(tf.square(y - labels))
  # Training sub-graph
  global_step = tf.train.get_global_step()
  optimizer = tf.train.GradientDescentOptimizer(0.01)
  train = tf.group(optimizer.minimize(loss),
                   tf.assign_add(global_step, 1))
  # ModelFnOps connects subgraphs we built to the
  # appropriate functionality.
  return tf.contrib.learn.ModelFnOps(
      mode=mode, predictions=y,
      loss=loss,
      train_op=train)

In [31]:
estimator = tf.contrib.learn.Estimator(model_fn=model)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_save_checkpoints_secs': 600, '_num_ps_replicas': 0, '_keep_checkpoint_max': 5, '_tf_random_seed': None, '_task_type': None, '_environment': 'local', '_is_chief': True, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f55034f6110>, '_tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1
}
, '_task_id': 0, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_evaluation_master': '', '_keep_checkpoint_every_n_hours': 10000, '_master': ''}


In [32]:
# define our data set
x = np.array([1., 2., 3., 4.])
y = np.array([0., -1., -2., -3.])

In [33]:
input_fn = tf.contrib.learn.io.numpy_input_fn({"x": x}, y, 4, num_epochs=1000)

In [34]:
# train
estimator.fit(input_fn=input_fn, steps=1000)

INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Saving checkpoints for 1 into /tmp/tmp03B8v7/model.ckpt.
INFO:tensorflow:loss = 27.2766532376, step = 1
INFO:tensorflow:global_step/sec: 1049.54
INFO:tensorflow:loss = 0.0404314733066, step = 101
INFO:tensorflow:global_step/sec: 1627.68
INFO:tensorflow:loss = 0.00347074509316, step = 201
INFO:tensorflow:global_step/sec: 1631.14
INFO:tensorflow:loss = 0.000177336507735, step = 301
INFO:tensorflow:global_step/sec: 1485.34
INFO:tensorflow:loss = 0.000105954684508, step = 401
INFO:tensorflow:global_step/sec: 1656.18
INFO:tensorflow:loss = 5.13863812648e-06, step = 501
INFO:tensorflow:global_step/sec: 1569.29
INFO:tensorflow:loss = 7.59447228565e-07, step = 601
INFO:tensorflow:global_step/sec: 1646.68
INFO:tensorflow:loss = 5.53801360607e-08, step = 701
INFO:tensorflow:global_step/sec: 1728.61
INFO:tensorflow:loss = 4.62455548089e-09, step = 801
INFO:tensorflow:global_step/sec: 1748.19
INFO:tensorflow:loss = 4.61378244361e-10, step

Estimator(params=None)

In [35]:
# evaluate our model
print(estimator.evaluate(input_fn=input_fn, steps=10))

INFO:tensorflow:Starting evaluation at 2017-03-14-20:34:12
INFO:tensorflow:Evaluation [1/10]
INFO:tensorflow:Evaluation [2/10]
INFO:tensorflow:Evaluation [3/10]
INFO:tensorflow:Evaluation [4/10]
INFO:tensorflow:Evaluation [5/10]
INFO:tensorflow:Evaluation [6/10]
INFO:tensorflow:Evaluation [7/10]
INFO:tensorflow:Evaluation [8/10]
INFO:tensorflow:Evaluation [9/10]
INFO:tensorflow:Evaluation [10/10]
INFO:tensorflow:Finished evaluation at 2017-03-14-20:34:12
INFO:tensorflow:Saving dict for global step 1000: global_step = 1000, loss = 3.48098e-11
{'loss': 3.4809836e-11, 'global_step': 1000}
