copied from https://www.oreilly.com/learning/hello-tensorflow

In [1]:
import tensorflow as tf

ImportError: No module named tensorflow

At this point TensorFlow has already started managing a lot of state for us. There's already an implicit default graph, for example. Internally, the default graph lives in the `_default_graph_stack`, but we don't have access to that directly. We use `tf.get_default_graph()`.

In [2]:
graph = tf.get_default_graph()

The nodes of the TensorFlow graph are called “operations,” or “ops.” We can see what operations are in the graph with `graph.get_operations()`.

In [3]:
graph.get_operations()

[]

Currently, there isn't anything in the graph. We’ll need to put everything we want TensorFlow to compute into that graph. Let's start with a simple constant input value of one.

In [4]:
input_value = tf.constant(1.0)

That constant now lives as a node, an operation, in the graph. The Python variable name `input_value` refers indirectly to that operation, but we can also find the operation in the default graph.

In [5]:
operations = graph.get_operations()
operations

[<tensorflow.python.framework.ops.Operation at 0x118037390>]

In [6]:
operations[0].node_def

name: "Const"
op: "Const"
attr {
  key: "dtype"
  value {
    type: DT_FLOAT
  }
}
attr {
  key: "value"
  value {
    tensor {
      dtype: DT_FLOAT
      tensor_shape {
      }
      float_val: 1.0
    }
  }
}

TensorFlow can do a lot of great things, but it can only work with what's been explicitly given to it. This is true even for a single constant.

If we inspect our `input_value`, we see it is a constant 32-bit float tensor of no dimension: just one number.

In [7]:
input_value

<tf.Tensor 'Const:0' shape=() dtype=float32>

Note that this doesn't tell us what that number is. To evaluate `input_value` and get a numerical value out, we need to create a “session” where graph operations can be evaluated and then explicitly ask to evaluate or “run” `input_value`. (The session picks up the default graph by default.)

In [8]:
sess = tf.Session()
sess.run(input_value)

1.0

Now that we have a session with a simple graph, let's build a neuron with just one parameter, or weight. Often, even simple neurons also have a bias term and a non-identity activation function, but we'll leave these out.

The neuron's weight isn't going to be constant; we expect it to change in order to learn based on the “true” input and output we use for training. The weight will be a TensorFlow variable. We'll give that variable a starting value of 0.8.

In [9]:
weight = tf.Variable(0.8)

You might expect that adding a variable would add one operation to the graph, but in fact that one line adds four operations. We can check all the operation names:

In [10]:
for op in graph.get_operations(): 
    print(op.name)

Const
Variable/initial_value
Variable
Variable/Assign
Variable/read


We won't want to follow every operation individually for long, but it will be nice to see at least one that feels like a real computation.

In [11]:
output_value = weight * input_value

Now there are six operations in the graph, and the last one is that multiplication.

In [12]:
op = graph.get_operations()[-1]
op.name

'mul'

In [13]:
for op_input in op.inputs: 
    print(op_input)

Tensor("Variable/read:0", shape=(), dtype=float32)
Tensor("Const:0", shape=(), dtype=float32)


This shows how the multiplication operation tracks where its inputs come from: they come from other operations in the graph. To understand a whole graph, following references this way quickly becomes tedious for humans. TensorBoard graph visualization is designed to help.

How do we find out what the product is? We have to “run” the `output_value` operation. But that operation depends on a variable: `weight`. We told TensorFlow that the initial value of `weight` should be 0.8, but the value hasn't yet been set in the current session. The `tf.initialize_all_variables()` function generates an operation which will initialize all our variables (in this case just one) and then we can run that operation.

In [14]:
init = tf.initialize_all_variables()
sess.run(init)

The result of `tf.initialize_all_variables()` will include initializers for all the variables currently in the graph, so if you add more variables you'll want to use `tf.initialize_all_variables()` again; a stale init wouldn't include the new variables.

Now we're ready to run the `output_value` operation.

In [15]:
sess.run(output_value)

0.80000001

Up to this point, the graph has been simple, but it would already be nice to see it represented in a diagram. We'll use TensorBoard to generate that diagram. TensorBoard reads the name field that is stored inside each operation (quite distinct from Python variable names). We can use these TensorFlow names and switch to more conventional Python variable names. Using `tf.mul` here is equivalent to our earlier use of just `*` for multiplication, but it lets us set the name for the operation.

In [16]:
x = tf.constant(1.0, name='input')
w = tf.Variable(0.8, name='weight')
y = tf.mul(w, x, name='output')

TensorBoard works by looking at a directory of output created from TensorFlow sessions. We can write this output with a `SummaryWriter`, and if we do nothing aside from creating one with a graph, it will just write out that graph.

The first argument when creating the `SummaryWriter` is an output directory name, which will be created if it doesn't exist.

In [17]:
summary_writer = tf.train.SummaryWriter('log_simple_graph', sess.graph)

Now, at the command line, we can start up TensorBoard:
`tensorboard --logdir=log_simple_graph`

TensorBoard runs as a local web app, on port 6006. (“6006” is “goog” upside-down.) If you go in a browser to `localhost:6006/#graphs` you should see a diagram of the graph you created in TensorFlow.

Now that we’ve built our neuron, how does it learn? We set up an input value of 1.0. Let's say the correct output value is zero. That is, we have a very simple “training set” of just one example with one feature, which has the value one, and one label, which is zero. We want the neuron to learn the function taking one to zero.

Currently, the system takes the input one and returns 0.8, which is not correct. We need a way to measure how wrong the system is. We'll call that measure of wrongness the “loss” and give our system the goal of minimizing the loss. If the loss can be negative, then minimizing it could be silly, so let's make the loss the square of the difference between the current output and the desired output.

In [18]:
y_ = tf.constant(0.0)
loss = (y - y_)**2

So far, nothing in the graph does any learning. For that, we need an optimizer. We'll use a gradient descent optimizer so that we can update the weight based on the derivative of the loss. The optimizer takes a learning rate to moderate the size of the updates, which we'll set at 0.025.

In [19]:
optim = tf.train.GradientDescentOptimizer(learning_rate=0.025)

The optimizer is remarkably clever. It can automatically work out and apply the appropriate gradients through a whole network, carrying out the backward step for learning.

Let's see what the gradient looks like for our simple example.

In [20]:
grads_and_vars = optim.compute_gradients(loss)
sess.run(tf.initialize_all_variables())
sess.run(grads_and_vars[1][0])

1.6

Why is the value of the gradient 1.6? Our loss is error squared, and the derivative of that is two times the error. Currently the system says 0.8 instead of 0, so the error is 0.8, and two times 0.8 is 1.6. It's working!

For more complex systems, it will be very nice indeed that TensorFlow calculates and then applies these gradients for us automatically.

Let's apply the gradient, finishing the backpropagation.

In [21]:
sess.run(optim.apply_gradients(grads_and_vars))
sess.run(w)

0.75999999

The weight decreased by 0.04 because the optimizer subtracted the gradient times the learning rate, 1.6 * 0.025, pushing the weight in the right direction.

Instead of hand-holding the optimizer like this, we can make one operation that calculates and applies the gradients: the `train_step`.

In [22]:
train_step = tf.train.GradientDescentOptimizer(0.025).minimize(loss)
for i in range(100):
    sess.run(train_step)
    
sess.run(y)

0.0044996012

We may be interested in what's happening during training. Say we want to follow what our system is predicting at every training step. We could print from inside the training loop.

In [23]:
sess.run(tf.initialize_all_variables())
for i in range(4):
    print('before step {}, y is {}'.format(i, sess.run(y)))
    sess.run(train_step)

before step 0, y is 0.800000011920929
before step 1, y is 0.7599999904632568
before step 2, y is 0.722000002861023
before step 3, y is 0.6858999729156494


This works, but there are some problems. It's hard to understand a list of numbers. A plot would be better. And even with only one value to monitor, there's too much output to read. We're likely to want to monitor many things. It would be nice to record everything in some organized way.

Luckily, the same system that we used earlier to visualize the graph also has just the mechanisms we need.

We instrument the computation graph by adding operations that summarize its state. Here, we'll create an operation that reports the current value of `y`, the neuron's current output.

In [24]:
summary_y = tf.scalar_summary('output', y)

When you run a summary operation, it returns a string of protocol buffer text that can be written to a log directory with a `SummaryWriter`.

In [29]:
summary_writer = tf.train.SummaryWriter('log_simple_graph')
sess.run(tf.initialize_all_variables())
for i in range(100):
    summary_str = sess.run(summary_y)
    summary_writer.add_summary(summary_str, i)
    sess.run(train_step)

Now after running `tensorboard --logdir=log_simple_graph`, you get an imnteractive plot at `localhost:6006/#events`