The **eager execution** for TensorFlow is a low-level interface allowing a more dynamic programming experience. Eager execution greatly simplifies how you can write and debug models, softening the complete separation between the *definition* of the operations and their *execution* in the standard TensorFlow interface.

Eager execution is a flexible machine learning platform for research and experimentation, providing:
-  An intuitive interface—Structure your code naturally and use Python data structures. Quickly iterate on small models and small data.
-  Easier debugging—Call ops directly to inspect running models and test changes. Use standard Python debugging tools for immediate error reporting.
-  Natural control flow—Use Python control flow instead of graph control flow, simplifying the specification of dynamic models.
    

# Static and dynamic graph computation

To understand the need for eager execution, consider a simple example:

In [0]:
import tensorflow as tf
a = tf.constant(3.0)
b = a + 2.0
print(b)

If you never played with the low-level components of TensorFlow before, you probably would have expected the print operation to show the value of `b` at this point. Instead, we have to fetch the value of the variable by running the operation inside a `Session` object:



In [0]:
sess = tf.Session()
with sess.as_default():
  print(sess.run(b))

# Enabling eager execution

Eager was introduced experimentally in TensorFlow v1.5, but in order to use all functionalities, we need to install the latest version (v1.7rc0 as of this writing). 

Eager is enabled with a single line:

In [0]:
import tensorflow as tf
import tensorflow.contrib.eager as tfe
import numpy as np
tfe.enable_eager_execution()

## Variables and gradients with eager execution

### Eager variables and NumPy compatibility

Eager execution works nicely with NumPy. NumPy operations accept tf.Tensor arguments. TensorFlow math operations convert Python objects and NumPy arrays to tf.Tensor objects. The tf.Tensor.numpy method returns the object's value as a NumPy ndarray.

Before going straight to the model definition, let us see how variables and gradients are handled under the eager execution. First, eager comes with its own implementation of variables, which are automatically initalized when requested:

In [0]:
a = tf.constant([[1, 2],
                 [3, 4]])
print(a)

In [0]:
# Broadcasting support
b = tf.add(a, 1)
print(b)

In [0]:
# Operator overloading is supported
print(a * b)

In [0]:
# Use NumPy values
import numpy as np

c = np.multiply(a, b)
print(c)

In [0]:
# Obtain numpy value from a tensor:
print(a.numpy())

### Dynamic control flow

A major benefit of eager execution is that all the functionality of the host language is available while your model is executing. So, for example, it is easy to write fizzbuzz:

In [0]:
def fizzbuzz(max_num):
    counter = tf.constant(0)
    max_num = tf.convert_to_tensor(max_num)
    for num in range(max_num.numpy()):
        num = tf.constant(num)
        if int(num % 3) == 0 and int(num % 5) == 0:
            print('FizzBuzz')
        elif int(num % 3) == 0:
            print('Fizz')
        elif int(num % 5) == 0:
            print('Buzz')
        else:
            print(num)
        counter += 1
    return counter

This has conditionals that depend on tensor values and it prints these values at runtime.

### Computing gradients in eager

Automatic differentiation is useful for implementing machine learning algorithms such as backpropagation for training neural networks. During eager execution, use tf.GradientTape to trace operations for computing gradients later.

tf.GradientTape is an opt-in feature to provide maximal performance when not tracing. Since different operations can occur during each call, all forward-pass operations get recorded to a "tape". To compute the gradient, play the tape backwards and then discard. A particular tf.GradientTape can only compute one gradient; subsequent calls throw a runtime error.

In [0]:
w = tfe.Variable([[1.0]])
with tfe.GradientTape() as tape:
    loss = w * w

grad = tape.gradient(loss, [w])
print(grad) 

Here's an example of tf.GradientTape that records forward-pass operations to train a simple model:

In [0]:

# A toy dataset of points around 3 * x + 2
NUM_EXAMPLES = 1000
training_inputs = tf.random_normal([NUM_EXAMPLES])
noise = tf.random_normal([NUM_EXAMPLES])
training_outputs = training_inputs * 3 + 2 + noise

def prediction(input, weight, bias):
  return input * weight + bias

# A loss function using mean-squared error
def loss(weights, biases):
  error = prediction(training_inputs, weights, biases) - training_outputs
  return tf.reduce_mean(tf.square(error))

# Return the derivative of loss with respect to weight and bias
def grad(weights, biases):
  with tfe.GradientTape() as tape:
    loss_value = loss(weights, biases)
  return tape.gradient(loss_value, [weights, biases])

train_steps = 200
learning_rate = 0.01
# Start with arbitrary values for W and B on the same batch of data
W = tfe.Variable(5.)
B = tfe.Variable(10.)

print("Initial loss: {:.3f}".format(loss(W, B)))

for i in range(train_steps):
  dW, dB = grad(W, B)
  W.assign_sub(dW * learning_rate)
  B.assign_sub(dB * learning_rate)
  if i % 20 == 0:
    print("Loss at step {:03d}: {:.3f}".format(i, loss(W, B)))

print("Final loss: {:.3f}".format(loss(W, B)))
print("W = {}, B = {}".format(W.numpy(), B.numpy()))

tfe.Variable objects store mutable tf.Tensor values accessed during training to make automatic differentiation easier. The parameters of a model can be encapsulated in classes as variables.

In [0]:

class Model(tf.keras.Model):
  def __init__(self):
    super(Model, self).__init__()
    self.W = tfe.Variable(5., name='weight')
    self.B = tfe.Variable(10., name='bias')
  def call(self, inputs):
    return inputs * self.W + self.B

# A toy dataset of points around 3 * x + 2
NUM_EXAMPLES = 2000
training_inputs = tf.random_normal([NUM_EXAMPLES])
noise = tf.random_normal([NUM_EXAMPLES])
training_outputs = training_inputs * 3 + 2 + noise

# The loss function to be optimized
def loss(model, inputs, targets):
  error = model(inputs) - targets
  return tf.reduce_mean(tf.square(error))

def grad(model, inputs, targets):
  with tfe.GradientTape() as tape:
    loss_value = loss(model, inputs, targets)
  return tape.gradient(loss_value, [model.W, model.B])

# Define:
# 1. A model.
# 2. Derivatives of a loss function with respect to model parameters.
# 3. A strategy for updating the variables based on the derivatives.
model = Model()
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)

print("Initial loss: {:.3f}".format(loss(model, training_inputs, training_outputs)))

# Training loop
for i in range(300):
  grads = grad(model, training_inputs, training_outputs)
  optimizer.apply_gradients(zip(grads, [model.W, model.B]),
                            global_step=tf.train.get_or_create_global_step())
  if i % 20 == 0:
    print("Loss at step {:03d}: {:.3f}".format(i, loss(model, training_inputs, training_outputs)))

print("Final loss: {:.3f}".format(loss(model, training_inputs, training_outputs)))
print("W = {}, B = {}".format(model.W.numpy(), model.B.numpy()))