# Eager Execution

- TensorFlow's eager execution is an imperative programming environment that evaluates operations immediately, without building graphs: operations return concrete values instead of constructing a computational graph to run later. This makes it easy to get started with TensorFlow and debug models, and it reduces boilerplate as well. To follow along with this guide, run the code samples below in an interactive `python` interpreter.

- Eager execution is a flexible machine learning platform for research and exprerimentation, providing:
    - *An intuitive interface* - Structure your code naturally and use Python data structures. Quickly iterate on small models and small data.
    - *Easier debugging* - Call ops directly to inspect running models and test changes. Use standard Pyton debugging tools for immediate error reporting.
    - *Natural control flow* - Use Python control flow instead of graph control flow, simplifying the specifaction of dynamic models
    
- Eager execution supports most TensorFlow operations and GPU acceleration. For a collection of examples running in eager execution, see: [tensorflow/contrib/eager/python/examples](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/eager/python/examples)

- **Note**: Some models may experience increased overhead with eager execution enabled. Performance improvements are ongoing, but please [file a bug](https://github.com/tensorflow/tensorflow/issues) if you find a problem and share your benchmarks

## Setup and basic usage

- To start eager execution, add `tf.enable_eager_execution()` to the beginning of the program or console session. Do not add this operation to other modules that the program calls.

In [1]:
from __future__ import absolute_import, division, print_function

import tensorflow as tf

tf.enable_eager_execution()

- Now you can run TensorFlow operations and the results will return immediately:

In [2]:
tf.executing_eagerly()

True

In [3]:
x = [[2.]]
m = tf.matmul(x, x)
print("hello, {}".format(m))

hello, [[4.]]


- Enabling eager execution changes how TensorFlow operations behave - now they immediately evaluate and return their values to Python. `tf.Tensor` objects reference concrete values instead of symbolic handles to nodes in a computational graph. Since there isn't a computational graph to build and run later in a session, it's easy to inspect results using `print()` or a debugger. Evaluating, printing, and checking tensor values does not break the flow for computing gradients.

- Eager execution works nicely with `numpy`. NumPy operation accept `tf.Tensor` arguments. TensorFlow [math operations](https://www.tensorflow.org/api_guides/python/math_ops) convert Python objects and NumPy arrys to `tf.Tensor` objects. The `tf.Tensor.numpy` method returns the object's value as a NumPy `ndarray`.

In [4]:
a = tf.constant([[1, 2],
                [3, 4]])
print(a)

tf.Tensor(
[[1 2]
 [3 4]], shape=(2, 2), dtype=int32)


In [5]:
# Broadcasting support
b = tf.add(a, 1)
print(b)

tf.Tensor(
[[2 3]
 [4 5]], shape=(2, 2), dtype=int32)


In [6]:
# Operator overloading is supported
print(a * b)

tf.Tensor(
[[ 2  6]
 [12 20]], shape=(2, 2), dtype=int32)


In [7]:
# Use NumPy values
import numpy as np

c = np.multiply(a, b)
print(c)

[[ 2  6]
 [12 20]]


In [8]:
# Obtain numpy value from a tensor
print(a.numpy())

[[1 2]
 [3 4]]


- The `tf.contrib.eager` module contains symbols available to both eager and graph execution environments and is useful for writing code to [work with graphs](https://www.tensorflow.org/guide/eager#work_with_graphs):

In [9]:
tfe = tf.contrib.eager

## Dynamic control flow

- A major benefit of eager execution is that all the functionality of the host language is available while your model is executing. So, for example, it is easy to write [fizzbuzz](https://en.wikipedia.org/wiki/Fizz_buzz):

In [10]:
def fizzbuzz(max_num):
    counter = tf.constant(0)
    max_num = tf.convert_to_tensor(max_num)
    for num in range(1, max_num.numpy()+1):
        num = tf.constant(num)
        if int(num % 3) == 0 and int(num % 5) == 0:
            print('FizzBuzz')
        elif int(num % 3) == 0:
            print('Fizz')
        elif int(num % 5) == 0:
            print('Buzz')
        else:
            print(num.numpy())
        counter += 1

In [11]:
fizzbuzz(15)

1
2
Fizz
4
Buzz
Fizz
7
8
Fizz
Buzz
11
Fizz
13
14
FizzBuzz


- This has conditionals that depend on tensor values and it prints these values at runtime.

### Build a model

- Many machine learning models are represented by composing layers. When using TensorFlow with eager execution you can either write your own layers or use a layer provided in the `tf.keras.layers` package.

- While you can use any Python object to represent a layer, TensorFlow has `tf.keras.layers.Layer` as a convenient base class. Inherit from it to implement your own player:

In [12]:
class MySimpleLayer(tf.keras.layers.Layer):
    def __init__(self, output_units):
        super(MySimpleLayer, self).__init__()
        self.output_units = output_units
        
    def build(self, input_shape):
        # The build method gets called the first tiem your layer is used.
        # Creating variables on build() allows you to make their shape depend
        # on the input shape and hence removes the need for user to specify
        # full shapes. It is possible to create variables during __init__() if
        # you already know their full shapes.
        self.kernel = self.add_variable("kernel", [input_shape[-1], self.output_units])
    
    def call(self, input):
        # Override call() instead of __call__ so we can perform some bookkeeping.
        return tf.matmul(input, self.kernel)

- Use `tf.keras.layers.Dense` layer instead of `MySimpleLayer` above as it has a superset of its functionality (it can also add a bias).

- When composing layers into models you can use `tf.keras.Sequential` to represent models which are a linear stack of layers. It is easy to use for basic models:

In [13]:
model = tf.keras.Sequential([
    tf.keras.layers.Dense(10, input_shape=(784,)), # must declare input shape
    tf.keras.layers.Dense(10)
])

- Alternatively, organize models in classes by inheriting from `tf.keras.Model`. This is a container for layer itself, allowing `tf.keras.Model` objects to contain other `tf.keras.Model` objects.

In [14]:
class MNISTModel(tf.keras.Model):
    def __init__(self):
        super(MNISTModel, self).__init__()
        self.dense1 = tf.keras.layers.Dense(units=10)
        self.dense2 = tf.keras.layers.Dense(units=10)
        
    def call(self, input):
        '''Run the model.'''
        result = self.dense1(input)
        result = self.dense2(result)
        result = self.dense2(result) # reuse variables from dense2 layer
        
model = MNISTModel()

- It's not required to set an input shape for the `tf.keras.Model` class since the parameters are set the first time input is passed to the layer.

- `tf.keras.layers` classes create and contain their own model variables that are tied to the lifetime of their layer objects. To share layer variables, share their objects.

## Eager training

### Computing gradients

- [Automatic differentiation](https://en.wikipedia.org/wiki/Automatic_differentiation) is useful for implementing machine learning algorithms such as [backpropagation](https://en.wikipedia.org/wiki/Backpropagation) for training neural networks. During eager execution, use `tf.GradientTape` to trace operations for computing gradients later.

- `tf.GradientTape` is an opt-in feature to provide maximal performance when not tracing. Since different operations can occur during each call, all forward-pass operations get recorded to a "tape". To compute the gradient, play the tape backwards and then discard. A particular `tf.GradientTape` can only compute one gradient; subsequent calls throw a runtime error.

In [18]:
w = tf.Variable([1.0])
with tf.GradientTape() as tape:
    loss = w * w
    
grad = tape.gradient(loss, w)
print(grad) # => tf.Tensor([[ 2.]], shape=(1, 1), dtype=float32)

tf.Tensor([2.], shape=(1,), dtype=float32)


### Train a model

- The following example creates a multi-layer model that classifies the standard MNIST handwritten digits. It demonstrates the optimizer and layer APIs to build trainable graphs in an eager execution environment.

In [19]:
# Fetch and format the mnist data
(mnist_images, mnist_labels), _ = tf.keras.datasets.mnist.load_data()

dataset = tf.data.Dataset.from_tensor_slices(
    (tf.cast(mnist_images[...,tf.newaxis]/255, tf.float32),
    tf.cast(mnist_labels,tf.int64)))
dataset = dataset.shuffle(1000).batch(32)

In [20]:
# Build the model
mnist_model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(16, [3,3], activation='relu'),
    tf.keras.layers.Conv2D(16, [3,3], activation='relu'),
    tf.keras.layers.GlobalAveragePooling2D(),
    tf.keras.layers.Dense(10)
])

- Even without training, call the model and inspect the output in eager execution:

In [22]:
for images, labels in dataset.take(1):
    print("Logits: ", mnist_model(images[0:1]).numpy())

Logits:  [[ 0.02328178  0.06322408 -0.05037579  0.00277662  0.00567407  0.08586673
  -0.02721465 -0.00211906  0.02003275  0.00239374]]


- While keras models have a builtin training loop (using the `fit` method), sometimes you need more customization. Here's an example, of a training loop implemented with eager:

In [23]:
optimizer = tf.train.AdamOptimizer()

loss_history=[]

In [28]:
for (batch, (images, label)) in enumerate(dataset.take(400)):
    if batch % 10 == 0:
        print('.', end='')
    with tf.GradientTape() as tape:
        logits = mnist_model(images, training=True)
        loss_value = tf.losses.sparse_softmax_cross_entropy(labels, logits)
        
    loss_history.append(loss_value.numpy())
    grads = tape.gradient(loss_value, mnist_model.trainable_variables)
    optimizer.apply_gradients(zip(grads, mnist_model.trainable_variables),
                             global_step=tf.train.get_or_create_global_step())

........................................

In [29]:
import matplotlib.pyplot as plt

plt.plot(loss_history)
plt.xlabel('Batch #')
plt.ylabel('Loss [entropy]')

Text(0, 0.5, 'Loss [entropy]')