# Eager Execution Basics and Gradient

This is the first [tutorial](https://www.tensorflow.org/tutorials/eager/eager_basics) in the research and experimentation section of the [tutorials](https://www.tensorflow.org/tutorials/eager/). This is now standard with version 2 of TensorFlow but I can still learn something.

In [1]:
import tensorflow as tf
tf.enable_eager_execution()

**Tensors**  
Up until recently, I didn't know what 'tensor' was in TensorFlow. I just assumed it was something ML related and moved on. Turns out, it is a multi-dimensional array. I used something similar in numpy with their ndarray mutliple times across multiple classes. Tensors can reside in GPU (I wish crypto would crash so I could pick one up at a normal prices) memory.  

The differences between tensors and ndarrays:  
* Tensors can be backed by accelerator memory (GPU/TPU)  
* Tensors are immutable

In [6]:
#Built in methods
print(tf.add(1,2)) #Notice the type returned. You have the value but also a shape and data type
print(tf.add([1,2],[3.4]))
print(tf.square(5))
print(tf.reduce_sum([1,2,3]))
print(tf.encode_base64("hello world"))
#Operator overloading is also supported
print(tf.square(2) + tf.square(3))

tf.Tensor(3, shape=(), dtype=int32)
tf.Tensor([4 5], shape=(2,), dtype=int32)
tf.Tensor(25, shape=(), dtype=int32)
tf.Tensor(6, shape=(), dtype=int32)
tf.Tensor(b'aGVsbG8gd29ybGQ', shape=(), dtype=string)
tf.Tensor(13, shape=(), dtype=int32)


In [7]:
x = tf.matmul([[1]],[[2,3]])
print(x.shape)
print(x.dtype)

(1, 2)
<dtype: 'int32'>


**Conversion between NumPy and Tensors**  
TensorFlow operations automatically convert NumPy ndarrays to Tensors.  
NumPy operations automatically convert Tensors to NumPy ndarrays.  

In [9]:
import numpy as np
ndarray = np.ones([3,3])
print("TensorFlow operations convert numpy arrays to Tensors automatically")
tensor = tf.multiply(ndarray,42)
print(tensor)
print("And NumPy operations convert Tensors to numpy arrays automatically")
print(np.add(tensor, 1))
print("The .numpy() method explicitly converts a Tensor to a numpy array")
print(tensor.numpy())

TensorFlow operations convert numpy arrays to Tensors automatically
tf.Tensor(
[[42. 42. 42.]
 [42. 42. 42.]
 [42. 42. 42.]], shape=(3, 3), dtype=float64)
And NumPy operations convert Tensors to numpy arrays automatically
[[43. 43. 43.]
 [43. 43. 43.]
 [43. 43. 43.]]
The .numpy() method explicitly converts a Tensor to a numpy array
[[42. 42. 42.]
 [42. 42. 42.]
 [42. 42. 42.]]


**GPU Acceleration**  
I don't have a GPU to run on but if I did I could set the Tensor to execute on it.

In [10]:
#None of this works!!
x = tf.random_uniform([3, 3])

print("Is there a GPU available: "),
print(tf.test.is_gpu_available())

print("Is the Tensor on GPU #0:  "),
print(x.device.endswith('GPU:0'))

Is there a GPU available: 
False
Is the Tensor on GPU #0:  
False


Since I don't have a GPU to work with I am going to skip the section working with them since I can't show the speed.

## Datasets  
I have always loaded data from a CSV using Pandas or using the built in ones. So, this section was all new to me.

**Eager vs Graph**  
When creating the data set without eager execution is the same. The difference some when you are iterating over the elements.

In [17]:
ds_tensors = tf.data.Dataset.from_tensor_slices([1,2,3,4,5,6])
#Create CSV
import tempfile
_, filename = tempfile.mkstemp()

with open(filename, 'w') as f:
    f.write("""Line 1
Line 2
Line 3
    """)
    
ds_file = tf.data.TextLineDataset(filename)

**Transformations/Iterations**  

In [18]:
ds_tensors = ds_tensors.map(tf.square).shuffle(2).batch(2)
ds_file = ds_file.batch(2)

In [19]:
print('Elemens of ds_tensors')
for x in ds_tensors:
    print(x)

print("\nElements in ds_file")
for x in ds_file:
    print(x)

Elemens of ds_tensors
tf.Tensor([1 9], shape=(2,), dtype=int32)
tf.Tensor([ 4 25], shape=(2,), dtype=int32)
tf.Tensor([36 16], shape=(2,), dtype=int32)

Elements in ds_file
tf.Tensor([b'Line 1' b'Line 2'], shape=(2,), dtype=string)
tf.Tensor([b'Line 3' b'    '], shape=(2,), dtype=string)


# Gradient Tape

Since these are short I am going to roll the [Automatic Differentiation](https://www.tensorflow.org/tutorials/eager/automatic_differentiation) into this same notebook.  

TensorFlow provides *tf.GradientType* for automatic differentiation (computing the gradient of a computation with respect to its input variables).

In [44]:
x = tf.ones((2,2))

with tf.GradientTape() as t:
    t.watch(x)
    y = tf.reduce_sum(x) #4.0
    z = tf.multiply(y,y) # 16.0

#Derivative of z with respect to the original input tensor x
dz_dx = t.gradient(z, x) # [[8,8],[8,8]]

for i in [0,1]:
    for j in [0,1]:
        assert dz_dx[i][j].numpy() == 8.0

In [33]:
x = tf.ones((2,2))

with tf.GradientTape() as t:
    t.watch(x)
    y = tf.reduce_sum(x)
    z = tf.multiply(y,y)

# Use the tape to compute the derivative of z w/r/t the intermediate value y
dz_dy = t.gradient(z,y)
assert dz_dy.numpy() == 8.0

tf.Tensor(
[[1. 1.]
 [1. 1.]], shape=(2, 2), dtype=float32)
tf.Tensor(4.0, shape=(), dtype=float32)
tf.Tensor(16.0, shape=(), dtype=float32)


By default, the resources are released as soon as the *gradient* method is called. To keep these around for multiple calls you need to set *persistent* property.

In [27]:
x = tf.constant(3.0)
with tf.GradientTape(persistent=True) as t:
    t.watch(x)
    y = x * x
    z = y * y
dz_dx = t.gradient(z, x) # 108 => (4 * x^3 as x = 3)
dy_dx = t.gradient(y, x) # 6.0
del t

**Recording control flow**  
Since the tapes are recording the operations as they are executed, pyton control flow (if/while) is naturally handled

In [28]:
def f(x, y):
  output = 1.0
  for i in range(y):
    if i > 1 and i < 5:
      output = tf.multiply(output, x)
  return output

def grad(x, y):
  with tf.GradientTape() as t:
    t.watch(x)
    out = f(x, y)
  return t.gradient(out, x)

x = tf.convert_to_tensor(2.0)

assert grad(x, 6).numpy() == 12.0
assert grad(x, 5).numpy() == 12.0
assert grad(x, 4).numpy() == 4.0

**Higher-order gradients**  
Operations inside of *GradientTape* context manager are recorded.

In [29]:
x = tf.Variable(1.0)  # Create a Tensorflow variable initialized to 1.0

with tf.GradientTape() as t:
  with tf.GradientTape() as t2:
    y = x * x * x
  # Compute the gradient inside the 't' context manager
  # which means the gradient computation is differentiable as well.
  dy_dx = t2.gradient(y, x)
d2y_dx2 = t.gradient(dy_dx, x)

assert dy_dx.numpy() == 3.0
assert d2y_dx2.numpy() == 6.0