<sup>This notebook is adapted from code from *Deep Learning with Python 2nd edition* by Francois Challot. 

<sup>See the original code in the books' companion [GitHub](https://github.com/fchollet/deep-learning-with-python-notebooks).  Find the book here: [Amazon Link](https://www.amazon.com/Learning-Python-Second-Fran-C3-A7ois-Chollet-dp-1617296864/dp/1617296864/)



## The gears of neural networks: tensor operations

These code snippets from Francois Challot provide pseudocode to how tensor operations work conceptually. Under the hood of TennsorFlow, the optimized code performs similar functionality. 

### Element-wise operations

The naive add and relu functions compute add and relu activation in a simple way to demonstrate the concepts of vector operations. 

The third and fourth code chunks below compare the naive Python code to NumPy code to show the efficiency of operations in NumPy, 

In [None]:
import numpy as np
import time

In [None]:
def naive_relu(x):
    assert len(x.shape) == 2
    x = x.copy()
    for i in range(x.shape[0]):
        for j in range(x.shape[1]):
            x[i, j] = max(x[i, j], 0)
    return x

In [None]:
def naive_add(x, y):
    assert len(x.shape) == 2
    assert x.shape == y.shape
    x = x.copy()
    for i in range(x.shape[0]):
        for j in range(x.shape[1]):
            x[i, j] += y[i, j]
    return x

In [None]:
x = np.random.random((20, 100))
y = np.random.random((20, 100))

t0 = time.time()
for _ in range(1000):
    z = x + y
    z = np.maximum(z, 0.)
print("Took: {0:.2f} s".format(time.time() - t0))

Took: 0.01 s


In [None]:
t0 = time.time()
for _ in range(1000):
    z = naive_add(x, y)
    z = naive_relu(z)
print("Took: {0:.2f} s".format(time.time() - t0))

Took: 2.79 s


### Broadcasting

When copying a smaller array to a larger one, NumPy will "broadcast", meaning the smaller array is copied as many times as needed to fill the larger array. 

This has two steps:
* an axes to the smaller tensor to match the ndim of the larger tensor
* repeat the smaller tensor along the new axes to match the shape of the larger tensor


The expand\_dims function is deprecated, so another way to expand the dims is shown below. 

In [None]:
X = np.random.random((32, 10))
y_original = np.random.random((10,))
y_original.shape

(10,)

In [None]:
y_expanded = np.expand_dims(y_original, axis=0)
y_expanded.shape

(1, 10)

In [None]:
# expand_dims is deprecated
# use this instead

y = y[np.newaxis, :]  # put : where you want the new axis
y.shape

(1, 32, 10)

In [None]:
Y = np.concatenate([y] * 32, axis=0)
Y.shape

(32, 32, 10)

In [None]:
def naive_add_matrix_and_vector(x, y):
    assert len(x.shape) == 2
    assert len(y.shape) == 1
    assert x.shape[1] == y.shape[0]
    x = x.copy()
    for i in range(x.shape[0]):
        for j in range(x.shape[1]):
            x[i, j] += y[j]
    return x

In [None]:
result = naive_add_matrix_and_vector(X, y_original)
result.shape

(32, 10)

### Tensor product

Below are more functions that serve as pseudocode for tensor operations such as dot products. 

In [None]:
import numpy as np
x = np.random.random((64, 3, 32, 10))
y = np.random.random((32, 10))
z = np.maximum(x, y)

In [None]:
x = np.random.random((32,))
y = np.random.random((32,))
z = np.dot(x, y)

In [None]:
def naive_vector_dot(x, y):
    assert len(x.shape) == 1
    assert len(y.shape) == 1
    assert x.shape[0] == y.shape[0]
    z = 0.
    for i in range(x.shape[0]):
        z += x[i] * y[i]
    return z

In [None]:
def naive_matrix_vector_dot(x, y):
    assert len(x.shape) == 2
    assert len(y.shape) == 1
    assert x.shape[1] == y.shape[0]
    z = np.zeros(x.shape[0])
    for i in range(x.shape[0]):
        for j in range(x.shape[1]):
            z[i] += x[i, j] * y[j]
    return z

In [None]:
def naive_matrix_vector_dot(x, y):
    z = np.zeros(x.shape[0])
    for i in range(x.shape[0]):
        z[i] = naive_vector_dot(x[i, :], y)
    return z

In [None]:
def naive_matrix_dot(x, y):
    assert len(x.shape) == 2
    assert len(y.shape) == 2
    assert x.shape[1] == y.shape[0]
    z = np.zeros((x.shape[0], y.shape[1]))
    for i in range(x.shape[0]):
        for j in range(y.shape[1]):
            row_x = x[i, :]
            column_y = y[:, j]
            z[i, j] = naive_vector_dot(row_x, column_y)
    return z

### Tensor reshaping

In [None]:
train_images = train_images.reshape((60000, 28 * 28))

In [None]:
x = np.array([[0., 1.],
             [2., 3.],
             [4., 5.]])
x.shape

In [None]:
x = x.reshape((6, 1))
x

In [None]:
x = np.zeros((300, 20))
x = np.transpose(x)
x.shape

#### The gradient tape in TensorFlow

TensorFlow implements automatic differentiation by means of computation graphs. The gradients are compositions of differentiable tensor operations. Specifying the forward pass also defines how the gradients are computed. 

The GradientTape functionality in TensorFlow records the tensor operations that run inside it's scope. 

In [None]:
import tensorflow as tf
x = tf.Variable(0.)
with tf.GradientTape() as tape:
    y = 2 * x + 3
grad_of_y_wrt_x = tape.gradient(y, x)

print(grad_of_y_wrt_x)

tf.Tensor(2.0, shape=(), dtype=float32)


In [None]:
x = tf.Variable(tf.random.uniform((2, 2)))
with tf.GradientTape() as tape:
    y = 2 * x + 3
grad_of_y_wrt_x = tape.gradient(y, x)

print(grad_of_y_wrt_x)

tf.Tensor(
[[2. 2.]
 [2. 2.]], shape=(2, 2), dtype=float32)


In [None]:
W = tf.Variable(tf.random.uniform((2, 2)))
b = tf.Variable(tf.zeros((2,)))
x = tf.random.uniform((2, 2))
with tf.GradientTape() as tape:
    y = tf.matmul(x, W) + b
grad_of_y_wrt_W_and_b = tape.gradient(y, [W, b])

print('W=', W)
print('b=', b)
print('x=', x)
print('\nGrad:', grad_of_y_wrt_W_and_b)

W= <tf.Variable 'Variable:0' shape=(2, 2) dtype=float32, numpy=
array([[0.5864936, 0.5392288],
       [0.0373162, 0.5145111]], dtype=float32)>
b= <tf.Variable 'Variable:0' shape=(2,) dtype=float32, numpy=array([0., 0.], dtype=float32)>
x= tf.Tensor(
[[0.7531737  0.55128396]
 [0.40519285 0.30589557]], shape=(2, 2), dtype=float32)

Grad: [<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[1.1583666, 1.1583666],
       [0.8571795, 0.8571795]], dtype=float32)>, <tf.Tensor: shape=(2,), dtype=float32, numpy=array([2., 2.], dtype=float32)>]
