# Lab 4: Forward propagation with vectorized implementation

Vectorized implementations use matrix math that executes operations in parallel. This is why vectorized implementations are more efficient than non-vectorized implementations, which execute operations sequentially (e.g. using a for loop). An additional boost to efficiency is achieved when using a GPU to process vectorized implementations because GPUs are particularly good at performing vectorized calculations.

In [2]:
import numpy as np
# import matplotlib.pyplot as plt
# plt.style.use('./deeplearning.mplstyle')

# import tensorflow as tf

from lib.lab_utils_common import dlc, sigmoid
# from lib.lab_coffee_utils import load_coffee_data, plt_roast, plt_prob, plt_layer, plt_network, plt_output_unit

# import logging
# logging.getLogger("tensorflow").setLevel(logging.ERROR)
# tf.autograph.set_verbosity(0)

## Non-vectorized implementation (from previous labs)

### Setup data

In [27]:
# input values (1 training example with 2 features)
x = np.array([200, 17]) # 1D array (non-vectorized)

# weights for layer 1: 
# 2 input features per unit = 2 rows 
# 3 units in the layer = 3 cols
weights1 = np.array([[1, -3, 5], 
                [-2, 4, -6]]) 

# biases for layer 1:
# always 1 bias per unit = 1 row
# 3 units in the layer = 3 cols
biases1 = np.array([-1, 1, 2]) # 1D array (non-vectorized)

### Setup functions

In [28]:
# define the activation function g
# This should be a sigmoid function because we are doing classification.
# sigmoid is implemented in lab_utils_common; we simply rename the function here
g = sigmoid

# for a single layer, combine functions and activations
# W is a matrix of weights for this layer; rows = features for this layers (number of outputs from previous layer); cols = units for this layer
def my_dense(a_in, W, b):
    # print("\nentering my dense with variables:")
    # print(a_in) # This has two rows instead of 1!!! 
    # print(W)
    # print(b)

    unit_count = W.shape[1]
    
    # make dummy array to accumulate results
    a_out = np.zeros(unit_count)

    # each loop represents one unit/function of the layer
    for j in range(unit_count):
        # print(f"on unit {j}")
        
        weights_for_this_unit = W[:,j] # accesses the jth column of the weights matrix... the weights for the unit j
        
        z = np.dot( weights_for_this_unit, a_in) + b[j]  # analogous to f(x) = wx + b
        
        a_out[j] = g(z) # sigmoid because doing classification; NB: you could pass g in as a parameter to this function
        # print(a_out)
    
    return a_out

## Vectorized implementation

A major difference between the vectorized and non-vectorized versions is:
* non-vectorized: Use 1D array when only 1 row of data.
* vectorized: Use 2D array regardless of how many rows.

### Setup data

In [29]:
# input values (1 training example with 2 features)
XT = np.array([[200, 17]]) # 2D array (vectorized); T denotes that this is a transposed matrix but we manually code the original input in a transposed way

# weights for layer 1: 
# 2 input features per unit = 2 rows 
# 3 units in the layer = 3 cols
weights1 = np.array([[1, -3, 5], 
                [-2, 4, -6]])          # same 2D array for vectorized and non-vectorized approaches

# biases for layer 1:
# always 1 bias per unit = 1 row
# 3 units in the layer = 3 cols
biases1 = np.array([[-1, 1, 2]]) # 2D array (vectorized)

### Setup functions

In [35]:
# use capital letters to represent matrices; lowercase to represent vectors and scalars
# ... in a vectorized approach almost everything will be a matrix so we can do parallelized matrix math
def my_dense_vectorized(A_in, W, B):

    # no for-loop required in the vectorized implementation because the calculations are done in parallel (not synchronously)
    # This means that all units/functions of the layer are calculated simultaneously
    
    Z = np.matmul(A_in, W) + B  # matmul is a NumPy function that does matrix multiplication 
    print(f"Z values in this layer: {Z}")
    
    A_out = g(Z) # function g applied element-wise

    return A_out


result = my_dense_vectorized(X, weights1, biases1)
print(f"output from this layer: {result}")

Z values in this layer: [[ 165 -531  900]]
output from this layer: [[1.00e+000 7.12e-218 1.00e+000]]


## Matrix Math

This section is not linked to the examples above.

In [22]:
# matrix of input values (could be activation values from a previous layer)
A = np.array([[1, -1, 0.1],
             [2, -2, 0.2]])

# We must transpose A in order to do a dot product calculation with W.


# manual transpose
# AT = np.array([
#     [1, 2],
#     [-1, -2],
#     [0.1, 0.2],
# ])

# auto transpose
AT = A.T

print(AT)

[[ 1.   2. ]
 [-1.  -2. ]
 [ 0.1  0.2]]


In [20]:
# matrix of weights for one layer
W = np.array([[3, 5, 7, 9],
             [4, 6, 8, 0]])
print(W)

[[3 5 7 9]
 [4 6 8 0]]


In [21]:
# Use matmul to calculate dot product of AT and W (parallel processing)

Z = np.matmul(AT, W)
print(Z)

[[ 11.   17.   23.    9. ]
 [-11.  -17.  -23.   -9. ]
 [  1.1   1.7   2.3   0.9]]
