**YOUR NAMES HERE**

Spring 2025

CS 444: Deep Learning

Project 1: Deep Neural Networks 

#### Week 1: VGG4 and building a deep learning library

In [80]:
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt

plt.style.use(['seaborn-v0_8-colorblind', 'seaborn-v0_8-darkgrid'])
plt.rcParams.update({'font.size': 18})

np.set_printoptions(suppress=True, precision=4)

# Automatically reload your external source code
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


## Task 2: Implement familiar neural network layers: Dense, Dropout, Flatten, MaxPool2D, Conv2D

Now that the CIFAR10 and MNIST data loading and preprocessing pipeline is ready, implement and test the layers that will be assembled to create a deep VGG neural network.

There are many methods to implement here, but they are mostly small — either one or just several lines of code. *You implemented all these methods in CS343 in NumPy so it should be helpful to have your CS343 CNN project open. The new element here is writing them in TensorFlow rather than NumPy.*

**NOTE:**
- For reasons that will become clear very quickly, it is **critical** to have your code run fast and on the GPU. This means **you must write 100% TensorFlow code** in `layers.py`. If you use any NumPy your neural network may not work or run too slowly to do anything useful!
- Ignore the methods that have (Week 2) or (Week 3) in the docstrings for now. You will implement those in a few weeks.

In [81]:
from layers import Layer

### 2a. Implement and test `Layer` class methods

These methods are in the `Layer` class in `layers.py`:
- Constructor
- Get/set methods: `get_name`, `get_act_fun_name`, `get_prev_layer_or_block`, `get_wts`, `get_b`, `has_wts`, `get_mode`, `set_mode`
- `compute_net_activation(net_in)`
- `__call__(x)`. Does the forward pass thru the layer. Uses the functional API.

#### Test: Basic `Layer` methods

**NOTE:** You should never instantiate `Layer` objects, but we do so here only to test your code.

In [82]:
# Test hidden_0
hidden_0 = Layer('Dense_0', activation='linear', prev_layer_or_block=None)
print(f'The layer is called {hidden_0.get_name()} and has {hidden_0.get_act_fun_name()} activation.')
print('You should see:')
print('The layer is called Dense_0 and has linear activation.\nThe previous layer is None')
print(f'The previous layer is {hidden_0.get_prev_layer_or_block()} and should be None.')
print(f'The layer is training?\n{hidden_0.get_mode()}. You should see')
print("<tf.Variable 'Variable:0' shape=() dtype=bool, numpy=False>")
print(f'Does the layer have wts? {hidden_0.has_wts()}. It should be False.')
print('Setting the network to training mode.')
hidden_0.set_mode(True)
print(f'The layer is training?\n{hidden_0.get_mode()}. You should see')
print("<tf.Variable 'Variable:0' shape=() dtype=bool, numpy=True>")


The layer is called Dense_0 and has linear activation.
You should see:
The layer is called Dense_0 and has linear activation.
The previous layer is None
The previous layer is None and should be None.
The layer is training?
<tf.Variable 'Variable:0' shape=() dtype=bool, numpy=False>. You should see
<tf.Variable 'Variable:0' shape=() dtype=bool, numpy=False>
Does the layer have wts? False. It should be False.
Setting the network to training mode.
The layer is training?
<tf.Variable 'Variable:0' shape=() dtype=bool, numpy=True>. You should see
<tf.Variable 'Variable:0' shape=() dtype=bool, numpy=True>


In [83]:
# Test hidden_1
hidden_1 = Layer('Dense_1', activation='relu', prev_layer_or_block=hidden_0)
print(f'The layer is called {hidden_1.get_name()} and has {hidden_1.get_act_fun_name()} activation.')
print('You should see:')
print('The layer is called Dense_1 and has relu activation.\nThe previous layer is None')
print(f'The previous layer is {hidden_1.get_prev_layer_or_block().get_name()} and should be Dense_0.')

The layer is called Dense_1 and has relu activation.
You should see:
The layer is called Dense_1 and has relu activation.
The previous layer is None
The previous layer is Dense_0 and should be Dense_0.


#### Test: computing net activation

In [84]:
tf.random.set_seed(0)
x_test_input = tf.random.uniform(shape=(2, 3), minval=-2, maxval=2)
print('Your activations from Dense_0:')
test_acts = hidden_0.compute_net_activation(x_test_input)
print(test_acts.numpy())
print('they should be:')
print('''[[-0.8321 -1.1737  0.1416]
 [ 0.245  -0.3333  1.2313]]''')
print('Your activations from Dense_1:')
test_acts = hidden_1.compute_net_activation(x_test_input)
print(test_acts.numpy())
print('they should be:')
print('''[[0.     0.     0.1416]
 [0.245  0.     1.2313]]''')
print('Your activations from Dense_1 (after changing act fun to softmax):')
hidden_1.act_fun_name = 'softmax'
test_acts = hidden_1.compute_net_activation(x_test_input)
print(test_acts.numpy())
print('they should be:')
print('''[[0.2295 0.163  0.6075]
 [0.2357 0.1322 0.6321]]''')

Your activations from Dense_0:
[[-0.8321 -1.1737  0.1416]
 [ 0.245  -0.3333  1.2313]]
they should be:
[[-0.8321 -1.1737  0.1416]
 [ 0.245  -0.3333  1.2313]]
Your activations from Dense_1:
[[0.     0.     0.1416]
 [0.245  0.     1.2313]]
they should be:
[[0.     0.     0.1416]
 [0.245  0.     1.2313]]
Your activations from Dense_1 (after changing act fun to softmax):
[[0.2295 0.163  0.6075]
 [0.2357 0.1322 0.6321]]
they should be:
[[0.2295 0.163  0.6075]
 [0.2357 0.1322 0.6321]]


### 2b. Implement and test `Dense` layer class methods

These methods are in the `Dense` class in `layers.py`:
- Constructor
- `has_wts`
- `init_params(input_shape)`
- `compute_net_input(x)`.

In [85]:
from layers import Dense

#### Test: Dense layer basics

In [86]:
tf.random.set_seed(1)
# x_test_input2 = tf.random.uniform(shape=(2, 7), minval=-2, maxval=2)
layer_0 = Layer('Layer_0', activation='linear', prev_layer_or_block=None)
hidden_2 = Dense('Dense_2', 7, activation='linear', prev_layer_or_block=layer_0, wt_scale=1e-1)
print(f'Dense layer is called {hidden_2.get_name()} and has previous layer {hidden_2.get_prev_layer_or_block().get_name()}')
print('You should see:')
print('Dense layer is called Dense_2 and has previous layer Layer_0')
print(f'Does the layer have weights? {hidden_2.has_wts()}. It should be True.')

print('----------Initializing wts and biases Test1/2----------')
hidden_2.init_params(input_shape=(2, 3))
print(f'Your wts:\n{hidden_2.get_wts().numpy()} They should be:')
print('''[[-0.1101  0.1546  0.0384 -0.088  -0.1225 -0.0981  0.0088]
 [-0.0203 -0.0558 -0.0721 -0.0626 -0.0715 -0.0348 -0.0336]
 [ 0.0183  0.1109  0.128  -0.0021 -0.032   0.0373  0.0253]]''')
print(f'Your biases:\n{hidden_2.get_b().numpy()} They should be:')
print('''[ 0.0403 -0.1088 -0.0063  0.1337  0.0712 -0.0489 -0.0764]''')
print('----------Initializing wts and biases Test2/2----------')
hidden_2.init_params(input_shape=(4, 5, 2, 3))
print(f'Your wts:\n{hidden_2.get_wts().numpy()} They should be:')
print('''[[-0.0457 -0.0407  0.0729 -0.0893  0.0313  0.0994 -0.1784]
 [-0.0522  0.0981 -0.0676  0.1146  0.0206 -0.0197  0.0538]
 [ 0.0764 -0.0836  0.0336  0.156   0.0723  0.1075 -0.0266]]''')
print(f'Your biases:\n{hidden_2.get_b().numpy()} They should be:')
print('''[ 0.1694  0.012  -0.1158  0.0173 -0.0714  0.069  -0.1091]''')

Dense layer is called Dense_2 and has previous layer Layer_0
You should see:
Dense layer is called Dense_2 and has previous layer Layer_0
Does the layer have weights? True. It should be True.
----------Initializing wts and biases Test1/2----------
Your wts:
[[-0.1101  0.1546  0.0384 -0.088  -0.1225 -0.0981  0.0088]
 [-0.0203 -0.0558 -0.0721 -0.0626 -0.0715 -0.0348 -0.0336]
 [ 0.0183  0.1109  0.128  -0.0021 -0.032   0.0373  0.0253]] They should be:
[[-0.1101  0.1546  0.0384 -0.088  -0.1225 -0.0981  0.0088]
 [-0.0203 -0.0558 -0.0721 -0.0626 -0.0715 -0.0348 -0.0336]
 [ 0.0183  0.1109  0.128  -0.0021 -0.032   0.0373  0.0253]]
Your biases:
[ 0.0403 -0.1088 -0.0063  0.1337  0.0712 -0.0489 -0.0764] They should be:
[ 0.0403 -0.1088 -0.0063  0.1337  0.0712 -0.0489 -0.0764]
----------Initializing wts and biases Test2/2----------
Your wts:
[[-0.0457 -0.0407  0.0729 -0.0893  0.0313  0.0994 -0.1784]
 [-0.0522  0.0981 -0.0676  0.1146  0.0206 -0.0197  0.0538]
 [ 0.0764 -0.0836  0.0336  0.156   0.0723

#### Test: Dense layer forward pass 1/2

This test calls the rest of your `Dense` layer methods via your`__call__` implementation in `Layer`.

In [87]:
tf.random.set_seed(1)
x_test_input2 = tf.random.uniform(shape=(4, 3), minval=-2, maxval=2)
print('Your activations from Dense_2 w/ linear:')
hidden_2.act_fun_name = 'linear'
test_acts = hidden_2(x_test_input2)
print(test_acts.numpy())
print('they should be:')
print('''[[ 0.1868  0.1802 -0.3044  0.4026 -0.0424 -0.0396  0.2024]
 [ 0.2684 -0.1067 -0.0595  0.0342 -0.0556  0.1206 -0.1223]
 [ 0.1449 -0.1445  0.0619 -0.0825  0.029   0.3322 -0.4797]
 [ 0.1567  0.0102 -0.1068  0.1138 -0.014   0.1491 -0.1667]]''')
print('Your activations from Dense_2 w/ relu:')
hidden_2.act_fun_name = 'relu'
test_acts = hidden_2(x_test_input2)
print(test_acts.numpy())
print('they should be:')
print('''[[0.1868 0.1802 0.     0.4026 0.     0.     0.2024]
 [0.2684 0.     0.     0.0342 0.     0.1206 0.    ]
 [0.1449 0.     0.0619 0.     0.029  0.3322 0.    ]
 [0.1567 0.0102 0.     0.1138 0.     0.1491 0.    ]]''')

Your activations from Dense_2 w/ linear:
[[ 0.1868  0.1802 -0.3044  0.4026 -0.0424 -0.0396  0.2024]
 [ 0.2684 -0.1067 -0.0595  0.0342 -0.0556  0.1206 -0.1223]
 [ 0.1449 -0.1445  0.0619 -0.0825  0.029   0.3322 -0.4797]
 [ 0.1567  0.0102 -0.1068  0.1138 -0.014   0.1491 -0.1667]]
they should be:
[[ 0.1868  0.1802 -0.3044  0.4026 -0.0424 -0.0396  0.2024]
 [ 0.2684 -0.1067 -0.0595  0.0342 -0.0556  0.1206 -0.1223]
 [ 0.1449 -0.1445  0.0619 -0.0825  0.029   0.3322 -0.4797]
 [ 0.1567  0.0102 -0.1068  0.1138 -0.014   0.1491 -0.1667]]
Your activations from Dense_2 w/ relu:
[[0.1868 0.1802 0.     0.4026 0.     0.     0.2024]
 [0.2684 0.     0.     0.0342 0.     0.1206 0.    ]
 [0.1449 0.     0.0619 0.     0.029  0.3322 0.    ]
 [0.1567 0.0102 0.     0.1138 0.     0.1491 0.    ]]
they should be:
[[0.1868 0.1802 0.     0.4026 0.     0.     0.2024]
 [0.2684 0.     0.     0.0342 0.     0.1206 0.    ]
 [0.1449 0.     0.0619 0.     0.029  0.3322 0.    ]
 [0.1567 0.0102 0.     0.1138 0.     0.1491 0.   

#### Test: Dense layer forward pass 2/2

This tests lazy initialization.

In [88]:
tf.random.set_seed(1)
x_test_input2 = tf.random.uniform(shape=(4, 3), minval=-2, maxval=2)
print('Your activations from Dense_2 w/ linear:')
for i in range(5):
    test_acts = hidden_2(x_test_input2)
print(test_acts.numpy())
print('they should be:')
print('''[[0.1868 0.1802 0.     0.4026 0.     0.     0.2024]
 [0.2684 0.     0.     0.0342 0.     0.1206 0.    ]
 [0.1449 0.     0.0619 0.     0.029  0.3322 0.    ]
 [0.1567 0.0102 0.     0.1138 0.     0.1491 0.    ]]''')

Your activations from Dense_2 w/ linear:
[[0.1868 0.1802 0.     0.4026 0.     0.     0.2024]
 [0.2684 0.     0.     0.0342 0.     0.1206 0.    ]
 [0.1449 0.     0.0619 0.     0.029  0.3322 0.    ]
 [0.1567 0.0102 0.     0.1138 0.     0.1491 0.    ]]
they should be:
[[0.1868 0.1802 0.     0.4026 0.     0.     0.2024]
 [0.2684 0.     0.     0.0342 0.     0.1206 0.    ]
 [0.1449 0.     0.0619 0.     0.029  0.3322 0.    ]
 [0.1567 0.0102 0.     0.1138 0.     0.1491 0.    ]]


#### Test: `__str__` and output shape

In [89]:
print(hidden_2)
print('The above should print:')
print('Dense layer output(Dense_2) shape: [4, 7]')

Dense layer output(Dense_2) shape: [4, 7]
The above should print:
Dense layer output(Dense_2) shape: [4, 7]


### 2c. Implement and test `Dropout` layer class methods

Please complete the `Dropout` class then test your implementation below.

**Note:** If you did not learn dropout layers in CS343, please see the supplementary video on the notes website. Please come talk to me if any questions.

In [90]:
from layers import Dropout

#### Test: `Dropout` layer

In [91]:
tf.random.set_seed(0)
print('-------------------------Test1/2-------------------------')
x_test_input_drop = tf.random.normal(shape=(4, 5))
dummy_layer = Layer('Blarg', activation='linear', prev_layer_or_block=None)
drop = Dropout('Dropout_Layer', rate=0.7, prev_layer_or_block=dummy_layer)
drop.set_mode(True)
drop_net_in = drop.compute_net_input(x_test_input_drop)
print(f'The layer preceding your dropout layer is {drop.get_prev_layer_or_block().get_name()}. It should be Blarg.')
print(f'Your Dropout net_in is:\n{drop_net_in}')
print('It should be EITHER:')
print('''[[ 0.      1.4097 -0.     -0.     -0.    ]
 [ 0.     -0.0466  0.      0.      0.    ]
 [-0.     -0.      2.6454 -0.     -3.1994]
 [-0.     -1.2027 -0.      1.0128  1.7384]]''')
print('or:')
print('''[[ 0.      0.     -1.399  -3.4535 -0.    ]
 [ 0.     -0.      3.9629  0.      0.    ]
 [-2.3524 -0.      0.     -2.325  -0.    ]
 [-3.0023 -0.     -0.      0.      0.    ]]''')
print('-------------------------Test2/2-------------------------')
drop.set_mode(False)
drop_net_act = drop(x_test_input_drop)
print(f'Your Dropout net_act is:\n{drop_net_act}')
print('It should be:')
print('''[[ 1.5111  0.4229 -0.4197 -1.036  -1.2368]
 [ 0.4703 -0.014   1.1889  0.6025  0.5997]
 [-0.7057 -0.433   0.7936 -0.6975 -0.9598]
 [-0.9007 -0.3608 -0.2238  0.3038  0.5215]]''')

-------------------------Test1/2-------------------------
The layer preceding your dropout layer is Blarg. It should be Blarg.
Your Dropout net_in is:
[[ 0.      1.4097 -0.     -0.     -0.    ]
 [ 0.     -0.0466  0.      0.      0.    ]
 [-0.     -0.      2.6454 -0.     -3.1994]
 [-0.     -1.2027 -0.      1.0128  1.7384]]
It should be EITHER:
[[ 0.      1.4097 -0.     -0.     -0.    ]
 [ 0.     -0.0466  0.      0.      0.    ]
 [-0.     -0.      2.6454 -0.     -3.1994]
 [-0.     -1.2027 -0.      1.0128  1.7384]]
or:
[[ 0.      0.     -1.399  -3.4535 -0.    ]
 [ 0.     -0.      3.9629  0.      0.    ]
 [-2.3524 -0.      0.     -2.325  -0.    ]
 [-3.0023 -0.     -0.      0.      0.    ]]
-------------------------Test2/2-------------------------
Your Dropout net_act is:
[[ 1.5111  0.4229 -0.4197 -1.036  -1.2368]
 [ 0.4703 -0.014   1.1889  0.6025  0.5997]
 [-0.7057 -0.433   0.7936 -0.6975 -0.9598]
 [-0.9007 -0.3608 -0.2238  0.3038  0.5215]]
It should be:
[[ 1.5111  0.4229 -0.4197 -1.036  -

#### Test: `Dropout` layer `__str__`

In [92]:
print(drop)
print('The above should print:')
print('Dropout layer output(Dropout_Layer) shape: [4, 5]')

Dropout layer output(Dropout_Layer) shape: [4, 5]
The above should print:
Dropout layer output(Dropout_Layer) shape: [4, 5]


### 2d. Implement and test `Flatten` layer class methods

In [93]:
from layers import Flatten

#### Test: `Flatten` forward pass

In [94]:
tf.random.set_seed(0)
x_test_input_flat = tf.random.normal(shape=(2, 1, 2, 3))
flat = Flatten('pancake', prev_layer_or_block=None)
flat_acts = flat(x_test_input_flat)
print(f'Your Flatten layer net_act is:\n{flat_acts}')
print('It should be:')
print('''[[ 1.5111  0.4229 -0.4197 -1.036  -1.2368  0.4703]
 [-0.014   1.1889  0.6025  0.5997 -0.7057 -0.433 ]]''')

Your Flatten layer net_act is:
[[ 1.5111  0.4229 -0.4197 -1.036  -1.2368  0.4703]
 [-0.014   1.1889  0.6025  0.5997 -0.7057 -0.433 ]]
It should be:
[[ 1.5111  0.4229 -0.4197 -1.036  -1.2368  0.4703]
 [-0.014   1.1889  0.6025  0.5997 -0.7057 -0.433 ]]


#### Test: `Flatten` layer `__str__`

In [95]:
print(flat)
print('The above should print:')
print('Flatten layer output(pancake) shape: [2, 6]')

Flatten layer output(pancake) shape: [2, 6]
The above should print:
Flatten layer output(pancake) shape: [2, 6]


### 2d. Implement and test `MaxPool2D` layer class methods

In [96]:
from layers import MaxPool2D

#### Test: `MaxPool2D` forward pass

In [97]:
tf.random.set_seed(0)
x_test_input_pool = tf.random.normal(shape=(2, 8, 8, 3))
print('Test 1/2...')
pool1 = MaxPool2D('swimming_pool', pool_size=(2, 2), strides=4, prev_layer_or_block=None)
print(f'Layer has weights? {pool1.has_wts()}. Should be False.')
pool_acts = pool1(x_test_input_pool)
print(f'Your MaxPool2D layer net_act is:\n{pool_acts}')
print('It should be:')
print('''[[[[1.5111 0.4229 0.4703]
   [0.7936 1.0278 0.3279]]

  [[2.0615 0.1364 1.2597]
   [0.6657 1.2569 0.4754]]]


 [[[0.711  0.4818 0.8706]
   [0.1162 0.268  0.8781]]

  [[1.6651 1.3796 1.2738]
   [2.2517 1.7184 0.3346]]]]''')

Test 1/2...
Layer has weights? False. Should be False.
Your MaxPool2D layer net_act is:
[[[[1.5111 0.4229 0.4703]
   [0.7936 1.0278 0.3279]]

  [[2.0615 0.1364 1.2597]
   [0.6657 1.2569 0.4754]]]


 [[[0.711  0.4818 0.8706]
   [0.1162 0.268  0.8781]]

  [[1.6651 1.3796 1.2738]
   [2.2517 1.7184 0.3346]]]]
It should be:
[[[[1.5111 0.4229 0.4703]
   [0.7936 1.0278 0.3279]]

  [[2.0615 0.1364 1.2597]
   [0.6657 1.2569 0.4754]]]


 [[[0.711  0.4818 0.8706]
   [0.1162 0.268  0.8781]]

  [[1.6651 1.3796 1.2738]
   [2.2517 1.7184 0.3346]]]]


In [98]:
tf.random.set_seed(1)
x_test_input_pool = tf.random.normal(shape=(1, 9, 9, 2))
print('Test 2/2...')
pool2 = MaxPool2D('swimming_pool', pool_size=(3, 3), strides=4, prev_layer_or_block=None)
pool_acts = pool2(x_test_input_pool)
print(f'Your MaxPool2D layer net_act is:\n{pool_acts}')
print('It should be:')
print('''[[[[2.1463 1.5458]
   [1.2249 0.0586]]

  [[1.6981 1.328 ]
   [2.4762 1.23  ]]]]''')

Test 2/2...
Your MaxPool2D layer net_act is:
[[[[2.1463 1.5458]
   [1.2249 0.0586]]

  [[1.6981 1.328 ]
   [2.4762 1.23  ]]]]
It should be:
[[[[2.1463 1.5458]
   [1.2249 0.0586]]

  [[1.6981 1.328 ]
   [2.4762 1.23  ]]]]


#### Test: `MaxPool2D` layer `__str__`

In [99]:
print(pool1)
print('The above should print:')
print('MaxPool2D layer output(swimming_pool) shape: [2, 2, 2, 3]')
print()
print(pool2)
print('The above should print:')
print('MaxPool2D layer output(swimming_pool) shape: [1, 2, 2, 2]')

MaxPool2D layer output(swimming_pool) shape: [2, 2, 2, 3]
The above should print:
MaxPool2D layer output(swimming_pool) shape: [2, 2, 2, 3]

MaxPool2D layer output(swimming_pool) shape: [1, 2, 2, 2]
The above should print:
MaxPool2D layer output(swimming_pool) shape: [1, 2, 2, 2]


### 2e. Implement and test `Conv2D` layer class methods

These methods are in the `Conv2D` class in `layers.py`:
- Constructor
- `has_wts`
- `init_params(input_shape)`
- `compute_net_input(x)`.

In [100]:
from layers import Conv2D

#### Test: Conv2D layer basics

In [101]:
tf.random.set_seed(2)
x_test_input2 = tf.random.uniform(shape=(1, 2, 2, 1), minval=-2, maxval=2)
conv = Conv2D('convoluted', 5, kernel_size=(2, 2), activation='relu', prev_layer_or_block=None, wt_scale=1e-2)
print(f'Conv2D layer is called {conv.get_name()} and has previous layer {conv.get_prev_layer_or_block()}')
print('You should see:')
print('Conv2D layer is called convoluted and has previous layer None')
print(f'Does the layer have weights? {conv.has_wts()}. It should be True.')

print('----------Initializing wts and biases Test1/2----------')
conv.init_params(input_shape=x_test_input2.shape)
print(f'Your wts:\n{conv.get_wts().numpy()} They should be:')
print('''[[[[-0.0004  0.0097 -0.0111  0.003   0.0077]]

  [[ 0.0082  0.0284  0.0118 -0.0047 -0.0143]]]


 [[[ 0.0049 -0.0014  0.0105  0.0115  0.0049]]

  [[-0.0026  0.0006  0.0071 -0.0024  0.0181]]]]''')
print(f'Your biases:\n{conv.get_b().numpy()} They should be:')
print('''[ 0.0104  0.0062  0.0153 -0.0236 -0.0161]''')


Conv2D layer is called convoluted and has previous layer None
You should see:
Conv2D layer is called convoluted and has previous layer None
Does the layer have weights? True. It should be True.
----------Initializing wts and biases Test1/2----------
Your wts:
[[[[-0.0004  0.0097 -0.0111  0.003   0.0077]]

  [[ 0.0082  0.0284  0.0118 -0.0047 -0.0143]]]


 [[[ 0.0049 -0.0014  0.0105  0.0115  0.0049]]

  [[-0.0026  0.0006  0.0071 -0.0024  0.0181]]]] They should be:
[[[[-0.0004  0.0097 -0.0111  0.003   0.0077]]

  [[ 0.0082  0.0284  0.0118 -0.0047 -0.0143]]]


 [[[ 0.0049 -0.0014  0.0105  0.0115  0.0049]]

  [[-0.0026  0.0006  0.0071 -0.0024  0.0181]]]]
Your biases:
[ 0.0104  0.0062  0.0153 -0.0236 -0.0161] They should be:
[ 0.0104  0.0062  0.0153 -0.0236 -0.0161]


In [102]:
tf.random.set_seed(1)
print('----------Initializing wts and biases Test2/2----------')
conv.init_params(input_shape=(2, 5, 5, 2))
print(f'Your wts:\n{conv.get_wts().numpy()} They should be:')
print('''[[[[-0.011   0.0155  0.0038 -0.0088 -0.0122]
   [-0.0098  0.0009 -0.002  -0.0056 -0.0072]]

  [[-0.0063 -0.0072 -0.0035 -0.0034  0.0018]
   [ 0.0111  0.0128 -0.0002 -0.0032  0.0037]]]


 [[[ 0.0025  0.0064  0.0215 -0.0083 -0.009 ]
   [ 0.0139  0.0122  0.0006 -0.0049 -0.0082]]

  [[-0.0019 -0.0039 -0.0066 -0.0098  0.0039]
   [-0.0104 -0.0156 -0.0016 -0.0036 -0.002 ]]]]''')
print(f'Your biases:\n{conv.get_b().numpy()} They should be:')
print('''[ 0.004  -0.0109 -0.0006  0.0134  0.0071]''')

----------Initializing wts and biases Test2/2----------
Your wts:
[[[[-0.011   0.0155  0.0038 -0.0088 -0.0122]
   [-0.0098  0.0009 -0.002  -0.0056 -0.0072]]

  [[-0.0063 -0.0072 -0.0035 -0.0034  0.0018]
   [ 0.0111  0.0128 -0.0002 -0.0032  0.0037]]]


 [[[ 0.0025  0.0064  0.0215 -0.0083 -0.009 ]
   [ 0.0139  0.0122  0.0006 -0.0049 -0.0082]]

  [[-0.0019 -0.0039 -0.0066 -0.0098  0.0039]
   [-0.0104 -0.0156 -0.0016 -0.0036 -0.002 ]]]] They should be:
[[[[-0.011   0.0155  0.0038 -0.0088 -0.0122]
   [-0.0098  0.0009 -0.002  -0.0056 -0.0072]]

  [[-0.0063 -0.0072 -0.0035 -0.0034  0.0018]
   [ 0.0111  0.0128 -0.0002 -0.0032  0.0037]]]


 [[[ 0.0025  0.0064  0.0215 -0.0083 -0.009 ]
   [ 0.0139  0.0122  0.0006 -0.0049 -0.0082]]

  [[-0.0019 -0.0039 -0.0066 -0.0098  0.0039]
   [-0.0104 -0.0156 -0.0016 -0.0036 -0.002 ]]]]
Your biases:
[ 0.004  -0.0109 -0.0006  0.0134  0.0071] They should be:
[ 0.004  -0.0109 -0.0006  0.0134  0.0071]


#### Test: Conv2D layer forward pass 1/2

This tests your `Conv2D` layer methods via your`__call__` implementation in `Layer`.

In [103]:
tf.random.set_seed(1)
conv = Conv2D('convoluted', 3, kernel_size=(3, 3), activation='relu', prev_layer_or_block=None)
x_test_input2 = tf.random.uniform(shape=(1, 3, 3, 4), minval=-2, maxval=2)
print('Your activations from Conv2D w/ ReLU:')
test_acts = conv(x_test_input2)
print(test_acts.numpy())
print('they should be:')
print('''[[[[0.0027 0.0049 0.0064]
   [0.     0.     0.0073]
   [0.0081 0.0044 0.0006]]

  [[0.     0.     0.    ]
   [0.     0.0007 0.    ]
   [0.0044 0.     0.    ]]

  [[0.     0.     0.    ]
   [0.     0.0002 0.0082]
   [0.     0.     0.    ]]]]''')

Your activations from Conv2D w/ ReLU:
[[[[0.0027 0.0049 0.0064]
   [0.     0.     0.0073]
   [0.0081 0.0044 0.0006]]

  [[0.     0.     0.    ]
   [0.     0.0007 0.    ]
   [0.0044 0.     0.    ]]

  [[0.     0.     0.    ]
   [0.     0.0002 0.0082]
   [0.     0.     0.    ]]]]
they should be:
[[[[0.0027 0.0049 0.0064]
   [0.     0.     0.0073]
   [0.0081 0.0044 0.0006]]

  [[0.     0.     0.    ]
   [0.     0.0007 0.    ]
   [0.0044 0.     0.    ]]

  [[0.     0.     0.    ]
   [0.     0.0002 0.0082]
   [0.     0.     0.    ]]]]


#### Test: Conv2D layer forward pass 2/2

This tests lazy initialization.

In [104]:
tf.random.set_seed(1)
conv = Conv2D('convoluted', 3, kernel_size=(3, 3), activation='linear', prev_layer_or_block=None)
x_test_input2 = tf.random.uniform(shape=(1, 3, 3, 4), minval=-2, maxval=2)
print('Your activations from Dense_2 w/ linear:')
for i in range(5):
    test_acts = conv(x_test_input2)
print(test_acts.numpy())
print('they should be:')
print('''[[[[ 0.0027  0.0049  0.0064]
   [-0.0061  0.      0.0073]
   [ 0.0081  0.0044  0.0006]]

  [[-0.0024 -0.0052 -0.01  ]
   [-0.0005  0.0007 -0.001 ]
   [ 0.0044 -0.0066 -0.004 ]]

  [[-0.0054 -0.0025 -0.005 ]
   [-0.0033  0.0002  0.0082]
   [-0.006  -0.0028 -0.0062]]]]''')

Your activations from Dense_2 w/ linear:
[[[[ 0.0027  0.0049  0.0064]
   [-0.0061  0.      0.0073]
   [ 0.0081  0.0044  0.0006]]

  [[-0.0024 -0.0052 -0.01  ]
   [-0.0005  0.0007 -0.001 ]
   [ 0.0044 -0.0066 -0.004 ]]

  [[-0.0054 -0.0025 -0.005 ]
   [-0.0033  0.0002  0.0082]
   [-0.006  -0.0028 -0.0062]]]]
they should be:
[[[[ 0.0027  0.0049  0.0064]
   [-0.0061  0.      0.0073]
   [ 0.0081  0.0044  0.0006]]

  [[-0.0024 -0.0052 -0.01  ]
   [-0.0005  0.0007 -0.001 ]
   [ 0.0044 -0.0066 -0.004 ]]

  [[-0.0054 -0.0025 -0.005 ]
   [-0.0033  0.0002  0.0082]
   [-0.006  -0.0028 -0.0062]]]]


#### Test: `__str__` and output shape

In [105]:
print(conv)
print('The above should print:')
print('Conv2D layer output(convoluted) shape: [1, 3, 3, 3]')

Conv2D layer output(convoluted) shape: [1, 3, 3, 3]
The above should print:
Conv2D layer output(convoluted) shape: [1, 3, 3, 3]


## Task 3: Build the `DeepNetwork` and `VGG4` classes

With the layers implemented, now let's tackle the network itself. This is the last step before we can start training on some data!

We will divide up the job of creating a deep network into a parent and child classes:
- `DeepNetwork` class (located in `network.py`): Serves as parent class for VGG and all the neural networks that we develop with this semester.
- `VGG4` class (located in `vgg_nets.py`): The minimal code necessary to the `VGG4` architecture (i.e. what makes it unique vs other potential neural networks).

This division of labor will allow us to rapidly build many neural networks (`VGG4`, `VGG6`, `VGG9`, etc.) with minimal added code. All the "boilerplate" code that every network needs like the `fit` method, will reside in the parent and be reused automatically without massive amounts of error-prone copy-pasting!

In [106]:
from layers import Dense
from network import DeepNetwork

### 3a. Implement `DeepNetwork` part 1/2

To help see the big picture, start implementing and testing only the following methods in `DeepNetwork`:

- `DeepNetwork` constructor
- `compile(loss, optimizer, lr, beta_1, print_summary))`: Just add the optimizer to the existing implementation.

#### Test: Constructor and `compile`

This code that will be used for most forthcoming tests.

In [107]:
def create_test_net():
    # Build fake layers for testing
    test_layer1 = Dense('dense1', 3, prev_layer_or_block=None)
    test_layer2 = Dense('dense2', 4, prev_layer_or_block=test_layer1)
    test_layer_out = Dense('dense_out', 5, prev_layer_or_block=test_layer2)
    # Build fake net for testing
    test_net = DeepNetwork(input_feats_shape=(2,), reg=1.0)
    test_net.test_layer1 = test_layer1
    test_net.test_layer2 = test_layer2
    test_net.output_layer = test_layer_out

    def __call__(self, x):
        net_act = self.test_layer1(x)
        net_act = self.test_layer2(net_act)
        net_act = self.output_layer(net_act)
        return net_act

    setattr(DeepNetwork, '__call__', __call__)
    return test_net

test_net = create_test_net()
test_net.compile(lr=0.1)

---------------------------------------------------------------------------
Dense layer output(dense_out) shape: [1, 5]
Dense layer output(dense2) shape: [1, 4]
Dense layer output(dense1) shape: [1, 3]
---------------------------------------------------------------------------


Executing the above cell should print out:


```
---------------------------------------------------------------------------
Dense layer output(dense_out) shape: [1, 5]
Dense layer output(dense2) shape: [1, 4]
Dense layer output(dense1) shape: [1, 3]
---------------------------------------------------------------------------
```

In [108]:
print(f'Network reg is {test_net.reg} and should be 1.0.')
print(f'Input ƒeature shape is {test_net.input_feats_shape} and should be (2,).')
print(f'Optimizer is {test_net.opt.name} and should be adam.')
print(f'Optimizer lr is {float(test_net.opt.learning_rate):.3f} and should be 0.100.')
print(f'Number of parameters discovered in network is {len(test_net.all_net_params)} and should be 6.')

Network reg is 1.0 and should be 1.0.
Input ƒeature shape is (2,) and should be (2,).
Optimizer is adam and should be adam.
Optimizer lr is 0.100 and should be 0.100.
Number of parameters discovered in network is 6 and should be 6.


### 3b. Implement `VGG4` network

Here is an overview of the layers in the `VGG4` network:

Conv2D → Conv2D → MaxPool2D → Flatten → Dense → Dropout → Dense

Since the parent class `DeepNetwork` will handle training, getting predictions, and other tasks, all that needs to be done in the `VGG4` class (in `vgg_nets.py`) is implement:
- constructor: Where you create and configure network layers and assign them to instance variables.
- `__call__(x)`: Performs a forward pass thru your `VGG4` network with data samples `x`. If you `VGG4` network is called `vgg` and the data is called `data`, recall that you would call the `__call__` method like this: `vgg(data)`.

In [109]:
from vgg_nets import VGG4

#### Test: `VGG4` forward pass shapes and architecture

In [110]:
test_vgg1 = VGG4(C=4, input_feats_shape=(32, 32, 3))
print('My beautiful VGG4 network!')
test_vgg1.compile()

My beautiful VGG4 network!
---------------------------------------------------------------------------
Dense layer output(output) shape: [1, 4]
Dropout layer output(dropout1) shape: [1, 128]
Dense layer output(dense1) shape: [1, 128]
Flatten layer output(flat) shape: [1, 16384]
MaxPool2D layer output(maxpool1) shape: [1, 16, 16, 64]
Conv2D layer output(conv2) shape: [1, 32, 32, 64]
Conv2D layer output(conv1) shape: [1, 32, 32, 64]
---------------------------------------------------------------------------


The above cell should print out:

```
My beautiful VGG4 network!
---------------------------------------------------------------------------
Dense layer output(output) shape: [1, 4]
Dropout layer output(dropout1) shape: [1, 128]
Dense layer output(dense1) shape: [1, 128]
Flatten layer output(flat) shape: [1, 16384]
MaxPool2D layer output(maxpool1) shape: [1, 16, 16, 64]
Conv2D layer output(conv2) shape: [1, 32, 32, 64]
Conv2D layer output(conv1) shape: [1, 32, 32, 64]
---------------------------------------------------------------------------
```

The network layer names in the parentheses probably will be different (*your chosen layer names*) and that's ok — it should have no bearing on the functionality of your network.

#### Test: `VGG4` forward pass output

Be cautious about small errors in the activations — any discrepancy may suggest a potential bug in your code.

In [111]:
tf.random.set_seed(0)
test_x = tf.random.normal(shape=(2, 8, 8, 3))
test_vgg2 = VGG4(C=4, input_feats_shape=(8, 8, 3), filters=5, dense_units=10, wt_scale=1e-1)
test_vgg2.compile()
test_net_act_out = test_vgg2(test_x)
print(f'Your output activations after the forward pass are:\n{test_net_act_out}')



---------------------------------------------------------------------------
Dense layer output(output) shape: [1, 4]
Dropout layer output(dropout1) shape: [1, 10]
Dense layer output(dense1) shape: [1, 10]
Flatten layer output(flat) shape: [1, 80]
MaxPool2D layer output(maxpool1) shape: [1, 4, 4, 5]
Conv2D layer output(conv2) shape: [1, 8, 8, 5]
Conv2D layer output(conv1) shape: [1, 8, 8, 5]
---------------------------------------------------------------------------
Your output activations after the forward pass are:
[[0.2589 0.2687 0.2406 0.2318]
 [0.2661 0.2601 0.2397 0.2342]]


The above cell should print out:

```
---------------------------------------------------------------------------
Dense layer output(output) shape: [1, 4]
Dropout layer output(dropout1) shape: [1, 10]
Dense layer output(dense1) shape: [1, 10]
Flatten layer output(flat) shape: [1, 80]
MaxPool2D layer output(maxpool1) shape: [1, 4, 4, 5]
Conv2D layer output(conv2) shape: [1, 8, 8, 5]
Conv2D layer output(conv1) shape: [1, 8, 8, 5]
---------------------------------------------------------------------------
Your output activations after the forward pass are:
[[0.2589 0.2687 0.2406 0.2318]
 [0.2661 0.2601 0.2397 0.2342]]
```

### 3c. Implement `DeepNetwork` part 2/2

Now that the `VGG4` network has been built and tested, let's make it functional so that we can train it with some data!

Implement the following methods in `DeepNetwork` to finish up the class:
- `set_layer_training_mode(is_training)`: Configures each net layer to operate in training mode or non-training mode.
- `accuracy(y_true, y_pred)`
- `predict(x, output_layer_net_act)`: Perform the forward pass on data samples `x` and predict their classes.
- `loss(out_net_act, y, eps)`: Compute the general cross-entropy loss. **See equation below.**
- `update_params(tape, loss)`: Perform one "step" of backprop — update the network weights and biases based on gradients recorded on the gradient tape.
- `train_step(x_batch, y_batch)`: Do one "step" of training — forward backward pass.
- `test_step(x_batch, y_batch)`: Do one "step" of testing/prediction.
- `fit(x, y, x_val, y_val, batch_size, max_epochs, val_every, verbose)`: Train the deep neural network using training and (optionally) validation data.

Except for one case noted in `fit`, **your code should be implemented in 100% TensorFlow** in `DeepNetwork`!

#### General cross-entropy loss

Here is a refresher on the equation for the general cross-entropy loss $L$ with int-coded classes $y_i$ and output layer net acts $z_i$ for sample $i$. You should implement this equation in the `loss` method.

$$
L = -\frac{1}{N} \sum_{i=1}^N Log \left (z_{i, y_i} + \epsilon \right )
$$

The only thing new about the above equation is addition of $\epsilon$, which is a very small fudge factor to prevent possibly taking the log of 0 in rare cases.

**NOTE:** You already implemented this in CS343, so you can adapt your code. But remember, the code should be written in TensorFlow rather than NumPy here.

#### Test: `set_layer_training_mode`

**NOTE:** The following tests go back to using the simpler network test code from Task 3a.

In [112]:
test_net = create_test_net()
test_net.compile()

test_net.set_layer_training_mode(False)
should_be_false = [test_net.test_layer1.get_mode(),
                   test_net.test_layer2.get_mode(),
                   test_net.output_layer.get_mode()]

print(f'All net layers should NOT be in training mode. Are they in training mode? {tf.reduce_any(should_be_false)}')

test_net.set_layer_training_mode(True)
should_be_true = [test_net.test_layer1.get_mode(),
                  test_net.test_layer2.get_mode(),
                  test_net.output_layer.get_mode()]

print(f'All net layers should be in training mode. Are they in training mode? {tf.reduce_any(should_be_true)}')

test_net.set_layer_training_mode(False)
should_be_false = [test_net.test_layer1.get_mode(),
                   test_net.test_layer2.get_mode(),
                   test_net.output_layer.get_mode()]

print(f'All net layers should NOT be in training mode. Are they in training mode? {tf.reduce_any(should_be_false)}')

---------------------------------------------------------------------------
Dense layer output(dense_out) shape: [1, 5]
Dense layer output(dense2) shape: [1, 4]
Dense layer output(dense1) shape: [1, 3]
---------------------------------------------------------------------------
All net layers should NOT be in training mode. Are they in training mode? False
All net layers should be in training mode. Are they in training mode? True
All net layers should NOT be in training mode. Are they in training mode? False


#### Test: `accuracy` and `loss`

In [79]:
tf.random.set_seed(0)
test_y_true = tf.constant([1, 0, 0, 1, 2, 1, 1, 0, 0, 1, 2])
test_y_pred = tf.constant([1, 0, 2, 1, 0, 1, 1, 0, 0, 1, 0])
test_acc = test_net.accuracy(test_y_true, test_y_pred)
print(f'Your test acc is {test_acc:.4f} and it should be 0.7273.')

test_net_acts = tf.random.uniform(shape=(2, 5))
test_y = tf.constant([0, 2])
test_loss = test_net.loss(test_net_acts, test_y, eps=1e-1)
print(f'Your test loss is {test_loss:.4f} and it should be 0.4215.')


Your test acc is 0.7273 and it should be 0.7273.
Your test loss is 0.4215 and it should be 0.4215.


#### Test: `update_params` and `train_step`

In [119]:
tf.random.set_seed(0)  # Make sure everyone's wts/biases are the same
test_net = create_test_net()
test_net.compile(lr=10.0)  # this is an insanely high lr just for testing :)

tf.random.set_seed(1)
test_x = tf.random.uniform(shape=(3, 2))
test_y_true = tf.constant([3, 0, 2])

print(28*'-', 'Before train step', 28*'-')
wts_0 = test_net.test_layer1.wts.numpy()
b_2 = test_net.output_layer.b.numpy()
print(f'1st layer wts:\n{wts_0}')
print('and they should be:')
print('''[[ 0.0015  0.0004 -0.0004]
 [-0.001  -0.0012  0.0005]]''')
print(f'Output layer bias:\n{b_2}')
print('and they should be:')
print('''[-0.0012 -0.0007 -0.0002  0.0019 -0.0005]''')

loss = test_net.train_step(test_x, test_y_true)

print(28*'-', 'After train step', 28*'-')
wts_0 = test_net.test_layer1.wts.numpy()
b_2 = test_net.output_layer.b.numpy()
print(f'1st layer wts:\n{wts_0}')
print('and they should be:')
print('''[[-9.9775 -9.9253  9.9247]
 [ 9.9693  9.9732 -9.9327]]''')
print(f'Output layer bias:\n{b_2}')
print('and they should be:')
print('''[-0.0012 -0.0007 -0.0002 10.0019 -0.0005]''')

print()
print(f'The loss on the test mini-batch is {loss.numpy():.4f} and should be 26.6442')

---------------------------------------------------------------------------
Dense layer output(dense_out) shape: [1, 5]
Dense layer output(dense2) shape: [1, 4]
Dense layer output(dense1) shape: [1, 3]
---------------------------------------------------------------------------
---------------------------- Before train step ----------------------------
1st layer wts:
[[ 0.0015  0.0004 -0.0004]
 [-0.001  -0.0012  0.0005]]
and they should be:
[[ 0.0015  0.0004 -0.0004]
 [-0.001  -0.0012  0.0005]]
Output layer bias:
[-0.0012 -0.0007 -0.0002  0.0019 -0.0005]
and they should be:
[-0.0012 -0.0007 -0.0002  0.0019 -0.0005]
---------------------------- After train step ----------------------------
1st layer wts:
[[-9.9775 -9.9253  9.9247]
 [ 9.9693  9.9732 -9.9327]]
and they should be:
[[-9.9775 -9.9253  9.9247]
 [ 9.9693  9.9732 -9.9327]]
Output layer bias:
[-0.0012 -0.0007 -0.0002 10.0019 -0.0005]
and they should be:
[-0.0012 -0.0007 -0.0002 10.0019 -0.0005]

The loss on the test mini-batch is

#### Test: `test_step`

In [120]:
tf.random.set_seed(0)  # Make sure everyone's wts/biases are the same
test_net = create_test_net()
test_net.compile()

tf.random.set_seed(1)
test_x = tf.random.uniform(shape=(5, 2))
test_y_true = tf.constant([3, 1, 2, 1, 0])

test_acc, test_loss = test_net.test_step(test_x, test_y_true)

print(f'Your acc is {test_acc:.3f} and it should be 0.200')
print(f'Your loss is {test_loss:.3f} and it should be 30.723')


---------------------------------------------------------------------------
Dense layer output(dense_out) shape: [1, 5]
Dense layer output(dense2) shape: [1, 4]
Dense layer output(dense1) shape: [1, 3]
---------------------------------------------------------------------------
Your acc is 0.200 and it should be 0.200
Your loss is 30.723 and it should be 30.723


**NOTE:** Your `fit` method will be tested in the next task!