# CS1470/2470 HW2: Testing Notebook


We love unit tests.

---

Due to the scale of this assignment, we've included a notebook for unit testing the individual 
components that you will code. Feel free to edit this notebook however you want (e.g. adding print
statements for shapes and stuff). Please note that there might be slight differences between the 
expected output and your outputs in terms of the shapes and values due to broadcasting and floating 
point values; use your best judgement if differences do appear and if you still aren't sure, 
feel free to post on Ed!

A lot of the functions you will write influence the outputs of each other, making it really 
frustrating sometimes to debug. Hopefully these tests will help you pinpoint where exactly your 
model is correct/incorrect and thus will lead to a lot less hair pulling! :D 

In [None]:
%load_ext autoreload
%autoreload 1
%aimport Beras, assignment

import tensorflow as tf
import numpy as np

### Dense Testing

In [None]:
from Beras.layers import Dense


# Define inputs for testing
in_size, out_size = 10, 5
inputs = np.arange(-5,5).reshape((1, in_size))

# Define implementation
student_implementation = Dense(in_size, out_size, initializer="zero")
outputs = student_implementation(inputs)

print("Dense Input:")
print(inputs)
print("\nDense Output:")
print(outputs)
print("\nWeight Gradients:")
print(student_implementation.weight_gradients())
print("\nWeight Gradient Shapes:")
print(student_implementation.weight_gradients()[0].shape)
print(student_implementation.weight_gradients()[1].shape)
print("\nInput Gradients:")
print(student_implementation.input_gradients())
print("\nInput Gradient Shapes:")
print(student_implementation.input_gradients()[0].shape)

**> Expected Output** (Double Click)
<!--
```
Dense Input:
[[-5 -4 -3 -2 -1  0  1  2  3  4]]

Dense Output:
[[0. 0. 0. 0. 0.]]

Weight Gradients:
[Variable([[[-5., -5., -5., -5., -5.],
           [-4., -4., -4., -4., -4.],
           [-3., -3., -3., -3., -3.],
           [-2., -2., -2., -2., -2.],
           [-1., -1., -1., -1., -1.],
           [ 0.,  0.,  0.,  0.,  0.],
           [ 1.,  1.,  1.,  1.,  1.],
           [ 2.,  2.,  2.,  2.,  2.],
           [ 3.,  3.,  3.,  3.,  3.],
           [ 4.,  4.,  4.,  4.,  4.]]]), Variable([[1., 1., 1., 1., 1.]])]

Weight Gradient Shapes:
(1, 10, 5)
(1, 5)

Input Gradients:
[Variable([[0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.]])]

Input Gradient Shapes:
(10, 5)
```-->

### Metrics Testing

In [None]:
from Beras.metrics import CategoricalAccuracy

# Define inputs for testing
y_pred = [[0.1, 0.9, 0.8], [0.05, 0.95, 0]]
y_true = [[0, 0, 1], [0, 1, 0]]

# Define implementations for comparison
student_implementation = CategoricalAccuracy() 
keras_implementation = tf.keras.metrics.CategoricalAccuracy()

print("Inputs:")

print("Predictions")
print(y_pred)

print("Labels")
print(y_true)

print()

print("Outputs:")
print("Student Implementation")
print(student_implementation(y_true, y_pred))
print("Keras Implementation")
print(keras_implementation(y_true, y_pred).numpy())



### Activations Testing

#### LeakyReLU

In [None]:
from Beras.activations import LeakyReLU

# Define inputs for testing (batchsize of 2, each a 3 x 4 array)
inputs = np.float64(np.arange(-12, 12).reshape(2, 3, 4))

# Define implementation
student_implementation = LeakyReLU()
outputs = student_implementation(inputs)

print("Activation Input:")
print(inputs)
print("\nActivation Output:")
print(outputs)
print(f"Shape of {outputs.shape}")
print("\nInput Gradients:")
print(np.array(student_implementation.input_gradients()))
print("\nInput Gradient Shapes:")
print(np.array(student_implementation.input_gradients()).shape)
print("\nCompose to Input:")
print(student_implementation.compose_to_input(outputs))
print("\nCompose to Input Shapes:")
print(np.array(student_implementation.compose_to_input(outputs)).shape)


**> Expected Output**
<!--
```
Activation Input:
[[[-12. -11. -10.  -9.]
  [ -8.  -7.  -6.  -5.]
  [ -4.  -3.  -2.  -1.]]

 [[  0.   1.   2.   3.]
  [  4.   5.   6.   7.]
  [  8.   9.  10.  11.]]]

Activation Output:
[[[-3.6 -3.3 -3.  -2.7]
  [-2.4 -2.1 -1.8 -1.5]
  [-1.2 -0.9 -0.6 -0.3]]

 [[ 0.   1.   2.   3. ]
  [ 4.   5.   6.   7. ]
  [ 8.   9.  10.  11. ]]]
Shape of (2, 3, 4)

Input Gradients:
[[[[0.3 0.3 0.3 0.3]
   [0.3 0.3 0.3 0.3]
   [0.3 0.3 0.3 0.3]]

  [[0.  1.  1.  1. ]
   [1.  1.  1.  1. ]
   [1.  1.  1.  1. ]]]]

Input Gradient Shapes:
(1, 2, 3, 4)

Compose to Input:
[[[-1.08 -0.99 -0.9  -0.81]
  [-0.72 -0.63 -0.54 -0.45]
  [-0.36 -0.27 -0.18 -0.09]]

 [[ 0.    1.    2.    3.  ]
  [ 4.    5.    6.    7.  ]
  [ 8.    9.   10.   11.  ]]]

Compose to Input Shapes:
(2, 3, 4)
```-->

#### Softmax (2470 Only)

In [None]:
from Beras.activations import Softmax

# Define inputs for testing (batchsize of 4 with 4 elements each)
inputs = np.float64(np.arange(-8, 8).reshape(4, 4))

# Define implementation
student_implementation = Softmax()
outputs = student_implementation(inputs)

print("Input:")
print(inputs)
print("\nActivation Outputs:")
print(outputs)
print(f"Shape of {outputs.shape}")
print("\nInput Gradients:")
print(np.array(student_implementation.input_gradients()))
print("\nInput Gradient Shapes:")
print(np.array(student_implementation.input_gradients()).shape)

**> Expected Output**
<!--
```
Input:
[[-8. -7. -6. -5.]
 [-4. -3. -2. -1.]
 [ 0.  1.  2.  3.]
 [ 4.  5.  6.  7.]]

Activation Outputs:
[[0.0320586  0.08714432 0.23688282 0.64391426]
 [0.0320586  0.08714432 0.23688282 0.64391426]
 [0.0320586  0.08714432 0.23688282 0.64391426]
 [0.0320586  0.08714432 0.23688282 0.64391426]]
Shape of (4, 4)

Input Gradients:
[[[[ 0.03103085 -0.00279373 -0.00759413 -0.02064299]
   [-0.00279373  0.07955019 -0.02064299 -0.05611347]
   [-0.00759413 -0.02064299  0.18076935 -0.15253222]
   [-0.02064299 -0.05611347 -0.15253222  0.22928869]]

  [[ 0.03103085 -0.00279373 -0.00759413 -0.02064299]
   [-0.00279373  0.07955019 -0.02064299 -0.05611347]
   [-0.00759413 -0.02064299  0.18076935 -0.15253222]
   [-0.02064299 -0.05611347 -0.15253222  0.22928869]]

  [[ 0.03103085 -0.00279373 -0.00759413 -0.02064299]
   [-0.00279373  0.07955019 -0.02064299 -0.05611347]
   [-0.00759413 -0.02064299  0.18076935 -0.15253222]
   [-0.02064299 -0.05611347 -0.15253222  0.22928869]]

  [[ 0.03103085 -0.00279373 -0.00759413 -0.02064299]
   [-0.00279373  0.07955019 -0.02064299 -0.05611347]
   [-0.00759413 -0.02064299  0.18076935 -0.15253222]
   [-0.02064299 -0.05611347 -0.15253222  0.22928869]]]]

Input Gradient Shapes:
(1, 4, 4, 4)
   ```-->

### Optimizers Testing

#### BasicOptimizer

In [None]:
from Beras.optimizers import BasicOptimizer
from Beras import Tensor

# Define inputs
learning_rate = 0.05

weights = Tensor([-1., 1.])
grads = Tensor([0.5, -0.5])

# Define implementation
student_implementation = BasicOptimizer(learning_rate)

print("Inputs:")
print(f"Weights: {weights}")
print(f"Grads: {grads}")
print("\nOutputs t=1:")
student_implementation.apply_gradients([weights], [grads])
print(f"Weights: {weights}")
print("\nOutputs t=2:")
student_implementation.apply_gradients([weights], [grads])
print(f"Weights: {weights}")

**> Expected Output**
<!--
```
Inputs:
Weights: [-1.  1.]
Grads: [ 0.5 -0.5]

Outputs t=1:
Weights: [-1.025  1.025]

Outputs t=2:
Weights: [-1.05  1.05]
```-->

#### RMSProp

In [None]:
from Beras.optimizers import RMSProp

# Define inputs
learning_rate = 0.05
beta = 0.8
epsilon = 1e-7

weights = Tensor([-1., 1.])
grads   = Tensor([0.5, -0.5])

# Define implementation
student_implementation = RMSProp(learning_rate, beta=beta, epsilon=epsilon)

print("Inputs:")
print(f"Weights: {weights}")
print(f"Grads: {grads}")
print("\nOutputs t=1:")
student_implementation.apply_gradients([weights], [grads])
print(f"Weights: {weights}")
print(f"Running Average: {list(student_implementation.v.items())}")
print("\nOutputs t=2:")
student_implementation.apply_gradients([weights], [grads])
print(f"Weights: {weights}")
print(f"Running Average: {list(student_implementation.v.items())}")

**> Expected Output**
<!--
```
Inputs:
Weights: [-1.  1.]
Grads: [ 0.5 -0.5]

Outputs t=1:
Weights: [-1.11180335  1.11180335]
Running Average: [(0, Tensor([0.05, 0.05]))]

Outputs t=2:
Weights: [-1.19513665  1.19513665]
Running Average: [(0, Tensor([0.09, 0.09]))]
```-->

#### Adam

In [None]:
from Beras.optimizers import Adam

# Define inputs
learning_rate = 0.05
beta_1 = 0.8
beta_2 = 0.888
epsilon = 1e-6

weights = [Tensor([-1., 1.])]
grads = [Tensor([0.5, -0.5])]

# Define implementation
student_implementation = Adam(learning_rate, beta_1=beta_1, beta_2=beta_2, epsilon=epsilon, amsgrad=False)

print("Inputs:")
print(f"Weights: {weights}")
print(f"Grads: {grads}")
print("\nOutputs t=1:")
student_implementation.apply_gradients(weights, grads)
print(f"Weights: {weights}")
print(f"First moment: {list(student_implementation.m.items())}")
print(f"Second moment: {list(student_implementation.v.items())}")
print(f"Expected first moment: {list(student_implementation.m_hat.items())}")
print(f"Expected second moment: {list(student_implementation.v_hat.items())}")
print("\nOutputs t=2:")
student_implementation.apply_gradients(weights, grads)
print(f"Weights: {weights}")
print(f"First moment: {list(student_implementation.m.items())}")
print(f"Second moment: {list(student_implementation.v.items())}")
print(f"Expected first moment: {list(student_implementation.m_hat.items())}")
print(f"Expected second moment: {list(student_implementation.v_hat.items())}")

**> Expected Output**
<!--
```
Inputs:
Weights: [Tensor([-1.,  1.])]
Grads: [Tensor([ 0.5, -0.5])]

Outputs t=1:
Weights: [Tensor([-1.0499999,  1.0499999])]
First moment: [(0, Tensor([ 0.1, -0.1]))]
Second moment: [(0, Tensor([0.028, 0.028]))]
Expected first moment: [(0, Tensor([ 0.5, -0.5]))]
Expected second moment: [(0, Tensor([0.25, 0.25]))]

Outputs t=2:
Weights: [Tensor([-1.0999998,  1.0999998])]
First moment: [(0, Tensor([ 0.18, -0.18]))]
Second moment: [(0, Tensor([0.052864, 0.052864]))]
Expected first moment: [(0, Tensor([ 0.5, -0.5]))]
Expected second moment: [(0, Tensor([0.25, 0.25]))]
```-->