## NAME(s):               
## NETID(s): 

# Homework 01: Neural Networks
This assignment walks you through the basics of neural network construction. You may work in groups of up to 3 people. 

The overall goal here is to give you a better understanding of neural networks without reliance on other packages. You'll basically be building an extremely simplified version of some Pytorch layers/functions. 

**For your own good, do not use AI assistance to complete this assignment. You will be tested on this material in the first exam.** Using AI to complete this will not ultimately help you prepare for this upcoming evaluation. You also risk losing (all) points if your work is clearly AI-generated. 

If you get stuck, I recommend trying to work these problems out by hand. Drawing a picture or performing the arithmetic on paper can make things clearer. The lecture notes will also help you here. If you get really, *really* stuck, send me or a TA an email! 

### Submission
Export the completed notebook as a PDF and submit on CMS prior to the deadline. **You are responsible for making sure that all cells are run and that the cell output is clearly visible prior to submission**. Double check your work before you click submit! 

## 1. Matrix multiplication (25 points)
Implement dot product, matrix multiplication, and transpose in plain Python: **no Numpy/Pytorch allowed.** Pay attention to the types indicated in the docstrings. Don't worry too much about time or memory complexity (unless you want to): I just care that your implementation works. 

In [None]:
def dot(a, b):
    """
    Dot product between two vectors 

    Args: 
        a: a n-dim vector (list, 1d np array, 1d tensor...)
        b: a n-dim vector (list, np array, tensor...)
    
    Returns: 
        c: Dot product of a and b (float)
    """
    # Your code here
    ...


def mm(A, B): 
    """
    Matrix multiply.

    Args: 
        A: n x m matrix (list of lists, np array, tensor...)
        B: m x p matrix (list of lists, np array, tensor...)

    Returns:
        C: A @ B 
    """
    # Your code here
    # Use the dot product function above to help you
    ...

def transpose(A): 
    """
    Transpose a matrix. (You will need this later.) 

    Args: 
        A: a n x m matrix A (list of lists, np array, tensor...)

    Returns:
        A^T: a m x n matrix, with rows and columns of A swapped.  
    """
    # Your code here
    ...

In [None]:
# Some tests for the functions above to help you
# Feel free to add more if you like

import numpy as np


a_test = np.random.randn(10)
b_test = np.random.randn(10)
true_dot = (a_test * b_test).sum()
your_dot = dot(a_test.tolist(), b_test.tolist())

assert np.allclose(true_dot, your_dot)

A_test = np.random.randn(4, 10)
B_test = np.random.randn(10, 3)

true_mm = A_test @ B_test 
your_mm = mm(A_test.tolist(), B_test.tolist())

assert np.allclose(true_mm, your_mm)

assert np.allclose(A_test.T, transpose(A_test.tolist()))

## 2. Sigmoid and ReLU (25 points)

Complete the functions below. I have imported `exp` for you, but you should not use any other imports here.


In [None]:
from math import exp

def sigmoid(x):
    """
    The sigmoid function. 

    Args: 
        x: a float, list of floats, list-of-lists of floats, np array, tensor, ...
    Returns: 
        sigmoid(x)
    """
    # Your code here
    ...

def relu(x):
    """
    The ReLU (rectified linear unit) function. 
    
    Args:
        x: a float, list of floats, list-of-lists of floats, np array, tensor...
    Returns: 
        ReLU(x)
    """
    # Your code here
    ...

In [None]:
# Some additional tests to help you
import torch.nn.functional as F


assert np.allclose(F.sigmoid(1.0), sigmoid(1.0))

test_a = np.random.randn(10)
test_A = np.random.randn((10, 10))

true_sigmoid = F.sigmoid(test_a)
your_sigmoid = sigmoid(test_a.tolist())

assert np.allclose(true_sigmoid, your_sigmoid)

true_sigmoid = F.sigmoid(test_A)
your_sigmoid = sigmoid(test_A)

assert np.allclose(true_sigmoid, your_sigmoid)

assert np.allclose(F.relu(-1.0), relu(-1.0))
assert np.allclose(F.relu(1.0), relu(1.0))

true_relu = F.relu(test_a)
your_relu = relu(test_a)

assert np.all_close(true_relu, your_relu)

true_relu = F.relu(test_A)
your_relu = relu(test_A)

assert np.allclose(true_relu, your_relu)

## 3. Building a Fully Connected Layer (25 points)
Here, implement a single linear layer: 
$$ y = xW^\intercal + b $$
where $x$ is your input, $W$ is your weight matrix, and $b$ is your bias vector. 

You may not use any Pytorch or Numpy here: for matmul, use the function you wrote earlier. 

In [None]:
class LinearLayer:
    """
    A mock fully-connected linear layer: 
    y = xW^T + b
    """
    def __init__(self, in_features, out_features, weight=None, bias=None):
        """
        Initialize a layer that accepts in_features in and outputs 
        out_features. If provided, initialize the weight matrix and bias vector
        with provided values. Otherwise, set all weights and bias to 0.0. 

        Args: 
            in_features: int
            out_features: int
            weight: Optional out_features x in_features matrix
                (list-of-lists). 
            bias: Optional bias vector of size out_features. 
        """
        self.in_features = in_features
        self.out_features = out_features

        # Your code here!
        if weight: 
            self.weight = weight 
        else: 
            ...

        if bias:
            self.bias = bias
        else:
            ... 
    
    def forward(self, x):
        """
        Args: 
            x: Input array-like with shape (*, self.in_features)
        
        Returns: 
            y: Output array-like with shape (*, self.out_features)
        """
        # Your code here!
        ...


Now, demonstrate that your layer behaves identically to Pytorch's `F.linear` in the cell below (check the Pytorch docs to see how to use this functional version of `nn.Linear`). You can do this by passing the same weights and bias term to both implementations when processing identical input.

In [None]:
# Your code here

## 4. Multilayer Perceptron (25 Points)

Construct an MLP using the classes/functions you've written above. The MLP should have the following structure:

- Linear layer w/ 8 input features, 16 output features
- ReLU
- Linear layer with ? input features, 32 output features
- ReLU
- Linear layer with ? input features, 16 output features
- ReLU
- Linear layer with ? input features, 8 output features
- Sigmoid 

You should be able to determine the values of the "?"s above. 

As with all prior sections, do not use any imports, Numpy/Pytorch functions, etc. What you've already written will be sufficient.


In [None]:
class MLP: 
    def __init__(self):
        """
        An MLP with layers and activation functions as described
        above. 
        """
        ...
    
    def forward(self, x):
        """
        Implements the forward pass of the MLP. 
        """
        ...