<img align="center" src="figures/course.png" width="800">

#                                    16720 (B) Neural Networks for Recognition - Assignment 3

     Instructor: Kris Kitani                       TAs: Qichen(Lead), Paritosh, Rawal, Yan, Zen, Wen-Hsuan

## Q5 Image Compression with Autoencoders [Extra Credit](25 points)

**Note: We would recommend finishing Question 7 (pytorch) before attempting this question**

**Please include all the write up answers below to theory.ipynb. For the questions need code, you need to include the screenshot of code to theory.ipynb to get points.**

An autoencoder is a neural network that is trained to attempt to copy its input to its output, but it usually allows copying only approximately. This is typically achieved by restricting the number of hidden nodes inside the autoencoder; in other words, the autoencoder would be forced to learn to \textit{represent} data with this limited number of hidden nodes. This is a useful way of learning compressed representations.
In this section, we will continue using the NIST36 dataset you have from the previous questions.

### Q5.1 Building the Autoencoder

#### Q5.1.1 (10 points Code)

Due to the difficulty in training auto-encoders, we have to move to the $relu(x) = max(x,0)$ activation function. It is provided for you in **q2.ipynb**. Implement a 2 hidden layer autoencoder where the layers are 
1. 1024 to 32 dimensions, followed by a ReLU
2. 32 to 32 dimensions, followed by a ReLU
3. 32 to 32 dimensions, followed by a ReLU
4. 32 to 1024 dimensions, followed by a sigmoid (this normalizes the image output for us)

The loss function that you're using is total squared error for the output image compared to the input image (they should be the same!).  

<font color="red">**Please include the screenshot of code to theory.ipynb**</font>

In [None]:
import numpy as np
import scipy.io
from ipynb.fs.defs.q2 import *
from collections import Counter

train_data = scipy.io.loadmat('data/nist36_train.mat')
valid_data = scipy.io.loadmat('data/nist36_valid.mat')

# we don't need labels now!
train_x = train_data['train_data']
valid_x = valid_data['valid_data']

max_iters = 100
# pick a batch size, initial learning rate
batch_size = None
learning_rate = None
# YOUR CODE HERE
raise NotImplementedError()
hidden_size = 32
lr_rate = 20

batches = get_random_batches(train_x,np.ones((train_x.shape[0],1)),batch_size)
batch_num = len(batches)

params = Counter()

# initialize layers here
# YOUR CODE HERE
raise NotImplementedError()

#### Q5.1.2 (5 points Autograder)

To help even more with convergence speed, we will implement [momentum](http://cs231n.github.io/neural-networks-3/#sgd). Now, instead of updating $W = W - \alpha \frac{\partial J}{\partial W}$, we will use the update rules $M_W = 0.9 M_W - \alpha \frac{\partial J}{\partial W}$ and $W = W + M_W$. To implement this, populate the parameters dictionary with zero-initialized momentum accumulators, one for each parameter. Then simply perform both update equations for every batch.

In [None]:
def momentum_update(w: np.ndarray, m_w: np.ndarray, grad_w: np.ndarray, alpha: float, momentum: float=0.9):
    '''
    Momentum update
    
    [input]
    * w -- parameters
    * m_w -- the momentum of the parameters
    * grad_x -- parameter gradients
    * alpha -- learning rate
    * momentum -- the momentum factor
    
    [output]
    * w -- updated parameters
    * m_w -- updated momentum of the parameters
    '''
    # YOUR CODE HERE
    raise NotImplementedError()

    return w, m_w

### Q5.2 Training the Autoencoder (3 Points Code+WriteUp)
 
Using the provided default settings, train the network for 100 epochs. What do you observe in the plotted training loss curve as it progresses?


<font color="red">**Please include the write up answer and the screenshot of code to theory.ipynb**</font>

In [None]:
# should look like your previous training loops
for itr in range(max_iters):
    total_loss = 0
    for xb,_ in batches:
        
        # training loop can be exactly the same as q2!
        # your loss is now squared error
        # delta is the d/dx of (x-y)^2
        # to implement momentum
        #   just use 'm_'+name variables
        #   to keep a saved value over timestamps
        #   params is a Counter(), which returns a 0 if an element is missing
        #   so you should be able to write your loop without any special conditions
        # YOUR CODE HERE
        raise NotImplementedError()
    if itr % 2 == 0:
        print("itr: {:02d} \t loss: {:.2f}".format(itr,total_loss))
    if itr % lr_rate == lr_rate-1:
        learning_rate *= 0.9

### Q5.3 Evaluating the Autoencoder

#### Q5.3.1(4 Points Code+WriteUp)
 
Now let’s evaluate how well the autoencoder has been trained.
 Select 5 classes from the total 36 classes in your dataset and for each selected class include in your report 2 validation images and their reconstruction. What differences do you observe that exist in the reconstructed validation images, compared to the original ones?


<font color="red">**Please include the write up answer and the screenshot of code to theory.ipynb**</font>

In [None]:
import matplotlib.pyplot as plt
# YOUR CODE HERE
raise NotImplementedError()

#### Q5.3.2 (3 Points Code+WriteUp)
 
Let’s evaluate the reconstruction quality using Peak Signal- to-noise Ratio (PSNR). PSNR is defined as
\begin{align}
\text{PSNR} = 20 \times \log_{10}(\text{MAX}_I) - 10\times \log_{10}(\text{MSE})
\end{align}
where $\text{MAX}_I$ is the maximum possible pixel value of the image, and $\text{MSE}$ (mean squared error) is computed across all pixels. You may use [skimage.measure.compare\_psnr](http://scikit-image.org/docs/dev/api/skimage.measure.html#skimage.measure.compare_psnr) for convenience. Report the average PSNR you get from the autoencoder across all validation images.

<font color="red">**Please include the write up answer to theory.ipynb**</font>

In [None]:
from skimage.metrics import peak_signal_noise_ratio as psnr
# evaluate PSNR

# YOUR CODE HERE
raise NotImplementedError()
