# A7.1 Autoencoder for Classification


We have talked in lecture about how an Autoencoder nonlinearly reduces the dimensionality of data.  In this assignment you will 
1. load an autoencoder network already trained in the MNIST data,
2. apply it to the MNIST training set to obtain the outputs of the units in the bottleneck layer as a new representation of each training set image with a greatly reduced dimensionality,
3. Train a fully-connected classification network on this new representation.
4. Report on the percent of training and testing images correctly classified.  Compare with the accuracy you get with the original images.

Download [nn_torch.zip](https://www.cs.colostate.edu/~anderson/cs445/notebooks/nn_torch.zip) and extract the files.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas
import pickle
import gzip
import torch
import neuralnetworks_torch as nntorch

First, let's load the MNIST data. You may download it here: [mnist.pkl.gz](http://deeplearning.net/data/mnist/mnist.pkl.gz).

In [2]:
with gzip.open('mnist.pkl.gz', 'rb') as f:
    train_set, valid_set, test_set = pickle.load(f, encoding='latin1')

Xtrain = train_set[0]
Ttrain = train_set[1]

Xtest = test_set[0]
Ttest = test_set[1]

Xtrain.shape, Ttrain.shape, Xtest.shape, Ttest.shape

((50000, 784), (50000,), (10000, 784), (10000,))

To load the network saved in Lecture Notes 21, run the following code.  This loads the saved torch neural network that was trained in a GPU.  It loads the state of that net (its weights) into a new net of the same structure but allocated on the CPU.

First download [mnist_autoencoder.pt](https://www.cs.colostate.edu/~anderson/cs445/notebooks/mnist_autoencoder.pt).

In [3]:
n_in = Xtrain.shape[1]
n_hiddens_per_layer = [500, 100, 50, 50, 20, 50, 50, 100, 500]
nnet_autoencoder = nntorch.NeuralNetwork(n_in, n_hiddens_per_layer, n_in, device='cpu')
nnet_autoencoder.standardize = ''

nnet_autoencoder.load_state_dict(torch.load('mnist_autoencoder.pt', map_location=torch.device('cpu')))

<All keys matched successfully>

To get the output of the units in the middle hidden layer, run `use_to_middle` function implemented for you in `neuralnetworks_torch`.

In [4]:
Xtrain_reduced = nnet_autoencoder.use_to_middle(Xtrain)
Xtrain_reduced.shape

(50000, 20)

And while we are here, let's get the reduced representation of `Xtest` also.

In [5]:
Xtest_reduced = nnet_autoencoder.use_to_middle(Xtest)
Xtest_reduced.shape

(10000, 20)

## Requirement

Your jobs are now to
1. train one fully-connected classifier using `Xtrain_reduced` and `Ttrain` and test it with `Xtest_reduced` and `Ttest`, and
2. train a second fully-connected classifier using `Xtrain` and `Ttrain` and test it with `Xtest` and `Ttest`.

Try to find parameters (hidden network structure, number of epochs, and learning rate) for which the classifier given the reduced representation does almost as well as the other classifier with the original data. Discuss your results.

Here is an example for part of Step 1.  It shows a brief training session (small number of epochs and simple hidden layer structure) for using the reduced data. 

## Reduced Neural Network

In [6]:
n_in = Xtrain_reduced.shape[1]
reduced_classifier = nntorch.NeuralNetwork_Classifier(n_in, [200,200], 10, device='cuda')

n_epochs = 100
reduced_classifier.train(Xtrain_reduced, Ttrain, n_epochs, 0.01, method='adam', standardize='')

def percent_correct(Predicted, Target):
    return 100 * np.mean(Predicted == Target)

Classes, _ = reduced_classifier.use(Xtrain_reduced)

print(f'% Correct  Train Reduced {percent_correct(Classes, Ttrain):.2f}')

Classes, _ = reduced_classifier.use(Xtest_reduced)

print(f'% Correct  Test Reduced {percent_correct(Classes, Ttest):.2f}')



Epoch 10: RMSE 0.412
Epoch 20: RMSE 0.347
Epoch 30: RMSE 0.293
Epoch 40: RMSE 0.253
Epoch 50: RMSE 0.219
Epoch 60: RMSE 0.188
Epoch 70: RMSE 0.159
Epoch 80: RMSE 0.135
Epoch 90: RMSE 0.115
Epoch 100: RMSE 0.100
% Correct  Train Reduced 97.11
% Correct  Test Reduced 96.60


## Original Neural Network

In [7]:
n_in = Xtrain.shape[1]
reduced_classifier = nntorch.NeuralNetwork_Classifier(n_in, [50,30], 10, device='cuda')

n_epochs = 100
reduced_classifier.train(Xtrain, Ttrain, n_epochs, 0.01, method='adam', standardize='')

def percent_correct(Predicted, Target):
    return 100 * np.mean(Predicted == Target)

Classes, _ = reduced_classifier.use(Xtrain)

print(f'% Correct  Train Original {percent_correct(Classes, Ttrain):.2f}')

Classes, _ = reduced_classifier.use(Xtest)

print(f'% Correct  Test Original {percent_correct(Classes, Ttest):.2f}')

Epoch 10: RMSE 0.824
Epoch 20: RMSE 0.414
Epoch 30: RMSE 0.288
Epoch 40: RMSE 0.227
Epoch 50: RMSE 0.187
Epoch 60: RMSE 0.157
Epoch 70: RMSE 0.132
Epoch 80: RMSE 0.113
Epoch 90: RMSE 0.096
Epoch 100: RMSE 0.082
% Correct  Train Original 97.84
% Correct  Test Original 96.01


## Results
Neural Network Parameters when trained using entire dataset:

Fully Connected Layer : [50,30]

Learning Rate : 0.01

Training Accuracy: 97.84

Testing Accuracy: 96.01

Number of Epochs: 100

Neural Network Parameters when trained using Reduced dataset:

Fully Connected Layer : [200,200]

Learning Rate : 0.01

Training Accuracy: 97.11

Testing Accuracy: 96.60

Number of Epochs: 100

It was observed that when the neural network was trained using reduced datset, larger number of fully connected layer was needed to get the same accuracy.


## Extra Credit

For 1 point of extra credit repeat this assignment using a second data set, one that we have not used in class before. This will require you to to train a new autoencoder net to use for this part.