# Autoencoders: a feature representation technique

## Introduction:


In this notebook, we will explore how to design an autoencoder class and apply it for features representation task.

But first make sure you have [training](https://d396qusza40orc.cloudfront.net/phoenixassets/course1-for-students/image_test_data.zip) and [testing](https://d396qusza40orc.cloudfront.net/phoenixassets/course1-for-students/image_train_data.zip) data. It's a CIFAR-10 dataset but with only 4 classes, courtesy [Prof. Carlos](https://www.coursera.org/learn/ml-foundations).

[Autoencoders](https://github.com/HFTrader/DeepLearningBook/blob/master/files/Chap14.pdf) are a special type for neural networks where the architecture is required to learn the input itself i.e. 

input --> (deepNeuralNetwork) --> input

One might question, why waste resources in learning hidden layers; instead just map a constant linear function $$f:R^n => R^n$$

But the architecture can be made such that the hidden layers can have lower dimensionality to "represent" an input. So autoencoders can be used as a technique to reduce dimensions and since it's NN architecture, it can reduce dimensionality *non-linearly* unlike PCA which is a linear dimensionality reduction technique. For a thorough review, learn more at [github](https://github.com/HFTrader/DeepLearningBook/blob/master/files/Chap14.pdf)

## Setting-up the notebook

* First: Write a simple AutoEncoder class using Tensorflow
* Second: Import all the required packages and datasets
* Third: Train the autoencoder 
* Fourth: Autoencode the input and run a simple linear classifier
* Fifth: Wrap up and notes

###  Write a simplest AutoEncoder class using TensorFlow

In [None]:
class AutoEncoder(object):
    """
    A deep learning autoencoder class implemented using TensorFlow
    
    https://github.com/nonlocal
    """
    def __init__(self, input_shape = (None,3072),  hidden_layers = 1, hidden_units = [1024,512,256]):
        """
        Initialize the class:

            Parameters
            ----------
            input_shape : tuple, of the form (None, int)
                The second element in the tuple is the number of "features" in a sample input
            hidden_units: list, int elements
                The number of hidden units in the "encoding" layers

            Returns
            -------

        """
        _, self.n_inputs = input_shape
        self._new_units =list([self.n_inputs]+hidden_units)
        self.weights = {}
        self.biases = {}
        self._n_hidden = len(hidden_units)
        
        for i in range(len(self._new_units)-1):
            self.weights["W_encode"+"{:d}".format(i+1)] = tf.Variable(tf.random_normal([self._new_units[i],self._new_units[i+1]]),name="W_encode"+"{:d}".format(i+1))
            self.biases["B_encode"+"{:d}".format(i+1)] = tf.Variable(tf.random_normal([self._new_units[i+1]]), name="B_encode"+"{:d}".format(i+1))
            self.weights["W_decode"+"{:d}".format(len(self._new_units)-1-i)] = tf.Variable(tf.random_normal([self._new_units[-(i+1)],self._new_units[-(i+2)]]),name="W_decode"+"{:d}".format(len(self._new_units)-1-i))
            self.biases["B_decode"+"{:d}".format(i+1)] = tf.Variable(tf.random_normal([self._new_units[i]]), name="B_decode"+"{:d}".format(len(self._new_units)-1-i))         
    
    def encode(self, arr):
        """
        Encode the given input sample(s)
            Parameters
            ----------
            arr : array (of array)
                The input to encode
                
            Returns
            -------
            code: array (of array)
                The "code" for the given input
        """
        code = arr
        for i in range(len(self._new_units)-1):
            code = tf.nn.tanh(tf.add(tf.matmul(code, self.weights[i]), self.biases[i]))
        return code
    
    def decode(self, code):
        """
        Decode the given code(s)
            Parameters
            ----------
            code : array (of array)
                The code to decode
                
            Returns
            -------
            recon: array (of array)
                The reconstructed input
        """
        recon = code
        for i in range(len(self._new_units)-1):
            recon = tf.nn.tanh(tf.add(tf.matmul(recon, tf.transpose(self.weights[i])), self.biases[self._n_hidden+i]))
        return recon
    def train(self, arr, batch_size=50, epochs=10, learning_rate=0.001):
        """
        Train the autoencoder.
        The error function is MSE, trained with AdamOptimizer
            Parameters
            ----------
            arr : array (of array), np.float32 or tf.floar32
                The input to autoencode
            batch_size : int, optional
                The number of batches to train at a time
            epochs : int, optional
                The epochs to make on the dataset
            learning_rate : int, optional
                The learning rate for Adam Optimizer            
                
            Returns
            -------
            
            Note
            ----
            MSE function is not recommended. It tries to blurr the images.
            
        """
        import time
        INPUT_SIZE = self.n_inputs

        x = tf.placeholder("float", [None, INPUT_SIZE])
        code = self.encode(x)
        y = self.decode(code)
        error = tf.reduce_sum(tf.square(y - x))
        optimization = tf.train.AdamOptimizer(learning_rate).minimize(error)
        
        n_samp,_ = arr.shape
        init = tf.global_variables_initializer()
        sess = tf.Session()
        sess.run(init)

        for i in range(epochs):
            start = time.time()
            for j in range(n_samp//batch_size):
                sample = np.arange(j*batch_size, j*batch_size+batch_size)
                batch_xs = arr[sample]
                sess.run(train_step, feed_dict={x: batch_xs})
            print ("Time: %s, \t Epochs completed: %s, \t MSE: %s") %(time.time()- start, i+1, sess.run(cost, feed_dict={x: batch_xs}))
        print("Final parameters:")
        print ("Time: %f, \t Epochs completed: %s, \t MSE: %s") %(time.time()- start, i+1, sess.run(cost, feed_dict={x: batch_xs - MEAN_IMG}))
        return "Model Trained."

In [None]:
help(AutoEncoder)

The above class demands some definition. Let's go over every function:
* init :
    This function create a neural network with 3 encoding layers and hence 3 decoding layers + the input layer. More specifically, here's the architeture(#hidden units):

    * layers structure:
    
    input-> layer1-> layer2-> layer3->layer4->layer5->layer6

    * layer functions:
    
    input-> encode-> encode-> encode-> decode -> decode -> decode

    * #units units per layers:
    
    3072 -> 1024 -> 512 -> 256 -> 512 -> 1024 -> 3072
    
The weights are defined such that a decoding layer has weights transpose that of corresponding encoding layer i.e. symmetric weights or "tied" weights


* decode:

    For decoding layers we use tanh function as activation for neural units.
    

* encode:

    For encoding layers, the activation function is tanh
    

* train:

    To use encoding and decoding functions of autoencoder, we first need to train it: here it's trained with MSE as cost and Adam Optimizer as training algorithm. Other error functions are advised for high constrast and resolution; MSE tried to average the neighbourhood of a region hence not recommended for this purpose; cross-entropy is a good choice. It is HIGHLY recommended to train the model usin Adam optimization. Maybe I am wrong, but, so far in my experience, in every step, the error is observed to decrease monotonically.

### Import all the required packages and datasets

In [None]:
import pandas as pd
import numpy as np
import tensorflow as tf
import graphlab

In [None]:
#the sizes and names incorrect. Make sure train dataset has more samples
image_train = graphlab.SFrame("image_test_data/")
image_test = graphlab.SFrame("image_train_data/")

### Let's explore the dataset

In [None]:
image_test.shape, image_train.shape

In [None]:
image_train

The SFrame has 5 columns:

In [None]:
image_train.column_names()

id: the ID of the image,

image: the actual image of the sample,

label: what's label for the sample, {cat, dog, automobile, bird}

deep_features: features extracted from AlexNet

image_array: raw pixel values of the sample

In [None]:
#convert the SFrame to pandas.DataFrame
image_train = image_train.to_dataframe()

In [None]:
image_train

In [None]:
#drop unnecessary columns

In [None]:
image_train.drop(['id', 'image'], axis=1,inplace=True)

In [None]:
image_train

## Train the autoencoder

In [None]:
ae = AutoEncoder()

In [None]:
image_train.image_array = image_train.image_array.apply(lambda x: np.array(x, np.float32))

In [None]:
x = ae.train(image_train.image_array.as_matrix())

In [None]:
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)

In [None]:
sess.run(x)

This will take a long long time!! The number of parameters to be learned is:

In [None]:
((3072*1024) + 1024 +(1024*512) + 512 + (512*256) +256)*2

7.6 million!! Brace yourself.

In [None]:
#en"code" the raw image pixels

In [None]:
image_train['ae_features'] = image_train.image_array.apply(lambda x: sess.run( ae.encode(np.array(x, dtype=np.float32).reshape(-1, 3072))))


Create a simple classifier to train with 'ae_features'

convert the dataframe back to SFrame

In [None]:
image_train = graphlab.SFrame(image_train)

In [None]:
type(image_train)

In [None]:
ae_feature_model = graphlab.logistic_classifier.create(image_train,
                                                         features=['ae_feature'],
                                                         target='label')

## Wrap up:

One interesting thing observed was that the Adam Optimizer was *always* reducing the error.I have know idea why. Have to check it out. Plus it takes a long long long time to learn 7.6 million parameters, so it's better to keep it running at night with smaller batch size.