# Overview
___

The work so far has been to implement kind of "modular mini framework" which takes the data, trains different classifiers (autoencoders) and generates perforamnce, weights and topology. The main class is in clfpipeline.py which is composed of preprocessors, readers, and classifiers. as a classifier so far it includes softmax and MLP, and sparse and stacked autoencoders. We will see how the process goes below, but it might be useful to review overal structure.


__Most of the Classes modules and functions are  well documented__ 


# Description of the repository structure
___
short overfiew of directories in __uvnn__ directory:

1. clfpipeline.py : main class which contains classifiers, readers and preprocessors, descirbed above.
2. __clssifiers__: package consisting of classifiers and autoencoders:
    * __nn directory__: base classes for MLPs: It includes training with SGD(constant step size) and evaluating errors
    * __autoencoder_sparse.py__: one hidden layer sparse autoencoder, which tries to regularize the activation layer.
    * __mlp.py__: multilayer perceptron with any number of hidden layers and different activation functio
    * __mlp_2layer and mlp_3layer__: implementation of 2 and 3 layer networks explicitely. $Wx +b1, Ux+b2$ without using cycles. I just leave it there because it might be easier for someone to debug and understand backpropagation and gradients.  
    * __softmax.py__: softmax regression 
    * __misc.py__: some helper functions

3. __output__: folder for keeping network weights and benchmarks, for each dataset there is a folder inside.
4. __input__: folder from where the program reads the data, this is not included in the repository as the files are usually large
5. __utils/__: folder for pipeline modules, inside there are:
    * __readers.py__ CSV data reader for classifiers, if there are other readers necessary they will go here
    * __Preprocessors.py__ different preprocessors for classifier pipeline, Main function of this class is to preprocess data and provide train, test, val splits. So far there is only one BasicePreprocessor, which does mean substriction and after that dividing by the SD. There is also autoencoderPP preprocessor for autoencoder
6. __config/__: config files(includes almost everything, method,  network parameters datasets, preprocessors and readers). there is a simple file smaple.yaml which gives and idea, loading configs from file isn't done yet.


In [1]:
# Set up and load modules nothing special here
%matplotlib inline
%load_ext autoreload
%autoreload 2
%load_ext autotime

import seaborn as sns
sns.set(color_codes=True)
import matplotlib.pyplot as plt
import numpy as np
plt.rcParams['savefig.dpi'] = 100

from IPython.display import display

from utils.readers import CsvReader
from clfpipeline import Clfpipeline
from classifiers.mlp import MLP
from classifiers.misc import merge_dicts


# MLP classifier example

mlp uses softmax function to turn the final layer into probabilites of classes, The loss to optimize is the average cross entropy, activation function can be passed either tanh or sigmoid. 

In [2]:
# set up a csv_reader class for pipeline object, we pass filename of inputs and targets

csv_reader = CsvReader(has_header=False, 
                       fn='input/trunc_mnist/trunc_mnist20x20_inputs.csv', 
                       fn_labels='input/trunc_mnist/trunc_mnist20x20_targets.csv')

# we set up a main object 'pipe' which preprocesses the data before training 
pipe = Clfpipeline(csv_reader, load_now=True)
digits_input = pipe.X
digits_targets = pipe.y
n_samples = pipe.X_train.shape[0]

Now set the parameter for classifiers and train, detailed description of each parameter will be kept on github readme.
Note these parameters aren't optimal. 

In [3]:
from classifiers.mlp import MLP
conf_clf = {'dims':(400, 50, 20, 15, 10), 
            'alpha':0.001,                
            'reg':0.01,
            'activation':'tanh'}

conf_train = {'batchsize':32,
              'costevery': 50,
               'nepoch': 500,
              'acc_batch':False,
              'opt':'rmsprop',
              'tolerance':0.01,
              'loss_metric':'accuracy'}
fullconf = merge_dicts(conf_clf, conf_train)
pipe.set_classifier(MLP(**conf_clf))
pipe.train(**conf_train)
pipe.plot()

Now we can call pipe save_weight function if we wont to save weights and topology

In [4]:
pipe.save_weights(algo_name="AE_EXAMPLE", dataset_name="TRUNCMNIST", confs=fullconf, folder="output/truncmnist/")

# Sparse Autoencoder Example

Sparse autoencoder has an average squared error as the loss fucntion, activation function is sigmoid. 

In [5]:
from classifiers.autoencoder_sparse import AutoEncoderSparse
from utils.preprocessors import AutoEncoderPP
csv_reader = CsvReader(has_header=False, fn='input/trunc_mnist/trunc_mnist20x20_inputs.csv')
pipe = Clfpipeline(csv_reader, PreProc=AutoEncoderPP, load_now=True)

conf_clf = {'dims':(400, 100, 400), 
            'alpha':0.001,
            #'alpha':0.000095, 
            'reg':0.0000,
            'beta':0,
            'ro':0.05}

conf_train = {'batchsize':-1,
              'costevery': 1,
               'nepoch': 500,
                'acc_batch':True,
                'opt':'rmsprop',
                'loss_metric':'MSE',
                 'tolerance':0.01}
fullconf = merge_dicts(conf_clf, conf_train)

#aec = AutoEncoderSparse(**conf_clf)
#aec.grad_check(pipe.X[0], pipe.X[0])
pipe.set_classifier(AutoEncoderSparse(**conf_clf))
pipe.train(**conf_train)
pipe.plot()

## Sparse AE on Full mnist

In [6]:
from utils.preprocessors import MnistPP
csv_reader = CsvReader(fn='input/kaggle_set/train.csv',has_header=True, label_pos=0)
#csv_reader = CsvReader(has_header=False, fn='input/trunc_mnist/trunc_mnist20x20_inputs.csv')
pipe = Clfpipeline(csv_reader, PreProc=MnistPP, load_now=True)

In [7]:
from classifiers.autoencoder_sparse import AutoEncoderSparse
from utils.preprocessors import AutoEncoderPP
conf_clf = {'dims':(784, 196, 784), 
            'alpha':0.001,
            #'alpha':0.000095, 
            'reg':0.0000,
            'beta':3,
            'ro':0.1}
conf_train = {'batchsize':600,
              'costevery': 100,
               'nepoch': 500000,
                'acc_batch':True,
                'opt':'rmsprop',
                'loss_metric':'MSE',
                 'tolerance':-1}
fullconf = merge_dicts(conf_clf, conf_train)
#aec = AutoEncoderSparse(**conf_clf)
#aec.grad_check(pipe.X[0], pipe.X[0])
pipe.set_classifier(AutoEncoderSparse(**conf_clf))
pipe.train(**conf_train)
pipe.plot()


### We can visualize how sparse autoencoder predicts digits
First row is the input, second row is the output of the AE


In [8]:
from classifiers.misc import vis_images_truncmnist
vis_images_truncmnist(pipe.preprocessor.X, pipe.classifier, 28, 28, 10)


In [9]:
cur_input = pipe.preprocessor.X
print cur_input.shape

In [10]:
ae_hidden = pipe.classifier.predict_hidden(cur_input)

In [11]:
kaggle_mnst_output = pipe.y
print ae_hidden.shape

In [12]:
W = pipe.classifier.get_weights()[0][:,:-1]
plt.figure(figsize=(10, 10))
for i in range(16):
    for j in range(16):
        n = i * 16 + j
        if n >= len(W):
            break
        ax = plt.subplot(16, 16, n + 1)
        ax.get_xaxis().set_visible(False)
        ax.get_yaxis().set_visible(False)
        plt.imshow(W[n].reshape(28, 28))
#plt.imshow(W[7].reshape(20, 20))

# Stacked Autoencoder on truncated mnist
Let's make and train autoencoder with 2 hidden layers and with softmax function as the final activation to predict mnist dataset.

In [14]:
from classifiers.autoencoder_sparse import AutoEncoderSparse
from utils.readers import DummyReader

csv_reader = CsvReader(has_header=False, fn='input/trunc_mnist/trunc_mnist20x20_inputs.csv')
pipe = Clfpipeline(csv_reader, PreProc=AutoEncoderPP, load_now=True)
cur_input = pipe.X
dims = (400, 100, 30, 10)


Ws = []  # stacked layers
for i in range(len(dims) - 2):
    sparse_dims = (dims[i], dims[i + 1], dims[i])
    print sparse_dims
    conf_sparse_clf = {'dims': sparse_dims,
                       'alpha':0.001, 
                       'reg':0.0000,
                       'beta':0,
                       'ro':0.1}
    conf_sparse_train = {'batchsize':-1,
                          'costevery': 1,
                          'nepoch': 1000,
                          'acc_batch':True,
                          'opt':'rmsprop',
                          'loss_metric':'MSE',
                          'tolerance':0.1}
    ae_sparse = AutoEncoderSparse(**conf_sparse_clf)
    pipe.set_classifier(ae_sparse)
    print 'Starting Training # %d sparse auto encoder' % (i + 1) 
    print 'dims are (%d %d %d)' % sparse_dims
    pipe.train(**conf_sparse_train)
    print 'Finised Training'
    pipe.plot()
    # extract W from first layer of sparse AE
    W = ae_sparse.get_weights()[0][:,:-1]
    Ws.append(W)
    cur_input = ae_sparse.predict_hidden(cur_input)
    print 'cur_input shape is', cur_input.shape
    
    # now build up a new classifier for the next autoencoder, for which the 
    # input will be the hidden activations of current one
    pipe = Clfpipeline(DummyReader(X=cur_input, y=cur_input), PreProc=AutoEncoderPP)

    
# train last stacked layer with softmax

    


#### Training softmax regression with the input which are features from last autoencoder

In [15]:
cur_input_cent = cur_input - np.mean(cur_input, axis=0)

In [16]:
sr_conf = {'dims':(30, 10),
          'alpha':0.001,
          'reg':0,
          'activation':'sigmoid'
          }
sr_train_conf = {'batchsize':700,
                'costevery':200,
                'nepoch':100000,
                'acc_batch':False,
                'opt':'rmsprop',
                'loss_metric':'accuracy',
                'tolerance':-10}

sr_pipe = Clfpipeline(DummyReader(X=cur_input_cent, y=kaggle_mnst_output), load_now=True)
sr_pipe.set_classifier(MLP(**sr_conf))
sr_pipe.train(**sr_train_conf)

# Extract the last layer
last_W = sr_pipe.get_weights()[0][:,:-1]
Ws.append(last_W)
print 'we have %d pretrained weights now' %(len (Ws))




##  fine tune the MLP

In [17]:
### now build the mlp and initialize weights with sparse aes
csv_reader = CsvReader(has_header=False, 
                       fn='input/trunc_mnist/trunc_mnist20x20_inputs.csv', 
                       fn_labels='input/trunc_mnist/trunc_mnist20x20_targets.csv')

# we set up a main object 'pipe' which preprocesses the data before training 
pipe = Clfpipeline(csv_reader, load_now=True)
conf_clf = {'dims':dims, 
            'alpha':0.001,                
            'reg':0.01,
            'activation':'sigmoid',
            'init_weights':Ws
           }

conf_train = {'batchsize':32,
              'costevery': 50,
               'nepoch': 500,
              'acc_batch':False,
              'opt':'rmsprop',
              'tolerance':-10,
              'loss_metric':'accuracy'}
fullconf = merge_dicts(conf_clf, conf_train)
pipe.set_classifier(MLP(**conf_clf))
pipe.train(**conf_train)
pipe.plot()
