## Introduction

In the previous notebook, I started look into the possibility of replacing the PCA part of platform pipeline with an autoencoder approach in order to leverage gpu computing for potential speed-up. In this notebook I will look into two different kinds of autoencoder, i.e. Variational autoencoder and Convolutional autoencoder 

## Configurations

This notebook was entirely run on a google cloud compute engine VM instance, which has a Nvidia Tesla P-100 GPU with 16 GB memory. Other important dependencies are as follow:

1. **Tensorflow**: 1.5
2. **CUDA**: 9.0
3. **cuDNN**: 7.0.4
4. **numpy**: 1.14.0
5. **sklearn**: 0.19.1
6. **xgboost**: 0.7

## Loading relevant modules

In [1]:
from AutoNets import Autoencoder,Transformer
import numpy as np
import tensorflow as tf
from sklearn.decomposition import PCA
import xgboost as xgb
import pickle
import time
import h5py
from sklearn.metrics import accuracy_score
from sklearn.model_selection import StratifiedShuffleSplit
import sklearn

  from ._conv import register_converters as _register_converters


## Building a few helper functions

In [2]:
def read_data(file_path):
    with h5py.File(file_path,'r') as f:
        data=f.get('dataset_1')
        Array=np.array(data)
    return Array

def pca(n_components,target):
    model=PCA(n_components=n_components)
    model.fit(target)
    PCs=model.transform(target)
    return PCs

def min_max_scale(data):
    Max=data.max()
    Min=data.min()
    return (data-Min)/(Max-Min)


def data_split(data,label,test_ratio=0.2):
    sss=StratifiedShuffleSplit(n_splits=1,test_size=test_ratio,random_state=0)
    for train_index,test_index in sss.split(data,label):
        return train_index,test_index

## Loading data

Data used in this analysis will be the same as the previous one. However, I made two modifications regarding the data used. Firstly only 1000 samples will be used to avoid memory issue. Secondly only layer activations from the layer named block2_pool, which is the second pooling layer in the network, will be used, since the bottleneck of the current pipeline resides on the slow dimensionality reduction on the very-high dimensional earlier layer activations. Therefore, input data in below experiment will have shape (1000,401408) or (1000,56,56,128) for convolutional case.

In [3]:
vgg_features=read_data('block2_poll_train_features.h5')# load block2_pool_train_features
vgg_features=vgg_features[:1000,:]# Take a subset
label=np.load('imagenet_train_label.pk')
label=label[:1000]

## PCA

### Second Pooling Layer

In [4]:
# Time the speed of PCA
start=time.time()
PCs_block2_pool=pca(128,vgg_features)
print('Run Time:{} seconds'.format(time.time()-start))

Run Time:23.85962963104248 seconds


In [5]:
train_index,test_index=data_split(PCs_block2_pool,label)# Train Test split

In [6]:
pickle.dump(PCs_block2_pool,open('PCs_block2_pool.pk','wb'))

In [7]:
# create a xgboost classifier and fit it with train data
model=xgb.XGBClassifier()
model.fit(PCs_block2_pool[train_index,:],label[train_index])

XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
       colsample_bytree=1, gamma=0, learning_rate=0.1, max_delta_step=0,
       max_depth=3, min_child_weight=1, missing=None, n_estimators=100,
       n_jobs=1, nthread=None, objective='multi:softprob', random_state=0,
       reg_alpha=0, reg_lambda=1, scale_pos_weight=1, seed=None,
       silent=True, subsample=1)

In [8]:
# make predictions using test data and check the accuracy
predictions=model.predict(PCs_block2_pool[test_index,:])
accuracy_score(label[test_index],predictions)

  if diff:


0.355

**Thoughts**: PCA performance is very constistent with its performance in previous experiments

## Variational Autoencoder

### Second Pooling Layer

In [9]:
# Time the speed of Variational Autoencoder
start=time.time()
# build an one-hidden layer variational autoencoder
auto=Autoencoder(401408,1,1,[128],[401408],mode='Variation',activation=tf.nn.elu)
vgg_features=auto.scale_data(vgg_features,method='Normal')
model=auto.build(lr=1e-4)
auto.train(vgg_features,10,32,model,return_pc=False)
model=Transformer(vgg_features,mode='Variation')
encodings,_=model.transform()
print('Run Time:{} seconds'.format(time.time()-start))

Number of epchs: 1/10 Loss: 376449.65625
Number of epchs: 2/10 Loss: 383916.1875
Number of epchs: 3/10 Loss: 358984.78125
Number of epchs: 4/10 Loss: 346379.71875
Number of epchs: 5/10 Loss: 348259.75
Number of epchs: 6/10 Loss: 347537.09375
Number of epchs: 7/10 Loss: 342321.96875
Number of epchs: 8/10 Loss: 336006.40625
Number of epchs: 9/10 Loss: 329707.8125
Number of epchs: 10/10 Loss: 324625.03125
INFO:tensorflow:Restoring parameters from ./saved_model/auto
Model trained and saved
INFO:tensorflow:Restoring parameters from ./saved_model/auto
Run Time:52.07313561439514 seconds


In [10]:
train_index,test_index=data_split(encodings,label)# train test split

In [11]:
# create a xgboost classifier and fit it with train data
model2=xgb.XGBClassifier()
model2.fit(encodings[train_index,:],label[train_index])

XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
       colsample_bytree=1, gamma=0, learning_rate=0.1, max_delta_step=0,
       max_depth=3, min_child_weight=1, missing=None, n_estimators=100,
       n_jobs=1, nthread=None, objective='multi:softprob', random_state=0,
       reg_alpha=0, reg_lambda=1, scale_pos_weight=1, seed=None,
       silent=True, subsample=1)

In [12]:
# make predictions using test data and check the accuracy
predictions=model2.predict(encodings[test_index,:])
accuracy_score(label[test_index],predictions)

  if diff:


0.41

**Thoughts**: In terms of classification accuracy, variational autoencoder is the best approach among all approaches I have ever tried. However, it also employes largest number of parameters, which significantly slows down the saving process of the trained model, which in turn slows down the process of obtaining the desired encodings, as model is firstly saved and the inference is performed in my implementation. A way to work around this issue would be to perform inference within the same tensorflow session as the training. Thus by sacrificing more GPU memory, we would eliminate the need to save the model, and this method will be illustrated below.

## Variational Autoencoder with in session inference

In [4]:
# Time the speed of Variational Autoencoder
start=time.time()
auto=Autoencoder(401408,1,1,[128],[401408],mode='Variation',activation=tf.nn.elu)
vgg_features=auto.scale_data(vgg_features,method='Normal')
model=auto.build(lr=1e-4)
encodings=auto.train(vgg_features,10,32,model,return_pc=True)
print('Run Time:{} seconds'.format(time.time()-start))

Number of epchs: 1/10 Loss: 418679.375
Number of epchs: 2/10 Loss: 401375.78125
Number of epchs: 3/10 Loss: 384766.15625
Number of epchs: 4/10 Loss: 373281.15625
Number of epchs: 5/10 Loss: 369464.46875
Number of epchs: 6/10 Loss: 356834.71875
Number of epchs: 7/10 Loss: 351872.34375
Number of epchs: 8/10 Loss: 349648.375
Number of epchs: 9/10 Loss: 354882.8125
Number of epchs: 10/10 Loss: 344756.40625
Run Time:20.920249700546265 seconds


In [5]:
train_index,test_index=data_split(encodings,label)# train test split

In [6]:
# create a xgboost classifier and fit it with train data
model2=xgb.XGBClassifier()
model2.fit(encodings[train_index,:],label[train_index])

XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
       colsample_bytree=1, gamma=0, learning_rate=0.1, max_delta_step=0,
       max_depth=3, min_child_weight=1, missing=None, n_estimators=100,
       n_jobs=1, nthread=None, objective='multi:softprob', random_state=0,
       reg_alpha=0, reg_lambda=1, scale_pos_weight=1, seed=None,
       silent=True, subsample=1)

In [7]:
# make predictions using test data and check the accuracy
predictions=model2.predict(encodings[test_index,:])
accuracy_score(label[test_index],predictions)Stacked autoencoder without nonlinearity

  if diff:


0.37

**Thoughts**: By eliminating saving the network all together, considerable amount of time has been saved, and VAE in this case, is actually slightly faster than the PCA approach. However, doing this will sacrifice more GPU memory, which might casuse out of memory error in certain situations.

## Convolutional Autoencoder

In [4]:
vgg_features=vgg_features.reshape(-1,56,56,128)# convert data to 3-dimensional tensorStacked autoencoder without nonlinearity

### Second Pooling Layer

In [10]:
# Time the speed of Convolutional Autoencoder
start=time.time()
auto=Autoencoder([56,56,128],3,3,[64,32,4],[32,64,128],ksize=3,stride=2,mode='Convolution',activation=tf.nn.relu)
model=auto.build(lr=1e-3)
auto.train(vgg_features,10,32,model,return_pc=False)
model=Transformer(vgg_features,mode='Convolution')
encodings,_=model.transform()
print('Run Time:{} seconds'.format(time.time()-start))

Number of epchs: 1/10 Loss: 385214.34375
Number of epchs: 2/10 Loss: 370110.21875
Number of epchs: 3/10 Loss: 362324.84375
Number of epchs: 4/10 Loss: 357767.03125
Number of epchs: 5/10 Loss: 353236.9375
Number of epchs: 6/10 Loss: 343601.6875
Number of epchs: 7/10 Loss: 337408.65625
Number of epchs: 8/10 Loss: 333473.84375
Number of epchs: 9/10 Loss: 331280.0625
Number of epchs: 10/10 Loss: 330185.03125
Model trained and saved
INFO:tensorflow:Restoring parameters from ./saved_model/auto
Run Time:12.177278757095337 seconds


In [11]:
encodings=np.reshape(encodings,[1000,7*7*4])

In [12]:
train_index,test_index=data_split(encodings,label)# train test split

In [13]:
# create a xgboost classifier and fit it with train data
model2=xgb.XGBClassifier()
model2.fit(encodings[train_index,:],label[train_index])

XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
       colsample_bytree=1, gamma=0, learning_rate=0.1, max_delta_step=0,
       max_depth=3, min_child_weight=1, missing=None, n_estimators=100,
       n_jobs=1, nthread=None, objective='multi:softprob', random_state=0,
       reg_alpha=0, reg_lambda=1, scale_pos_weight=1, seed=None,
       silent=True, subsample=1)

In [14]:
# make predictions using test data and check the accuracy
predictions=model2.predict(encodings[test_index,:])
accuracy_score(label[test_index],predictions)

  if diff:


0.26

** Thoughts **: convolutional autoencoder has a sizable speed advantage over PCA. However, during my experiment, it seems to be hard to reach the same level of encoding quality as PCA, measured by classification accuracy.

## Summary

From above experiments, we can see variational autoencoder offers the encodings with highest quality, but the speed boost seems to be less obvious. On the other hand, convolutional autoencoder although offers a sizable speed boost, quality of its encodings is not very optimal.A summary table of above experiments is presented below.



**Performance on second Pooling layer activations**

| Models | Time|Classification accuracy|
|------|------|
|   PCA| 23.86 s|0.355|
|   VAE| 52.07 s|0.4397|
|   VAE with in-session-inference| 20.92 s|0.37|
|   Convolutional autoencoder| 12.17 s|0.26|

