## 1. Import Libraries
* pickle - Used for importing created features
* numpy - Used for working with arrays
* TensorFlow - For creating deep neural network graphs and later processing them
* Keras ImageDataGenerator - Used for randomly changing input data for more robust learning

In [1]:
import pickle
import numpy as np
import tensorflow as tf
from keras.preprocessing.image import ImageDataGenerator

Using TensorFlow backend.


## 2. Import the data
The data is already pre-processed and we are just using mixture of horizontal and vertical data. This has provided me better classification result when compared against using only horizontal or vartical. Also, the performance is comparable (but not better :/) if we combine horizontal, verticle and summed horizontal and vertical images. But it reduces the size of our network. 
* provide the path of pickle files
* Load and convert the data into numpy arrays/matrices

In [2]:
pickle_train = open("train.pickle","rb")
pickle_valid = open("valid.pickle","rb")

trainX = pickle.load(pickle_train)
validX = pickle.load(pickle_valid)

X_train = np.array(trainX['xtrain'],dtype=np.float32)
X_valid = np.array(validX['xvalid'],dtype=np.float32)

y_train = np.array(trainX['ytrain'],dtype=np.int32)
y_valid = np.array(validX['yvalid'],dtype=np.int32)

datagen = ImageDataGenerator(
    rotation_range=20,  # randomly rotate images in the range (degrees, 0 to 180)
    width_shift_range=0.1,  # randomly shift images horizontally (fraction of total width)
    height_shift_range=0.1,  # randomly shift images vertically (fraction of total height)
    horizontal_flip=True,  # randomly flip images
    vertical_flip=False)  # randomly flip images
    
datagen.fit(X_train)

## 3. Global Variables
Create some global variables/parameters used by our network.
* image_size - size of images, here 75 x 75
* n_class - one hot encoding of iceberg or ship
* batch_size - size of the batch which will be supplied to train our network. Using 32 as it create a medium sized tensors but if I use a bigger batch size (62, 128 etc), the size of tensors get large and problematic to train on my laptop.
* epocs - No. of times the network sees the whole data

In [3]:
image_size = 75
n_class = 2
n_layer1 = 6144
encoder_output = 32
learning_rate = 9e-5
batch_size = 32
epochs = 1000

## 4. Design the Graph
----------------------------
### 4.1 Methods
Some helpful functions which will be repeatedly used while creating the graphs:
* weight_variable: initializes weights. Using Xavier Initialization as provides better starting weights than initializing with other techniques and results in faster converging solutions.
    * inputs shape of tensor and name for the variable
* bias_variable: creates bias variable. Initialize it with small constant weight.
    * inputs size based on weight and name of the bias

In [4]:
def weight_variable(shape,nm):
  initial = tf.contrib.layers.xavier_initializer()
  return tf.get_variable(nm,shape=shape,initializer=initial)

def bias_variable(shape,nm):
  initial = tf.constant(0.01, shape=shape,name=nm)
  return tf.Variable(initial)

### 4.2 Placeholder
Input and output placeholders

In [5]:
x = tf.placeholder(tf.float32, shape=[None, image_size * image_size])
y_ = tf.placeholder(tf.float32, shape=[None, image_size * image_size])

### 4.3 Encoder
2-fully connected layer and then an output layer with size 32. The encoder should reduce the dataset and find the best possible 32 features. The output of encoder is passed on to the decoder to recreate the input. 

In [6]:
W_e1 = weight_variable([image_size * image_size, n_layer1],"W_e1")
b_e1 = bias_variable([n_layer1],"b_e1")

e_layer1 = tf.nn.relu(tf.add(tf.matmul(x, W_e1), b_e1),name='e_l1')

W_e2 = weight_variable([n_layer1, n_layer1],"W_e2")
b_e2 = bias_variable([n_layer1],"b_e2")

e_layer2 = tf.nn.relu(tf.add(tf.matmul(e_layer1, W_e2), b_e2),name='e_l2')

W_e3 = weight_variable([n_layer1, encoder_output],"W_e3")
b_e3 = bias_variable([encoder_output],"b_e3")

e_out = tf.nn.relu(tf.add(tf.matmul(e_layer2, W_e3), b_e3),name='e_l3')

### 4.4 Decoder
It receives the output from encoder and then recreates the image. It consists of 2-layers of fully connected layers and then outputs the real image sized output.

In [7]:
W_d1 = weight_variable([encoder_output, n_layer1],"W_d1")
b_d1 = bias_variable([n_layer1],"b_d1")

d_layer1 = tf.nn.relu(tf.add(tf.matmul(e_out, W_d1), b_d1),name='d_l1')

W_d2 = weight_variable([n_layer1, n_layer1],"W_d2")
b_d2 = bias_variable([n_layer1],"b_d2")

d_layer2 = tf.nn.relu(tf.add(tf.matmul(d_layer1, W_d2), b_d2),name='d_l2')

W_d3 = weight_variable([n_layer1, image_size * image_size],"W_d3")
b_d3 = bias_variable([image_size * image_size],"b_d3")

d_out = tf.nn.relu(tf.add(tf.matmul(d_layer2, W_d3), b_d3),name='d_l3')

### 4.5 Loss and Optimization
* loss - Mean of the square distance between input image and output image
* train_optimizer - AdamOtimizer with learning rate 1e-4

In [8]:
loss = tf.reduce_mean(tf.square(y_ - d_out))

train_optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)

## 5. Visualization
Plot the loss and encoder distribution.

In [9]:
tf.summary.scalar(name='Loss', tensor=loss)
tf.summary.histogram(name='Encoder_Distribution', values=e_out)
summary_ae = tf.summary.merge_all()

## 6. Training
Train the model on out data and find the loss.

In [10]:
global_step = 0
with tf.Session() as sess:
    tf_saver = tf.train.Saver()
    sess.run(tf.global_variables_initializer())
    writer = tf.summary.FileWriter(logdir='./Iceberg/Autoencoder', graph=sess.graph)
    for e in range(epochs):
        n_batches = int(X_train.shape[0] / batch_size)
        print("Epoch:{} ".format(e))
        gen_train = datagen.flow(X_train, y_train,
                                           batch_size=batch_size)
        for batch in range(n_batches):
            batch_data = gen_train.next()
            sess.run(train_optimizer,feed_dict={x: np.reshape(batch_data[0],(-1,image_size*image_size)), 
                                                y_: np.reshape(batch_data[0],(-1,image_size*image_size))})
            if(batch % 8 == 0):
                batch_loss, summary = sess.run([loss, summary_ae], feed_dict={x: np.reshape(batch_data[0],(-1,image_size*image_size)), 
                                                                            y_: np.reshape(batch_data[0],(-1,image_size*image_size))})
                print(" - Batch no: {}, train loss: {}".format(batch,batch_loss))
            global_step += 1
        batch_loss, summary = sess.run([loss, summary_ae], feed_dict={x: np.reshape(X_valid,(-1,image_size*image_size)), 
                                                                      y_: np.reshape(X_valid,(-1,image_size*image_size))})
        print("-Validation loss after epoc: {} ".format(batch_loss))
    #Save your model
    tf_saver.save(sess, save_path='./Iceberg/Autoencoder/SavedModel')

Epoch:0 
 - Batch no: 0, train loss: 1.201899766921997
 - Batch no: 8, train loss: 0.9124476909637451
 - Batch no: 16, train loss: 0.6679865121841431
 - Batch no: 24, train loss: 0.7741681933403015
-Validation loss after epoc: 0.6996757984161377 
Epoch:1 
 - Batch no: 0, train loss: 0.7932273149490356
 - Batch no: 8, train loss: 0.7835976481437683
 - Batch no: 16, train loss: 0.773699164390564
 - Batch no: 24, train loss: 1.0435341596603394
-Validation loss after epoc: 0.6788281798362732 
Epoch:2 
 - Batch no: 0, train loss: 0.6922950744628906
 - Batch no: 8, train loss: 0.9551734924316406
 - Batch no: 16, train loss: 0.6900110840797424
 - Batch no: 24, train loss: 0.6741445660591125
-Validation loss after epoc: 0.6700567007064819 
Epoch:3 
 - Batch no: 0, train loss: 0.8280214071273804
 - Batch no: 8, train loss: 0.7751855850219727
 - Batch no: 16, train loss: 0.6818187832832336
 - Batch no: 24, train loss: 0.6612483859062195
-Validation loss after epoc: 0.6622953414916992 
Epoch:4 
 