`Course Instructor`: **John Chiasson**

`Author (TA)`: **Ruthvik Vaila**

# References
* This notebook shows how to use the `nn_classifierclass.py` which is a wrapper for `Keras` which itself is a wrapper for `Tensorflow`. We can use `nn_classifierclass.py` to quickly train fully connected neural networks. Some of the below links might not be really necessary for this notebook nevertheless these are good reads.  
* [Neural Nets](http://neuralnetworksanddeeplearning.com/chap3.html)
* [Randombackprop](https://github.com/xuexue/randombp/blob/master/randombp.py)
* [Randombackprop](https://github.com/sangyi92/feedback_alignment/blob/master/RFA.ipynb)
* [Backprop](http://blog.aloni.org/posts/backprop-with-tensorflow/)
* [Initializers](https://towardsdatascience.com/hyper-parameters-in-action-part-ii-weight-initializers-35aee1a28404)
* [Dropout](https://github.com/pinae/TensorFlow-MNIST-example/blob/master/fully-connected.py)
* [Softmax](https://stackoverflow.com/questions/34240703/what-is-logits-softmax-and-softmax-cross-entropy-with-logits)
* [SoftmaxLogits](https://www.tensorflow.org/api_docs/python/tf/nn/softmax_cross_entropy_with_logits)
* Tested on `Python 3.7.5` with `Tensorflow 1.15.0` and `Keras 2.2.4`. 
* Tested on `Python 2.7.17` with `Tensorflow 1.15.3` and `Keras 2.2.4`.

# Imports

In [1]:
import sys
sys.version

'3.7.9 (default, Aug 31 2020, 17:10:11) [MSC v.1916 64 bit (AMD64)]'

In [2]:
import numpy as np
import h5py, sys, os, time, pickle, gzip
GPU =  False
if(not GPU):
    os.environ["CUDA_VISIBLE_DEVICES"]="-1"
import numpy as np
import matplotlib.pyplot as plt
from keras import backend
import IPython
from tensorflow.python.client import device_lib
import nn_classifierclass as cls
%load_ext tensorboard

Using TensorFlow backend.





In [3]:
device_lib.list_local_devices()

[name: "/device:CPU:0"
 device_type: "CPU"
 memory_limit: 268435456
 locality {
 }
 incarnation: 5712067084846844192]

# Load Data

```
# IGNORE THIS CELL FOR NOW
filename = 'data/emnist_train_x.h5'
with h5py.File(filename, 'r') as hf:
    train_x = hf['pool1_spike_features'][:]

filename = 'data/emnist_test_x.h5'
with h5py.File(filename, 'r') as hf:
    test_x = hf['pool1_spike_features'][:]

print('Train data shape:{}'.format(train_x.shape))
print('Test data shape:{}'.format(test_x.shape))

filename = 'data/emnist_train_y.pkl'
filehandle = open(filename, 'rb')
train_y = pickle.load(filehandle)
filehandle.close()

filename = 'data/emnist_test_y.pkl'
filehandle = open(filename, 'rb')
test_y = pickle.load(filehandle)
filehandle.close()

print('Train labels shape:{}'.format(train_y.shape))
print('Test labels shape:{}'.format(test_y.shape))
```

In [4]:
filename = 'data/mnist.pkl.gz'
filehandle = gzip.open(filename, 'rb')
train_data, val_data, test_data = pickle.load(filehandle,encoding = 'latin1')
filehandle.close()
train_x, train_y = train_data
print('Train data shape:{} and labels shape:{}'.format(train_x.shape, train_y.shape))
val_x, val_y = val_data
print('Valid data shape:{} and labels shape:{}'.format(val_x.shape, val_y.shape))
## combine train and validation data, classifier_class can split it inside 
train_x = np.concatenate([train_x, val_x], axis=0)
train_y = np.concatenate([train_y, val_y], axis=0)
print('Train data shape:{}'.format(train_x.shape))
print('Train labels shape:{}'.format(train_y.shape))
test_x, test_y = test_data
print('Test data shape:{}'.format(test_x.shape))
print('Test labels shape:{}'.format(test_y.shape))

Train data shape:(50000, 784) and labels shape:(50000,)
Valid data shape:(10000, 784) and labels shape:(10000,)
Train data shape:(60000, 784)
Train labels shape:(60000,)
Test data shape:(10000, 784)
Test labels shape:(10000,)


# Setup a NN classifier and train with `Keras`

In [5]:
n_classes = 10
n_hidden = 1 #number of hidden layers
network_structure = [train_x.shape[1],1500,n_classes]
#activation_fns = ['sigmoid']*(n_hidden)+['softmax']
#activation_fns = ['tanh']*(n_hidden)+['softmax']
activation_fns = ['relu']*(n_hidden)+['softmax']
#activation_fns = ['swish']*(n_hidden)+['softmax']
#activation_fns = ['softmax']
#sys.exit()
#weight_init = 'he_uniform'
weight_init = 'glorot_uniform'
eta_drop_type = 'plateau'
lmbda = 0.000
batch_size = 32  # mini-batch size
eta = 0.005

log_path = r'./logs'+''.join(['/']+activation_fns+['-',weight_init,'-',eta_drop_type,str(-lmbda)])+'/eta'+str(-eta)+'/'
print(log_path)

./logs/relusoftmax-glorot_uniform-plateau-0.0/eta-0.005/


In [6]:
repeats = 1
all_histories = []
for repeat in range(repeats):
    print('Repeat:{}'.format(repeat))
    backend.clear_session()
    neural_net = cls.Classifier(train_data=(train_x,train_y),
                                test_data=(test_x,test_y),
                                network_structure=network_structure,activation_fns=activation_fns,
                                epochs=3,eta=eta,lmbda=lmbda,verbose=1,plots=False,optimizer='adam',
                                eta_decay_factor=1.007,patience=8,eta_drop_type=eta_drop_type,
                                epochs_drop=1, val_frac=0.09,drop_out=0.0,ip_lyr_drop_out=0.0,
                                leaky_alpha=0.1,leaky_relu=False,weight_init=weight_init,
                                bias_init=0.1,batch_size=batch_size,log_path=log_path)
    neural_net.keras_fcn_classifier()
    all_histories.append(neural_net.history)

Repeat:0




_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_1 (Dense)              (None, 1500)              1177500   
_________________________________________________________________
activation_1 (Activation)    (None, 1500)              0         
_________________________________________________________________
dense_2 (Dense)              (None, 10)                15010     
_________________________________________________________________
activation_2 (Activation)    (None, 10)                0         
Total params: 1,192,510
Trainable params: 1,192,510
Non-trainable params: 0
_________________________________________________________________
None


Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where



Train on 54600 samples, validate on 5400 samples








Epoch 1/3

Epoch 00001: val_acc improved from -inf to 0.96481, saving model to weights.bes

# Tensorboard Visualization
* This cell works **only** in Linux. Do not run in Windows!
* For Windows instructions see the **Issues.pdf** file.

In [7]:
# %tensorboard --logdir {log_path}

In [8]:
log_path

'./logs/relusoftmax-glorot_uniform-plateau-0.0/eta-0.005/'

In [9]:
best_accuracies = [item['best_test_acc'] for item in all_histories]
last_accuracies = [item['last_test_acc'] for item in all_histories]
best_accuracies = np.array(best_accuracies)*100
print('best test:{}, mean test:{}, std test:{}'.format(best_accuracies.max(),
                                                                        best_accuracies.mean(),
                                                                         best_accuracies.std()))

last_accuracies = np.array(last_accuracies)*100
print('best final test:{}, mean final test:{}, std final test:{}'.format(last_accuracies.max(),
                                                                        last_accuracies.mean(),
                                                                         last_accuracies.std()))

best test:97.47, mean test:54.12074801295177, std test:43.34925198704823
best final test:97.47, mean final test:54.12074801295177, std final test:43.34925198704823


# Save the results to pkl

In [10]:
picklefile = log_path + 'results.pkl' 
output1 = open(picklefile,'wb')
pickle.dump(all_histories,output1)
output1.close()
print('pickle file written to:{}'.format(picklefile))

pickle file written to:./logs/relusoftmax-glorot_uniform-plateau-0.0/eta-0.005/results.pkl


## Verify the written data

In [11]:
picklefile = open(log_path + 'results.pkl', 'rb') 
loaded_spike_count_record_all_histories = pickle.load(picklefile)
picklefile.close()
#loaded_spike_count_record_all_histories

# Setup a NN classifier and train with `NumPy`
* Note that this method trains only with `softmax` neurons and it cannot log data like the above `keras` version.
* To add more activations functions add corresponding functions to the file `numpy_fcn.py` file, add `if else` conditions to choose the activation function based on user input.
* `nn_classifierclass.numpy_fcn_classifier` is a wrapper around Michael Nielsen's `NumPy` based bare bones implementation.
* Runs only on the `CPU`.
* The code in `numpy_fcn.py` was written by [Michael Nielsen](http://neuralnetworksanddeeplearning.com/chap2.html) and it is slow because it was written with understandability of backpropagation implementation in mind and NOT speed or vectorization of mini-batch computations on the CPU. Specifically see the following lines in `numpy_fcn.py`. **WARNING:** This part runs very slow with `EMNIST` dataset use only `MNIST` for this.

`
200             for mini_batch in mini_batches:
201                 self.update_mini_batch(
202                     mini_batch, eta, lmbda, len(training_data))
`
* See the below lines where we loop over each sample in the mini-batch, instead this can be vectorized. Use the same logic as in the homework problem related to a simple neural network separating a spirally distributed data points.

`
246         for x, y in mini_batch:
247             delta_nabla_b, delta_nabla_w = self.backprop(x, y)
248             nabla_b = [nb+dnb for nb, dnb in zip(nabla_b, delta_nabla_b)]
249             nabla_w = [nw+dnw for nw, dnw in zip(nabla_w, delta_nabla_w)]
250         self.weights = [(1-eta*(lmbda/n))*w-(eta/len(mini_batch))*nw
251                         for w, nw in zip(self.weights, nabla_w)]
252         self.biases = [b-(eta/len(mini_batch))*nb
253                        for b, nb in zip(self.biases, nabla_b)]
`

In [12]:
n_classes = 47
n_hidden = 1
network_structure = [train_x.shape[1],1500,n_classes] ##will be ignored, only sigmoid neurons
#activation_fns = ['sigmoid']*(n_hidden)+['softmax']
#activation_fns = ['tanh']*(n_hidden)+['softmax']
#activation_fns = ['relu']*(n_hidden)+['softmax']
activation_fns = ['swish']*(n_hidden)+['softmax'] #will be ignored, only sigmoid neurons and last layer is softmax
#activation_fns = ['softmax']
#sys.exit()
#weight_init = 'he_uniform'
weight_init = 'glorot_uniform'
eta_drop_type = 'plateau'
lmbda = 0.000
batch_size = 32
eta = 0.005

log_path = '/home/visionteam/tf_tutorials/logs/'+''.\
join(activation_fns+['-',weight_init,'-',eta_drop_type,str(-lmbda)])+'/eta'+str(-eta)
print(log_path)
repeats = 1
all_histories = []
for repeat in range(repeats):
    print('Repeat:{}'.format(repeat))
    backend.clear_session()
    neural_net = cls.Classifier(train_data=(train_x,train_y),
                                test_data=(test_x,test_y),
                                network_structure=network_structure,activation_fns=activation_fns,
                                epochs=1,eta=eta,lmbda=lmbda,verbose=1,plots=False,optimizer='adam',
                                eta_decay_factor=1.007,patience=8,eta_drop_type=eta_drop_type,
                                epochs_drop=1, val_frac=0.09,drop_out=0.0,ip_lyr_drop_out=0.0,
                                leaky_alpha=0.1,leaky_relu=False,weight_init=weight_init,
                                bias_init=0.1,batch_size=batch_size,log_path=log_path)
    neural_net.numpy_fcn_classifier()
    all_histories.append(neural_net.history)

/home/visionteam/tf_tutorials/logs/swishsoftmax-glorot_uniform-plateau-0.0/eta-0.005
Repeat:0


[                                                                        ] N/A%

Eta for epoch: 0 is: 0.005
Epoch 0 progress





Epoch 0 training complete
Cost on training data: -0.000325125536201707
Accuracy on training data: 0.8438095238095238 / 54600, 0.8438095238095238
Cost on evaluation data: -0.0003229550875535708
Accuracy on evaluation data: 0.8494444444444444 
saved network to:fcn_best_validation_network_rand_epochs_1_lambda_0.0
test accuracy at the end:0.8545
best validation accuracy is:0.8494444444444444
test accuracy for weights corresponding to the best validation accuracy:0.8545


# Restart the notebook to free up the `GPU` and `RAM`.

In [13]:
IPython.Application.instance().kernel.do_shutdown(True) #automatically restarts kernel

{'status': 'ok', 'restart': True}

# Exercises
* Write a method `keras_cnn_classifer` in `classifier_class.py` that can train a Convolutional Neural Network.
* Write a class `Vecotized_Network` in `numpy_fcn.py` so that we can have a vectorized implementation, see cell number 6 for tips on where to look for modifications to be done.
* Plot various metrics like `training cost vs epochs`, `training accuracy vs epochs` etc. These metrics can be found in the attribute `self.history` of the `nn_classiferclass.py`.