<h4 align="right">3rd of February 2020</h4>
<h1 align="center"> <font color="blue"> <b>
Neural Networks and Deep Learning (CIE 555)</b></font> </h1>
<h2 align="center">Lab 2 (part 2): Keras</h2> <br>

### Lab out lines:
    
* Set up your environment for eager execution
* Define the main ingredients: a Keras model, an optimizer and a loss function
* Feed data to the training routine
* Write a simple training loop that does backprop on the model’s weights
* Make predictions on the test set 


### <u>**Part 1: Keras**</u>
Keras is a high-level Deep Learning API that allows you to easily build, train, evaluate
and execute all sorts of neural networks. Its documentation (or specification) is avail‐
able at https://keras.io.

#### Using Keras with eager execution
Eager execution is a way to train a Keras model without building a graph. Operations return values, not tensors. Consequently, you can inspect what goes in and comes out of an operation simply by printing a variable’s contents. This is an important advantage in model development and debugging.

You can use eager execution with Keras as long as you use the TensorFlow implementation. This guide gives an outline of the workflow by way of a simple example.



In [1]:
import numpy as np
import tensorflow as tf
tf.enable_eager_execution()

Enabling eager execution changes how TensorFlow operations behave—now they immediately evaluate and return their values to Python. tf.Tensor objects reference concrete values instead of symbolic handles to nodes in a computational graph. Since there isn't a computational graph to build and run later in a session, it's easy to inspect results using print() or a debugger. Evaluating, printing, and checking tensor values does not break the flow for computing gradients.

Eager execution works nicely with NumPy. NumPy operations accept tf.Tensor arguments. The TensorFlow tf.math operations convert Python objects and NumPy arrays to tf.Tensor objects. The tf.Tensor.numpy method returns the object's value as a NumPy ndarray.

In [0]:
A = tf.constant([[1, 2], [3, 4]])
B = tf.constant([[5, 6], [7, 8]])

C = tf.matmul(A, B)

In [3]:
print(C)
print(C.numpy())

tf.Tensor(
[[19 22]
 [43 50]], shape=(2, 2), dtype=int32)
[[19 22]
 [43 50]]


### <u>Part 2: Building MLP for iris dataset </u>

In [0]:
from sklearn.datasets import load_iris
from sklearn.preprocessing import OneHotEncoder
from sklearn.model_selection import train_test_split

# custom iris data loader with batchs
class DataLoader():
    def __init__(self):
        
        # load the iris form sklearn dataset
        iris = load_iris()
        
        # one hotencoding
        enc = OneHotEncoder() 
        
        # To center the data (make it have zero mean and unit standard error), you subtract 
        # the mean and then divide the result by the standard deviation.x′=x−μσ
        # You do that on the training set of data. But then you have to apply the same transformation
        # to your testing set (e.g. in cross-validation), or to newly obtained examples before forecast.
        labels = enc.fit_transform(iris.target.reshape(-1,1)).A.astype(np.float32)
        
        # split your data to test = 0.2 of train and randomize the data points
        xtrain, xtest, ytrain, ytest = train_test_split(iris.data, labels, test_size = 0.2, random_state = 1)
        
        # to class variables
        self.train_data = xtrain
        self.train_labels = np.asarray(ytrain, dtype=np.int32) 
        self.eval_data = xtest
        self.eval_labels = np.asarray(ytest, dtype=np.int32) 

        # orginal labels
        self.eval_labels_ori = np.asarray(enc.inverse_transform(self.eval_labels), dtype=np.int32)
        self.train_labels_ori = np.asarray(enc.inverse_transform(self.train_labels), dtype=np.int32)
    
    def get_batch(self, batch_size):
        # reshape training data to [0 , train_data , batch_size] 
        index = np.random.randint(0, np.shape(self.train_data)[0], batch_size)
        return self.train_data[index, :], self.train_labels[index]

In [0]:
# Define Multi Layer Perceptron Class
class MLP(tf.keras.Model):
    
    def __init__(self):
        super().__init__()
        # input layer accept inputs and return 5 outputs activated with ReLU 
        self.dense1 = tf.keras.layers.Dense(units=5, activation=tf.nn.relu)
        # hidden layer accept 5 inputs and return 3 outputs activated with softMax
        self.dense2 = tf.keras.layers.Dense(units=3, activation=tf.nn.softmax)
        
    # forward pass     
    def call(self, inputs):
        x = self.dense1(inputs)
        x = self.dense2(x)
        return x
    
    # predict function
    def predict(self, inputs):
        logits = self(inputs)
        return tf.argmax(logits, axis=-1)

In [0]:
# The seed method is used to initialize the pseudorandom number generator 
# if you provide same seed value before generating random data it will 
# produce the same data and results. 'reproducibility' 
tf.set_random_seed(2)
np.random.seed(2)

In [0]:
# Set the Hyperparameters
num_batches = 1000
batch_size = 50
learning_rate = 0.001

In [0]:
# create the MLP model
model = MLP()

# load the data
data_loader = DataLoader()

# select the optimizer and assgin its learning rate
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)

In [9]:
# define your training loop
for batch_index in range(num_batches):

    # get batch from the loader
    X, y = data_loader.get_batch(batch_size)
    
    # TensorFlow provides the tf.GradientTape API for automatic differentiation 
    # -computing the gradient of a computation with respect to its input variables.
    # Tensorflow "records" all operations executed inside the context of a tf.GradientTape onto a "tape". 
    with tf.GradientTape() as tape:
        
        # (1) forward pass with the data points from the batch we loaded before
        y_logit_pred = model(tf.convert_to_tensor(X))
        
        # (2) calculate the loss
        #     The softmax function outputs a categorical distribution over outputs.
        #     When you compute the cross-entropy over two categorical distributions,
        #     this is called the “cross-entropy loss”
        loss = tf.losses.softmax_cross_entropy(y, y_logit_pred)
        
        # we print the batch num and loss value
        print("batch %d: loss %f" % (batch_index, loss.numpy()))
    # (3) Compute the gradients for a list of variables.
    #     The gradient computation below is not traced, saving memory.
    grads = tape.gradient(loss, model.variables)

    # (4) Ask the optimizer to apply the processed gradients.
    optimizer.apply_gradients(grads_and_vars=zip(grads, model.variables))

    

batch 0: loss 1.262261
batch 1: loss 1.186012
batch 2: loss 1.164775
batch 3: loss 1.242380
batch 4: loss 1.220835
batch 5: loss 1.243268
batch 6: loss 1.240545
batch 7: loss 1.124558
batch 8: loss 1.336698
batch 9: loss 1.201456
batch 10: loss 1.258068
batch 11: loss 1.179017
batch 12: loss 1.200622
batch 13: loss 1.159011
batch 14: loss 1.255297
batch 15: loss 1.217777
batch 16: loss 1.233192
batch 17: loss 1.269191
batch 18: loss 1.233290
batch 19: loss 1.245303
batch 20: loss 1.268214
batch 21: loss 1.321350
batch 22: loss 1.261555
batch 23: loss 1.224177
batch 24: loss 1.208028
batch 25: loss 1.117901
batch 26: loss 1.072644
batch 27: loss 1.302055
batch 28: loss 1.251744
batch 29: loss 1.228238
batch 30: loss 1.273488
batch 31: loss 1.288287
batch 32: loss 1.210652
batch 33: loss 1.174688
batch 34: loss 1.238774
batch 35: loss 1.197765
batch 36: loss 1.142623
batch 37: loss 1.163979
batch 38: loss 1.217734
batch 39: loss 1.102479
batch 40: loss 1.159961
batch 41: loss 1.192984
ba

In [10]:
# load the train samples and calculate its acc 
num_train_samples = np.shape(data_loader.train_labels)[0]
y_pred = model.predict(data_loader.train_data).numpy()
print("train accuracy: %f" % (sum(y_pred == data_loader.train_labels_ori.squeeze()) / num_train_samples))

# load the test samples and calculate its acc 
num_eval_samples = np.shape(data_loader.eval_labels)[0]
y_pred = model.predict(data_loader.eval_data).numpy()
print("test accuracy: %f" % (sum(y_pred == data_loader.eval_labels_ori.squeeze()) / num_eval_samples))

train accuracy: 0.983333
test accuracy: 0.933333
