# Lab 1: Multi-Layer Perceptrons

In this lab, we aim to solve the problem of classifying human faces into males and females. We have at our disposition a dataset of 200 images of celebrity faces and their associated labels (0 for female and 1 for male).  

We  use a face detector from the dlib library to  estimate  the  location  of  68  (x,  y) coordinates  that  map  to specific facial regions. The image below visualizes what each of these coordinates maps to:


![](face_feature_extraction.png)

We then use tensorflow to define and train a multi-layer perceptron (MLP) graph to classify images using the features visualized above. Using the code below, try to apply the following changes:
    
1 - Change the complexity of the 2-layer MLP by increasing or decreasing the number of neurons in each layer.

2 - Try increasing the number of layers used. This should increase the "depth" of the MLP. To do so, you must change the definition of the MLP function in "multilayer_perceptron" and the weights/biases allocation function "allocate_weights_and_biases".

3 - Try changing how the weights and parameters are initialized. What would happen if you initialize all parameters to zero ?

4 - Try increasing or decreasing the learning rate and number of training epochs. How does this affect the "fitting" to training data ?


### Import APIs to be used 

In [1]:
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
import lab3_data as import_data
import numpy as np

Using TensorFlow backend.


### Load  CelebA data and create train and test splits (Train: 100 exmaples, Test: 100 examples)

In [2]:
def get_data():
    X, y = import_data.extract_features_labels()
    Y = np.array([y, -(y - 1)]).T
    tr_X = X[:100] ; tr_Y = Y[:100]
    te_X = X[100:] ; te_Y = Y[100:]

    return tr_X, tr_Y, te_X, te_Y

### Allocate memory for weights and biases for all MLP layers
You can try changing the number of neurons to increase or decrease the complexity of the MLP.

In [3]:
def allocate_weights_and_biases():

    # define number of hidden layers ..
    n_hidden_1 = 2048  # 1st layer number of neurons
    n_hidden_2 = 2048  # 2nd layer number of neurons

    # inputs placeholders
    X = tf.placeholder("float", [None, 68, 2])
    Y = tf.placeholder("float", [None, 2])  # 2 output classes
    
    # flatten image features into one vector (i.e. reshape image feature matrix into a vector)
    images_flat = tf.keras.layers.Flatten()(X)  
    
    # weights and biases are initialized from a normal distribution with a specified standard devation stddev
    stddev = 0.01
    
    # define placeholders for weights and biases in the graph
    weights = {
        'hidden_layer1': tf.Variable(tf.random_normal([68 * 2, n_hidden_1], stddev=stddev)),
        'hidden_layer2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2], stddev=stddev)),
        'out': tf.Variable(tf.random_normal([n_hidden_2, 2], stddev=stddev))
    }

    biases = {
        'bias_layer1': tf.Variable(tf.random_normal([n_hidden_1], stddev=stddev)),
        'bias_layer2': tf.Variable(tf.random_normal([n_hidden_2], stddev=stddev)),
        'out': tf.Variable(tf.random_normal([2], stddev=stddev))
    }
    
    return weights, biases, X, Y, images_flat
    

### Define how the weights and biases are used for inferring classes from inputs (i.e. define MLP function)

You can add more layers to the MLP to fit more complicated functions. Adding more layers requires more learnable weights and biases, which need to defined in "allocate_weights_and_biases" first.

In [4]:
# Create model
def multilayer_perceptron():
        
    weights, biases, X, Y, images_flat = allocate_weights_and_biases()

    # Hidden fully connected layer 1
    layer_1 = tf.add(tf.matmul(images_flat, weights['hidden_layer1']), biases['bias_layer1'])
    layer_1 = tf.math.sigmoid(layer_1)

    # Hidden fully connected layer 2
    layer_2 = tf.add(tf.matmul(layer_1, weights['hidden_layer2']), biases['bias_layer2'])
    layer_2 = tf.math.sigmoid(layer_2)
    
    # Output fully connected layer
    out_layer = tf.matmul(layer_2, weights['out']) + biases['out']

    return out_layer, X, Y



### Define graph training operation
The loss function (i.e. the value to minimize) is defined as the cross entropy between the predicted classes and the class ground truth. The train operation is then included within the graph as a weight/bias update operation.

Try changing the learning rate, how would setting a low or high learning rate affect the "fitting" to the training set ?

In [5]:
# learning parameters
learning_rate = 0.00001
training_epochs = 500

# display training accuracy every ..
display_accuracy_step = 2

    
training_images, training_labels, test_images, test_labels = get_data()
logits, X, Y = multilayer_perceptron()

# define loss and optimizer
loss_op = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
    logits=logits, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)

# define training graph operation
train_op = optimizer.minimize(loss_op)

# graph operation to initialize all variables
init = tf.global_variables_initializer()

Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See `tf.nn.softmax_cross_entropy_with_logits_v2`.



### Run graph for specified number of epochs.

After the graph is defined, different operations in the graph can be run by specifying them in the sess.run() function.
A session is wrapper for running graphs. Outputs can also be acquired from the graph by including them in the variable list of sess.run().

In [6]:
with tf.Session() as sess:

        # run graph weights/biases initialization op
        sess.run(init)
        # begin training loop ..
        for epoch in range(training_epochs):
            # run optimization operation (backprop) and cost operation (to get loss value)
            _, cost = sess.run([train_op, loss_op], feed_dict={X: training_images,
                                                               Y: training_labels})

            # Display logs per epoch step
            print("Epoch:", '%04d' % (epoch + 1), "cost={:.9f}".format(cost))
                
            if epoch % display_accuracy_step == 0:
                pred = tf.nn.softmax(logits)  # Apply softmax to logits
                correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(Y, 1))

                # calculate training accuracy
                accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
                print("Accuracy: {:.3f}".format(accuracy.eval({X: training_images, Y: training_labels})))

        print("Optimization Finished!")

        # -- Define and run test operation -- #
        
        # apply softmax to output logits
        pred = tf.nn.softmax(logits)
        
        #  derive inffered calasses as the class with the top value in the output density function
        correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(Y, 1))
        
        # calculate accuracy
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

        # run test accuracy operation ..
        print("Test Accuracy:", accuracy.eval({X: test_images, Y: test_labels}))



Epoch: 0001 cost=0.754594326
Accuracy: 0.470
Epoch: 0002 cost=0.739417732
Epoch: 0003 cost=0.726330817
Accuracy: 0.470
Epoch: 0004 cost=0.715396881
Epoch: 0005 cost=0.706636131
Accuracy: 0.470
Epoch: 0006 cost=0.700015306
Epoch: 0007 cost=0.695433736
Accuracy: 0.530
Epoch: 0008 cost=0.692710161
Epoch: 0009 cost=0.691573620
Accuracy: 0.530
Epoch: 0010 cost=0.691665947
Epoch: 0011 cost=0.692568362
Accuracy: 0.530
Epoch: 0012 cost=0.693852067
Epoch: 0013 cost=0.695138216
Accuracy: 0.530
Epoch: 0014 cost=0.696145177
Epoch: 0015 cost=0.696707308
Accuracy: 0.530
Epoch: 0016 cost=0.696768880
Epoch: 0017 cost=0.696360409
Accuracy: 0.530
Epoch: 0018 cost=0.695570290
Epoch: 0019 cost=0.694518209
Accuracy: 0.530
Epoch: 0020 cost=0.693332195
Epoch: 0021 cost=0.692131937
Accuracy: 0.530
Epoch: 0022 cost=0.691016853
Epoch: 0023 cost=0.690058470
Accuracy: 0.530
Epoch: 0024 cost=0.689297736
Epoch: 0025 cost=0.688744903
Accuracy: 0.530
Epoch: 0026 cost=0.688383341
Epoch: 0027 cost=0.688175678
Accuracy:

Accuracy: 0.690
Epoch: 0226 cost=0.559713364
Epoch: 0227 cost=0.558966160
Accuracy: 0.690
Epoch: 0228 cost=0.558206201
Epoch: 0229 cost=0.557438135
Accuracy: 0.680
Epoch: 0230 cost=0.556679249
Epoch: 0231 cost=0.555933833
Accuracy: 0.680
Epoch: 0232 cost=0.555192053
Epoch: 0233 cost=0.554444253
Accuracy: 0.680
Epoch: 0234 cost=0.553691268
Epoch: 0235 cost=0.552938938
Accuracy: 0.690
Epoch: 0236 cost=0.552190006
Epoch: 0237 cost=0.551442862
Accuracy: 0.690
Epoch: 0238 cost=0.550695062
Epoch: 0239 cost=0.549945354
Accuracy: 0.690
Epoch: 0240 cost=0.549192667
Epoch: 0241 cost=0.548439205
Accuracy: 0.690
Epoch: 0242 cost=0.547687173
Epoch: 0243 cost=0.546939909
Accuracy: 0.690
Epoch: 0244 cost=0.546196222
Epoch: 0245 cost=0.545453012
Accuracy: 0.690
Epoch: 0246 cost=0.544707894
Epoch: 0247 cost=0.543961048
Accuracy: 0.690
Epoch: 0248 cost=0.543213844
Epoch: 0249 cost=0.542466640
Accuracy: 0.690
Epoch: 0250 cost=0.541719139
Epoch: 0251 cost=0.540971935
Accuracy: 0.690
Epoch: 0252 cost=0.540

Accuracy: 0.810
Epoch: 0450 cost=0.425067544
Epoch: 0451 cost=0.424528569
Accuracy: 0.800
Epoch: 0452 cost=0.424001783
Epoch: 0453 cost=0.423487544
Accuracy: 0.800
Epoch: 0454 cost=0.422982663
Epoch: 0455 cost=0.422479272
Accuracy: 0.800
Epoch: 0456 cost=0.421972930
Epoch: 0457 cost=0.421461016
Accuracy: 0.800
Epoch: 0458 cost=0.420947343
Epoch: 0459 cost=0.420435667
Accuracy: 0.810
Epoch: 0460 cost=0.419928581
Epoch: 0461 cost=0.419428051
Accuracy: 0.820
Epoch: 0462 cost=0.418938071
Epoch: 0463 cost=0.418457597
Accuracy: 0.820
Epoch: 0464 cost=0.417993695
Epoch: 0465 cost=0.417538911
Accuracy: 0.810
Epoch: 0466 cost=0.417084157
Epoch: 0467 cost=0.416597515
Accuracy: 0.810
Epoch: 0468 cost=0.416058123
Epoch: 0469 cost=0.415453732
Accuracy: 0.820
Epoch: 0470 cost=0.414821774
Epoch: 0471 cost=0.414212614
Accuracy: 0.810
Epoch: 0472 cost=0.413663298
Epoch: 0473 cost=0.413174540
Accuracy: 0.810
Epoch: 0474 cost=0.412725359
Epoch: 0475 cost=0.412291974
Accuracy: 0.810
Epoch: 0476 cost=0.411