1.
Let's load the best model's graph and get a handle on all the important operations we will need. Note that instead of creating a new softmax output layer, we will just reuse the existing one (since it has the same number of outputs as the existing one). We will reinitialize its parameters before training.

In [1]:
def reset_graph(seed=42):
    tf.reset_default_graph()
    tf.set_random_seed(seed)
    np.random.seed(seed)
print('done')


done


In [2]:
import tensorflow as tf
tf.reset_default_graph()

restore_saver = tf.train.import_meta_graph( "./model/Team11_HW2.ckpt.meta")

X = tf.get_default_graph().get_tensor_by_name("X:0")
y = tf.get_default_graph().get_tensor_by_name("y:0")
loss = tf.get_default_graph().get_tensor_by_name("loss:0")
Y_proba = tf.get_default_graph().get_tensor_by_name("Y_proba:0")
logits = Y_proba.op.inputs[0]
accuracy = tf.get_default_graph().get_tensor_by_name("accuracy:0")
print('done')

done


To freeze the lower layers, we will exclude their variables from the optimizer's list of trainable variables, keeping only the output layer's trainable variables:

In [3]:
learning_rate = 0.01

output_layer_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope="logits")
optimizer = tf.train.AdamOptimizer(learning_rate, name="Adam2")
training_op = optimizer.minimize(loss, var_list=output_layer_vars)
print('done')

done


In [4]:
correct = tf.nn.in_top_k(logits, y, 1)
accuracy = tf.reduce_mean(tf.cast(correct, tf.float32), name="accuracy")

init = tf.global_variables_initializer()
five_frozen_saver = tf.train.Saver()
print('done')

done


2.
Let's create the training, validation and test sets. We need to subtract 5 from the labels because TensorFlow expects integers from 0 to n_classes-1.

In [5]:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/")
X_train2_full = mnist.train.images[mnist.train.labels >= 5]
y_train2_full = mnist.train.labels[mnist.train.labels >= 5] - 5
X_valid2_full = mnist.validation.images[mnist.validation.labels >= 5]
y_valid2_full = mnist.validation.labels[mnist.validation.labels >= 5] - 5
X_test2 = mnist.test.images[mnist.test.labels >= 5]
y_test2 = mnist.test.labels[mnist.test.labels >= 5] - 5
print('done')

Extracting /tmp/data/train-images-idx3-ubyte.gz
Extracting /tmp/data/train-labels-idx1-ubyte.gz
Extracting /tmp/data/t10k-images-idx3-ubyte.gz
Extracting /tmp/data/t10k-labels-idx1-ubyte.gz
done


Also, for the purpose of this exercise, we want to keep only 100 instances per class in the training set (and let's keep only 30 instances per class in the validation set). Let's create a small function to do that:

In [6]:
def sample_n_instances_per_class(X, y, n=100):
    Xs, ys = [], []
    for label in np.unique(y):
        idx = (y == label)
        Xc = X[idx][:n]
        yc = y[idx][:n]
        Xs.append(Xc)
        ys.append(yc)
    return np.concatenate(Xs), np.concatenate(ys)
print('done')

done


In [7]:
import numpy as np
X_train2, y_train2 = sample_n_instances_per_class(X_train2_full, y_train2_full, n=100)
X_valid2, y_valid2 = sample_n_instances_per_class(X_valid2_full, y_valid2_full, n=30)
print('done')

done


Now let's train the model. This is the same training code as earlier, using early stopping, except for the initialization: we first initialize all the variables, then we restore the best model trained earlier (on digits 0 to 4), and finally we reinitialize the output layer variables.

In [8]:
import time

n_epochs = 1000
batch_size = 20

max_checks_without_progress = 20
checks_without_progress = 0
best_loss = np.infty
with tf.Session() as sess:
    init.run()
    restore_saver.restore(sess, "./model/Team11_HW2.ckpt")
    
    for var in output_layer_vars:
        var.initializer.run()

    t0 = time.time()
        
    for epoch in range(n_epochs):
        rnd_idx = np.random.permutation(len(X_train2))
        for rnd_indices in np.array_split(rnd_idx, len(X_train2) // batch_size):
            X_batch, y_batch = X_train2[rnd_indices], y_train2[rnd_indices]
            sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
        loss_val, acc_val = sess.run([loss, accuracy], feed_dict={X: X_valid2, y: y_valid2})
        if loss_val < best_loss:
            save_path = five_frozen_saver.save(sess, "./my_mnist_model_5_to_9_five_frozen")
            best_loss = loss_val
            checks_without_progress = 0
        else:
            checks_without_progress += 1
            if checks_without_progress > max_checks_without_progress:
                print("Early stopping!")
                break
        print("{}\tValidation loss: {:.6f}\tBest loss: {:.6f}\tAccuracy: {:.2f}%".format(
            epoch, loss_val, best_loss, acc_val * 100))

    t1 = time.time()
    print("Total training time: {:.1f}s".format(t1 - t0))

with tf.Session() as sess:
    five_frozen_saver.restore(sess, "./my_mnist_model_5_to_9_five_frozen")
    acc_test = accuracy.eval(feed_dict={X: X_test2, y: y_test2})
    print("Final test accuracy: {:.2f}%".format(acc_test * 100))

INFO:tensorflow:Restoring parameters from ./model/Team11_HW2.ckpt
0	Validation loss: 13.550391	Best loss: 13.550391	Accuracy: 8.67%
1	Validation loss: 9.117629	Best loss: 9.117629	Accuracy: 10.00%
2	Validation loss: 4.515644	Best loss: 4.515644	Accuracy: 13.33%
3	Validation loss: 2.239404	Best loss: 2.239404	Accuracy: 26.00%
4	Validation loss: 1.521001	Best loss: 1.521001	Accuracy: 43.33%
5	Validation loss: 1.395050	Best loss: 1.395050	Accuracy: 45.33%
6	Validation loss: 1.329043	Best loss: 1.329043	Accuracy: 51.33%
7	Validation loss: 1.220435	Best loss: 1.220435	Accuracy: 50.67%
8	Validation loss: 1.393382	Best loss: 1.220435	Accuracy: 46.00%
9	Validation loss: 1.328711	Best loss: 1.220435	Accuracy: 49.33%
10	Validation loss: 1.285341	Best loss: 1.220435	Accuracy: 50.67%
11	Validation loss: 1.347284	Best loss: 1.220435	Accuracy: 46.00%
12	Validation loss: 1.315521	Best loss: 1.220435	Accuracy: 44.00%
13	Validation loss: 1.296335	Best loss: 1.220435	Accuracy: 42.00%
14	Validation loss:

3.
Let's start by getting a handle on the output of the last frozen layer:

In [9]:
hidden5_out = tf.get_default_graph().get_tensor_by_name("hidden5_out:0")
print('done')

done


Now let's train the model using roughly the same code as earlier. The difference is that we compute the output of the top frozen layer at the beginning (both for the training set and the validation set), and we cache it. This makes training roughly 1.5 to 3 times faster in this example (this may vary greatly, depending on your system):

In [10]:
import time

n_epochs = 1000
batch_size = 20

max_checks_without_progress = 20
checks_without_progress = 0
best_loss = np.infty

with tf.Session() as sess:
    init.run()
    restore_saver.restore(sess, "./model/Team11_HW2.ckpt")
    for var in output_layer_vars:
        var.initializer.run()

    t0 = time.time()
    #X_train2(500,784)
    hidden5_train = hidden5_out.eval(feed_dict={X: X_train2, y: y_train2})
    hidden5_valid = hidden5_out.eval(feed_dict={X: X_valid2, y: y_valid2})
    print(hidden5_train.shape)  #(128,128)
    for epoch in range(n_epochs):
        rnd_idx = np.random.permutation(len(X_train2))  #create 1~500 rnd
        for rnd_indices in np.array_split(rnd_idx, len(X_train2) // batch_size):
            h5_batch, y_batch = hidden5_train[rnd_indices], y_train2[rnd_indices]
            sess.run(training_op, feed_dict={hidden5_out: h5_batch, y: y_batch})
        loss_val, acc_val = sess.run([loss, accuracy], feed_dict={hidden5_out: hidden5_valid, y: y_valid2})
        if loss_val < best_loss:
            save_path = five_frozen_saver.save(sess, "./my_mnist_model_5_to_9_five_frozen")
            best_loss = loss_val
            checks_without_progress = 0
        else:
            checks_without_progress += 1
            if checks_without_progress > max_checks_without_progress:
                print("Early stopping!")
                break
        print("{}\tValidation loss: {:.6f}\tBest loss: {:.6f}\tAccuracy: {:.2f}%".format(
            epoch, loss_val, best_loss, acc_val * 100))

    t1 = time.time()
    print("Total training time: {:.1f}s".format(t1 - t0))

with tf.Session() as sess:
    five_frozen_saver.restore(sess, "./my_mnist_model_5_to_9_five_frozen")
    acc_test = accuracy.eval(feed_dict={X: X_test2, y: y_test2})
    print("Final test accuracy: {:.2f}%".format(acc_test * 100))

INFO:tensorflow:Restoring parameters from ./model/Team11_HW2.ckpt
(500, 5)
0	Validation loss: 14.038921	Best loss: 14.038921	Accuracy: 9.33%
1	Validation loss: 9.362429	Best loss: 9.362429	Accuracy: 10.67%
2	Validation loss: 5.394629	Best loss: 5.394629	Accuracy: 12.00%
3	Validation loss: 2.811015	Best loss: 2.811015	Accuracy: 24.00%
4	Validation loss: 1.718167	Best loss: 1.718167	Accuracy: 37.33%
5	Validation loss: 1.598576	Best loss: 1.598576	Accuracy: 38.00%
6	Validation loss: 1.550299	Best loss: 1.550299	Accuracy: 41.33%
7	Validation loss: 1.505520	Best loss: 1.505520	Accuracy: 40.00%
8	Validation loss: 1.488877	Best loss: 1.488877	Accuracy: 44.67%
9	Validation loss: 1.479374	Best loss: 1.479374	Accuracy: 42.00%
10	Validation loss: 1.446867	Best loss: 1.446867	Accuracy: 44.00%
11	Validation loss: 1.433239	Best loss: 1.433239	Accuracy: 42.00%
12	Validation loss: 1.427489	Best loss: 1.427489	Accuracy: 46.67%
13	Validation loss: 1.390036	Best loss: 1.390036	Accuracy: 45.33%
14	Validat

4.
Let's load the best model again, but this time we will create a new softmax output layer on top of the 4th hidden layer:

In [15]:
reset_graph()

n_outputs = 5

restore_saver = tf.train.import_meta_graph("./model/Team11_HW2.ckpt.meta")
he_init = tf.contrib.layers.variance_scaling_initializer()
X = tf.get_default_graph().get_tensor_by_name("X:0")
y = tf.get_default_graph().get_tensor_by_name("y:0")

hidden4_out = tf.get_default_graph().get_tensor_by_name("hidden4_out:0")
logits = tf.layers.dense(hidden4_out, n_outputs, kernel_initializer=he_init, name="new_logits")
Y_proba = tf.nn.softmax(logits)
xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=logits)
loss = tf.reduce_mean(xentropy)
correct = tf.nn.in_top_k(logits, y, 1)
accuracy = tf.reduce_mean(tf.cast(correct, tf.float32), name="accuracy")
print('done')

done


And now let's create the training operation. We want to freeze all the layers except for the new output layer:

In [16]:
learning_rate = 0.01

output_layer_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope="new_logits")
optimizer = tf.train.AdamOptimizer(learning_rate, name="Adam2")
training_op = optimizer.minimize(loss, var_list=output_layer_vars)

init = tf.global_variables_initializer()
four_frozen_saver = tf.train.Saver()
print('done')

done


And once again we train the model with the same code as earlier. Note: we could of course write a function once and use it multiple times, rather than copying almost the same training code over and over again, but as we keep tweaking the code slightly, the function would need multiple arguments and if statements, and it would have to be at the beginning of the notebook, where it would not make much sense to readers. In short it would be very confusing, so we're better off with copy & paste.

In [20]:
n_epochs = 1000
batch_size = 20

max_checks_without_progress = 20
checks_without_progress = 0
best_loss = np.infty

with tf.Session() as sess:
    init.run()
    restore_saver.restore(sess, "./model/Team11_HW2.ckpt")
        
    for epoch in range(n_epochs):
        rnd_idx = np.random.permutation(len(X_train2))
        for rnd_indices in np.array_split(rnd_idx, len(X_train2) // batch_size):
            X_batch, y_batch = X_train2[rnd_indices], y_train2[rnd_indices]
            sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
        loss_val, acc_val = sess.run([loss, accuracy], feed_dict={X: X_valid2, y: y_valid2})
        if loss_val < best_loss:
            save_path = four_frozen_saver.save(sess, "./my_mnist_model_5_to_9_four_frozen")
            best_loss = loss_val
            checks_without_progress = 0
        else:
            checks_without_progress += 1
            if checks_without_progress > max_checks_without_progress:
                print("Early stopping!")
                break
        print("{}\tValidation loss: {:.6f}\tBest loss: {:.6f}\tAccuracy: {:.2f}%".format(
            epoch, loss_val, best_loss, acc_val * 100))

with tf.Session() as sess:
    four_frozen_saver.restore(sess, "./my_mnist_model_5_to_9_four_frozen")
    acc_test = accuracy.eval(feed_dict={X: X_test2, y: y_test2})
    print("Final test accuracy: {:.2f}%".format(acc_test * 100))

INFO:tensorflow:Restoring parameters from ./model/Team11_HW2.ckpt
0	Validation loss: 3.427775	Best loss: 3.427775	Accuracy: 42.00%
1	Validation loss: 2.353230	Best loss: 2.353230	Accuracy: 45.33%
2	Validation loss: 2.158878	Best loss: 2.158878	Accuracy: 30.67%
3	Validation loss: 1.997038	Best loss: 1.997038	Accuracy: 46.00%
4	Validation loss: 1.787396	Best loss: 1.787396	Accuracy: 44.67%
5	Validation loss: 1.785813	Best loss: 1.785813	Accuracy: 44.67%
6	Validation loss: 1.742076	Best loss: 1.742076	Accuracy: 42.00%
7	Validation loss: 2.210415	Best loss: 1.742076	Accuracy: 44.00%
8	Validation loss: 2.076468	Best loss: 1.742076	Accuracy: 42.00%
9	Validation loss: 1.830659	Best loss: 1.742076	Accuracy: 44.67%
10	Validation loss: 2.353256	Best loss: 1.742076	Accuracy: 40.00%
11	Validation loss: 1.969716	Best loss: 1.742076	Accuracy: 50.67%
12	Validation loss: 2.580928	Best loss: 1.742076	Accuracy: 40.67%
13	Validation loss: 2.333932	Best loss: 1.742076	Accuracy: 43.33%
14	Validation loss: 

5.
Now unfreeze the top two hidden layers and continue training: can you get the model to perform even better?

In [23]:
learning_rate = 0.01

unfrozen_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope="hidden[34]|new_logits")
optimizer = tf.train.AdamOptimizer(learning_rate, name="Adam3")
training_op = optimizer.minimize(loss, var_list=unfrozen_vars)

init = tf.global_variables_initializer()
two_frozen_saver = tf.train.Saver()
print('done')

done


In [24]:
n_epochs = 1000
batch_size = 20

max_checks_without_progress = 20
checks_without_progress = 0
best_loss = np.infty

with tf.Session() as sess:
    init.run()
    four_frozen_saver.restore(sess, "./my_mnist_model_5_to_9_four_frozen")
        
    for epoch in range(n_epochs):
        rnd_idx = np.random.permutation(len(X_train2))
        for rnd_indices in np.array_split(rnd_idx, len(X_train2) // batch_size):
            X_batch, y_batch = X_train2[rnd_indices], y_train2[rnd_indices]
            sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
        loss_val, acc_val = sess.run([loss, accuracy], feed_dict={X: X_valid2, y: y_valid2})
        if loss_val < best_loss:
            save_path = two_frozen_saver.save(sess, "./my_mnist_model_5_to_9_two_frozen")
            best_loss = loss_val
            checks_without_progress = 0
        else:
            checks_without_progress += 1
            if checks_without_progress > max_checks_without_progress:
                print("Early stopping!")
                break
        print("{}\tValidation loss: {:.6f}\tBest loss: {:.6f}\tAccuracy: {:.2f}%".format(
            epoch, loss_val, best_loss, acc_val * 100))

with tf.Session() as sess:
    two_frozen_saver.restore(sess, "./my_mnist_model_5_to_9_two_frozen")
    acc_test = accuracy.eval(feed_dict={X: X_test2, y: y_test2})
    print("Final test accuracy: {:.2f}%".format(acc_test * 100))

INFO:tensorflow:Restoring parameters from ./my_mnist_model_5_to_9_four_frozen
0	Validation loss: 2.687472	Best loss: 2.687472	Accuracy: 38.67%
1	Validation loss: 1.485615	Best loss: 1.485615	Accuracy: 47.33%
2	Validation loss: 1.375690	Best loss: 1.375690	Accuracy: 49.33%
3	Validation loss: 1.192660	Best loss: 1.192660	Accuracy: 50.67%
4	Validation loss: 1.313740	Best loss: 1.192660	Accuracy: 50.67%
5	Validation loss: 1.125055	Best loss: 1.125055	Accuracy: 50.67%
6	Validation loss: 1.179728	Best loss: 1.125055	Accuracy: 58.00%
7	Validation loss: 1.257175	Best loss: 1.125055	Accuracy: 56.00%
8	Validation loss: 1.178493	Best loss: 1.125055	Accuracy: 54.67%
9	Validation loss: 1.230717	Best loss: 1.125055	Accuracy: 48.00%
10	Validation loss: 1.187102	Best loss: 1.125055	Accuracy: 52.67%
11	Validation loss: 1.205531	Best loss: 1.125055	Accuracy: 52.00%
12	Validation loss: 1.198220	Best loss: 1.125055	Accuracy: 51.33%
13	Validation loss: 1.273223	Best loss: 1.125055	Accuracy: 45.33%
14	Valid

In [25]:
learning_rate = 0.01

optimizer = tf.train.AdamOptimizer(learning_rate, name="Adam4")
training_op = optimizer.minimize(loss)

init = tf.global_variables_initializer()
no_frozen_saver = tf.train.Saver()
print('done')

done


In [26]:
n_epochs = 1000
batch_size = 20

max_checks_without_progress = 20
checks_without_progress = 0
best_loss = np.infty

with tf.Session() as sess:
    init.run()
    two_frozen_saver.restore(sess, "./my_mnist_model_5_to_9_two_frozen")
        
    for epoch in range(n_epochs):
        rnd_idx = np.random.permutation(len(X_train2))
        for rnd_indices in np.array_split(rnd_idx, len(X_train2) // batch_size):
            X_batch, y_batch = X_train2[rnd_indices], y_train2[rnd_indices]
            sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
        loss_val, acc_val = sess.run([loss, accuracy], feed_dict={X: X_valid2, y: y_valid2})
        if loss_val < best_loss:
            save_path = no_frozen_saver.save(sess, "./my_mnist_model_5_to_9_no_frozen")
            best_loss = loss_val
            checks_without_progress = 0
        else:
            checks_without_progress += 1
            if checks_without_progress > max_checks_without_progress:
                print("Early stopping!")
                break
        print("{}\tValidation loss: {:.6f}\tBest loss: {:.6f}\tAccuracy: {:.2f}%".format(
            epoch, loss_val, best_loss, acc_val * 100))

with tf.Session() as sess:
    no_frozen_saver.restore(sess, "./my_mnist_model_5_to_9_no_frozen")
    acc_test = accuracy.eval(feed_dict={X: X_test2, y: y_test2})
    print("Final test accuracy: {:.2f}%".format(acc_test * 100))

INFO:tensorflow:Restoring parameters from ./my_mnist_model_5_to_9_two_frozen
0	Validation loss: 0.883460	Best loss: 0.883460	Accuracy: 69.33%
1	Validation loss: 0.508830	Best loss: 0.508830	Accuracy: 82.00%
2	Validation loss: 0.542565	Best loss: 0.508830	Accuracy: 80.67%
3	Validation loss: 0.559748	Best loss: 0.508830	Accuracy: 85.33%
4	Validation loss: 0.762251	Best loss: 0.508830	Accuracy: 77.33%
5	Validation loss: 0.634756	Best loss: 0.508830	Accuracy: 86.00%
6	Validation loss: 0.562701	Best loss: 0.508830	Accuracy: 86.00%
7	Validation loss: 0.461003	Best loss: 0.461003	Accuracy: 86.00%
8	Validation loss: 0.464231	Best loss: 0.461003	Accuracy: 90.67%
9	Validation loss: 0.399830	Best loss: 0.399830	Accuracy: 89.33%
10	Validation loss: 0.399738	Best loss: 0.399738	Accuracy: 90.00%
11	Validation loss: 0.420987	Best loss: 0.399738	Accuracy: 90.00%
12	Validation loss: 0.399378	Best loss: 0.399378	Accuracy: 90.00%
13	Validation loss: 0.638286	Best loss: 0.399378	Accuracy: 84.00%
14	Valida

Let's compare that to a DNN trained from scratch:

In [34]:
dnn_clf_5_to_9 =  DNNClassifier(n_hidden_layers=4, random_state=42)
dnn_clf_5_to_9.fit(X_train2, y_train2, n_epochs=1000, X_valid=X_valid2, y_valid=y_valid2)

NameError: name 'DNNClassifier' is not defined

In [None]:
y_pred = dnn_clf_5_to_9.predict(X_test2)
accuracy_score(y_test2, y_pred)