# M2177.003100 Deep Learning <br> Assignment #1 Part 3: Playing with Neural Networks by TensorFlow

Copyright (C) Data Science & AI Laboratory, Seoul National University. This material is for educational uses only. Some contents are based on the material provided by other paper/book authors and may be copyrighted by them. 

Previously in `Assignment2-1_Data_Curation.ipynb`, we created a pickle with formatted datasets for training, development and testing on the [notMNIST dataset](http://yaroslavvb.blogspot.com/2011/09/notmnist-dataset.html).

The goal of this assignment is to progressively train deeper and more accurate models using TensorFlow.

**Note**: certain details are missing or ambiguous on purpose, in order to test your knowledge on the related materials. However, if you really feel that something essential is missing and cannot proceed to the next step, then contact the teaching staff with clear description of your problem.

### Submitting your work:
<font color=red>**DO NOT clear the final outputs**</font> so that TAs can grade both your code and results.  
Once you have done **part 1 - 3**, run the *CollectSubmission.sh* script with your **Student number** as input argument. <br>
This will produce a compressed file called *[Your student number].tar.gz*. Please submit this file on ETL. &nbsp;&nbsp; (Usage: ./*CollectSubmission.sh* &nbsp; 20\*\*-\*\*\*\*\*)

## Load datasets

First reload the data we generated in `Assignment2-1_Data_Curation.ipynb`.

In [1]:
# These are all the modules we'll be using later. Make sure you can import them
# before proceeding further.
from __future__ import print_function
import numpy as np
import tensorflow as tf
from six.moves import cPickle as pickle
from six.moves import range
import os

#configuration for gpu usage
conf = tf.ConfigProto()
# you can modify below as you want
#conf.gpu_options.per_process_gpu_memory_fraction = 0.4
#conf.gpu_options.allow_growth = True
#os.environ['CUDA_VISIBLE_DEVICES']='0'
print(tf.__version__)

1.12.0


In [2]:
pickle_file = 'data/notMNIST.pickle'

with open(pickle_file, 'rb') as f:
    save = pickle.load(f)
    train_dataset = save['train_dataset']
    train_labels = save['train_labels']
    valid_dataset = save['valid_dataset']
    valid_labels = save['valid_labels']
    test_dataset = save['test_dataset']
    test_labels = save['test_labels']
    del save  # hint to help gc free up memory
    print('Training set', train_dataset.shape, train_labels.shape)
    print('Validation set', valid_dataset.shape, valid_labels.shape)
    print('Test set', test_dataset.shape, test_labels.shape)

Training set (200000, 28, 28) (200000,)
Validation set (10000, 28, 28) (10000,)
Test set (10000, 28, 28) (10000,)


Reformat into a shape that's more adapted to the models we're going to train:
- data as a flat matrix,
- labels as float 1-hot encodings.

In [3]:
image_size = 28
num_labels = 10

def reformat(dataset, labels):
    dataset = dataset.reshape((-1, image_size * image_size)).astype(np.float32)
    # Map 0 to [1.0, 0.0, 0.0 ...], 1 to [0.0, 1.0, 0.0 ...]
    labels = (np.arange(num_labels) == labels[:,None]).astype(np.float32)
    return dataset, labels
train_dataset, train_labels = reformat(train_dataset, train_labels)
valid_dataset, valid_labels = reformat(valid_dataset, valid_labels)
test_dataset, test_labels = reformat(test_dataset, test_labels)
print('Training set', train_dataset.shape, train_labels.shape)
print('Validation set', valid_dataset.shape, valid_labels.shape)
print('Test set', test_dataset.shape, test_labels.shape)

Training set (200000, 784) (200000, 10)
Validation set (10000, 784) (10000, 10)
Test set (10000, 784) (10000, 10)


## TensorFlow tutorial: Fully Connected Network

We're first going to train a **fully connected network** with *1 hidden layer* with *1024 units* using stochastic gradient descent (SGD).

TensorFlow works like this:
* First you describe the computation that you want to see performed: what the inputs, the variables, and the operations look like. These get created as nodes over a computation graph. This description is all contained within the block below:

      with graph.as_default():
          ...

* Then you can run the operations on this graph as many times as you want by calling `session.run()`, providing it outputs to fetch from the graph that get returned. This runtime operation is all contained in the block below:

      with tf.Session(graph=graph) as session:
          ...

Let's load all the data into TensorFlow and build the computation graph corresponding to our training:

In [4]:
batch_size = 128
nn_hidden = 1024

graph = tf.Graph()
with graph.as_default():
    # Input data. For the training data, we use a placeholder that will be fed
    # at run time with a training minibatch.
    tf_dataset = tf.placeholder(tf.float32,
                                      shape=(None, image_size * image_size))
    tf_labels = tf.placeholder(tf.float32, shape=(None, num_labels))
    
    # Parameters. 
    w1 = tf.Variable(tf.truncated_normal([image_size * image_size, nn_hidden]))
    b1 = tf.Variable(tf.zeros([nn_hidden]))
    w2 = tf.Variable(tf.truncated_normal([nn_hidden, num_labels]))
    b2 = tf.Variable(tf.zeros([num_labels]))
    
    # Training computation.
    hidden = tf.tanh(tf.matmul(tf_dataset, w1) + b1)
    logits = tf.matmul(hidden, w2) + b2
    
    loss = tf.reduce_mean(
        tf.nn.softmax_cross_entropy_with_logits_v2(labels=tf_labels, logits=logits))
    
    # Optimizer.
    optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(loss)
  
    # Predictions for the training, validation, and test data.
    prediction = tf.nn.softmax(logits)

Let's run this computation and iterate:

In [5]:
num_steps = 10000

def accuracy(predictions, labels):
    return (100.0 * np.sum(np.equal(np.argmax(predictions, 1), np.argmax(labels, 1)))
          / predictions.shape[0])

with tf.Session(graph=graph) as session:
    tf.global_variables_initializer().run()
    print("Initialized")
    for step in range(num_steps):
        # Pick an offset within the training data, which has been randomized.
        # Note: we could use better randomization across epochs.
        offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
        print('offset', offset)
        # Generate a minibatch.
        batch_data = train_dataset[offset:(offset + batch_size), :]
        batch_labels = train_labels[offset:(offset + batch_size), :]
        # Prepare a dictionary telling the session where to feed the minibatch.
        # The key of the dictionary is the placeholder node of the graph to be fed,
        # and the value is the numpy array to feed to it.
        feed_dict_train={tf_dataset: batch_data, tf_labels: batch_labels}
        _, l, predictions = session.run([optimizer, loss, prediction], feed_dict=feed_dict_train)
        if (step % 1000 == 0):
            print("Minibatch loss at step %d: %f" % (step, l))
            print("Minibatch accuracy: %.1f%%" % accuracy(predictions, batch_labels))
            valid_prediction = session.run(logits, feed_dict={tf_dataset: valid_dataset})
            print("Validation accuracy: %.1f%%" % accuracy(valid_prediction, valid_labels))
                  
    test_prediction = session.run(prediction, feed_dict={tf_dataset: test_dataset})
    print("Test accuracy: %.1f%%" % accuracy(test_prediction, test_labels))
    saver = tf.train.Saver()
    saver.save(session, "./model_checkpoints/my_model_final")

Initialized
offset 0
Minibatch loss at step 0: 40.760921
Minibatch accuracy: 8.6%
Validation accuracy: 24.3%
offset 128
offset 256
offset 384
offset 512
offset 640
offset 768
offset 896
offset 1024
offset 1152
offset 1280
offset 1408
offset 1536
offset 1664
offset 1792
offset 1920
offset 2048
offset 2176
offset 2304
offset 2432
offset 2560
offset 2688
offset 2816
offset 2944
offset 3072
offset 3200
offset 3328
offset 3456
offset 3584
offset 3712
offset 3840
offset 3968
offset 4096
offset 4224
offset 4352
offset 4480
offset 4608
offset 4736
offset 4864
offset 4992
offset 5120
offset 5248
offset 5376
offset 5504
offset 5632
offset 5760
offset 5888
offset 6016
offset 6144
offset 6272
offset 6400
offset 6528
offset 6656
offset 6784
offset 6912
offset 7040
offset 7168
offset 7296
offset 7424
offset 7552
offset 7680
offset 7808
offset 7936
offset 8064
offset 8192
offset 8320
offset 8448
offset 8576
offset 8704
offset 8832
offset 8960
offset 9088
offset 9216
offset 9344
offset 9472
offset 960

offset 89088
offset 89216
offset 89344
offset 89472
offset 89600
offset 89728
offset 89856
offset 89984
offset 90112
offset 90240
offset 90368
offset 90496
offset 90624
offset 90752
offset 90880
offset 91008
offset 91136
offset 91264
offset 91392
offset 91520
offset 91648
offset 91776
offset 91904
offset 92032
offset 92160
offset 92288
offset 92416
offset 92544
offset 92672
offset 92800
offset 92928
offset 93056
offset 93184
offset 93312
offset 93440
offset 93568
offset 93696
offset 93824
offset 93952
offset 94080
offset 94208
offset 94336
offset 94464
offset 94592
offset 94720
offset 94848
offset 94976
offset 95104
offset 95232
offset 95360
offset 95488
offset 95616
offset 95744
offset 95872
offset 96000
offset 96128
offset 96256
offset 96384
offset 96512
offset 96640
offset 96768
offset 96896
offset 97024
offset 97152
offset 97280
offset 97408
offset 97536
offset 97664
offset 97792
offset 97920
offset 98048
offset 98176
offset 98304
offset 98432
offset 98560
offset 98688
offset 98816

offset 181760
offset 181888
offset 182016
offset 182144
offset 182272
offset 182400
offset 182528
offset 182656
offset 182784
offset 182912
offset 183040
offset 183168
offset 183296
offset 183424
offset 183552
offset 183680
offset 183808
offset 183936
offset 184064
offset 184192
offset 184320
offset 184448
offset 184576
offset 184704
offset 184832
offset 184960
offset 185088
offset 185216
offset 185344
offset 185472
offset 185600
offset 185728
offset 185856
offset 185984
offset 186112
offset 186240
offset 186368
offset 186496
offset 186624
offset 186752
offset 186880
offset 187008
offset 187136
offset 187264
offset 187392
offset 187520
offset 187648
offset 187776
offset 187904
offset 188032
offset 188160
offset 188288
offset 188416
offset 188544
offset 188672
offset 188800
offset 188928
offset 189056
offset 189184
offset 189312
offset 189440
offset 189568
offset 189696
offset 189824
offset 189952
offset 190080
offset 190208
offset 190336
offset 190464
offset 190592
offset 190720
offset

offset 74816
offset 74944
offset 75072
offset 75200
offset 75328
offset 75456
offset 75584
offset 75712
offset 75840
offset 75968
offset 76096
offset 76224
offset 76352
offset 76480
offset 76608
offset 76736
offset 76864
offset 76992
offset 77120
offset 77248
offset 77376
offset 77504
offset 77632
offset 77760
offset 77888
offset 78016
offset 78144
offset 78272
offset 78400
offset 78528
offset 78656
offset 78784
offset 78912
offset 79040
offset 79168
offset 79296
offset 79424
offset 79552
offset 79680
offset 79808
offset 79936
offset 80064
offset 80192
offset 80320
offset 80448
offset 80576
offset 80704
offset 80832
offset 80960
offset 81088
offset 81216
offset 81344
offset 81472
offset 81600
offset 81728
offset 81856
offset 81984
offset 82112
offset 82240
offset 82368
offset 82496
offset 82624
offset 82752
offset 82880
offset 83008
offset 83136
offset 83264
offset 83392
offset 83520
offset 83648
offset 83776
offset 83904
offset 84032
offset 84160
offset 84288
offset 84416
offset 84544

offset 169792
offset 169920
offset 170048
offset 170176
offset 170304
offset 170432
offset 170560
offset 170688
offset 170816
offset 170944
offset 171072
offset 171200
offset 171328
offset 171456
offset 171584
offset 171712
offset 171840
offset 171968
offset 172096
offset 172224
offset 172352
offset 172480
offset 172608
offset 172736
offset 172864
offset 172992
offset 173120
offset 173248
offset 173376
offset 173504
offset 173632
offset 173760
offset 173888
offset 174016
offset 174144
offset 174272
offset 174400
offset 174528
offset 174656
offset 174784
offset 174912
offset 175040
offset 175168
offset 175296
offset 175424
offset 175552
offset 175680
offset 175808
offset 175936
offset 176064
offset 176192
offset 176320
offset 176448
offset 176576
offset 176704
offset 176832
offset 176960
offset 177088
offset 177216
offset 177344
offset 177472
offset 177600
offset 177728
offset 177856
offset 177984
offset 178112
offset 178240
offset 178368
offset 178496
offset 178624
offset 178752
offset

offset 64768
offset 64896
offset 65024
offset 65152
offset 65280
offset 65408
offset 65536
offset 65664
offset 65792
offset 65920
offset 66048
offset 66176
offset 66304
offset 66432
offset 66560
offset 66688
offset 66816
offset 66944
offset 67072
offset 67200
offset 67328
offset 67456
offset 67584
offset 67712
offset 67840
offset 67968
offset 68096
offset 68224
offset 68352
offset 68480
offset 68608
offset 68736
offset 68864
offset 68992
offset 69120
offset 69248
offset 69376
offset 69504
offset 69632
offset 69760
offset 69888
offset 70016
offset 70144
offset 70272
offset 70400
offset 70528
offset 70656
offset 70784
offset 70912
offset 71040
offset 71168
offset 71296
offset 71424
offset 71552
offset 71680
offset 71808
offset 71936
offset 72064
offset 72192
offset 72320
offset 72448
offset 72576
offset 72704
offset 72832
offset 72960
offset 73088
offset 73216
offset 73344
offset 73472
offset 73600
offset 73728
offset 73856
offset 73984
offset 74112
offset 74240
offset 74368
offset 74496

offset 153600
offset 153728
offset 153856
offset 153984
offset 154112
offset 154240
offset 154368
offset 154496
offset 154624
offset 154752
offset 154880
offset 155008
offset 155136
offset 155264
offset 155392
offset 155520
offset 155648
offset 155776
offset 155904
offset 156032
offset 156160
offset 156288
offset 156416
offset 156544
offset 156672
offset 156800
offset 156928
offset 157056
offset 157184
offset 157312
offset 157440
offset 157568
offset 157696
offset 157824
offset 157952
offset 158080
offset 158208
offset 158336
offset 158464
offset 158592
offset 158720
offset 158848
offset 158976
offset 159104
offset 159232
offset 159360
offset 159488
offset 159616
offset 159744
offset 159872
offset 160000
offset 160128
offset 160256
offset 160384
offset 160512
offset 160640
offset 160768
offset 160896
offset 161024
offset 161152
offset 161280
offset 161408
offset 161536
offset 161664
offset 161792
offset 161920
offset 162048
offset 162176
offset 162304
offset 162432
offset 162560
offset

offset 50240
offset 50368
offset 50496
offset 50624
offset 50752
offset 50880
offset 51008
offset 51136
offset 51264
offset 51392
offset 51520
offset 51648
offset 51776
offset 51904
offset 52032
offset 52160
offset 52288
offset 52416
offset 52544
offset 52672
offset 52800
offset 52928
offset 53056
offset 53184
offset 53312
offset 53440
offset 53568
offset 53696
offset 53824
offset 53952
offset 54080
offset 54208
offset 54336
offset 54464
offset 54592
offset 54720
offset 54848
offset 54976
offset 55104
offset 55232
offset 55360
offset 55488
offset 55616
offset 55744
offset 55872
offset 56000
offset 56128
offset 56256
offset 56384
offset 56512
offset 56640
offset 56768
offset 56896
offset 57024
offset 57152
offset 57280
offset 57408
offset 57536
offset 57664
offset 57792
offset 57920
offset 58048
offset 58176
offset 58304
offset 58432
offset 58560
offset 58688
offset 58816
offset 58944
offset 59072
offset 59200
offset 59328
offset 59456
offset 59584
offset 59712
offset 59840
offset 59968

offset 147136
offset 147264
offset 147392
offset 147520
offset 147648
offset 147776
offset 147904
offset 148032
offset 148160
offset 148288
offset 148416
offset 148544
offset 148672
offset 148800
offset 148928
offset 149056
offset 149184
offset 149312
offset 149440
offset 149568
offset 149696
offset 149824
offset 149952
offset 150080
offset 150208
offset 150336
offset 150464
offset 150592
offset 150720
offset 150848
offset 150976
offset 151104
offset 151232
offset 151360
offset 151488
offset 151616
offset 151744
offset 151872
offset 152000
offset 152128
offset 152256
offset 152384
offset 152512
offset 152640
offset 152768
offset 152896
offset 153024
offset 153152
offset 153280
offset 153408
offset 153536
offset 153664
offset 153792
offset 153920
offset 154048
offset 154176
offset 154304
offset 154432
offset 154560
offset 154688
offset 154816
offset 154944
offset 155072
offset 155200
offset 155328
offset 155456
offset 155584
offset 155712
offset 155840
offset 155968
offset 156096
offset

offset 42880
offset 43008
offset 43136
offset 43264
offset 43392
offset 43520
offset 43648
offset 43776
offset 43904
offset 44032
offset 44160
offset 44288
offset 44416
offset 44544
offset 44672
offset 44800
offset 44928
offset 45056
offset 45184
offset 45312
offset 45440
offset 45568
offset 45696
offset 45824
offset 45952
offset 46080
offset 46208
offset 46336
offset 46464
offset 46592
offset 46720
offset 46848
offset 46976
offset 47104
offset 47232
offset 47360
offset 47488
offset 47616
offset 47744
offset 47872
offset 48000
offset 48128
offset 48256
offset 48384
offset 48512
offset 48640
offset 48768
offset 48896
offset 49024
offset 49152
offset 49280
offset 49408
offset 49536
offset 49664
offset 49792
offset 49920
offset 50048
offset 50176
offset 50304
offset 50432
offset 50560
offset 50688
offset 50816
offset 50944
offset 51072
offset 51200
offset 51328
offset 51456
offset 51584
offset 51712
offset 51840
offset 51968
offset 52096
offset 52224
offset 52352
offset 52480
offset 52608

offset 144384
offset 144512
offset 144640
offset 144768
offset 144896
offset 145024
offset 145152
offset 145280
offset 145408
offset 145536
offset 145664
offset 145792
offset 145920
offset 146048
offset 146176
offset 146304
offset 146432
offset 146560
offset 146688
offset 146816
offset 146944
offset 147072
offset 147200
offset 147328
offset 147456
offset 147584
offset 147712
offset 147840
offset 147968
offset 148096
offset 148224
offset 148352
offset 148480
offset 148608
offset 148736
offset 148864
offset 148992
offset 149120
offset 149248
offset 149376
offset 149504
offset 149632
offset 149760
offset 149888
offset 150016
offset 150144
offset 150272
offset 150400
offset 150528
offset 150656
offset 150784
offset 150912
offset 151040
offset 151168
offset 151296
offset 151424
offset 151552
offset 151680
offset 151808
offset 151936
offset 152064
offset 152192
offset 152320
offset 152448
offset 152576
offset 152704
offset 152832
offset 152960
offset 153088
offset 153216
offset 153344
offset

offset 39104
offset 39232
offset 39360
offset 39488
offset 39616
offset 39744
offset 39872
offset 40000
offset 40128
offset 40256
offset 40384
offset 40512
offset 40640
offset 40768
offset 40896
offset 41024
offset 41152
offset 41280
offset 41408
offset 41536
offset 41664
offset 41792
offset 41920
offset 42048
offset 42176
offset 42304
offset 42432
offset 42560
offset 42688
offset 42816
offset 42944
offset 43072
offset 43200
offset 43328
offset 43456
offset 43584
offset 43712
offset 43840
offset 43968
offset 44096
offset 44224
offset 44352
offset 44480
offset 44608
offset 44736
offset 44864
offset 44992
offset 45120
offset 45248
offset 45376
offset 45504
offset 45632
offset 45760
offset 45888
offset 46016
offset 46144
offset 46272
offset 46400
offset 46528
offset 46656
offset 46784
offset 46912
offset 47040
offset 47168
offset 47296
offset 47424
offset 47552
offset 47680
offset 47808
offset 47936
offset 48064
offset 48192
offset 48320
offset 48448
offset 48576
offset 48704
offset 48832

offset 133824
offset 133952
offset 134080
offset 134208
offset 134336
offset 134464
offset 134592
offset 134720
offset 134848
offset 134976
offset 135104
offset 135232
offset 135360
offset 135488
offset 135616
offset 135744
offset 135872
offset 136000
offset 136128
offset 136256
offset 136384
offset 136512
offset 136640
offset 136768
offset 136896
offset 137024
offset 137152
offset 137280
offset 137408
offset 137536
offset 137664
offset 137792
offset 137920
offset 138048
offset 138176
offset 138304
offset 138432
offset 138560
offset 138688
offset 138816
offset 138944
offset 139072
offset 139200
offset 139328
offset 139456
offset 139584
offset 139712
offset 139840
offset 139968
offset 140096
offset 140224
offset 140352
offset 140480
offset 140608
offset 140736
offset 140864
offset 140992
offset 141120
offset 141248
offset 141376
offset 141504
offset 141632
offset 141760
offset 141888
offset 142016
offset 142144
offset 142272
offset 142400
offset 142528
offset 142656
offset 142784
offset

offset 32256
offset 32384
offset 32512
offset 32640
offset 32768
offset 32896
offset 33024
offset 33152
offset 33280
offset 33408
offset 33536
offset 33664
offset 33792
offset 33920
offset 34048
offset 34176
offset 34304
offset 34432
offset 34560
offset 34688
offset 34816
offset 34944
offset 35072
offset 35200
offset 35328
offset 35456
offset 35584
offset 35712
offset 35840
offset 35968
offset 36096
offset 36224
offset 36352
offset 36480
offset 36608
offset 36736
offset 36864
offset 36992
offset 37120
offset 37248
offset 37376
offset 37504
offset 37632
offset 37760
offset 37888
offset 38016
offset 38144
offset 38272
offset 38400
offset 38528
offset 38656
offset 38784
offset 38912
offset 39040
offset 39168
offset 39296
offset 39424
offset 39552
offset 39680
offset 39808
offset 39936
offset 40064
offset 40192
offset 40320
offset 40448
offset 40576
offset 40704
offset 40832
offset 40960
offset 41088
offset 41216
offset 41344
offset 41472
offset 41600
offset 41728
offset 41856
offset 41984

So far, you have built the model in a naive way. However, TensorFlow provides a module named tf.layers for your convenience. 

From now on, build the same model as above using layers module.

In [None]:
graph_l=tf.Graph()
with graph_l.as_default():
    tf_dataset_l=tf.placeholder(tf.float32, shape=(None, image_size * image_size))
    tf_labels_l=tf.placeholder(tf.float32, shape=(None, num_labels))
    
    #neural network consists of two lines
    dense = tf.layers.dense(tf_dataset_l, nn_hidden, activation=tf.tanh)
    logits_l = tf.layers.dense(dense, num_labels, activation=None)
    
    #Loss
    loss_l = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(labels=tf_labels_l, logits=logits_l))
    
    #Optimizer
    optimizer_l = tf.train.GradientDescentOptimizer(0.5).minimize(loss_l)
    
    #Predictions for the training
    prediction_l = tf.nn.softmax(logits_l)

In [None]:
with tf.Session(graph=graph_l, config=conf) as session_l:
    tf.global_variables_initializer().run()
    print("Initialized")
    for step in range(num_steps):
        offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
        batch_data = train_dataset[offset:(offset + batch_size), :]
        batch_labels = train_labels[offset:(offset + batch_size), :].astype(float)
        feed_dict_l = {tf_dataset_l: batch_data, tf_labels_l: batch_labels}
        _, l_l, predictions_l = session_l.run([optimizer_l, loss_l, prediction_l], feed_dict=feed_dict_l)
        if(step % 1000 == 0):
            print("Minibatch loss at step %d: %f" % (step, l_l))
            feed_dict_val_l = {tf_dataset_l: valid_dataset}
            valid_prediction_l = session_l.run(prediction_l, feed_dict={tf_dataset_l: valid_dataset, tf_labels_l: valid_labels})
            print("Validation accuracy: %.1f%%" % accuracy(valid_prediction_l, valid_labels))

    feed_dict_test_l = {tf_dataset_l: test_dataset}
    test_prediction_l = session_l.run(prediction_l, feed_dict=feed_dict_test_l)
    print("Test accuracy: %.1f%%" % accuracy(test_prediction_l, test_labels))
    saver = tf.train.Saver()
    saver.save(session_l, "./model_checkpoints/my_model_final_using_layers")

---
Problem 1
-------

**Describe below** why there is a difference in an accuracy between the graph using layer module and the graph which is built in a naive way. **explain simply**





---

Describe here

---
Problem 2
-------

Try to get the best performance you can using a multi-layer model! (It doesn't matter whether you implement it in a naive way or using layer module. HOWEVER, you CANNOT use other type of layers such as conv.) 

The best reported test accuracy using a deep network is [97.1%](http://yaroslavvb.blogspot.kr/2011/09/notmnist-dataset.html?showComment=1391023266211#c8758720086795711595). You may use techniques below.

1. Experiment with different hyperparameters: num_steps, learning rate, etc.
2. We used a fixed learning rate epsilon for gradient descent. Implement an annealing schedule for the gradient descent learning rate ([more info](http://cs231n.github.io/neural-networks-3/#anneal)). *Hint*. Try using `tf.train.exponential_decay`.    
3. We used a tanh activation function for our hidden layer. Experiment with other activation functions included in TensorFlow.
4. Extend the network to multiple hidden layers. Experiment with the layer sizes. Adding another hidden layer means you will need to adjust the code. 
5. Introduce and tune regularization method (e.g. L2 regularization) for your model. Remeber that L2 amounts to adding a penalty on the norm of the weights to the loss. In TensorFlow, you can compute the L2 loss for a tensor `t` using `nn.l2_loss(t)`. The right amount of regularization should imporve your validation / test accuracy.
6. Introduce Dropout on the hidden layer of the neural network. Remember: Dropout should only be introduced during training, not evaluation, otherwise your evaluation results would be stochastic as well. TensorFlow provides nn.dropout() for that, but you have to make sure it's only inserted during training.

**Evaluation:** You will get full credit if your best test accuracy exceeds <font color=red>$93\%$</font>. Save your best perfoming model as my_model_final using saver. (Refer to the cell above) 

---

In [None]:
print(__doc__)
""" TODO """

batch_size = 500

graph = tf.Graph()
with graph.as_default():
    # Input data. For the training data, we use a placeholder that will be fed
    # at run time with a training minibatch.
    tf_dataset = tf.placeholder(tf.float32,
                                      shape=(None, image_size * image_size))
    tf_labels = tf.placeholder(tf.float32, shape=(None, num_labels))
    
    # Parameters. 
    w1 = tf.Variable(tf.truncated_normal([image_size * image_size, nn_hidden]))
    b1 = tf.Variable(tf.zeros([nn_hidden]))
    w2 = tf.Variable(tf.truncated_normal([nn_hidden, num_labels]))
    b2 = tf.Variable(tf.zeros([num_labels]))
    
    # Training computation.
    hidden = tf.tanh(tf.matmul(tf_dataset, w1) + b1)
    logits = tf.matmul(hidden, w2) + b2
    
    loss = tf.reduce_mean(
        tf.nn.softmax_cross_entropy_with_logits_v2(labels=tf_labels, logits=logits))
    
    # Optimizer.
    optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(loss)
  
    # Predictions for the training, validation, and test data.
    prediction = tf.nn.softmax(logits)