Deep Learning
=============

Assignment 3
------------

Previously in `2_fullyconnected.ipynb`, you trained a logistic regression and a neural network model.

The goal of this assignment is to explore regularization techniques.

In [1]:
import numpy as np
import tensorflow as tf
import pickle

First reload the data we generated in 1_notmnist.ipynb.

In [9]:
pickle_file = '../../data/notMNIST_sanitized.pickle'

with open(pickle_file, 'rb') as f:
    save = pickle.load(f)
    train_dataset = save['train_dataset_sanitized']
    train_labels = save['train_labels_sanitized']
    valid_dataset = save['valid_dataset']
    valid_labels = save['valid_labels']
    test_dataset = save['test_dataset']
    test_labels = save['test_labels']
    del save # hint to help gc free up memory
    print('Training_set',train_dataset.shape, train_labels.shape)
    print('Validation_set',valid_dataset.shape, valid_labels.shape)
    print('Test_set',test_dataset.shape, test_labels.shape)

Training_set (192407, 28, 28) (192407,)
Validation_set (10000, 28, 28) (10000,)
Test_set (10000, 28, 28) (10000,)


Reformat into a shape that's more adapted to the models we're going to train:
- data as a flat matrix,
- labels as float 1-hot encodings.

In [10]:
image_size = 28
num_labels = 10

In [11]:
def reformat(dataset, labels):
    """重构数据集样式"""
    dataset = dataset.reshape(dataset.shape[0], -1).astype(np.float32)
    # Map 0 to [1.0, 0.0, 0.0 ...], 1 to [0.0, 1.0, 0.0 ...]
    labels = (np.arange(num_labels) == labels[:,None]).astype(np.float32) 
    return dataset, labels

train_dataset, train_labels = reformat(train_dataset, train_labels)
valid_dataset, valid_labels = reformat(valid_dataset, valid_labels)
test_dataset, test_labels = reformat(test_dataset, test_labels)
print('Training_set',train_dataset.shape, train_labels.shape)
print('Validation_set',valid_dataset.shape, valid_labels.shape)
print('Test_set',test_dataset.shape, test_labels.shape)

Training_set (192407, 784) (192407, 10)
Validation_set (10000, 784) (10000, 10)
Test_set (10000, 784) (10000, 10)


In [12]:
def accuracy(predictions, labels):
    """精确度"""
#     correct_prediction = tf.equal(tf.argmax(prediction, 1), tf.argmax(labels, 1)) # bool
#     print(correct_prediction)
#     return tf.reduce_mean(tf.cast(correct_prediction, "float")) # tf.cast:将bool转换成其他类型
    return (100.0 * np.sum(np.argmax(predictions, 1) == np.argmax(labels, 1))
          / predictions.shape[0])

---
Problem 1
---------

Introduce and tune L2 regularization for both logistic and neural network models. Remember that L2 amounts to adding a penalty on the norm of the weights to the loss. In TensorFlow, you can compute the L2 loss for a tensor `t` using `nn.l2_loss(t)`. The right amount of regularization should improve your validation / test accuracy.

---

#### multinomial logistic regression

In [10]:
batch_size = 128 # mini-batch-size
lam = 0.01 # 正则参数

gragh = tf.Graph()
with gragh.as_default():
    # Input data. For the training data, we use a placeholder that will be fed
    # at run time with a training minibatch.
    tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_size * image_size))
    tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))
    tf_valid_dataset = tf.constant(valid_dataset)
    tf_test_dataset = tf.constant(test_dataset)
    
    # Variables
    weights = tf.Variable(tf.truncated_normal([image_size * image_size, num_labels]))
    biases = tf.Variable(tf.zeros([num_labels]))
    
    # Train computation
    logits = tf.matmul(tf_train_dataset, weights) + biases
    loss = tf.reduce_mean(
        tf.nn.softmax_cross_entropy_with_logits(labels=tf_train_labels, logits=logits) +
        lam * tf.nn.l2_loss(weights)) # 增加 正则项
    
    # Optimizer
    optimizer = tf.train.GradientDescentOptimizer(0.05).minimize(loss)
    
    # Predictions for the training, validation, and test data.
    train_prediction = tf.nn.softmax(logits)
    valid_prediction = tf.nn.softmax(tf.matmul(tf_valid_dataset, weights) + biases)
    test_prediction = tf.nn.softmax(tf.matmul(tf_test_dataset, weights) + biases)

In [11]:
num_steps = 5000

with tf.Session(graph=gragh) as session:
    tf.global_variables_initializer().run()
    print("Initialized")
    for step in range(num_steps):
        # Pick an offset within the training data, which has been randomized.
        # Note: we could use better randomization across epochs.
        offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
        # Generate a minibatch.
        batch_data = train_dataset[offset:(offset + batch_size), :]
        batch_labels = train_labels[offset:(offset + batch_size), :]
        # Prepare a dictionary telling the session where to feed the minibatch.
        # The key of the dictionary is the placeholder node of the graph to be fed,
        # and the value is the numpy array to feed to it.
        feed_dict = {tf_train_dataset:batch_data, tf_train_labels:batch_labels}
        _, l, predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict)
        if step % 200 == 0:
            print("Minibatch loss at step %d: %f" % (step, l))
            print("Minibatch accuracy: %.1f%%" % accuracy(predictions, batch_labels))
            print("Validation accuracy: %.1f%%" % accuracy(valid_prediction.eval(), valid_labels))
            print("Test accuracy: %.1f%%" % accuracy(test_prediction.eval(), test_labels))

Initialized
Minibatch loss at step 0: 45.672272
Minibatch accuracy: 10.9%
Validation accuracy: 11.4%
Test accuracy: 11.5%
Minibatch loss at step 200: 27.652506
Minibatch accuracy: 58.6%
Validation accuracy: 54.4%
Test accuracy: 59.9%
Minibatch loss at step 400: 21.828186
Minibatch accuracy: 67.2%
Validation accuracy: 64.9%
Test accuracy: 71.6%
Minibatch loss at step 600: 17.386768
Minibatch accuracy: 68.8%
Validation accuracy: 69.2%
Test accuracy: 76.3%
Minibatch loss at step 800: 14.664577
Minibatch accuracy: 68.0%
Validation accuracy: 71.4%
Test accuracy: 78.8%
Minibatch loss at step 1000: 11.444107
Minibatch accuracy: 75.8%
Validation accuracy: 72.8%
Test accuracy: 80.4%
Minibatch loss at step 1200: 9.809155
Minibatch accuracy: 75.0%
Validation accuracy: 73.8%
Test accuracy: 81.6%
Minibatch loss at step 1400: 7.634736
Minibatch accuracy: 79.7%
Validation accuracy: 75.0%
Test accuracy: 82.7%
Minibatch loss at step 1600: 6.642901
Minibatch accuracy: 76.6%
Validation accuracy: 75.8%
Te

#### 1-hidden layer neural network

In [16]:
import math

batch_size = 5000
hidden1_units = 1024
n_samples = train_dataset.shape[0]

with tf.name_scope('ANN') as scope:
    # Input data
    tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_size * image_size))
    tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))
    tf_valid_dataset = tf.constant(valid_dataset)
    tf_test_dataset = tf.constant(test_dataset)
    
    # Variable
    weights1 = tf.Variable(
        tf.truncated_normal([image_size * image_size, hidden1_units],
                           stddev=1.0 / math.sqrt(float(image_size * image_size))))
    biases1 = tf.Variable(tf.zeros([hidden1_units]))
    
    # hidden layer 1
    hidden1 = tf.nn.relu(tf.matmul(tf_train_dataset, weights1) + biases1)
    
    weights2 = tf.Variable(
        tf.truncated_normal([hidden1_units, num_labels],
                           stddev=1.0 / math.sqrt(float(hidden1_units))))
    biases2 = tf.Variable(tf.zeros([num_labels]))

    # Train computation
    logits = tf.matmul(hidden1, weights2) + biases2
    loss = tf.reduce_mean(
        tf.nn.softmax_cross_entropy_with_logits(labels=tf_train_labels, logits=logits) +
        lam * tf.nn.l2_loss(weights1) + lam * tf.nn.l2_loss(weights2)) # 增加 正则项
     
    # Optimizer
    global_step = tf.Variable(0, name='global_step', trainable=False) # 保存全局训练步骤（global training step）的数值
    optimizer = tf.train.GradientDescentOptimizer(0.05).minimize(loss, global_step=global_step)
    
    # Predictions for the training
    train_prediction = tf.nn.softmax(logits)
    
    # validation
    hidden_valid = tf.nn.relu(tf.matmul(valid_dataset, weights1) + biases1)
    valid_prediction = tf.nn.softmax(tf.matmul(hidden_valid, weights2) + biases2)
    
    # test
    hidden_test = tf.nn.relu(tf.matmul(test_dataset, weights1) + biases1)
    test_prediction = tf.nn.softmax(tf.matmul(hidden_test, weights2) + biases2)

In [17]:
n_steps = 10

with tf.Session() as session:
    
    tf.global_variables_initializer().run() # 初始化变量
    print("Initialized")
    for step in range(n_steps):
        # dataset and labesl
        for i in range(n_samples // batch_size):
            # dataset and labesl
            batch_data = train_dataset[i*batch_size:(i+1)*batch_size, :]
            batch_labels = train_labels[i*batch_size:(i+1)*batch_size, :]
            # Prepare a dictionary telling the session where to feed the minibatch.
            feed_dict = {tf_train_dataset:batch_data, tf_train_labels:batch_labels}
            # Run the computations. 
            _, l, predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict)
        print("Loss at step %d: %f" % (step, l))
        print("Train accuracy: %.1f%%" % accuracy(predictions, batch_labels))
        print("Validation accuracy: %.1f%%" % accuracy(valid_prediction.eval(), valid_labels))
        print("Test accuracy: %.1f%%" % accuracy(test_prediction.eval(), test_labels))

Initialized
Loss at step 0: 4.978971
Train accuracy: 75.4%
Validation accuracy: 77.0%
Test accuracy: 84.2%
Loss at step 1: 4.616375
Train accuracy: 78.1%
Validation accuracy: 79.1%
Test accuracy: 86.4%
Loss at step 2: 4.408500
Train accuracy: 79.1%
Validation accuracy: 79.9%
Test accuracy: 87.3%
Loss at step 3: 4.242886
Train accuracy: 79.8%
Validation accuracy: 80.5%
Test accuracy: 87.9%
Loss at step 4: 4.096069
Train accuracy: 80.2%
Validation accuracy: 81.0%
Test accuracy: 88.3%
Loss at step 5: 3.960530
Train accuracy: 80.6%
Validation accuracy: 81.4%
Test accuracy: 88.5%
Loss at step 6: 3.833129
Train accuracy: 81.0%
Validation accuracy: 81.6%
Test accuracy: 88.6%
Loss at step 7: 3.712298
Train accuracy: 81.2%
Validation accuracy: 81.8%
Test accuracy: 88.8%
Loss at step 8: 3.597161
Train accuracy: 81.4%
Validation accuracy: 82.1%
Test accuracy: 89.0%
Loss at step 9: 3.487132
Train accuracy: 81.6%
Validation accuracy: 82.3%
Test accuracy: 89.1%


train_sets和valid的精确度比较平衡

---
Problem 2
---------
Let's demonstrate an extreme case of overfitting. Restrict your training data to just a few batches. What happens?

---

In [29]:
import math

batch_size = 8000
hidden1_units = 1024

with tf.name_scope('ANN') as scope:
    # Input data
    tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_size * image_size))
    tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))
    tf_valid_dataset = tf.constant(valid_dataset)
    tf_test_dataset = tf.constant(test_dataset)
    
    # Variable
    weights1 = tf.Variable(
        tf.truncated_normal([image_size * image_size, hidden1_units],
                           stddev=1.0 / math.sqrt(float(image_size * image_size))))
    biases1 = tf.Variable(tf.zeros([hidden1_units]))
    
    # hidden layer 1
    hidden1 = tf.nn.relu(tf.matmul(tf_train_dataset, weights1) + biases1)
    
    weights2 = tf.Variable(
        tf.truncated_normal([hidden1_units, num_labels],
                           stddev=1.0 / math.sqrt(float(hidden1_units))))
    biases2 = tf.Variable(tf.zeros([num_labels]))

    # Train computation
    logits = tf.matmul(hidden1, weights2) + biases2
    loss = tf.reduce_mean(
        tf.nn.softmax_cross_entropy_with_logits(labels=tf_train_labels, logits=logits))
     
    # Optimizer
    global_step = tf.Variable(0, name='global_step', trainable=False) # 保存全局训练步骤（global training step）的数值
    optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(loss, global_step=global_step)
    
    # Predictions for the training
    train_prediction = tf.nn.softmax(logits)
    
    # validation
    hidden_valid = tf.nn.relu(tf.matmul(valid_dataset, weights1) + biases1)
    valid_prediction = tf.nn.softmax(tf.matmul(hidden_valid, weights2) + biases2)
    
    # test
    hidden_test = tf.nn.relu(tf.matmul(test_dataset, weights1) + biases1)
    test_prediction = tf.nn.softmax(tf.matmul(hidden_test, weights2) + biases2)

In [30]:
num_steps = 500

with tf.Session() as session:
    
    tf.global_variables_initializer().run() # 初始化变量
    print("Initialized")
    for step in range(num_steps):
        # dataset and labesl
        batch_data = train_dataset[:batch_size, :]
        batch_labels = train_labels[:batch_size, :]
        # Prepare a dictionary telling the session where to feed the minibatch.
        feed_dict = {tf_train_dataset:batch_data, tf_train_labels:batch_labels}
        # Run the computations. 
        _, l, predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict)
        if step % 50 == 0:
            print("Loss at step %d: %f" % (step, l))
            print("Train accuracy: %.1f%%" % accuracy(predictions, batch_labels))
            print("Validation accuracy: %.1f%%" % accuracy(valid_prediction.eval(), valid_labels))
            print("Test accuracy: %.1f%%" % accuracy(test_prediction.eval(), test_labels))

Initialized
Loss at step 0: 2.299004
Train accuracy: 13.5%
Validation accuracy: 59.4%
Test accuracy: 65.0%
Loss at step 50: 0.528100
Train accuracy: 85.0%
Validation accuracy: 82.6%
Test accuracy: 89.7%
Loss at step 100: 0.431688
Train accuracy: 87.9%
Validation accuracy: 83.3%
Test accuracy: 90.5%
Loss at step 150: 0.358668
Train accuracy: 90.0%
Validation accuracy: 83.6%
Test accuracy: 90.6%
Loss at step 200: 0.298744
Train accuracy: 92.7%
Validation accuracy: 83.7%
Test accuracy: 90.5%
Loss at step 250: 0.245042
Train accuracy: 94.6%
Validation accuracy: 83.6%
Test accuracy: 90.6%
Loss at step 300: 0.198047
Train accuracy: 95.9%
Validation accuracy: 83.6%
Test accuracy: 90.7%
Loss at step 350: 0.162373
Train accuracy: 97.3%
Validation accuracy: 83.6%
Test accuracy: 90.5%
Loss at step 400: 0.128728
Train accuracy: 98.2%
Validation accuracy: 83.7%
Test accuracy: 90.6%
Loss at step 450: 0.102029
Train accuracy: 98.9%
Validation accuracy: 83.7%
Test accuracy: 90.6%


数据量少，导致train的精确度很高，但是验证集上的精确度却差很多，妥妥的过拟合

---
Problem 3
---------
Introduce Dropout on the hidden layer of the neural network. Remember: Dropout should only be introduced during training, not evaluation, otherwise your evaluation results would be stochastic as well. TensorFlow provides `nn.dropout()` for that, but you have to make sure it's only inserted during training.

What happens to our extreme overfitting case?

---

In [21]:
import math

batch_size = 8000
hidden1_units = 1024
lam = 0.005

with tf.name_scope('ANN') as scope:
    # Input data
    tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_size * image_size))
    tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))
    tf_valid_dataset = tf.constant(valid_dataset)
    tf_test_dataset = tf.constant(test_dataset)
    
    # Variable
    weights1 = tf.Variable(
        tf.truncated_normal([image_size * image_size, hidden1_units],
                           stddev=1.0 / math.sqrt(float(image_size * image_size))))
    biases1 = tf.Variable(tf.zeros([hidden1_units]))
    
    # hidden layer 1
    hidden1 = tf.nn.relu(tf.matmul(tf_train_dataset, weights1) + biases1)
    
    # 增加dropout
    keep_pro = tf.placeholder(tf.float32)
    hidden1_drop = tf.nn.dropout(hidden1, keep_pro)
    
    weights2 = tf.Variable(
        tf.truncated_normal([hidden1_units, num_labels],
                           stddev=1.0 / math.sqrt(float(hidden1_units))))
    biases2 = tf.Variable(tf.zeros([num_labels]))
    
    # Train computation
    logits = tf.matmul(hidden1_drop, weights2) + biases2
    loss = tf.reduce_mean(
        tf.nn.softmax_cross_entropy_with_logits(labels=tf_train_labels, logits=logits))
     
    # Optimizer
    global_step = tf.Variable(0, name='global_step', trainable=False) # 保存全局训练步骤（global training step）的数值
    optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(loss, global_step=global_step)
    
    # Predictions for the training
    train_prediction = tf.nn.softmax(logits)
    
    # validation
    hidden_valid = tf.nn.relu(tf.matmul(valid_dataset, weights1) + biases1)
    valid_prediction = tf.nn.softmax(tf.matmul(hidden_valid, weights2) + biases2)
    
    # test
    hidden_test = tf.nn.relu(tf.matmul(test_dataset, weights1) + biases1)
    test_prediction = tf.nn.softmax(tf.matmul(hidden_test, weights2) + biases2)

In [22]:
n_steps = 100 # 循环次数

with tf.Session() as session:
    
    tf.global_variables_initializer().run() # 初始化变量
    print("Initialized")
    for step in range(n_steps):
        # dataset and labesl
        for i in range(n_samples // batch_size):
            # dataset and labesl
            batch_data = train_dataset[i*batch_size:(i+1)*batch_size, :]
            batch_labels = train_labels[i*batch_size:(i+1)*batch_size, :]
            # Prepare a dictionary telling the session where to feed the minibatch.
            feed_dict = {tf_train_dataset:batch_data, tf_train_labels:batch_labels, keep_pro:0.5} # 0.5
            # Run the computations. 
            _, l, predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict)
        if step % 10 == 0:
            print("Loss at step %d: %f" % (step, l))
            print("Train accuracy: %.1f%%" % accuracy(predictions, batch_labels))
            print("Validation accuracy: %.1f%%" % accuracy(
                valid_prediction.eval({tf_train_dataset:batch_data, tf_train_labels:batch_labels, keep_pro:1}), valid_labels)) # 1
            print("Test accuracy: %.1f%%" % accuracy(
                test_prediction.eval({tf_train_dataset:batch_data, tf_train_labels:batch_labels, keep_pro:1}), test_labels)) # 1

Initialized
Loss at step 0: 0.719336
Train accuracy: 79.8%
Validation accuracy: 81.6%
Test accuracy: 88.6%
Loss at step 10: 0.544086
Train accuracy: 84.7%
Validation accuracy: 85.5%
Test accuracy: 92.2%
Loss at step 20: 0.489057
Train accuracy: 86.2%
Validation accuracy: 87.0%
Test accuracy: 93.2%
Loss at step 30: 0.446530
Train accuracy: 87.2%
Validation accuracy: 87.9%
Test accuracy: 93.9%
Loss at step 40: 0.422733
Train accuracy: 88.2%
Validation accuracy: 88.4%
Test accuracy: 94.4%
Loss at step 50: 0.403037
Train accuracy: 88.5%
Validation accuracy: 88.9%
Test accuracy: 94.7%
Loss at step 60: 0.379670
Train accuracy: 89.1%
Validation accuracy: 89.2%
Test accuracy: 94.9%
Loss at step 70: 0.376045
Train accuracy: 89.4%
Validation accuracy: 89.2%
Test accuracy: 95.0%
Loss at step 80: 0.363014
Train accuracy: 89.9%
Validation accuracy: 89.4%
Test accuracy: 95.2%
Loss at step 90: 0.346940
Train accuracy: 90.1%
Validation accuracy: 89.6%
Test accuracy: 95.3%


---
Problem 4
---------

Try to get the best performance you can using a multi-layer model! The best reported test accuracy using a deep network is [97.1%](http://yaroslavvb.blogspot.com/2011/09/notmnist-dataset.html?showComment=1391023266211#c8758720086795711595).

One avenue you can explore is to add multiple layers.

Another one is to use learning rate decay:

    global_step = tf.Variable(0)  # count the number of steps taken.
    learning_rate = tf.train.exponential_decay(0.5, global_step, ...)
    optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)
 
 ---


In [31]:
import math
lam = 0.002

batch_size = 10000
hidden1_units = 1024 # 第一层单元数
hidden2_units = 512 # 第二层单元数

with tf.name_scope('ANN') as scope:
    # Input data
    tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_size * image_size))
    tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))
    tf_valid_dataset = tf.constant(valid_dataset)
    tf_test_dataset = tf.constant(test_dataset)
    
    # 1 layer
    weights1 = tf.Variable(
        tf.truncated_normal([image_size * image_size, hidden1_units],
                           stddev=1.0 / math.sqrt(float(image_size * image_size))))
    biases1 = tf.Variable(tf.zeros([hidden1_units]))
    
    hidden1 = tf.nn.relu(tf.matmul(tf_train_dataset, weights1) + biases1)
    
    # 2 layer
    weights2 = tf.Variable(
        tf.truncated_normal([hidden1_units, hidden2_units],
                           stddev=1.0 / math.sqrt(float(hidden1_units))))
    biases2 = tf.Variable(tf.zeros([hidden2_units]))
    
    hidden2 = tf.nn.relu(tf.matmul(hidden1, weights2) + biases2)
    
    # 3 layer
    weights3 = tf.Variable(
        tf.truncated_normal([hidden2_units, num_labels],
                           stddev=1.0 / math.sqrt(float(hidden2_units))))
    biases3 = tf.Variable(tf.zeros([num_labels]))
    
    # 增加dropout, 改善过拟合
    keep_pro = tf.placeholder(tf.float32)
    hidden2_drop = tf.nn.dropout(hidden2, keep_pro)
    
    # Train computation
    logits = tf.matmul(hidden2_drop, weights3) + biases3
    loss = tf.reduce_mean(
        tf.nn.softmax_cross_entropy_with_logits(labels=tf_train_labels, logits=logits) + 
        lam * tf.nn.l2_loss(weights1) +
        lam * tf.nn.l2_loss(weights2) +
        lam * tf.nn.l2_loss(weights3)) # 增加 正则项
     
    # Optimizer
    global_step = tf.Variable(0)
    learning_rate = tf.train.exponential_decay(0.6, global_step, 1000, 0.98)
    optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)
    
    # Predictions for the training
    train_prediction = tf.nn.softmax(logits)
    
    # validation
    hidden_valid_1 = tf.nn.relu(tf.matmul(valid_dataset, weights1) + biases1)
    hidden_valid_2 = tf.nn.relu(tf.matmul(hidden_valid_1, weights2) + biases2)
    valid_prediction = tf.nn.softmax(tf.matmul(hidden_valid_2, weights3) + biases3)
    
    # test
    hidden_test_1 = tf.nn.relu(tf.matmul(test_dataset, weights1) + biases1)
    hidden_test_2 = tf.nn.relu(tf.matmul(hidden_test_1, weights2) + biases2)
    test_prediction = tf.nn.softmax(tf.matmul(hidden_test_2, weights3) + biases3)

In [None]:
n_steps = 200 # 循环次数
n_samples = train_dataset.shape[0]

with tf.Session() as session:
    
    tf.global_variables_initializer().run() # 初始化变量
    print("Initialized")
    for step in range(n_steps):
        for i in range(n_samples // batch_size):
            # dataset and labesl
            batch_data = train_dataset[i*batch_size:(i+1)*batch_size, :]
            batch_labels = train_labels[i*batch_size:(i+1)*batch_size, :]
            # Prepare a dictionary telling the session where to feed the minibatch.
            feed_dict = {tf_train_dataset:batch_data, tf_train_labels:batch_labels, keep_pro:0.5} # 0.5
            # Run the computations. 
            _, l, predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict)
        if step % 10 == 0:
            print("Loss at step %d: %f" % (step, l))
            print("Train accuracy: %.1f%%" % accuracy(predictions, batch_labels))
            print("Validation accuracy: %.1f%%" % accuracy(
                valid_prediction.eval({tf_train_dataset:batch_data, tf_train_labels:batch_labels, keep_pro:1}), valid_labels)) # 1
            print("Test accuracy: %.1f%%" % accuracy(
                test_prediction.eval({tf_train_dataset:batch_data, tf_train_labels:batch_labels, keep_pro:1}), test_labels)) # 1

Initialized
Loss at step 0: 2.058116
Train accuracy: 69.3%
Validation accuracy: 67.7%
Test accuracy: 73.2%
Loss at step 10: 1.265482
Train accuracy: 85.6%
Validation accuracy: 85.2%
Test accuracy: 91.9%
Loss at step 20: 0.969415
Train accuracy: 87.2%
Validation accuracy: 86.8%
Test accuracy: 93.1%
Loss at step 30: 0.792380
Train accuracy: 88.0%
Validation accuracy: 87.7%
Test accuracy: 93.9%
Loss at step 40: 0.676623
Train accuracy: 89.0%
Validation accuracy: 88.1%
Test accuracy: 94.0%
Loss at step 50: 0.600525
Train accuracy: 89.3%
Validation accuracy: 88.4%
Test accuracy: 94.4%
Loss at step 60: 0.557666
Train accuracy: 89.6%
Validation accuracy: 88.5%
Test accuracy: 94.3%
Loss at step 70: 0.519929
Train accuracy: 90.2%
Validation accuracy: 88.9%
Test accuracy: 94.7%
Loss at step 80: 0.546584
Train accuracy: 88.7%
Validation accuracy: 85.9%
Test accuracy: 91.6%
Loss at step 90: 0.482159
Train accuracy: 91.0%
Validation accuracy: 89.4%
Test accuracy: 95.0%
Loss at step 100: 0.482473
Tr