本WorkShop包含的代码可以在MNIST上训练完全连接的深层神经网络。 以前笔记本的主要变化是：
*我们已经从线性分类器切换到深度神经网络。

*我们添加了代码来显示TensorBoard中的图形和摘要数据。

*我们正在使用AdamOptimizer而不是使用GradientDescentOptimizer。

*我们正在使用Dropout。

一个重要的要点：注意训练模型的代码与以前的Workshop相同，尽管模型更复杂。

通过运行单元或者取消注释代码来尝试本Workshop

当你完成这个笔记本，使用TensorBoard来查看结果。

虽然这是一个简单的模型，但我们可以在MNIST上实现> 97％的准确度。

In [22]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import math
import os

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

### 初始化，数据准备

In [23]:
tf.reset_default_graph()
sess = tf.Session()

In [24]:
# 定义输出的位置
LOGDIR = './graphs' 

In [25]:
# 获取数据
mnist = input_data.read_data_sets('/tmp/data', one_hot=True)

Extracting /tmp/data\train-images-idx3-ubyte.gz
Extracting /tmp/data\train-labels-idx1-ubyte.gz
Extracting /tmp/data\t10k-images-idx3-ubyte.gz
Extracting /tmp/data\t10k-labels-idx1-ubyte.gz


In [26]:
# 隐藏层的神经元数量
HIDDEN1_SIZE = 500
HIDDEN2_SIZE = 250

NUM_CLASSES = 10
NUM_PIXELS = 28 * 28

#  训练的次数
#  每个训练批次的时间
TRAIN_STEPS = 2000
BATCH_SIZE = 100

# 学习速率
# notebook, and a new optimizer
LEARNING_RATE = 0.001

### 定义模型

In [27]:
# 
with tf.name_scope('input'):
    images = tf.placeholder(tf.float32, [None, NUM_PIXELS], name="pixels")
    labels = tf.placeholder(tf.float32, [None, NUM_CLASSES], name="labels")

In [28]:
# Function to create a fully connected layer
def fc_layer(input, size_out, name="fc", activation=None):
    with tf.name_scope(name):
        size_in = int(input.shape[1])
        w = tf.Variable(tf.truncated_normal([size_in, size_out], stddev=0.1), name="weights")
        b = tf.Variable(tf.constant(0.1, shape=[size_out]), name="bias")
        wx_plus_b = tf.matmul(input, w) + b
        if activation: return activation(wx_plus_b)
        return wx_plus_b
    
# The way we initialize variables has an affect on how quickly 
# training converges. We may explore with different strategies later.
# w = tf.Variable(tf.truncated_normal(shape=[size_in, size_out], stddev=1.0 / math.sqrt(float(size_in))))

In [29]:
# 定义模型

# First, we'll create two fully connected layers, with ReLU activations
#首先，我们定义俩层全连接层,并使用激励函数
fc1 = fc_layer(images, HIDDEN1_SIZE, "fc1", activation=tf.nn.relu)
fc2 = fc_layer(fc1, HIDDEN2_SIZE, "fc2", activation=tf.nn.relu)

# Next, we'll apply Dropout to the second layer
# This can help prevent overfitting, and I've added it here
# for illustration. You can comment this out, if you like.

dropped = tf.nn.dropout(fc2, keep_prob=0.9)

# Finally, we'll calculate logists. This will be
# the input to our Softmax function. Notice we 
# don't apply an activation at this layer.
# If you've commented out the dropout layer,
# switch the input here to 'fc2'.
y = fc_layer(dropped, NUM_CLASSES, name="output")

In [30]:
# 定义损失函数与优化器
with tf.name_scope("loss"):
    loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=y, labels=labels))
    tf.summary.scalar('loss', loss)

with tf.name_scope("optimizer"):
    # Whereas in the previous notebook we used a vanilla GradientDescentOptimizer
    # here, we're using Adam. This is a single line of code change, and more
    # importantly, TensorFlow will still automatically analyze our graph
    # and determine how to adjust the variables to decrease the loss.
    train = tf.train.AdamOptimizer(LEARNING_RATE).minimize(loss)

In [31]:
# 
with tf.name_scope("evaluation"):
    # these there lines are identical to the previous notebook.
    correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(labels, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    tf.summary.scalar('accuracy', accuracy)    

In [32]:
# Set up logging.
# We'll use a second FileWriter to summarize accuracy on
# the test set. This will let us display it nicely in TensorBoard.
train_writer = tf.summary.FileWriter(os.path.join(LOGDIR, "train"))
train_writer.add_graph(sess.graph)
test_writer = tf.summary.FileWriter(os.path.join(LOGDIR, "test"))
summary_op = tf.summary.merge_all()

In [33]:
sess.run(tf.global_variables_initializer())

In [34]:
for step in range(TRAIN_STEPS):
    batch_xs, batch_ys = mnist.train.next_batch(BATCH_SIZE)
    summary_result, _ = sess.run([summary_op, train], 
                                    feed_dict={images: batch_xs, labels: batch_ys})

    train_writer.add_summary(summary_result, step)
    train_writer.add_run_metadata(tf.RunMetadata(), 'step%03d' % step)
    
    # calculate accuracy on the test set, every 100 steps.
    # we're using the entire test set here, so this will be a bit slow
    if step % 100 == 0:
        summary_result, acc = sess.run([summary_op, accuracy], 
                                       feed_dict={images: mnist.test.images, 
                                                  labels: mnist.test.labels})
        test_writer.add_summary(summary_result, step)
        test_writer.add_run_metadata(tf.RunMetadata(), 'step%03d' % step)
        print ("test accuracy: %f at step %d" % (acc, step))


print("Accuracy %f" % sess.run(accuracy, 
                               feed_dict={images: mnist.test.images,
                                          labels: mnist.test.labels}))
train_writer.close()
test_writer.close()

test accuracy: 0.216100 at step 0
test accuracy: 0.924500 at step 100
test accuracy: 0.936900 at step 200
test accuracy: 0.956500 at step 300
test accuracy: 0.955400 at step 400
test accuracy: 0.958800 at step 500
test accuracy: 0.967100 at step 600
test accuracy: 0.964900 at step 700
test accuracy: 0.967200 at step 800
test accuracy: 0.970600 at step 900
test accuracy: 0.970000 at step 1000
test accuracy: 0.965500 at step 1100
test accuracy: 0.974000 at step 1200
test accuracy: 0.972600 at step 1300
test accuracy: 0.973100 at step 1400
test accuracy: 0.970400 at step 1500
test accuracy: 0.972700 at step 1600
test accuracy: 0.977600 at step 1700
test accuracy: 0.977400 at step 1800
test accuracy: 0.972900 at step 1900
Accuracy 0.977300
