## 丢弃法（Dropout）— 使用Gluon
本章介绍如何使用Gluon在训练和测试深度学习模型中使用丢弃法 (Dropout)。

定义模型并添加丢弃层
有了Gluon，我们模型的定义工作变得简单了许多。我们只需要在全连接层后添加gluon.nn.Dropout层并指定元素丢弃概率。一般情况下，我们推荐把 更靠近输入层的元素丢弃概率设的更小一点。这个试验中，我们把第一层全连接后的元素丢弃概率设为0.2，把第二层全连接后的元素丢弃概率设为0.5。

实际中，此文中用tensorflow来代替，例如tf.nn.dropout

In [1]:
import tensorflow as tf

num_inputs = 28*28
num_outputs = 10

num_hidden1 = 256
num_hidden2 = 256

#与gluon相反，dropout_prob越小代表抛弃的节点越多
drop_prob1 = 1
drop_prob2 = 1

def net(X):
    h1 = tf.contrib.layers.fully_connected(X, num_hidden1, activation_fn=tf.nn.relu, scope='hidden_1')
    h1 = tf.nn.dropout(h1, drop_prob1)
    h2 = tf.contrib.layers.fully_connected(h1, num_hidden2, activation_fn=tf.nn.relu, scope='hidden_2')
    h2 = tf.nn.dropout(h2, drop_prob2)
    h3 = tf.contrib.layers.fully_connected(h2, num_outputs, activation_fn=None, scope='output')
    return h3


### 读取数据并训练
这跟之前没什么不同。

In [2]:
import sys

sys.path.append('../utils')
import utils

data_dir = '../data/fashion_mnist'
train_images, train_labels, test_images, test_labels = utils.load_data_fashion_mnist(data_dir, one_hot=True)
print train_images.shape
print train_labels.shape

from tensorflow.contrib.learn.python.learn.datasets.mnist import DataSet
train_dataset = DataSet(train_images, train_labels, one_hot=True)

Extracting ../data/fashion_mnist/train-images-idx3-ubyte.gz
Extracting ../data/fashion_mnist/train-labels-idx1-ubyte.gz
Extracting ../data/fashion_mnist/t10k-images-idx3-ubyte.gz
Extracting ../data/fashion_mnist/t10k-labels-idx1-ubyte.gz
(60000, 28, 28, 1)
(60000, 10)


In [3]:
import numpy as np
learning_rate = 1e-1
max_steps = 10000
batch_size = 32
train_loss = 0.0
train_acc = 0.0


tf.reset_default_graph()

input_placeholder = tf.placeholder(tf.float32, [None, num_inputs])
gt_placeholder = tf.placeholder(tf.int64, [None, num_outputs])
logits = net(input_placeholder)
loss = tf.losses.softmax_cross_entropy(logits=logits,  onehot_labels=gt_placeholder)

acc = utils.accuracy(logits , gt_placeholder)
var_list = tf.trainable_variables()
for var in var_list:
    print var.op.name
train_op = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
init = tf.global_variables_initializer()
sess = tf.InteractiveSession()
sess.run(init)


for step in range(max_steps):
    data, label = train_dataset.next_batch(batch_size)
    data = np.reshape(data, (batch_size, num_inputs))
    feed_dict = {input_placeholder: data.reshape((-1, num_inputs)), gt_placeholder: label}
    loss_, acc_, _ = sess.run([loss, acc, train_op], feed_dict=feed_dict)
    if step % 1000 == 0:
        print 'step %d, train loss %f' % (step, loss_)
        print 'step %d, train acc %f' % (step, acc_)
test_acc = sess.run(acc, feed_dict={input_placeholder: np.squeeze(test_images).reshape((-1, num_inputs)) / 255.0 , gt_placeholder: test_labels})
print 'step %d, test acc %f' % (step, test_acc)


Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See tf.nn.softmax_cross_entropy_with_logits_v2.

hidden_1/weights
hidden_1/biases
hidden_2/weights
hidden_2/biases
output/weights
output/biases
step 0, train loss 2.389092
step 0, train acc 0.062500
step 1000, train loss 0.294750
step 1000, train acc 0.906250
step 2000, train loss 0.467379
step 2000, train acc 0.843750
step 3000, train loss 0.101088
step 3000, train acc 1.000000
step 4000, train loss 0.423964
step 4000, train acc 0.875000
step 5000, train loss 0.294006
step 5000, train acc 0.906250
step 6000, train loss 0.416803
step 6000, train acc 0.812500
step 7000, train loss 0.111409
step 7000, train acc 0.968750
step 8000, train loss 0.302880
step 8000, train acc 0.843750
step 9000, train loss 0.109781
step 9000, train acc 0.968750
step 9999, test acc 0.875800
