# Dropout与优化器

Dropout类似于bagging的思想，用来防止模型的过拟合现象，不同的优化器对模型收敛的影响不同

In [1]:
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

In [2]:
mnist = input_data.read_data_sets("MNIST",one_hot=True)

Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
Instructions for updating:
Please write your own downloading logic.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting MNIST\train-images-idx3-ubyte.gz
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting MNIST\train-labels-idx1-ubyte.gz
Instructions for updating:
Please use tf.one_hot on tensors.
Extracting MNIST\t10k-images-idx3-ubyte.gz
Extracting MNIST\t10k-labels-idx1-ubyte.gz
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.


In [3]:
batch_size = 50
n_batchs = mnist.train.num_examples // batch_size

In [4]:
n_batchs

1100

## 定义输入层

两个数据的placeholder，一个dropout的参数

In [5]:
x = tf.placeholder(shape=[None, 784], dtype=tf.float32)
y = tf.placeholder(shape=[None, 10], dtype=tf.float32)
keep_prob = tf.placeholder(tf.float32)

In [6]:
x,y,keep_prob

(<tf.Tensor 'Placeholder:0' shape=(?, 784) dtype=float32>,
 <tf.Tensor 'Placeholder_1:0' shape=(?, 10) dtype=float32>,
 <tf.Tensor 'Placeholder_2:0' shape=<unknown> dtype=float32>)

## 定义隐藏层

In [7]:
w1 = tf.Variable(tf.zeros([784,1024]))
a1 = tf.nn.sigmoid(tf.matmul(x,w1))
o1 = tf.nn.dropout(a1,keep_prob)

In [8]:
w1,a1,o1

(<tf.Variable 'Variable:0' shape=(784, 1024) dtype=float32_ref>,
 <tf.Tensor 'Sigmoid:0' shape=(?, 1024) dtype=float32>,
 <tf.Tensor 'dropout/mul:0' shape=(?, 1024) dtype=float32>)

In [9]:
w2 = tf.Variable(tf.zeros([1024,512]))
a2 = tf.nn.sigmoid(tf.matmul(o1,w2))
o2 = tf.nn.dropout(a2,keep_prob)

In [10]:
w2,a2,o2

(<tf.Variable 'Variable_1:0' shape=(1024, 512) dtype=float32_ref>,
 <tf.Tensor 'Sigmoid_1:0' shape=(?, 512) dtype=float32>,
 <tf.Tensor 'dropout_1/mul:0' shape=(?, 512) dtype=float32>)

In [11]:
w3 = tf.Variable(tf.zeros([512,128]))
a3 = tf.nn.sigmoid(tf.matmul(o2,w3))
o3 = tf.nn.dropout(a3,keep_prob)

In [12]:
w3,a3,o3

(<tf.Variable 'Variable_2:0' shape=(512, 128) dtype=float32_ref>,
 <tf.Tensor 'Sigmoid_2:0' shape=(?, 128) dtype=float32>,
 <tf.Tensor 'dropout_2/mul:0' shape=(?, 128) dtype=float32>)

## 定义输出层

In [13]:
w4 = tf.Variable(tf.zeros([128, 10]))
prediction = tf.nn.softmax(tf.matmul(o3,w4))

In [14]:
w4, prediction

(<tf.Variable 'Variable_3:0' shape=(128, 10) dtype=float32_ref>,
 <tf.Tensor 'Softmax:0' shape=(?, 10) dtype=float32>)

## 定义损失函数和优化器

In [15]:
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=y))
train_step = tf.train.AdamOptimizer(0.001).minimize(loss)

Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See `tf.nn.softmax_cross_entropy_with_logits_v2`.



## 计算正确率

In [16]:
correct_prediction = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

In [17]:
correct_prediction, accuracy

(<tf.Tensor 'Equal:0' shape=(?,) dtype=bool>,
 <tf.Tensor 'Mean_1:0' shape=() dtype=float32>)

## 初始化变量

In [18]:
import os

os.environ["CUDA_VISIBLE_DEVICES"] = "-1"

In [19]:
init = tf.global_variables_initializer()

In [20]:
with tf.Session() as sess:
    sess.run(init)
    for epoch in range(50):
        for batch in range(n_batchs):
            batch_x, batch_y = mnist.train.next_batch(batch_size)
            sess.run([train_step], feed_dict={x:batch_x, y:batch_y, keep_prob:0.5})
        acc, loss_value = sess.run([accuracy, loss], feed_dict= {x:mnist.test.images, y: mnist.test.labels, keep_prob:1.0})
        print("Iter: ", epoch, "Loss: ", loss_value, "Accuracy: ", acc)

Iter:  0 Loss:  2.1350589 Accuracy:  0.318
Iter:  1 Loss:  2.0367603 Accuracy:  0.4641
Iter:  2 Loss:  1.600498 Accuracy:  0.9046
Iter:  3 Loss:  1.5266411 Accuracy:  0.9355
Iter:  4 Loss:  1.5152434 Accuracy:  0.946
Iter:  5 Loss:  1.5085855 Accuracy:  0.9529
Iter:  6 Loss:  1.5073667 Accuracy:  0.9536
Iter:  7 Loss:  1.5028167 Accuracy:  0.9585
Iter:  8 Loss:  1.4998456 Accuracy:  0.9617
Iter:  9 Loss:  1.5011758 Accuracy:  0.9602
Iter:  10 Loss:  1.4959947 Accuracy:  0.9654
Iter:  11 Loss:  1.4951199 Accuracy:  0.9661
Iter:  12 Loss:  1.4946299 Accuracy:  0.9668
Iter:  13 Loss:  1.4935352 Accuracy:  0.9677
Iter:  14 Loss:  1.4922739 Accuracy:  0.9688
Iter:  15 Loss:  1.4937623 Accuracy:  0.9669
Iter:  16 Loss:  1.4897847 Accuracy:  0.9718
Iter:  17 Loss:  1.4907577 Accuracy:  0.9704
Iter:  18 Loss:  1.490104 Accuracy:  0.9707
Iter:  19 Loss:  1.4901439 Accuracy:  0.9712
Iter:  20 Loss:  1.4893572 Accuracy:  0.9717
Iter:  21 Loss:  1.4894102 Accuracy:  0.9715
Iter:  22 Loss:  1.48695