# Dropout

dropout 通过暂时禁用一部分神经元的方式,达到防止模型过拟合的目的.
使用的时候,我们有一个`keep_prob`保存所有神经元,并设置一个激活的比例.

## 用法

1. keep_prob=tf.placeholder(tf.float32)
2. L1_drop = tf.nn.dropout(L1,keep_prob) 
3. sess.run(train_step,feed_dict={x:batch_xs,y:batch_ys,keep_prob:1.0})

`keep_prob:1.0`相当于不使用 dropout,如果要使用可以设置为0~1之间的小数

## 现象分析

定义网络的时候刻意将网络定义地比较深,mnist数字分类只有60000张训练图片,输入大小也只有28x28,并不需要上千个神经元进行训练.

用这样的方式人为来制造过拟合的现象,接下来看看试验结果.

**不使用 dropout**

```
Iter 27,Testing Accuracy 0.9731,Training Accuracy 0.9954364
Iter 28,Testing Accuracy 0.9733,Training Accuracy 0.9955091
Iter 29,Testing Accuracy 0.9731,Training Accuracy 0.9955636
Iter 30,Testing Accuracy 0.9732,Training Accuracy 0.9956545
```
可以看到训练集上的正确率很高到了 99% 以上,测试集准确率为97%.看上去也还好,估计是由于网络本身和任务本身并不复杂.

**使用 dropout**

```
Iter 27,Testing Accuracy 0.9692,Training Accuracy 0.9762545
Iter 28,Testing Accuracy 0.9689,Training Accuracy 0.9759273
Iter 29,Testing Accuracy 0.9697,Training Accuracy 0.97687274
Iter 30,Testing Accuracy 0.9713,Training Accuracy 0.9779091
```
和前面训练集的准确率相比降低到了97.8%,可以降低过拟合.


In [1]:
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

In [2]:
# 载入数据集
mnist = input_data.read_data_sets("../dataset/mnist", one_hot=True)

W0805 10:39:07.844691 140401451972352 deprecation.py:323] From <ipython-input-2-c3538dc343e9>:2: read_data_sets (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
W0805 10:39:07.845368 140401451972352 deprecation.py:323] From /home/aseit/anaconda3/envs/tensorflow1/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py:260: maybe_download (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Please write your own downloading logic.
W0805 10:39:07.845958 140401451972352 deprecation.py:323] From /home/aseit/anaconda3/envs/tensorflow1/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py:262: extract_images (from tensorflow.contrib.learn.python.learn.datasets.mnist) is depreca

Extracting ../dataset/mnist/train-images-idx3-ubyte.gz
Extracting ../dataset/mnist/train-labels-idx1-ubyte.gz
Extracting ../dataset/mnist/t10k-images-idx3-ubyte.gz
Extracting ../dataset/mnist/t10k-labels-idx1-ubyte.gz


In [3]:
#每个批次的大小
batch_size = 100
#计算一共有多少个批次
n_batch = mnist.train.num_examples // batch_size

#定义两个placeholder
x = tf.placeholder(tf.float32,[None,784])
y = tf.placeholder(tf.float32,[None,10])
keep_prob=tf.placeholder(tf.float32)

#创建一个简单的神经网络
W1 = tf.Variable(tf.truncated_normal([784,2000],stddev=0.1))
b1 = tf.Variable(tf.zeros([2000])+0.1)
L1 = tf.nn.tanh(tf.matmul(x,W1)+b1)
L1_drop = tf.nn.dropout(L1,keep_prob) 

W2 = tf.Variable(tf.truncated_normal([2000,2000],stddev=0.1))
b2 = tf.Variable(tf.zeros([2000])+0.1)
L2 = tf.nn.tanh(tf.matmul(L1_drop,W2)+b2)
L2_drop = tf.nn.dropout(L2,keep_prob) 

W3 = tf.Variable(tf.truncated_normal([2000,1000],stddev=0.1))
b3 = tf.Variable(tf.zeros([1000])+0.1)
L3 = tf.nn.tanh(tf.matmul(L2_drop,W3)+b3)
L3_drop = tf.nn.dropout(L3,keep_prob) 

W4 = tf.Variable(tf.truncated_normal([1000,10],stddev=0.1))
b4 = tf.Variable(tf.zeros([10])+0.1)
prediction = tf.nn.softmax(tf.matmul(L3_drop,W4)+b4)

#二次代价函数
# loss = tf.reduce_mean(tf.square(y-prediction))
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y,logits=prediction))
#使用梯度下降法
train_step = tf.train.GradientDescentOptimizer(0.2).minimize(loss)

#初始化变量
init = tf.global_variables_initializer()

#结果存放在一个布尔型列表中
correct_prediction = tf.equal(tf.argmax(y,1),tf.argmax(prediction,1))#argmax返回一维张量中最大的值所在的位置
#求准确率
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))

with tf.Session() as sess:
    sess.run(init)
    for epoch in range(31):
        for batch in range(n_batch):
            batch_xs,batch_ys =  mnist.train.next_batch(batch_size)
            sess.run(train_step,feed_dict={x:batch_xs,y:batch_ys,keep_prob:1.0})
        
        test_acc = sess.run(accuracy,feed_dict={x:mnist.test.images,y:mnist.test.labels,keep_prob:1.0})
        train_acc = sess.run(accuracy,feed_dict={x:mnist.train.images,y:mnist.train.labels,keep_prob:1.0})
        print("Iter " + str(epoch) + ",Testing Accuracy " + str(test_acc) +",Training Accuracy " + str(train_acc))

W0805 10:39:18.253728 140401451972352 deprecation.py:506] From <ipython-input-3-0cfecddb3bba>:15: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
W0805 10:39:18.311351 140401451972352 deprecation.py:323] From <ipython-input-3-0cfecddb3bba>:33: softmax_cross_entropy_with_logits (from tensorflow.python.ops.nn_ops) is deprecated and will be removed in a future version.
Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See `tf.nn.softmax_cross_entropy_with_logits_v2`.



Iter 0,Testing Accuracy 0.9467,Training Accuracy 0.95614547
Iter 1,Testing Accuracy 0.9582,Training Accuracy 0.9743091
Iter 2,Testing Accuracy 0.9632,Training Accuracy 0.98212725
Iter 3,Testing Accuracy 0.9654,Training Accuracy 0.98605454
Iter 4,Testing Accuracy 0.9662,Training Accuracy 0.9888
Iter 5,Testing Accuracy 0.9673,Training Accuracy 0.9902727
Iter 6,Testing Accuracy 0.9681,Training Accuracy 0.9912182
Iter 7,Testing Accuracy 0.9688,Training Accuracy 0.9918182
Iter 8,Testing Accuracy 0.97,Training Accuracy 0.99225456
Iter 9,Testing Accuracy 0.9704,Training Accuracy 0.9926
Iter 10,Testing Accuracy 0.9695,Training Accuracy 0.99305457
Iter 11,Testing Accuracy 0.9708,Training Accuracy 0.9934
Iter 12,Testing Accuracy 0.9702,Training Accuracy 0.99365455
Iter 13,Testing Accuracy 0.9718,Training Accuracy 0.99381816
Iter 14,Testing Accuracy 0.9708,Training Accuracy 0.99398184
Iter 15,Testing Accuracy 0.9722,Training Accuracy 0.9941091
Iter 16,Testing Accuracy 0.9719,Training Accuracy 0.9

In [4]:
#每个批次的大小
batch_size = 100
#计算一共有多少个批次
n_batch = mnist.train.num_examples // batch_size

#定义两个placeholder
x = tf.placeholder(tf.float32,[None,784])
y = tf.placeholder(tf.float32,[None,10])
keep_prob=tf.placeholder(tf.float32)

#创建一个简单的神经网络
W1 = tf.Variable(tf.truncated_normal([784,2000],stddev=0.1))
b1 = tf.Variable(tf.zeros([2000])+0.1)
L1 = tf.nn.tanh(tf.matmul(x,W1)+b1)
L1_drop = tf.nn.dropout(L1,keep_prob) 

W2 = tf.Variable(tf.truncated_normal([2000,2000],stddev=0.1))
b2 = tf.Variable(tf.zeros([2000])+0.1)
L2 = tf.nn.tanh(tf.matmul(L1_drop,W2)+b2)
L2_drop = tf.nn.dropout(L2,keep_prob) 

W3 = tf.Variable(tf.truncated_normal([2000,1000],stddev=0.1))
b3 = tf.Variable(tf.zeros([1000])+0.1)
L3 = tf.nn.tanh(tf.matmul(L2_drop,W3)+b3)
L3_drop = tf.nn.dropout(L3,keep_prob) 

W4 = tf.Variable(tf.truncated_normal([1000,10],stddev=0.1))
b4 = tf.Variable(tf.zeros([10])+0.1)
prediction = tf.nn.softmax(tf.matmul(L3_drop,W4)+b4)

#二次代价函数
# loss = tf.reduce_mean(tf.square(y-prediction))
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y,logits=prediction))
#使用梯度下降法
train_step = tf.train.GradientDescentOptimizer(0.2).minimize(loss)

#初始化变量
init = tf.global_variables_initializer()

#结果存放在一个布尔型列表中
correct_prediction = tf.equal(tf.argmax(y,1),tf.argmax(prediction,1))#argmax返回一维张量中最大的值所在的位置
#求准确率
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))

with tf.Session() as sess:
    sess.run(init)
    for epoch in range(31):
        for batch in range(n_batch):
            batch_xs,batch_ys =  mnist.train.next_batch(batch_size)
            sess.run(train_step,feed_dict={x:batch_xs,y:batch_ys,keep_prob:0.7})
        
        test_acc = sess.run(accuracy,feed_dict={x:mnist.test.images,y:mnist.test.labels,keep_prob:1.0})
        train_acc = sess.run(accuracy,feed_dict={x:mnist.train.images,y:mnist.train.labels,keep_prob:1.0})
        print("Iter " + str(epoch) + ",Testing Accuracy " + str(test_acc) +",Training Accuracy " + str(train_acc))

Iter 0,Testing Accuracy 0.9178,Training Accuracy 0.9146909
Iter 1,Testing Accuracy 0.929,Training Accuracy 0.92932725
Iter 2,Testing Accuracy 0.9384,Training Accuracy 0.9358909
Iter 3,Testing Accuracy 0.9395,Training Accuracy 0.94014543
Iter 4,Testing Accuracy 0.9426,Training Accuracy 0.94463634
Iter 5,Testing Accuracy 0.9453,Training Accuracy 0.94796365
Iter 6,Testing Accuracy 0.9481,Training Accuracy 0.9521818
Iter 7,Testing Accuracy 0.9495,Training Accuracy 0.95376366
Iter 8,Testing Accuracy 0.9517,Training Accuracy 0.9557091
Iter 9,Testing Accuracy 0.9534,Training Accuracy 0.95854545
Iter 10,Testing Accuracy 0.9555,Training Accuracy 0.9603091
Iter 11,Testing Accuracy 0.9584,Training Accuracy 0.9615091
Iter 12,Testing Accuracy 0.958,Training Accuracy 0.9626909
Iter 13,Testing Accuracy 0.9583,Training Accuracy 0.9638364
Iter 14,Testing Accuracy 0.9609,Training Accuracy 0.96590906
Iter 15,Testing Accuracy 0.9612,Training Accuracy 0.96763635
Iter 16,Testing Accuracy 0.9629,Training Acc