## Convolutions in Tensorflow

在MNIST中运用CNN

(未使用tf.layer，比较麻烦的定义方式，但是便于理解)

用两个卷积层，每层附带一个relu层和一个最大池化层，最后两个全连接层。卷积层的步幅是[1,1,1,1]

<img src="./picture/mnist_convnet.png">

In [1]:
""" Using convolutional net on MNIST dataset of handwritten digits
MNIST dataset: http://yann.lecun.com/exdb/mnist/
CS 20: "TensorFlow for Deep Learning Research"
cs20.stanford.edu
Chip Huyen (chiphuyen@cs.stanford.edu)
Lecture 07
"""
import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
import time 

import tensorflow as tf
import numpy as np
import utils

### Convolutional layer

卷积层和ReLU层通常放在一起定义，以下先定义一个通用的卷积层函数

#### tensorflow中的卷积操作算子：  
tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=True, data_format='NHWC', dilations=[1, 1, 1, 1],          name=None)  
各参数含义：  
Input : Batch size(N) * Height(H) * Width(W) * Channels(C)  
Filter: Height * Width * Input Channels * Output Channels (e.g. [5, 5, 3, 64])  
Strides: 4 element 1-D tensor, strides in each direction (often [1, 1, 1, 1] or [1, 2, 2, 1])  
Padding:  A string from: "SAME", "VALID"  
Dilations(膨胀系数？？): Defaults to [1, 1, 1, 1]. 1-D tensor of length 4. The dilation factor for each dimension of input. If set to k > 1, there will be k-1 skipped cells between each filter element on that dimension.    Dilations in the batch and depth dimensions must be 1.  
Data_format: default to NHWC  


tf.get_variable():  
如果已经创建的变量对象，就把那个对象返回，如果没有创建变量对象的话，就创建一个新的

In [2]:
def conv_relu(inputs, filters, k_size, stride, padding, scope_name):
    with tf.variable_scope(scope_name, reuse=tf.AUTO_REUSE) as scope:
        in_channels = inputs.shape[-1]
        kernel = tf.get_variable('kernel', [k_size, k_size, in_channels, filters],
                                initializer=tf.random_normal_initializer())
        biases = tf.get_variable('biases', [filters], initializer=tf.random_normal_initializer())
        conv = tf.nn.conv2d(inputs, kernel,strides=[1,stride, stride, 1], padding= padding)
    return tf.nn.relu(conv + biases, name=scope.name)

很重要的一点是我们要一直明确我们的输出规模，规模的定义公式：
		 $$(W-F + 2P) / S + 1$$
其中：
    the input volume size (W)  
    the receptive field size of filter (F)  
    the stride with which they are applied (S)  
    the amount of zero padding used (P) on the border  
例：输入为$7*7$，filter为$3*3$，步幅为1，填充为0，则输出为$5*5$($（7-3+2*0）/1+1 = 5$）

### Pooling Layer 池化层

池化层是一个下采样过程，作用是减少维度并尽可能保证特征。最常用的池化方式是the max pooling。

在tensorflow中，有专门的池化算子tf.nn.max_pool，同样的先创建池化层函数

tf.nn.max_pool(
    value,
    ksize,
    strides,
    padding,
    data_format='NHWC',
    name=None
)  
Args:  
value: A 4-D Tensor of the format specified by data_format.   
ksize: A 1-D int Tensor of 4 elements. The size of the window for each dimension of the input tensor.  
strides: A 1-D int Tensor of 4 elements. The stride of the sliding window for each dimension of the input tensor.  
padding: A string, either 'VALID' or 'SAME'. The padding algorithm. See the comment here  
data_format: A string. 'NHWC', 'NCHW' and 'NCHW_VECT_C' are supported.  
name: Optional name for the operation.  
Returns:  
A Tensor of format specified by data_format. The max pooled output tensor.   


In [3]:
def maxpool(inputs, ksize, stride, padding='VALID', scope_name='pool'):
    with tf.variable_scope(scope_name, reuse=tf.AUTO_REUSE) as scope:
        pool = tf.nn.max_pool(inputs,
                             ksize=[1, ksize, ksize, 1], 
                             strides=[1, ksize, ksize,1], 
                             padding=padding)
    return pool

同样的，我们关注输出的规模，公式如下：
		$$(W-K + 2P) / S + 1$$
其中：  
the input volume size (W)  
the pooling size (K)  
the stride with which they are applied (S)  
the amount of zero padding used (P) on the border  

## Fully connected layer
就。最常见的全连接层。  

fc = tf.matmul(pool2, w) + b

In [4]:
def fully_connected(inputs, out_dim, scope_name='fc'):
    with tf.variable_scope(scope_name, reuse=tf.AUTO_REUSE) as scope:
        in_dim = inputs.shape[-1]
        w = tf.get_variable('weights', [in_dim, out_dim], 
                           initializer=tf.truncated_normal_initializer())
        b = tf.get_variable('biases', [out_dim],
                           initializer=tf.constant_initializer(0.0))
        out = tf.matmul(inputs, w) + b
    return out

### Dropout

tensor中的数值以keep_prob的概率保留。其他置0。  
tf.nn.dropout(  
    x,  
    keep_prob,  
    noise_shape=None,  
    seed=None,  
    name=None  
)  
Args:  
x: A floating point tensor.  
keep_prob: A scalar Tensor with the same type as x. The probability that each element is kept.  
noise_shape: A 1-D Tensor of type int32, representing the shape for randomly generated keep/drop flags.  
seed: A Python integer. Used to create random seeds. See tf.set_random_seed for behavior.  
name: A name for this operation (optional).  
Returns:  
A Tensor of the same shape of x.  

## 整合模型

In [5]:
class ConvNet(object):
    #Step 1 : set 
    def _init_(self):
        self.lr = 0.001
        self.batch_size = 128
        self.keep_prob = tf.constant(0.75)
        self.gstep = tf.Variable(0, dtype=tf.int32,
                                trainable=False, name='global_step')
        self.n_classes = 10
        self.skip_step = 20
        self.n_test = 10000
        self.training = True
    
    #Step 2 : input data
    def get_data(self):
        with tf.name_scope('data'):
            #from tensorflow.examples.tutorials.mnist import input_data
            #mnist = input_data.read_data_sets('/MNIST_DATA', one_hot = True)
            train_data, test_data = utils.get_mnist_dataset(self.batch_size)
            
            iterator = tf.data.Iterator.from_structure(train_data.output_types, 
                                                      train_data.output_shapes)
            img, self.label = iterator.get_next()
             # reshape the image to make it work with tf.nn.conv2d
            self.img = tf.reshape(img, shape=[-1, 28, 28, 1])
            
             # initializer for train_data
            self.train_init = iterator.make_initializer(train_data) 
             # initializer for test_data
            self.test_init = iterator.make_initializer(test_data) 
            
    #Step 3 : build the model
    def inference(self):
        conv1 = conv_relu(inputs=self.img, 
                         filters=32, 
                         k_size = 5, 
                         stride=1, 
                         padding='SAME', 
                         scope_name='conv1')
        pool1 = maxpool(inputs=conv1,
                        ksize=2, 
                        stride=2, 
                        padding='VALID', 
                        scope_name='pool')
        conv2 = conv_relu(inputs=pool1,
                        filters=64,
                        k_size=5,
                        stride=1,
                        padding='SAME',
                        scope_name='conv2')
        pool2 = maxpool(conv2, 2, 2, 'VALID', 'pool2')
        #将pool2展开，以便fc层计算
        feature_dim = pool2.shape[1] * pool2.shape[2] * pool2.shape[3]
        pool2 = tf.reshape(pool2, [-1, feature_dim])
        
        fc = fully_connected(inputs=pool2, 
                            out_dim=1024, 
                            scope_name='fc')
        #为防止过拟合，做一次dropout
        dropout = tf.nn.dropout(x=tf.nn.relu(fc), 
                               keep_prob= self.keep_prob, 
                                name='relu_dropout')
        #第二层fc
        self.logits = fully_connected(dropout, self.n_classes, 'logits')
        
    #Step 4 : define the loss function
    #define loss function
    #use softmax cross entropy with logits as the loss function
    #compute mean cross entropy, softmax is applied internally
    def loss(self):
        with tf.name_scope('loss'):
            entropy = tf.nn.softmax_cross_entropy_with_logits(labels=self.label, 
                                                             logits=self.logits)
            self.loss = tf.reduce_mean(entropy, name='loss')
            
    #Step 5 : difine optimizer
    #using Adam Gradient Descent to minimize cost
    def optimize(self):
        self.opt = tf.train.AdamOptimizer(self.lr).minimize(self.loss, global_step=self.gstep)
        
            
    #Step 6 : 求accuracy
    #Count the number of right predictions in a batch
    def eval(self):
        with tf.name_scope('predict'):
            preds = tf.nn.softmax(self.logits)
            correct_preds = tf.equal(tf.argmax(preds, 1), tf.argmax(self.label, 1))
            self.accuracy = tf.reduce_sum(tf.cast(correct_preds, tf.float32))
            
    #Step 7 : visualize with tensorboard
    #Create summaries to write on TensorBoard
    def summary(self):
        with tf.name_scope('summaries'):
            tf.summary.scalar('loss', self.loss)
            tf.summary.scalar('accuracy', self.accuracy)
            tf.summary.histogram('histogram', self.loss)
            self.summary_op = tf.summary.merge_all()
            
        
    ###组合啦
    def build(self):
        self._init_()
        self.get_data()
        self.inference()
        self.loss()
        self.optimize()
        self.eval()
        self.summary()
 

    #训练开始！
    #先定义一次训练
    def train_one_epoch(self, sess, saver, init, writer, epoch, step):
        start_time = time.time()
        sess.run(init)
        self.training = True
        total_loss = 0
        n_batches = 0
        try:
            while True:
                _, l, summaries = sess.run([self.opt, self.loss, self.summary_op])
                writer.add_summary(summaries, global_step = step)
                #间隔几步，输出loss
                if(step + 1) % self.skip_step == 0:
                    print('Loss at step {0}: {1}'.format(step, l))
                step += 1
                total_loss += 1
                n_batches += 1
        except tf.errors.OutOfRangeError:
             pass
        saver.save(sess, 'checkpoints/convnet_mnist/mnist-convnet', step)
        print('Average loss at epoch {0}: {1}'.format(epoch, total_loss/n_batches))
        print('Took: {0} seconds'.format(time.time() - start_time))
        return step
    
    #单次的accuracy
    def eval_once(self, sess, init, writer, epoch, step):
        start_time = time.time()
        sess.run(init)
        self.training = False
        total_correct_preds = 0
        try:
            while True:
                accuracy_batch, summaries = sess.run([self.accuracy, self.summary_op])
                writer.add_summary(summaries, global_step = step)
                total_correct_preds += accuracy_batch
        except tf.errors.OutOfRangeError:
            pass
            
        print('Accuracy at epoch {0}: {1} '.format(epoch, total_correct_preds/self.n_test))
        print('Took: {0} seconds'.format(time.time() - start_time))
    
    #The train function alternates between training one epoch and evaluating
    def train(self, n_epochs):
        utils.safe_mkdir('checkpoints')
        utils.safe_mkdir('checkpoints/convnet_mnist')
        writer = tf.summary.FileWriter('./graphs/convnet', tf.get_default_graph())
        
        with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())
            saver = tf.train.Saver()
            ckpt = tf.train.get_checkpoint_state(os.path.dirname('checkpoints/convnet_mnist/checkpoint'))
            
            #如果已有数据，则读取以保存的数据
            if ckpt and ckpt.model_checkpoint_path:
                saver.restore(sess, ckpt.model_checkpoint_path)
                
            step = self.gstep.eval()
            
            for epoch in range(n_epochs):
                step = self.train_one_epoch(sess, saver, self.train_init, writer, epoch, step)
                self.eval_once(sess, self.test_init, writer, epoch, step)
        writer.close()

In [6]:
if __name__ == '__main__':
    model = ConvNet()
    model.build()
    model.train(n_epochs=30)

Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See tf.nn.softmax_cross_entropy_with_logits_v2.

Loss at step 19: 22725.66796875
Loss at step 39: 10053.849609375
Loss at step 59: 6540.755859375
Loss at step 79: 5954.5361328125
Loss at step 99: 3885.522705078125
Loss at step 119: 3472.58837890625
Loss at step 139: 2871.009521484375
Loss at step 159: 1993.1129150390625
Loss at step 179: 2132.438232421875
Loss at step 199: 2661.772216796875
Loss at step 219: 2204.09619140625
Loss at step 239: 1850.6143798828125
Loss at step 259: 1916.8681640625
Loss at step 279: 663.19287109375
Loss at step 299: 1371.8922119140625
Loss at step 319: 917.9982299804688
Loss at step 339: 762.3031005859375
Loss at step 359: 1035.0814208984375
Loss at step 379: 758.4645385742188
Loss at step 399: 1118.3355712890625
Loss at step 419: 790.3795166015625
Average loss at epoch 0: 1.0
Took: 99.80129194259644 seconds
Accuracy a

Loss at step 3759: 110.52007293701172
Loss at step 3779: 22.517333984375
Loss at step 3799: 56.43974304199219
Loss at step 3819: 84.72364807128906
Loss at step 3839: 79.27716064453125
Loss at step 3859: 2.6389198303222656
Average loss at epoch 8: 1.0
Took: 92.86406707763672 seconds
Accuracy at epoch 8: 0.9634 
Took: 5.631995916366577 seconds
Loss at step 3879: 8.198116302490234
Loss at step 3899: 16.56704330444336
Loss at step 3919: 27.444305419921875
Loss at step 3939: 38.54176712036133
Loss at step 3959: 16.897443771362305
Loss at step 3979: 6.151445388793945
Loss at step 3999: 3.4584617614746094
Loss at step 4019: 94.47312927246094
Loss at step 4039: 26.588287353515625
Loss at step 4059: 0.0
Loss at step 4079: 28.846712112426758
Loss at step 4099: 33.18427658081055
Loss at step 4119: 26.835031509399414
Loss at step 4139: 26.467370986938477
Loss at step 4159: 8.624481201171875
Loss at step 4179: 42.34376525878906
Loss at step 4199: 10.636581420898438
Loss at step 4219: 51.45255661010

Average loss at epoch 17: 1.0
Took: 92.82866907119751 seconds
Accuracy at epoch 17: 0.9745 
Took: 5.597985029220581 seconds
Loss at step 7759: 8.407346725463867
Loss at step 7779: 0.0
Loss at step 7799: 17.826385498046875
Loss at step 7819: 45.348236083984375
Loss at step 7839: 3.0579490661621094
Loss at step 7859: 0.0
Loss at step 7879: 38.969600677490234
Loss at step 7899: 31.420196533203125
Loss at step 7919: 0.0
Loss at step 7939: 19.144752502441406
Loss at step 7959: 17.492549896240234
Loss at step 7979: 0.0
Loss at step 7999: 8.784873962402344
Loss at step 8019: 0.0
Loss at step 8039: 33.79181671142578
Loss at step 8059: 0.0
Loss at step 8079: 0.0
Loss at step 8099: 11.764053344726562
Loss at step 8119: 3.9808349609375
Loss at step 8139: 0.0
Loss at step 8159: 14.672096252441406
Average loss at epoch 18: 1.0
Took: 92.50471258163452 seconds
Accuracy at epoch 18: 0.9746 
Took: 5.606974124908447 seconds
Loss at step 8179: 46.80978775024414
Loss at step 8199: 20.297801971435547
Loss 

Average loss at epoch 27: 1.0
Took: 92.59288167953491 seconds
Accuracy at epoch 27: 0.9813 
Took: 5.600987911224365 seconds
Loss at step 12059: 0.0
Loss at step 12079: 0.0
Loss at step 12099: 0.0
Loss at step 12119: 7.424171447753906
Loss at step 12139: 1.0775260925292969
Loss at step 12159: 3.5080795288085938
Loss at step 12179: 6.86175537109375
Loss at step 12199: 27.41316032409668
Loss at step 12219: 0.0
Loss at step 12239: 1.0388069152832031
Loss at step 12259: 37.439544677734375
Loss at step 12279: 30.647640228271484
Loss at step 12299: 48.15747833251953
Loss at step 12319: 0.0
Loss at step 12339: 15.432842254638672
Loss at step 12359: 0.0
Loss at step 12379: 5.017585754394531
Loss at step 12399: 3.95086669921875
Loss at step 12419: 0.0
Loss at step 12439: 0.0
Loss at step 12459: 0.0
Average loss at epoch 28: 1.0
Took: 92.47008109092712 seconds
Accuracy at epoch 28: 0.9787 
Took: 5.592982769012451 seconds
Loss at step 12479: 11.001678466796875
Loss at step 12499: 0.0
Loss at step 