<a href="https://colab.research.google.com/github/morbosohex/Workflow/blob/master/CNNs_in_TensorFlow.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 1. Convolutional Layers



### 卷积层

下图是使用3x3过滤器和步长为1进行卷积的示例。

![image.png](https://upload-images.jianshu.io/upload_images/12735209-6903c153ff580bfb.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

根据权重计算每个3x3部分的卷积[[1, 0, 1], [0, 1, 0], [1, 0, 1]]，然后加上偏差以在右侧创建卷积特征。在这种情况下，偏差为零。


### TensorFlow中的卷积层
让我们来看看如何在TensorFlow中实现卷积层。

TensorFlow提供了tf.nn.conv2d()，tf.nn.bias_add()和tf.nn.relu()函数来创建自己的卷积层。
```python
# output depth
k_output = 64

# image dimensions
image_width = 10
image_height = 10
color_channels = 3

# convolution filter dimensions
filter_size_width = 5
filter_size_height = 5

# input/image
input = tf.placeholder(
    tf.float32,
    shape=[None, image_height, image_width, color_channels])

# weight and bias
weight = tf.Variable(tf.truncated_normal(
    [filter_size_height, filter_size_width, color_channels, k_output]))
bias = tf.Variable(tf.zeros(k_output))

# apply convolution
conv_layer = tf.nn.conv2d(input, weight, strides=[1, 2, 2, 1], padding='SAME')
# add bias
conv_layer = tf.nn.bias_add(conv_layer, bias)
# apply activation function
conv_layer = tf.nn.relu(conv_layer)
```

上面的代码使用tf.nn.conv2d()函数计算卷积weight作为过滤器和`[1, 2, 2, 1]`步幅。

- TensorFlow为每个input维度使用一个步长, 如`[batch, input_height, input_width, input_channels]`。
- 我们通常总是设置`batch`和`input_channels`的步长为（即`strides`数组中的第一个和第四个元素）1。这可确保模型所有批次和输入通道都被使用。（如果某些批次和通道不使用, 最好从数据集中删除要跳过的批次或通道，而不是使用步长跳过它们。）
- 只需要设置`input_height`和 `input_width` 的步长通常（在设置`batch`和`input_channels`为1）。此示例代码使用步长为2和`5x5`的过滤器input。通常有一个方形的步幅`height = width`。当有人说他们正在使用2的步幅时，他们通常意味着`tf.nn.conv2d(x, W, strides=[1, 2, 2, 1])`。

该`tf.nn.bias_add()`函数在矩阵的最后一维上增加了一维偏差。（注意：`tf.add()`当张量不同的形状时，使用不起作用。）

该`tf.nn.relu()`功能将ReLU激活功能应用于图层。

### Quiz: 在tensorflow中构建卷积层

现在让我们在TensorFlow中构建一个卷积层。在下面的练习中，将要求您设置卷积过滤器的shape，权重和偏差。这在很多方面是在TensorFlow中使用CNN最棘手的部分。一旦您了解了如何设置这些属性的维度，应用CNN将更加直截了当。

Instructions
- Finish off each TODO in the conv2d function.
- Set up the strides, padding, filter weight (F_w), and filter bias (F_b)
- the output shape is (1, 2, 2, 3). Note that all of these except strides should be TensorFlow variables.



In [0]:
"""
Setup the strides, padding and filter weight/bias such that
the output shape is (1, 2, 2, 3).
"""
import tensorflow as tf
import numpy as np

# `tf.nn.conv2d` requires the input be 4D (batch_size, height, width, depth)
# (1, 4, 4, 1)
x = np.array([
    [0, 1, 0.5, 10],
    [2, 2.5, 1, -8],
    [4, 0, 5, 6],
    [15, 1, 2, 3]], dtype=np.float32).reshape((1, 4, 4, 1))
X = tf.constant(x)


def conv2d(input):
    # Filter (weights and bias)
    # The shape of the filter weight is (height, width, input_depth, output_depth)
    # The shape of the filter bias is (output_depth,)
    # TODO: Define the filter weights `F_W` and filter bias `F_b`.
    # NOTE: Remember to wrap them in `tf.Variable`, they are trainable parameters after all.
    F_W = tf.Variable(tf.truncated_normal([2,
                                           2,
                                           1,
                                           3]))
    F_b = tf.Variable(tf.zeros(3))
    # TODO: Set the stride for each dimension (batch_size, height, width, depth)
    strides = [1, 2, 2, 1]
    # TODO: set the padding, either 'VALID' or 'SAME'.
    padding = 'SAME'
    # https://www.tensorflow.org/versions/r0.11/api_docs/python/nn.html#conv2d
    # `tf.nn.conv2d` does not include the bias computation so we have to add it ourselves after.
    return tf.nn.conv2d(input, F_W, strides, padding) + F_b

out = conv2d(X)


In [2]:
print(out)

Tensor("add:0", shape=(1, 2, 2, 3), dtype=float32)


# 2. Max Pooling Layers

### TensorFlow中的最大池化层

下图是使用2x2过滤器和步幅为2 的最大池化的示例。四个2x2颜色区域表示每次应用过滤器以查找每个区域的最大值。

![image.png](https://upload-images.jianshu.io/upload_images/12735209-56d8be13c357beb5.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

例如，`[[1, 0], [4, 6]]`变为`6`，因为`6`是此集合中的最大值。同样，`[[2, 3], [6, 8]]`成为`8`。

从概念上讲，最大池化操作的好处是减小输入的大小，并允许神经网络只关注最重要的元素。最大池化仅通过保留每个过滤区域的最大值并删除其余值来实现此目的。

TensorFlow提供了 `tf.nn.max_pool()`将最大池化应用于卷积层的函数。

```python
conv_layer = tf.nn.conv2d(input, weight, strides=[1, 2, 2, 1], padding='SAME')
conv_layer = tf.nn.bias_add(conv_layer, bias)
conv_layer = tf.nn.relu(conv_layer)
# apply max pooling
conv_layer = tf.nn.max_pool(
    conv_layer,
    ksize=[1, 2, 2, 1],
    strides=[1, 2, 2, 1],
    padding='SAME')
```

该`tf.nn.max_pool()`函数执行最大池化，`ksize`参数作为过滤器的大小，`strides`参数作为步幅的长度。在实践中，步长为2x2的滤波器很常见。

`ksize`和`strides`参数被构造为4元素列表，其中相应于输入张量的尺寸的每个元素（`[batch, height, width, channels]`）。对于这两个`ksize`和`strides`，batch和channel shape通常设定为1。

### Quiz - 在TensorFlow中使用Max Pooling图层

Instructions
- Finish off each TODO in the maxpool function.

- Setup the strides, padding and ksize such that the output shape after pooling is (1, 2, 2, 1).



In [1]:
"""
Set the values to `strides` and `ksize` such that
the output shape after pooling is (1, 2, 2, 1).
"""
import tensorflow as tf
import numpy as np

# `tf.nn.max_pool` requires the input be 4D (batch_size, height, width, depth)
# (1, 4, 4, 1)
x = np.array([
    [0, 1, 0.5, 10],
    [2, 2.5, 1, -8],
    [4, 0, 5, 6],
    [15, 1, 2, 3]], dtype=np.float32).reshape((1, 4, 4, 1))
X = tf.constant(x)

def maxpool(input):
    # TODO: Set the ksize (filter size) for each dimension (batch_size, height, width, depth)
    ksize = [1, 2, 2, 1]
    # TODO: Set the stride for each dimension (batch_size, height, width, depth)
    strides = [1, 2, 2, 1]
    # TODO: set the padding, either 'VALID' or 'SAME'.
    padding = "SAME"
    # https://www.tensorflow.org/versions/r0.11/api_docs/python/nn.html#max_pool
    return tf.nn.max_pool(input, ksize, strides, padding)
    
out = maxpool(X)
print(out)

Tensor("MaxPool:0", shape=(1, 2, 2, 1), dtype=float32)


# 3. CNN in TensorFlow

![image.png](https://upload-images.jianshu.io/upload_images/12735209-6fec1b6786f533a5.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

该网络的结构遵循CNN的经典结构，CNN是卷积层和最大池的混合，其次是完全连接的层。

### 数据集
在这里，我们导入MNIST数据集并使用方便的TensorFlow函数对数据进行批处理，缩放和单热编码。

```python
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets(".", one_hot=True, reshape=False)

import tensorflow as tf

# parameters
learning_rate = 0.00001
epochs = 10
batch_size = 128

# number of samples to calculate validation and accuracy
# decrease this if you're running out of memory
test_valid_size = 256

# network Parameters
n_classes = 10  # MNIST total classes (0-9 digits)
dropout = 0.75  # dropout (probability to keep units)
```

### 权重和偏差
在下面的代码中，我们将创建3个层，在卷积和最大池之间交替，然后是完全连接和输出层。我们首先定义必要的权重和偏差。

```python
# store weights & biases
weights = {
    'wc1': tf.Variable(tf.random_normal([5, 5, 1, 32])),
    'wc2': tf.Variable(tf.random_normal([5, 5, 32, 64])),
    'wd1': tf.Variable(tf.random_normal([7*7*64, 1024])),
    'out': tf.Variable(tf.random_normal([1024, n_classes]))}

biases = {
    'bc1': tf.Variable(tf.random_normal([32])),
    'bc2': tf.Variable(tf.random_normal([64])),
    'bd1': tf.Variable(tf.random_normal([1024])),
    'out': tf.Variable(tf.random_normal([n_classes]))}
```

### 卷积层
回想一下，TensorFlow提供了`tf.nn.conv2d()`，`tf.nn.bias_add()`和`tf.nn.relu()`函数来创建自己的卷积层。为了简化代码，我们在下面定义了一个有用的函数。

```python
def conv2d(x, W, b, strides=1):
    x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='SAME')
    x = tf.nn.bias_add(x, b)
    x = tf.nn.relu(x)
    return x
```

该conv2d函数在应用ReLU激活函数之前计算卷积与权重W，然后增加偏差b。

### 最大池层

TensorFlow提供了 `tf.nn.max_pool()`将最大池应用于卷积层的功能。为了简化代码，我们在下面定义了一个有用的函数。

```python
def maxpool2d(x, k=2):
    return tf.nn.max_pool(
        x,
        ksize=[1,k,k,1],
        strides=[1,k,k,1],
        padding="SAME")
```

该`maxpool2d`函数`x`使用大小的过滤器将最大池应用于图层k。

### 模型
每个层到新维度的转换显示在注释中。例如，第一层在卷积步骤中将图像从28x28x1到28x28x32整形。下一步应用最大池，将每个样本转换为14x14x32。所有的层是由施加conv1到output，产生10类预测。

#### tips:
```
out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
out_width  = ceil(float(in_width - filter_width + 1) / float(strides[2]))
```
个层到新维度的转换显示在注释中。例如，第一层在卷积步骤中将图像从28x28x1到28x28x32整形。下一步应用最大池，将每个样本转换为14x14x32。所有的层是由施加conv1到output，产生10类预测。

```python
def conv_net(x, weights, biases, dropout):
    # Layer 1 - 28*28*1 to 14*14*32
    conv1 = conv2d(x, weights['wc1'], biases['bc1'])
    conv1 = maxpool2d(conv1, k=2)

    # Layer 2 - 14*14*32 to 7*7*64
    conv2 = conv2d(conv1, weights['wc2'], biases['bc2'])
    conv2 = maxpool2d(conv2, k=2)

    # Fully connected layer - 7*7*64 to 1024
    fc1 = tf.reshape(conv2, [-1, weights['wd1'].get_shape().as_list()[0]])
    fc1 = tf.add(tf.matmul(fc1, weights['wd1']), biases['bd1'])
    fc1 = tf.nn.relu(fc1)
    fc1 = tf.nn.dropout(fc1, dropout)

    # Output Layer - class prediction - 1024 to 10
    out = tf.add(tf.matmul(fc1, weights['out']), biases['out'])
    return out
```

### 会话
```python
# tf Graph input
x = tf.placeholder(tf.float32, [None, 28, 28, 1])
y = tf.placeholder(tf.float32, [None, n_classes])
keep_prob = tf.placeholder(tf.float32)

# Model
logits = conv_net(x, weights, biases, keep_prob)

# Define loss and optimizer
cost = tf.reduce_mean(\
    tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)\
    .minimize(cost)

# Accuracy
correct_pred = tf.equal(tf.argmax(logits, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# Initializing the variables
init = tf. global_variables_initializer()

# Launch the graph
with tf.Session() as sess:
    sess.run(init)

    for epoch in range(epochs):
        for batch in range(mnist.train.num_examples//batch_size):
            batch_x, batch_y = mnist.train.next_batch(batch_size)
            sess.run(optimizer, feed_dict={
                x: batch_x,
                y: batch_y,
                keep_prob: dropout})

            # Calculate batch loss and accuracy
            loss = sess.run(cost, feed_dict={
                x: batch_x,
                y: batch_y,
                keep_prob: 1.})
            valid_acc = sess.run(accuracy, feed_dict={
                x: mnist.validation.images[:test_valid_size],
                y: mnist.validation.labels[:test_valid_size],
                keep_prob: 1.})

            print('Epoch {:>2}, Batch {:>3} -'
                  'Loss: {:>10.4f} Validation Accuracy: {:.6f}'.format(
                epoch + 1,
                batch + 1,
                loss,
                valid_acc))

    # Calculate Test Accuracy
    test_acc = sess.run(accuracy, feed_dict={
        x: mnist.test.images[:test_valid_size],
        y: mnist.test.labels[:test_valid_size],
        keep_prob: 1.})
    print('Testing Accuracy: {}'.format(test_acc))
```

