<a href="https://colab.research.google.com/github/morbosohex/Workflow/blob/master/intro_to_cnn.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

The optimizer is specified when you compile the model (in Step 7 of the notebook). 

'sgd' : SGD

'rmsprop' : RMSprop

'adagrad' : Adagrad

'adadelta' : Adadelta

'adam' : Adam

'adamax' : Adamax

'nadam' : Nadam

'tfoptimizer' : TFOptimi

# Convolutional Layers in Keras

In [0]:
from keras.layers import Conv2D

Using TensorFlow backend.


### Arguments

- `filters` - number of filters
- `kernel_size` - height and width of the convolution window
- `strides` - stride of the convolution, if don't specify anything, `strides` is set to 1
- `padding` - one of `valid` or `same`, if don't specify anything, `padding` is set to `valid`
- `activation` - Typically `relu`, if you don't specify anything, no activation is applied. You are strongly encouraged to add a ReLU activation function to every convolutional layer in your networks.

NOTE: It is possible to represent both kernel_size and strides as either a number or a tuple.


When using your convolutional layer as the first layer (appearing after the input layer) in a model, you must provide an additional input_shape argument:

- `input_shape `- Tuple specifying the height, width, and depth (in that order) of the input.

NOTE: Do not include the input_shape argument if the convolutional layer is not the first layer in your network.

## Example #1

Say I'm constructing a CNN, and my input layer accepts grayscale images that are 200 by 200 pixels (corresponding to a 3D array with height 200, width 200, and depth 1). Then, say I'd like the next layer to be a convolutional layer with 16 filters, each with a width and height of 2. When performing the convolution, I'd like the filter to jump two pixels at a time. I also don't want the filter to extend outside of the image boundaries; in other words, I don't want to pad the image with zeros. Then, to construct this convolutional layer, I would use the following line of code:

假设我正在构建CNN，我的输入层接受200 x 200像素的灰度图像（对应于高度为200，宽度为200，深度为1的3D数组）。然后，假设我希望下一层是一个带有16个滤波器的卷积层，每个滤波器的宽度和高度为2.当执行卷积时，我希望滤波器一次跳两个像素,也即步长为2。我也不希望滤波器延伸到图像边界之外;换句话说，我不想用零填充图像。然后，为了构造这个卷积层，我将使用以下代码行：

In [0]:
Conv2D(filters=16, 
       kernel_size=2,
       strides=2,
       padding='valid',
       activation='relu',
       input_shape=(200,200,1))

W0718 14:35:10.980968 140640370833280 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:74: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.



<keras.layers.convolutional.Conv2D at 0x7fe918b7a4a8>

## Example #2

Say I'd like the next layer in my CNN to be a convolutional layer that takes the layer constructed in Example 1 as input. Say I'd like my new layer to have 32 filters, each with a height and width of 3. When performing the convolution, I'd like the filter to jump 1 pixel at a time. I want the convolutional layer to see all regions of the previous layer, and so I don't mind if the filter hangs over the edge of the previous layer when it's performing the convolution. Then, to construct this convolutional layer, I would use the following line of code:

假设我希望CNN中的下一层是卷积层，它将示例1中构造的卷积层作为输入。假设我希望我的新图层有32个滤波器，每个滤波器的高度和宽度均为3.当执行卷积时，我希望滤波器一次跳1个像素,即步长为1。我希望卷积层能够看到前一层的所有区域，所以我不介意过滤器在执行卷积时是否挂在前一层的边缘上,即需要进行零填充。然后，为了构造这个卷积层，我将使用以下代码行：

In [0]:
Conv2D(filters=32,
      kernel_size=3,
      strides=1,
      padding='same',
      activation='relu')

<keras.layers.convolutional.Conv2D at 0x7fe918b7a908>

## Example #3

If you look up code online, it is also common to see convolutional layers in Keras in this format:

In [0]:
Conv2D(64, (2,2), activation='relu')

<keras.layers.convolutional.Conv2D at 0x7fe918b7a438>

In this case, there are 64 filters, each with a size of 2x2, and the layer has a ReLU activation function. The other arguments in the layer use the default values, so the convolution uses a stride of 1, and the padding has been set to 'valid'.

# Dimensionality

-  same as with neural network, create a CNN in keras by first creating a Sequential model
- add layers by using `.add()`


This corresponds to the value under Output Shape in the printed output. In the figure above, None corresponds to the batch size, and the convolutional layer has a height of 100, width of 100, and depth of 16.

In [0]:
from keras.models import Sequential
from keras.layers import Conv2D

model = Sequential()
model.add(Conv2D(filters=16,
                 kernel_size=2,
                 strides=2, 
                 padding='valid',
                 activation='relu',
                 input_shape=(200,200,1)))
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_12 (Conv2D)           (None, 100, 100, 16)      80        
Total params: 80
Trainable params: 80
Non-trainable params: 0
_________________________________________________________________


# Formula: Number of Parameters in a Convolutional Layer

The number of parameters in a convolutional layer depends on the supplied values of filters, kernel_size, and input_shape. Let's define a few variables:

- `K` - 滤波器的数量
- `F` - 卷积层的高度和宽度
- `D_in` - 前一层的深度

Notice that `K = filters`, and `F = kernel_size`. Likewise, `D_in` is the last value in the input_shape tuple.

each filter: `F*F*D_in` weights

there is `K` filters: `K*F*F*D_in`

Since there is one bias term per filter, the convolutional layer has `K` biases. Thus, the _ number of parameters_ in the convolutional layer is given by `K*F*F*D_in + K`.

# Formula: Shape of a Convolutional Layer

The shape of a convolutional layer depends on the supplied values of `kernel_size`, `input_shape`, `padding`, and `stride`. Let's define a few variables:

- K - 卷积层中滤波器的数量
- F - 滤波器的高度与宽度
- S - 卷积层的步长
- H_in - 前一层的高度
- W_in - 前一层的宽度

Notice that `K = filters`, `F = kernel_size`, and `S = stride`. Likewise, `H_in` and `W_in` are the first and second value of the input_shape tuple, respectively.

The **depth** of the convolutional layer will always equal the number of filters `K`.


If `padding = 'same'`, then the spatial dimensions of the convolutional layer are the following:

- height = ceil(float(`H_in`) / float(`S`))
- width = ceil(float(`W_in`) / float(`S`))

If `padding = 'valid'`, then the spatial dimensions of the convolutional layer are the following:

- height = ceil(float(`H_in` - `F` + 1) / float(`S`))
- width = ceil(float(`W_in` - `F` + 1) / float(`S`))



# Quiz

In [0]:
from keras.models import Sequential
from keras.layers import Conv2D

model = Sequential()
model.add(Conv2D(filters=32, kernel_size=3, strides=2, padding='same', 
    activation='relu', input_shape=(128, 128, 3)))
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_13 (Conv2D)           (None, 64, 64, 32)        896       
Total params: 896
Trainable params: 896
Non-trainable params: 0
_________________________________________________________________


#### How many parameters does the convolutional layer have?

K = 32
F = 3
D_in = 3

num_of_weights = 32 * 3* 3* 3
num_of_bias = 32


#### What is the depth of the convolutional layer?
answer = K

#### What is the width of the convolutional layer?

padding='same'

S = 2

W_in = 128

width = ceil(float(W_in)/float(S)) = 64

# Pooling Layer

池化层的输入是卷积层中不同滤波器产生的特征映射的堆叠, 卷积层可能会因为滤波器的数量巨大,产生的特征映射过多, 导致维度过大, 池化层所扮演的角色就是降低维数

两种类型

- 最大池化层

将一组特征映射作为输入
与卷积操作类似,需要获得滤波器,以及滑动步长, 然后对一组特征映射进行平行垂直滑动.

最大池化层中对应的节点值的计算方法是: 拿出**窗口**中包含的最大像素

输出是一组具有相同数量的特征映射, 但是特征映射的宽和高都减小了

- 全局平均池化

对于这种类型,不指定窗口大小,也不指定步长

对一组特征映射计算每个映射的节点均值, 最终会得到一个向量, 向量数目和特征映射数目相同



## Max Pooling Layers in Keras - 最大池化层-Keras

In [0]:
# 导入包并且新建最大池化层
from keras.layers import MaxPooling2D

```MaxPooling2D(pool_size,strides,padding)```

参数

- `pool_size` - 池化窗口的宽和高
- `strides` - 滑动步长, 如果不指定, 大小为池化窗口的大小
- `padding` - 两个选项`valid`和`same`,如果不指定, 默认不填充


Example
Say I'm constructing a CNN, and I'd like to reduce the dimensionality of a convolutional layer by following it with a max pooling layer. Say the convolutional layer has size (100, 100, 15), and I'd like the max pooling layer to have size (50, 50, 15). I can do this by using a 2x2 window in my max pooling layer, with a stride of 2, which could be constructed in the following line of code:

假设我正在构建一个CNN，我想通过跟随最大池层来减少卷积层的维数。假设卷积层的大小（100,100,15），我想最大池层的大小（50,50,15）。我可以通过在我的最大池层中使用2x2窗口来做到这一点，步长为2，可以在以下代码行中构造：



In [0]:
MaxPooling2D(pool_size=2, strides=2)

W0808 07:55:28.687218 139850996135808 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:74: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.



<keras.layers.pooling.MaxPooling2D at 0x7f314e57b860>

In [0]:
 MaxPooling2D(pool_size=2, strides=1)

<keras.layers.pooling.MaxPooling2D at 0x7f314e57b9e8>

## 查看最大池化层的维度

In [0]:
from keras.layers import MaxPooling2D
from keras.models import Sequential

model = Sequential()
model.add(MaxPooling2D(pool_size=2,strides=2,input_shape=(100,100,15)))
model.summary()

W0808 07:59:39.212870 139850996135808 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

W0808 07:59:39.228593 139850996135808 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:3976: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.



_________________________________________________________________
Layer (type)                 Output Shape              Param #   
max_pooling2d_3 (MaxPooling2 (None, 50, 50, 15)        0         
Total params: 0
Trainable params: 0
Non-trainable params: 0
_________________________________________________________________


# CNN - 图像分类

In [0]:
from keras.models import Sequential
from keras.layers import Conv2D, MaxPool2D, Flatten, Dense

model = Sequential()
model.add(Conv2D(filters=16,kernel_size=2,padding='same',activation='relu',
                 input_shape=(32,32,3)))
model.add(MaxPooling2D(pool_size=2))
model.add(Conv2D(filters=32,kernel_size=2,padding='same',activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Conv2D(filters=64,kernel_size=2,padding='same',activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Flatten())
model.add(Dense(500,activation='relu'))
model.add(Dense(10,activation='softmax'))


网络以三个卷积层的序列开始，然后是最大池层。前六层设计用于获取图像像素的输入阵列并将其转换为已挤出所有空间信息的阵列，并且仅保留编码图像内容的信息。然后将阵列展平为CNN的第七层中的矢量。接下来是两个密集层，旨在进一步阐明图像的内容。最后一层为数据集中的每个对象类都有一个条目，并具有softmax激活函数，因此它返回概率。

### 注意事项
- 始终将ReLU激活功能添加到CNN中的Conv2D层。除了网络中的最后一层，密集层还应具有ReLU激活功能。

- 在构建用于分类的网络时，网络中的最后一层应该是具有softmax激活功能的密集层。最终层中的节点数应等于数据集中的类总数。

卷积层一般是增加深度

池化层一般是减少宽度和高度

最终通过卷积-池化多次操作后,得到的高深度低维度数组中的每一个特征映射代表了学到的不同特征,例如是否有轮子,是否有眼睛,是否有腿等等, 然后经过全连接作为特征预测是否是那种类型的图片.

# Image Augmentation in Keras
只关注图像中是否存在某个对象

去除不相关的信息:

对象的大小,角度,位置为不相关信息



# Transfer learning

迁移学习包括采用预先训练的神经网络并使神经网络适应新的不同数据集。

取决于两者：

- 新数据集的大小，和
- 新数据集与原始数据集的相似性

使用转学习的方法会有所不同。主要有四种情况：

- 新数据集很小，新数据类似于原始训练数据
- 新数据集很小，新数据与原始训练数据不同
- 新数据集很大，新数据类似于原始训练数据
- 新数据集很大，新数据与原始训练数据不同

![image.png](https://upload-images.jianshu.io/upload_images/12735209-81c4505578431b4d.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

大型数据集可能有一百万个图像。一个小数据可能有两千个图像。大数据集和小数据集之间的分界线在某种程度上是主观的。当使用具有小数据集的转移学习时，过度拟合是一个问题。

狗的图像和狼的图像将被认为是相似的; 图像将具有共同的特征。花图像的数据集将与狗图像的数据集不同。

四个迁移学习案例中的每一个都有自己的方法。在以下部分中，我们将逐一查看每个案例。

## 示范网络
为了解释每种情况如何工作，我们将从通用的预训练卷积神经网络开始，并解释如何针对每种情况调整网络。我们的示例网络包含三个卷积层和三个完全连接的层：

![image.png](https://upload-images.jianshu.io/upload_images/12735209-527314469c1c4536.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

以下是卷积神经网络的概括概述：

- 第一层将检测图像中的边缘
- 第二层将检测形状
- 第三个卷积层检测更高级别的特征

每个转移学习案例将以不同的方式使用预先训练的卷积神经网络。

## 案例1：小数据集，类似数据

![案例1：具有相似数据的小数据集.png](https://upload-images.jianshu.io/upload_images/12735209-d8e548e3b1ff5c93.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

如果新数据集很小并且与原始训练数据类似：

- 切断神经网络的末端
- 添加一个与新数据集中的类数相匹配的新完全连接层
- 随机化新的完全连接层的权重; 冻结预训练网络中的所有权重
- 训练网络以更新新的完全连接层的权重

为了避免过度拟合小数据集，原始网络的权重将保持不变，而不是重新训练权重。

由于数据集相似，因此来自每个数据集的图像将具有类似的更高级别的特征。因此，大多数或所有预训练的神经网络层已经包含有关新数据集的相关信息，应该保留。

以下是如何可视化这种方法：

![image.png](https://upload-images.jianshu.io/upload_images/12735209-350a3d3d4f327829.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)


## 案例2：小数据集，不同数据

![image.png](https://upload-images.jianshu.io/upload_images/12735209-7775bd3a0d223a25.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

如果新数据集很小且与原始训练数据不同：

- 切断网络开始附近的大多数预训练层
- 向剩余的预训练层添加一个新的完全连接层，该层与新数据集中的类数相匹配
- 随机化新的完全连接层的权重; 冻结预训练网络中的所有权重
- 训练网络以更新新的完全连接层的权重

由于数据集很小，过度拟合仍然是一个问题。为了防止过度拟合，原始神经网络的权重将保持不变，就像第一种情况一样。

但是原始训练集和新数据集不共享更高级别的功能。在这种情况下，新网络将仅使用包含较低级别功能的图层。

以下是如何可视化此方法：

![image.png](https://upload-images.jianshu.io/upload_images/12735209-9090d6bf51b75ed9.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

## 案例3：大数据集，类似数据

![image.png](https://upload-images.jianshu.io/upload_images/12735209-cfab7b4676fded90.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

如果新数据集很大并且与原始训练数据类似：

- 删除最后一个完全连接的图层，并替换为与新数据集中的类数相匹配的图层
- 随机初始化新的完全连接层中的权重
- 使用预先训练的权重初始化其余权重
- 重新训练整个神经网络

在对大型数据集进行培训时，过度拟合并不是一个问题; 因此，您可以重新训练所有重量。

由于原始训练集和新数据集共享更高级别的特征，因此也使用整个神经网络。

以下是如何可视化此方法：

![image.png](https://upload-images.jianshu.io/upload_images/12735209-3a5eb409b409538b.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

## 案例4：大数据集，不同数据

![image.png](https://upload-images.jianshu.io/upload_images/12735209-569c178fad5bcbbd.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

如果新数据集很大且与原始训练数据不同：

- 删除最后一个完全连接的图层，并替换为与新数据集中的类数相匹配的图层
- 使用随机初始化的权重从头开始重新训练网络
- 或者，您可以使用与“大型和类似”数据案例相同的策略


即使数据集与训练数据不同，从预训练的网络初始化权重可能会使训练更快。因此，这种情况与具有大型类似数据集的情况完全相同。

如果使用预先训练的网络作为起点不能产生成功的模型，另一种选择是随机初始化卷积神经网络权重并从头开始训练网络。

以下是如何可视化此方法：

![image.png](https://upload-images.jianshu.io/upload_images/12735209-407d8ebc3a1de52a.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

# Transfer Learning in Keras

## 1. Load Dog Dataset




In [0]:
from sklearn.datasets import load_files       
from keras.utils import np_utils
import numpy as np
from glob import glob

# define function to load train, test, and validation datasets
def load_dataset(path):
    data = load_files(path)
    dog_files = np.array(data['filenames'])
    dog_targets = np_utils.to_categorical(np.array(data['target']), 133)
    return dog_files, dog_targets

# load train, test, and validation datasets
train_files, train_targets = load_dataset('dogImages/train')
valid_files, valid_targets = load_dataset('dogImages/valid')
test_files, test_targets = load_dataset('dogImages/test')

# load ordered list of dog names
dog_names = [item[25:-1] for item in glob('dogImages/train/*/')]

# print statistics about the dataset
print('There are %d total dog categories.' % len(dog_names))
print('There are %s total dog images.\n' % str(len(train_files) + len(valid_files) + len(test_files)))
print('There are %d training dog images.' % len(train_files))
print('There are %d validation dog images.' % len(valid_files))
print('There are %d test dog images.'% len(test_files))

## 2. Visualize the First 12 Training Images

In [0]:
import cv2
import matplotlib.pyplot as plt
%matplotlib inline

def visualize_img(img_path, ax):
    img = cv2.imread(img_path)
    ax.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
    
fig = plt.figure(figsize=(20, 10))
for i in range(12):
    ax = fig.add_subplot(3, 4, i + 1, xticks=[], yticks=[])
    visualize_img(train_files[i], ax)

## 3. Obtain the VGG-16 Bottleneck Features

In [0]:
bottleneck_features = np.load('bottleneck_features/DogVGG16Data.npz')
train_vgg16 = bottleneck_features['train']
valid_vgg16 = bottleneck_features['valid']
test_vgg16 = bottleneck_features['test']

## 4. Define a Model Architecture (Model 1)
此处得新全连接层得输入shape(7,7,512)是如何得到得呢?

答: 此处需要对预训练模型截断处得输出进行检查,查看输出shape, 方法如下

---
#### 4.1 导入预训练模型(完整模型)
```
from keras.applications.vgg16 import VGG16
model = VGG16()
model.summary()
```

####  4.2 查看预训练模型不经过修改,最后的输出shape
```
model.predict(img_input).shape
```

    对于此网络，model.predict返回1000维概率向量，其中包含图像返回1000个ImageNet类别中的每一个的预测概率。通过img_input通过模型获得的输出的维数是（8,1000）。第一个值8仅表示8个图像通过网络。

#### 4.3 导入去除最后一层全连接层的预训练模型
```
from keras.applications.vgg16 import VGG16
model = VGG16(include_top=False)
model.summary()
```

#### 4.4 获取截断层的输出shape
```
print(model.predict(img_input).shape)
```
    现在，存储在模型中的网络是VGG-16网络的截断版本，其中最后三个完全连接的层已被删除。在这种情况下，model.predict返回对应于VGG-16的最终最大池化层的3D阵列（尺寸为7×7×512）。通过img_input通过模型获得的输出的维数是（8,7,7,512）。第一个值8仅表示8个图像通过网络。

---


In [0]:
from keras.layers import Dense, Flatten
from keras.models import Sequential

model = Sequential()
model.add(Flatten(input_shape=(7, 7, 512)))
model.add(Dense(133, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop', 
                  metrics=['accuracy'])
model.summary()

## 5. Define another Model Architecture (Model 2)

In [0]:
from keras.layers import GlobalAveragePooling2D

model = Sequential()
model.add(GlobalAveragePooling2D(input_shape=(7, 7, 512)))
model.add(Dense(133, activation='softmax'))
model.summary()

## 6. Compile the Model (Model 2)


In [0]:
model.compile(loss='categorical_crossentropy', optimizer='rmsprop', 
                  metrics=['accuracy'])

## 7. Train the Model (Model 2)

In [0]:
from keras.callbacks import ModelCheckpoint   

# train the model
checkpointer = ModelCheckpoint(filepath='dogvgg16.weights.best.hdf5', verbose=1, 
                               save_best_only=True)
model.fit(train_vgg16, train_targets, epochs=20, validation_data=(valid_vgg16, valid_targets), 
          callbacks=[checkpointer], verbose=1, shuffle=True)

## 8. Load the Model with the Best Validation Accuracy (Model 2)

In [0]:
# load the weights that yielded the best validation accuracy
model.load_weights('dogvgg16.weights.best.hdf5')

## 9. Calculate Classification Accuracy on Test Set (Model 2)

In [0]:
# get index of predicted dog breed for each image in test set
vgg16_predictions = [np.argmax(model.predict(np.expand_dims(feature, axis=0))) 
                     for feature in test_vgg16]

# report test accuracy
test_accuracy = 100*np.sum(np.array(vgg16_predictions)==
                           np.argmax(test_targets, axis=1))/len(vgg16_predictions)
print('\nTest accuracy: %.4f%%' % test_accuracy)