# 神经网络的数学基础
通过解决MNIST问题来入门Keras库

下面将分为几个步骤来进行这一实例的验证
1. 加载Keras中的MNIST数据集
2. 网络架构
3. 编译步骤
4. 准备图像数据
5. 准备标签

In [1]:
# 导入数据

from keras.datasets import mnist

(train_images, train_labels),(test_images, test_labels) = mnist.load_data()

# (train_images, train_labels) 训练集
# (test_images, test_labels) 测试集

# 显示两个数据集的情况
print(train_images.shape)
print(len(test_labels))
print(test_images.shape)
print(len(test_labels))


(60000, 28, 28)
10000
(10000, 28, 28)
10000


In [2]:
# 进行网络构建

from keras import models
from keras import layers

# 选择构建模型方法，目前有两种：1. Sequential models；2.Functional API
network = models.Sequential() 
# Sequential模型字面上的翻译是顺序模型，给人的第一感觉是那种简单的线性模型，
# 但实际上Sequential模型可以构建非常复杂的神经网络，
# 包括全连接神经网络、卷积神经网络(CNN)、循环神经网络(RNN)等等。
# 这里的Sequential更准确的应该理解为堆叠，通过堆叠许多层，构建出深度神经网络。

network.add(layers.Dense(512, activation='relu', input_shape=(28*28,)))

network.add(layers.Dense(10, activation='softmax'))


目前，大多数深度学习都是**将简单的层链接起来**，从而实现**渐进式的数据蒸馏**。
实例中包含2个Dense层，首先构建第一个Dense层
```python
network.add(layers.Dense(512, activation='relu', input_shape=(28*28,)))
```
其后构建第二个，也就是最后一层，是一个10输出的softmax层，其将返回一个由10个概率值（总和为1）组成的数组，每个概率值对应图像属于10个数字中某一个的概率
```python
network.add(layers.Dense(10, activation='softmax'))
```

To make our network ready for training, we need to pick three more things, as part of **"compilation"** step:

- **A loss function**损失函数: the is how the network will be able to <span class="mark">measure how good a job it is doing on its training data</span>, and thus <span class="mark">how it will be able to steer itself in the right direction</span>.
- **An optimizer**优化器: this is the mechanism through which the network will update itself based on the data it sees and its loss function.
- **Metrics to monitor during training and testing**在训练和测试过程中需要监控的指标. Here we will only care about accuracy (the fraction of the images that were correctly classified).



In [3]:
# 编译步骤， 
network.compile(optimizer='rmsprop',loss='categorical_crossentropy',metrics=['accuracy'])

In [4]:
# 数据预处理及标签处理

# 准备图像数据
# 将数据变换为网络需要的形状，并将其变换为网络需要的形状，并缩放到所有值都在[0,1]区间

train_images = train_images.reshape((60000, 28*28))
train_images = train_images.astype('float32')/255

test_images = test_images.reshape((10000, 28*28))
test_images = test_images.astype('float32')/255


# 准备标签进行处理
from keras.utils import to_categorical

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)


In [5]:
# 进行训练网络，在Keras调用网络的fit方法完成

network.fit(train_images, train_labels, epochs=5, batch_size=128)

# 验证模型在测试集上的性能

test_loss, test_acc = network.evaluate(test_images,test_labels)
print('test_acc',test_acc)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
test_acc 0.9807000160217285
