循环神经网络（RNN）：主要用于挖掘数据中的时序信息以及语义信息的深度表达能力,在语音识别,语言模型,机器翻译以及时序分析方面也被广泛应用.

举个例子,比如文本序列的预测,预测句子的下一个单词是什么,一般需要当前的单词以及前面的单词,因为句子的各之间不独立的,比如当前单词是is,前一个词汇是sky,那么下一个词汇很大的概率是blue,RNN就是通过对大量的序列数据的学习,网络会记忆之前的信息,并依据之前的信息来推测后来的输出信息.

所以在RNN中隐藏层节点之间是有连接的,隐藏层当前的状态是由但前和输入和上一个时间点的隐藏层(状态)输出共同决定的.

### 导包

In [1]:
import numpy as np
# np.random.seed(1337) # for reproducibility
from keras.datasets import mnist
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers import SimpleRNN, Activation, Dense
from keras.optimizers import Adam

Using TensorFlow backend.


### 加载数据集

In [2]:
# X shape (60,000 28x28), y shape (10,000, )
(X_train, y_train), (X_test, y_test) = mnist.load_data()
print(X_train.shape, y_train.shape)
print(X_test.shape, y_test.shape)

(60000, 28, 28) (60000,)
(10000, 28, 28) (10000,)


### 处理数据

In [3]:
# data pre-processing
X_train = X_train.reshape(-1, 28, 28) / 255. # normalize
X_test = X_test.reshape(-1, 28, 28) / 255. # normalize
y_train = np_utils.to_categorical(y_train, num_classes=10)
y_test = np_utils.to_categorical(y_test, num_classes=10)

### 构建模型
Recurrent层是抽象类，不要在模型中直接应用，应使用它的子类LSTM，GRU或SimpleRNN。

In [16]:
'''
为了使用RNN，我们将图像理解为序列化数据。
每一行作为一个输入单元，所以输入数据大小INPUT_SIZE = 28； 
先是第1行输入，再是第2行，第3行，第4行，…，第28行输入， 
这就是一张图片也就是一个序列，所以步长TIME_STEPS = 28。
'''
TIME_STEPS = 28 #要读取多少个时间点的数据，如果一次读一行需要读28次，相当于图片的高度
INPUT_SIZE = 28 # same as the width of the image
BATCH_SIZE = 50 #批的大小
BATCH_INDEX = 0 #批的起始索引
OUTPUT_SIZE = 10 #分类结果的数量
CELL_SIZE = 50 #隐层中的输出维度

# build RNN model
model = Sequential()
# RNN cell
model.add(SimpleRNN( #全连接RNN网络，RNN的输出会被回馈到输入
    input_shape=(TIME_STEPS, INPUT_SIZE), # Or:input_dim=INPUT_SIZE, input_length=TIME_STEPS,
    units=CELL_SIZE, #输出维度
    unroll=True,
#     unroll 默认为False，若为True，则循环层将被展开，否则就使用符号化的循环。
#     当使用TensorFlow为后端时，循环网络本来就是展开的，因此该层不做任何事情。
#     层展开会占用更多的内存，但会加速RNN的运算。层展开只适用于短序列。
))
# output layer
model.add(Dense(30))
model.add(Activation('sigmoid'))
model.add(Dense(OUTPUT_SIZE))
model.add(Activation('softmax'))

In [10]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
simple_rnn_2 (SimpleRNN)     (None, 50)                3950      
_________________________________________________________________
dense_3 (Dense)              (None, 30)                1530      
_________________________________________________________________
activation_3 (Activation)    (None, 30)                0         
_________________________________________________________________
dense_4 (Dense)              (None, 10)                310       
_________________________________________________________________
activation_4 (Activation)    (None, 10)                0         
Total params: 5,790
Trainable params: 5,790
Non-trainable params: 0
_________________________________________________________________


### 编译模型

In [11]:
LR = 0.001
# optimizer
adam = Adam(LR)
model.compile(optimizer=adam,loss='categorical_crossentropy',metrics=['accuracy'])

### 训练模型

In [15]:
# batch training
for step in range(4001):
    # data shape = (batch_num, steps, inputs/outputs)
    X_batch = X_train[BATCH_INDEX:BATCH_INDEX+BATCH_SIZE,:,:]
    Y_batch = y_train[BATCH_INDEX:BATCH_INDEX+BATCH_SIZE,:]
    cost = model.train_on_batch(X_batch, Y_batch)
    BATCH_INDEX += BATCH_SIZE
    BATCH_INDEX = 0 if BATCH_INDEX >= X_train.shape[0] else BATCH_INDEX
    if step % 500 == 0: #每 500 步输出一下测试集的准确率和损失
        cost, accuracy = model.evaluate(X_test, y_test, batch_size=y_test.shape[0], verbose=False)
        print('test cost: ', cost, 'test accuracy: ', accuracy)
        
# model.fit(X_train, y_train, epochs=2, batch_size=BATCH_SIZE)    
# cost, accuracy = model.evaluate(X_test, y_test)
# print('test cost: ', cost, ' test accuracy: ', accuracy)

print('Over!')

test cost:  0.228961184621 test accuracy:  0.937900006771
test cost:  0.173115715384 test accuracy:  0.95300000906
test cost:  0.187467262149 test accuracy:  0.947600007057
test cost:  0.197130635381 test accuracy:  0.948099970818
test cost:  0.186225309968 test accuracy:  0.95039999485
test cost:  0.177072882652 test accuracy:  0.952499985695
test cost:  0.189168646932 test accuracy:  0.949100017548
test cost:  0.175225347281 test accuracy:  0.95450001955
test cost:  0.190782010555 test accuracy:  0.948300004005
Over!


In [3]:
from keras.models import load_model
model = load_model('cifar10.h5')

In [4]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_3 (Conv2D)            (None, 32, 32, 32)        896       
_________________________________________________________________
activation_5 (Activation)    (None, 32, 32, 32)        0         
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 16, 16, 32)        0         
_________________________________________________________________
dropout_4 (Dropout)          (None, 16, 16, 32)        0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 16, 16, 64)        18496     
_________________________________________________________________
activation_6 (Activation)    (None, 16, 16, 64)        0         
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 8, 8, 64)          0         
__________

In [11]:
from keras.datasets import cifar10
from keras.utils import np_utils
(x_train,y_train), (x_test, y_test) = cifar10.load_data()
x_test = x_test.astype('float32')/255
y_test = np_utils.to_categorical(y_test, 10).reshape(y_test.shape[0],-1)
a, b = model.evaluate(x_test, y_test)



In [12]:
print(a,b)

1.12066130123 0.6284
