# Advanced RNN - 2
- Objective: try various types of NN architectures

## Load Dataset
- CIFAR-10 dataset
- source: https://www.cs.toronto.edu/~kriz/cifar.html
<img src="https://image.slidesharecdn.com/pycon2015-150913033231-lva1-app6892/95/pycon-2015-48-638.jpg?cb=1442115225" style="width: 500px"/>

In [91]:
import numpy as np

from sklearn.metrics import accuracy_score
from keras.datasets import cifar10
from keras.utils import to_categorical

In [92]:
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

In [93]:
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

In [94]:
print(X_train.shape)
print(X_test.shape)
print(y_train.shape)
print(y_test.shape)

(50000, 32, 32, 3)
(10000, 32, 32, 3)
(50000, 10)
(10000, 10)


## 1. CNN-RNN
- Perform convolution & pooling operation first, and then perform recurrent operation
- Similar to the structure used in image captioning

<img src="https://cdn-images-1.medium.com/max/1600/1*vzFwXFJOrg6WRGNsYYT6qg.png" style="width: 600px"/>

In [95]:
from keras.models import Sequential, Model
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, LSTM, Input, Activation, Reshape, concatenate
from keras import optimizers

In [96]:
model = Sequential()

In [97]:
model.add(Conv2D(input_shape = (X_train.shape[1], X_train.shape[2], X_train.shape[3]), filters = 50, kernel_size = (3,3), strides = (1,1), padding = 'same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size = (2,2)))

In [98]:
print(model.output_shape)

(None, 16, 16, 50)


In [99]:
model.add(Reshape(target_shape = (16*16, 50)))

In [100]:
model.add(LSTM(50, return_sequences = False))

In [101]:
model.add(Dense(10))
model.add(Activation('softmax'))

In [102]:
adam = optimizers.Adam(lr = 0.001)
model.compile(loss = 'categorical_crossentropy', optimizer = adam, metrics = ['accuracy'])

In [103]:
print(model.summary())

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_8 (Conv2D)            (None, 32, 32, 50)        1400      
_________________________________________________________________
activation_18 (Activation)   (None, 32, 32, 50)        0         
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 16, 16, 50)        0         
_________________________________________________________________
reshape_6 (Reshape)          (None, 256, 50)           0         
_________________________________________________________________
lstm_13 (LSTM)               (None, 50)                20200     
_________________________________________________________________
dense_18 (Dense)             (None, 10)                510       
_________________________________________________________________
activation_19 (Activation)   (None, 10)                0         
Total para

In [105]:
%%time
history = model.fit(X_train, y_train, epochs = 100, batch_size = 100, verbose = 0)

Wall time: 3h 54min 45s


In [106]:
results = model.evaluate(X_test, y_test)



In [107]:
print('Test Accuracy: ', results[1])

Test Accuracy:  0.5927


## 2. CNN-RNN-2
- Perform convolution-pooling operations and recurrent operations independently, and sum their results up
- Similar to the structure used in visual question answering

<img src="https://camo.githubusercontent.com/828817c970da406d2d83dc9a5c03fb120231e2a2/687474703a2f2f692e696d6775722e636f6d2f56627149525a7a2e706e67" style="width: 800px"/>

In [34]:
input_layer = Input(shape = (X_train.shape[1], X_train.shape[2], X_train.shape[3]))
conv_layer = Conv2D(filters = 50, kernel_size = (3,3), strides = (1,1), padding = 'same')(input_layer)
activation_layer = Activation('relu')(conv_layer)
pooling_layer = MaxPooling2D(pool_size = (2,2), padding = 'same')(activation_layer)
flatten = Flatten()(pooling_layer)
dense_layer_1 = Dense(100)(flatten)

In [35]:
reshape = Reshape(target_shape = (X_train.shape[1]*X_train.shape[2], X_train.shape[3]))(input_layer)
lstm_layer = LSTM(50, return_sequences = False)(reshape)
dense_layer_2 = Dense(100)(lstm_layer)

In [36]:
merged_layer = concatenate([dense_layer_1, dense_layer_2])

In [37]:
output_layer = Dense(10, activation = 'softmax')(merged_layer)

In [42]:
model = Model(inputs = input_layer, outputs = output_layer)

In [43]:
adam = optimizers.Adam(lr = 0.001)
model.compile(loss = 'categorical_crossentropy', optimizer = adam, metrics = ['accuracy'])

In [44]:
print(model.summary())

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
input_4 (InputLayer)             (None, 32, 32, 3)     0                                            
____________________________________________________________________________________________________
conv2d_6 (Conv2D)                (None, 32, 32, 50)    1400        input_4[0][0]                    
____________________________________________________________________________________________________
activation_8 (Activation)        (None, 32, 32, 50)    0           conv2d_6[0][0]                   
____________________________________________________________________________________________________
max_pooling2d_5 (MaxPooling2D)   (None, 16, 16, 50)    0           activation_8[0][0]               
___________________________________________________________________________________________

In [None]:
%%time
history = model.fit(X_train, y_train, epochs = 10, batch_size = 100, verbose = 0)

In [46]:
results = model.evaluate(X_test, y_test)



In [47]:
print('Test Accuracy: ', results[1])

[14.50628567199707, 0.10000000000000001]
