# Advanced RNN - 2
- Objective: try various types of NN architectures

## Load Dataset
- CIFAR-10 dataset
- source: https://www.cs.toronto.edu/~kriz/cifar.html
<img src="https://image.slidesharecdn.com/pycon2015-150913033231-lva1-app6892/95/pycon-2015-48-638.jpg?cb=1442115225" style="width: 500px"/>

In [1]:
import numpy as np

import tensorflow as tf
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.utils import to_categorical

In [2]:
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz


In [3]:
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

In [4]:
print(X_train.shape)
print(X_test.shape)
print(y_train.shape)
print(y_test.shape)

(50000, 32, 32, 3)
(10000, 32, 32, 3)
(50000, 10)
(10000, 10)


## 1. CNN-RNN
- Perform convolution & pooling operation first, and then perform recurrent operation
- Similar to the structure used in image captioning

<img src="https://cdn-images-1.medium.com/max/1600/1*vzFwXFJOrg6WRGNsYYT6qg.png" style="width: 600px"/>

In [6]:
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, LSTM, Input, Activation, Reshape, concatenate
from tensorflow.keras import optimizers

In [7]:
model = Sequential()

In [8]:
model.add(Conv2D(input_shape = (X_train.shape[1], X_train.shape[2], X_train.shape[3]), filters = 50, kernel_size = (3,3), strides = (1,1), padding = 'same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size = (2,2)))

In [9]:
print(model.output_shape)

(None, 16, 16, 50)


In [10]:
model.add(Reshape(target_shape = (16*16, 50)))

In [11]:
model.add(LSTM(50, return_sequences = False))

In [12]:
model.add(Dense(10))
model.add(Activation('softmax'))

In [13]:
adam = optimizers.Adam(lr = 0.001)
model.compile(loss = 'categorical_crossentropy', optimizer = adam, metrics = ['accuracy'])

In [14]:
print(model.summary())

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 32, 32, 50)        1400      
_________________________________________________________________
activation (Activation)      (None, 32, 32, 50)        0         
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 16, 16, 50)        0         
_________________________________________________________________
reshape (Reshape)            (None, 256, 50)           0         
_________________________________________________________________
lstm (LSTM)                  (None, 50)                20200     
_________________________________________________________________
dense (Dense)                (None, 10)                510       
_________________________________________________________________
activation_1 (Activation)    (None, 10)                0

In [None]:
%%time
history = model.fit(X_train, y_train, epochs = 100, batch_size = 100, verbose = 1, validation_split=0.2)

In [106]:
results = model.evaluate(X_test, y_test)



In [107]:
print('Test Accuracy: ', results[1])

Test Accuracy:  0.5927


## 2. CNN-RNN-2
- Perform convolution-pooling operations and recurrent operations independently, and sum their results up
- Similar to the structure used in visual question answering

<img src="https://camo.githubusercontent.com/828817c970da406d2d83dc9a5c03fb120231e2a2/687474703a2f2f692e696d6775722e636f6d2f56627149525a7a2e706e67" style="width: 800px"/>

In [17]:
input_layer = Input(shape = (X_train.shape[1], X_train.shape[2], X_train.shape[3]))
conv_layer = Conv2D(filters = 50, kernel_size = (3,3), strides = (1,1), padding = 'same')(input_layer)
activation_layer = Activation('relu')(conv_layer)
pooling_layer = MaxPooling2D(pool_size = (2,2), padding = 'same')(activation_layer)
flatten = Flatten()(pooling_layer)
dense_layer_1 = Dense(100)(flatten)

In [18]:
reshape = Reshape(target_shape = (X_train.shape[1]*X_train.shape[2], X_train.shape[3]))(input_layer)
lstm_layer = LSTM(50, return_sequences = False)(reshape)
dense_layer_2 = Dense(100)(lstm_layer)

In [19]:
merged_layer = concatenate([dense_layer_1, dense_layer_2])

In [20]:
output_layer = Dense(10, activation = 'softmax')(merged_layer)

In [21]:
model = Model(inputs = input_layer, outputs = output_layer)

In [22]:
adam = optimizers.Adam(lr = 0.001)
model.compile(loss = 'categorical_crossentropy', optimizer = adam, metrics = ['accuracy'])

In [23]:
print(model.summary())

Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            [(None, 32, 32, 3)]  0                                            
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 32, 32, 50)   1400        input_1[0][0]                    
__________________________________________________________________________________________________
activation_2 (Activation)       (None, 32, 32, 50)   0           conv2d_1[0][0]                   
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)  (None, 16, 16, 50)   0           activation_2[0][0]               
______________________________________________________________________________________________

In [None]:
%%time
history = model.fit(X_train, y_train, epochs = 10, batch_size = 100, verbose = 0)

In [46]:
results = model.evaluate(X_test, y_test)



In [47]:
print('Test Accuracy: ', results[1])

[14.50628567199707, 0.10000000000000001]
