## Differences between sequential and functional API models in ®Keras

### This notebooks present the differences in number of parameters in API functional and Sequential models. 

#### As you can see below, there are differences between output shapes in each layers which provides completely different number parameters and learning time / accuracy.

This is next step of my investigation about this topic. I discovered it before, however now I add generator here to dismiss any additional features which might cause that issue. The next step is to create equivalent of such a model (even without training) in Keras and try to understand what's *under the hood* of this problem.

In [1]:
import tensorflow as tf
from tensorflow.python.client import device_lib
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
device_lib.list_local_devices()

[name: "/device:CPU:0"
 device_type: "CPU"
 memory_limit: 268435456
 locality {
 }
 incarnation: 5451369871618886748, name: "/device:GPU:0"
 device_type: "GPU"
 memory_limit: 356122624
 locality {
   bus_id: 1
 }
 incarnation: 1778908455212714976
 physical_device_desc: "device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0, compute capability: 3.7"]

In [2]:
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential, Model, load_model
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense, Input
from keras.callbacks import ModelCheckpoint
from keras.utils.vis_utils import plot_model
from sklearn.utils import shuffle
from keras import backend as K
import matplotlib.pyplot as plt

Using TensorFlow backend.


In [3]:
width, height = 120,120
train_data_dir = '../zdjeciax4_gen/train'
test_data_dir = '../zdjeciax4_gen/test'
nb_train_samples = 10468
nb_test_samples = 2618
epochs = 20
batch_size = 4

input_shape = (width, height, 1)

## Sequential model

In [4]:
model = Sequential()
model.add(Conv2D(64, kernel_size=(7, 7), input_shape=input_shape, activation='relu'))
model.add(Conv2D(64, kernel_size=(7, 7), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(128, (5, 5), activation='relu'))
model.add(Conv2D(128, (5, 5), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(256, (3, 3), activation='relu'))
model.add(Conv2D(256, (3, 3), activation='relu'))
model.add(Conv2D(256, (3, 3), activation='relu'))
model.add(Conv2D(256, (3, 3), activation='relu'))
model.add(Dropout(0.25))

model.add(Conv2D(512, (3, 3), activation='relu'))
model.add(Conv2D(512, (3, 3), activation='relu'))
model.add(Conv2D(512, (3, 3), activation='relu'))
model.add(Conv2D(512, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(128, activation='relu'))
model.add(Dense(6, activation='softmax'))

Instructions for updating:
keep_dims is deprecated, use keepdims instead


## Api functional model

In [5]:
visible = Input(shape=input_shape)

conv1 = Conv2D(64, kernel_size=(7, 7), padding='same', activation='relu')(visible)
conv2 = Conv2D(64, kernel_size=(7, 7), padding='same', activation='relu')(conv1)
pool1 = MaxPooling2D((2, 2))(conv2)

conv3 = Conv2D(128, kernel_size=(5, 5), padding='same', activation='relu')(pool1)
conv4 = Conv2D(128, kernel_size=(5, 5), padding='same', activation='relu')(conv3)
pool2 = MaxPooling2D((2, 2))(conv4)

conv5 = Conv2D(256, kernel_size=(3, 3), padding='same', activation='relu')(pool2)
conv6 = Conv2D(256, kernel_size=(3, 3), padding='same', activation='relu')(conv5)
conv7 = Conv2D(256, kernel_size=(3, 3), padding='same', activation='relu')(conv6)
conv8 = Conv2D(256, kernel_size=(3, 3), padding='same', activation='relu')(conv7)
dropout1 = Dropout(0.25)(conv8)

conv9 = Conv2D(512, kernel_size=(3, 3), padding='same', activation='relu')(dropout1)
conv10 = Conv2D(512, kernel_size=(3, 3), padding='same', activation='relu')(conv9)
conv11 = Conv2D(512, kernel_size=(3, 3), padding='same', activation='relu')(conv10)
conv12 = Conv2D(512, kernel_size=(3, 3), padding='same', activation='relu')(conv11)
pool4 = MaxPooling2D((2, 2))(conv12)

flat1 = Flatten()(pool4)
hidden1 = Dense(4096, activation='relu')(flat1)
dropout2 = Dropout(0.5)(hidden1)
hidden2 = Dense(4096, activation='relu')(dropout2)
dropout3 = Dropout(0.5)(hidden2)
hidden3 = Dense(128, activation='relu')(dropout3)
predictions = Dense(6, activation='softmax')(hidden3)
modelAPI = Model(inputs=visible, outputs=predictions)
modelAPI.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

Instructions for updating:
keep_dims is deprecated, use keepdims instead
Instructions for updating:
keep_dims is deprecated, use keepdims instead


In [6]:
model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

In [7]:
modelAPI.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

In [8]:
# summarize model and apply checkpoints
print(model.summary())
filepath = 'SEQ2-{epoch:02d}-{loss:.4f}.h5'
checkpoint = ModelCheckpoint(filepath, monitor='loss', verbose=1, save_best_only=True, mode='min')
callbacks_list = [checkpoint]

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 114, 114, 64)      3200      
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 108, 108, 64)      200768    
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 54, 54, 64)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 50, 50, 128)       204928    
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 46, 46, 128)       409728    
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 23, 23, 128)       0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 21, 21, 256)       295168    
__________

In [9]:
# summarize model and apply checkpoints
print(modelAPI.summary())
filepathAPI = 'API2-{epoch:02d}-{loss:.4f}.h5'
checkpointAPI = ModelCheckpoint(filepath, monitor='loss', verbose=1, save_best_only=True, mode='min')
callbacks_listAPI = [checkpointAPI]

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         (None, 120, 120, 1)       0         
_________________________________________________________________
conv2d_13 (Conv2D)           (None, 120, 120, 64)      3200      
_________________________________________________________________
conv2d_14 (Conv2D)           (None, 120, 120, 64)      200768    
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 60, 60, 64)        0         
_________________________________________________________________
conv2d_15 (Conv2D)           (None, 60, 60, 128)       204928    
_________________________________________________________________
conv2d_16 (Conv2D)           (None, 60, 60, 128)       409728    
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 30, 30, 128)       0         
__________

In [10]:
train_datagen = ImageDataGenerator(
    rescale=1. / 255)
test_datagen = ImageDataGenerator(rescale=1. / 255)

In [11]:
train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(width, height),
    color_mode='grayscale',
    batch_size=batch_size,
    class_mode='categorical',
    classes=['1', '2', '3', '4', '5', '6'],
    shuffle=True,
    seed=2018)

test_generator = test_datagen.flow_from_directory(
    test_data_dir,
    target_size=(width, height),
    batch_size=batch_size,
    color_mode='grayscale',
    class_mode='categorical',
    classes=['1', '2', '3', '4', '5', '6'],
    shuffle=True,
    seed=2018)

Found 10468 images belonging to 6 classes.
Found 2618 images belonging to 6 classes.


In [None]:
history = model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples // batch_size,
    epochs=epochs,
    verbose=1,
    callbacks=callbacks_list,
    shuffle=True,
    validation_data=test_generator,
    validation_steps=nb_test_samples // batch_size)

In [12]:
historyAPI = modelAPI.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples // batch_size,
    epochs=epochs,
    verbose=1,
    callbacks=callbacks_list,
    max_queue_size=2,
    shuffle=True,
    validation_data=test_generator,
    validation_steps=nb_test_samples // batch_size)

Epoch 1/20


ResourceExhaustedError: OOM when allocating tensor with shape[115200,4096] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[Node: training/Adam/mul_123 = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Adam_2/beta_2/read, training/Adam/Variable_56/read)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


Caused by op 'training/Adam/mul_123', defined at:
  File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.5/dist-packages/ipykernel_launcher.py", line 16, in <module>
    app.launch_new_instance()
  File "/usr/local/lib/python3.5/dist-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "/usr/local/lib/python3.5/dist-packages/ipykernel/kernelapp.py", line 478, in start
    self.io_loop.start()
  File "/usr/local/lib/python3.5/dist-packages/zmq/eventloop/ioloop.py", line 177, in start
    super(ZMQIOLoop, self).start()
  File "/usr/local/lib/python3.5/dist-packages/tornado/ioloop.py", line 888, in start
    handler_func(fd_obj, events)
  File "/usr/local/lib/python3.5/dist-packages/tornado/stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/zmq/eventloop/zmqstream.py", line 440, in _handle_events
    self._handle_recv()
  File "/usr/local/lib/python3.5/dist-packages/zmq/eventloop/zmqstream.py", line 472, in _handle_recv
    self._run_callback(callback, msg)
  File "/usr/local/lib/python3.5/dist-packages/zmq/eventloop/zmqstream.py", line 414, in _run_callback
    callback(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tornado/stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/ipykernel/kernelbase.py", line 281, in dispatcher
    return self.dispatch_shell(stream, msg)
  File "/usr/local/lib/python3.5/dist-packages/ipykernel/kernelbase.py", line 232, in dispatch_shell
    handler(stream, idents, msg)
  File "/usr/local/lib/python3.5/dist-packages/ipykernel/kernelbase.py", line 397, in execute_request
    user_expressions, allow_stdin)
  File "/usr/local/lib/python3.5/dist-packages/ipykernel/ipkernel.py", line 208, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "/usr/local/lib/python3.5/dist-packages/ipykernel/zmqshell.py", line 533, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/IPython/core/interactiveshell.py", line 2728, in run_cell
    interactivity=interactivity, compiler=compiler, result=result)
  File "/usr/local/lib/python3.5/dist-packages/IPython/core/interactiveshell.py", line 2850, in run_ast_nodes
    if self.run_code(code, result):
  File "/usr/local/lib/python3.5/dist-packages/IPython/core/interactiveshell.py", line 2910, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-12-16e54444fdec>", line 10, in <module>
    validation_steps=nb_test_samples // batch_size)
  File "/usr/local/lib/python3.5/dist-packages/keras/legacy/interfaces.py", line 87, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 2016, in fit_generator
    self._make_train_function()
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 990, in _make_train_function
    loss=self.total_loss)
  File "/usr/local/lib/python3.5/dist-packages/keras/legacy/interfaces.py", line 87, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/keras/optimizers.py", line 433, in get_updates
    v_t = (self.beta_2 * v) + (1. - self.beta_2) * K.square(g)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/variables.py", line 775, in _run_op
    return getattr(ops.Tensor, operator)(a._AsTensor(), *args)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/math_ops.py", line 907, in binary_op_wrapper
    return func(x, y, name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/math_ops.py", line 1130, in _mul_dispatch
    return gen_math_ops._mul(x, y, name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 2747, in _mul
    "Mul", x=x, y=y, name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3069, in create_op
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1579, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[115200,4096] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[Node: training/Adam/mul_123 = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Adam_2/beta_2/read, training/Adam/Variable_56/read)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.



In [None]:
score = model.evaluate_generator(test_generator)
print("Accuracy: %.2f%%" % (score[1] * 100))
print("Test loss:", score[0])
print("Test accuracy", score[1])

In [None]:
scoreAPI = modelAPI.evaluate_generator(test_generator)
print("Accuracy: %.2f%%" % (scoreAPI[1] * 100))
print("Test loss:", scoreAPI[0])
print("Test accuracy", scoreAPI[1])

In [None]:
model_json = model.to_json()
with open('APISEQ2_model.json', 'w') as json_file:
    json_file.write(model_json)
    
model.save_weights('APISEQ2_weights.h5')
print('Saved model weights')

In [None]:
modelAPI_json = modelAPI.to_json()
with open('APISEQ2_model.json', 'w') as json_file:
    json_file.write(modelAPI_json)
    
modelAPI.save_weights('APISEQ2_weights.h5')
print('Saved modelAPI weights')