# Advanced Keras Tutorial 2

This notebook contains 

1) input with more complicated data generator. At least to see the difference I need more complicated model. An example with parallel data training on multipole GPUs. 

2) transfer learning. 

In [1]:
import os
import keras
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

Using TensorFlow backend.


In [4]:
keras.__version__, tf.__version__

('2.0.8', '1.3.0')

In [6]:
from subprocess import check_output
print(check_output(["ls", "./"]).decode("utf8"))

# Any results you write to the current directory are saved as output.

mnist_advanced_2.ipynb
mnist_advanced.ipynb
mnist_quick_start.ipynb
README.md



Borrow code from https://github.com/fchollet/keras/tree/master/examples

In [9]:
'''Trains a simple convnet on the MNIST dataset.
Gets to 99.25% test accuracy after 12 epochs
(there is still a lot of margin for parameter tuning).
16 seconds per epoch on a GRID K520 GPU.
'''

from __future__ import print_function
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K

batch_size = 128
num_classes = 10
epochs = 12

# input image dimensions
img_rows, img_cols = 28, 28

# the data, shuffled and split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=0,
          validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
Test loss: 0.0254829914389
Test accuracy: 0.9918


# Make a parrallel model

As a preface to this, I would like to note that your model may not run any faster on multiple GPUs if you are not actually GPU bound; some cases where this can happen include when you use a generator with your data and it‘s creation is CPU/IO bound, or if your model is not particularly complex and you are Memory-bound when transferring data to your GPU.

https://github.com/fchollet/keras/blob/3dd3e8331677e68e7dec6ed4a1cbf16b7ef19f7f/keras/utils/training_utils.py#L56-L75

In [None]:
from keras.applications import Xception
from keras.utils.training_utils import multi_gpu_model
from keras.datasets import cifar10

(x_train, y_train), (x_test, y_test) = cifar10.load_data()

num_samples = 1000
height = 32
width = 32
num_classes = 1000
# Instantiate the base model
# (here, we do it on CPU, which is optional).
with tf.device('/cpu:0'):
    model = Xception(weights=None,
                     input_shape=(height, width, 3),
                     classes=num_classes)
model.compile(loss='categorical_crossentropy',
                       optimizer='rmsprop')
# Replicates the model on 2 GPUs.
# This assumes that your machine has 2 available GPUs.
#parallel_model = multi_gpu_model(model, gpus=1)
#parallel_model.compile(loss='categorical_crossentropy',
#                       optimizer='rmsprop')

# This `fit` call will be distributed on 2 GPUs.
# Since the batch size is 256, each GPU will process 128 samples.
model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=0,
          validation_data=(x_test, y_test))

Downloading data from http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz

    16384/170498071 [..............................] - ETA: 4:38
   106496/170498071 [..............................] - ETA: 2:17
   401408/170498071 [..............................] - ETA: 1:00
  1015808/170498071 [..............................] - ETA: 32s 
  1581056/170498071 [..............................] - ETA: 25s
  2146304/170498071 [..............................] - ETA: 23s
  2711552/170498071 [..............................] - ETA: 21s
  3276800/170498071 [..............................] - ETA: 20s
  3842048/170498071 [..............................] - ETA: 19s
  4407296/170498071 [..............................] - ETA: 18s
  4972544/170498071 [..............................] - ETA: 18s
  5537792/170498071 [..............................] - ETA: 17s
  6103040/170498071 [>.............................] - ETA: 17s
  6668288/170498071 [>.............................] - ETA: 17s
  7233536/170498071 [>

