# Comparison of the Train time and Inference Time of MNIST MLP and 
# CNN on CPUs and GPUs with `keras`

The purpose of this notebook is to determine the relative speeds of training MNIST on CPUs and GPUs, as well as the relative speeds of inference. We will do this both for Convolutional Neural Network (CNN) and for Multi-Layer Perceptron (MLP) implementations of MNIST. It is similar to one of the experiments described in [this paper](https://arxiv.org/pdf/1904.08986.pdf) (arXiv:1904.08986v1 \[physics.data-an\]). The next step in this process will be to transform the code in this notebook into bare `tensorflow` code and compare the runtime between that and this `keras` implementation, and again between CPUs and GPUs in bare `tensorflow`. Then we will set the GPU implementation of MNIST up as a service. Ultimately, we look forward to running MNIST on TPUs and comparing runtime again.

The code for the implementations of mnist using MLP with `keras` were pulled from [this github](https://github.com/keras-team/keras/blob/master/examples/mnist_mlp.py), and the code for the CNN impementation was pulled from [this file](https://github.com/keras-team/keras/blob/master/examples/mnist_cnn.py) on the same github. I significantly changed both so that they would be more comparable.

## Imports
We import all the necessary classes and set some of the globals for the program. `NUM_CLASSES` is the number of categories to train mnist on. `NUM_EPOCHS` describes the number of epochs to run training over. `IMG_EDGE` is the side length of one of the (square) images, making the total pixel cound `IMG_EDGE ** 2`.

In [2]:
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D
from keras.optimizers import RMSprop, Adadelta
from keras.utils import to_categorical
from keras import backend as K
import tensorflow as tf
from time import time
import random
import numpy as np

# Disable depreciation warnings
try:
    from tensorflow.python.util import deprecation
    deprecation._PRINT_DEPRECATION_WARNINGS = False
except AttributeError:
    print("Import failed")

# Set global constants
NUM_CLASSES = 10
NUM_EPOCHS = 5
IMG_EDGE = 28
INFERENCE_TIME_THRESHOLD=10

Using TensorFlow backend.


## Creating MNIST

Having imported all the necessary modules and defined the constants, we need to implement MNIST in `keras`. Since we plan to implement MNIST with MLP and with CNN, we will first create a parent class holding the common functions of both. Later, we will define two subclasses, one for MLP and once for CNN. The parent class has several functions:
- `_load` and `_finish_load` load the default `keras` MNIST dataset in whatever form MLP or CNN wants the data to be in.
- `_get_batch_sizes` sets a list of all the batch sizes to test. Because CNN uses so much memory, we will need to test smaller batch sizes or else the machine will crash. However, for the MLP implementation, we can train on very high batch sizes. Hence the batch sizes will be different for each case.
- `_create` creates the MNIST model with a specific batch size.
- `_load_inferences` loads several randomly generated images to be inferred on.
- `_train` trains the MNIST model, keeping track of the time it takes to do so, and returns that time. The time it returns is actually the time to train _per iteration_, where one iteration is the number of epochs times the number of data points divided by the batch size.
- `_predict` runs a number of inferences equal to the batch size the model was trained on and returns the time per inference. It does this multiple times to reduce uncertainty.
- `get_data` runs all of the above functions in order to get the train times and inference times for the given machine type and implementation (MLP or CNN) for all the batch sizes`.

In [7]:
class MNIST:
    def __init__(self, machine):
        self.machine = machine
        self.model = None
        (self.x_train, self.y_train), (self.x_test, self.y_test) = mnist.load_data()
        self.start_power = 0
        self.end_power = 0
        self.train_times = []
        self.inference_times = []
        
    def _load(self):
        # To be overrided
        pass
    
    def _load_inferences(self):
        # To be overrided
        pass
    
    def _get_batch_sizes(self):
        self.batch_sizes = []
        for i in range(self.start_power, self.end_power):
            self.batch_sizes += list(range(10**i, 10**(i+1), 10**i))
        self.batch_sizes += [10**self.end_power]
    
    def _finish_load(self):
        self.x_train = self.x_train.astype('float32')
        self.x_test = self.x_test.astype('float32')
        self.x_train /= 255
        self.x_test /= 255
        print('Train dataset size:', self.x_train.shape[0])
        print('Test dataset size:', self.x_test.shape[0])

        # convert class vectors to binary class matrices
        self.y_train = to_categorical(self.y_train, NUM_CLASSES)
        self.y_test = to_categorical(self.y_test, NUM_CLASSES)
    
    def _create(self):
        # To be overrided
        pass
    
    def _train(self, batch_size):
        start_time = time()
        history = self.model.fit(self.x_train, self.y_train, batch_size=batch_size, epochs=NUM_EPOCHS, verbose=1,
                            validation_data=(self.x_test, self.y_test))
        end_time = time()
        train_time = (end_time - start_time) / (NUM_EPOCHS * self.x_train.shape[0])

        #loss, accuracy = self.model.evaluate(self.x_test, self.y_test, verbose=0)
        return train_time
    
    def _predict(self, batch_size):
        inference_time = 0
        start_inference = time()
        inference_num = 0
        while True: # Do multiple trials
            inputs = self._load_inferences()

            start_time = time()
            self.model.predict(inputs, batch_size=batch_size)
            end_time = time()
            
            inference_time += end_time - start_time
            inference_num += 1
            if end_time - start_inference > INFERENCE_TIME_THRESHOLD:
                print("Done inference")
                break
        return inference_time / (inference_num * self.x_train.shape[0]), inference_num
    
    def get_data(self):
        self.max_train = 0
        self.max_inference = 0
        
        self._load()
        self._get_batch_sizes()
        for batch_size in self.batch_sizes:
            self._create()
            
            train_time = self._train(batch_size)
            inference_time, inference_num = self._predict(batch_size)
            print('\n','Batch size:', batch_size, '\tTrain time:', train_time, '\tInference time', inference_time, '(%s)'%inference_num)
            print('+'*100)
            self.train_times.append(train_time)
            self.inference_times.append(inference_time)
            
            K.clear_session()# Clean up memory

Now that we have created our superclass, we may create two subclasses, one for MLP and one for CNN. They each handle the data and create the model differently, but in all other respects, `keras` allows us to treat them similarly, hence the superclass functions.

In [8]:
class MNIST_MLP(MNIST):
    def _load(self):
        self.x_train = self.x_train.reshape(60000, IMG_EDGE**2)
        self.x_test = self.x_test.reshape(10000, IMG_EDGE**2)
        
        self.start_power = 0
        self.end_power = 4
        
        self._finish_load()
        
    def _load_inferences(self):
        return np.random.rand(self.x_train.shape[0], IMG_EDGE**2)
    
    def _create(self):
        with tf.device(self.machine):
            self.model = Sequential()
            self.model.add(Dense(512, activation='relu', input_shape=(784,)))
            self.model.add(Dropout(0.2))
            self.model.add(Dense(512, activation='relu'))
            self.model.add(Dropout(0.2))
            self.model.add(Dense(NUM_CLASSES, activation='softmax'))

            self.model.compile(loss='categorical_crossentropy',
                          optimizer=RMSprop(),
                          metrics=['accuracy'])

In [9]:
class MNIST_CNN(MNIST):
    def _load(self):
        if K.image_data_format() == 'channels_first':
            self.x_train = self.x_train.reshape(self.x_train.shape[0], 1, IMG_EDGE, IMG_EDGE)
            self.x_test = self.x_test.reshape(self.x_test.shape[0], 1, IMG_EDGE, IMG_EDGE)
            self.input_shape = (1, IMG_EDGE, IMG_EDGE)
        else:
            self.x_train = self.x_train.reshape(self.x_train.shape[0], IMG_EDGE, IMG_EDGE, 1)
            self.x_test = self.x_test.reshape(self.x_test.shape[0], IMG_EDGE, IMG_EDGE, 1)
            self.input_shape = (IMG_EDGE, IMG_EDGE, 1)
        
        self._finish_load()
        
        self.start_power = 0
        self.end_power = 2# Smaller because CNN takes up more memory
        
    def _load_inferences(self):
        return np.random.rand(self.x_train.shape[0], IMG_EDGE, IMG_EDGE, 1)
    
    def _create(self):
        with tf.device(self.machine):
            self.model = Sequential()
            self.model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=self.input_shape))
            self.model.add(Conv2D(64, (3, 3), activation='relu'))
            self.model.add(MaxPooling2D(pool_size=(2, 2)))
            self.model.add(Dropout(0.25))
            self.model.add(Flatten())
            self.model.add(Dense(128, activation='relu'))
            self.model.add(Dropout(0.5))
            self.model.add(Dense(NUM_CLASSES, activation='softmax'))

            self.model.compile(loss='categorical_crossentropy',
                          optimizer=Adadelta(),
                          metrics=['accuracy'])

## Gathering Data
Now that we have defined all the methods we need to gather data on the train time and inference time of MNIST on different machines with different implementations, all we need to do is call the functions. `get_data` will generate our lists of train times and inference times for every batch size. This will take several hours.

In [25]:
print("MNIST MLP")
print()
mlp_cpu = MNIST_MLP('/cpu:0')
mlp_gpu = MNIST_MLP('/gpu:0')

print()
print("TRAIN ON CPUS")
print()
print('+'*100)
mlp_cpu.get_data()

print()
print("TRAIN ON GPUS")
print()
print('+'*100)
mlp_gpu.get_data()

print()
print('+'*47, "DONE", '+'*47)

MNIST MLP


TRAIN ON CPUS

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train dataset size: 60000
Test dataset size: 10000
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 1 	Train time: 0.00463285360733668 	Inference time 0.0005607285221417745 (1)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 2 	Train time: 0.0030870140480995178 	Inference time 0.00039501887957255045 (1)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 3 	Train time: 0.0019886515307426453 	Inference time 0.00030668368736902875

Epoch 5/5
Done inference

 Batch size: 9 	Train time: 0.0008144742250442505 	Inference time 0.00012006571491559346 (2)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 10 	Train time: 0.0008023828832308452 	Inference time 0.0001123758614063263 (2)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 20 	Train time: 0.0004529981780052185 	Inference time 7.745010852813721e-05 (2)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 30 	Train time: 0.0003847236299

Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 90 	Train time: 0.0002142064627011617 	Inference time 5.1415328184763594e-05 (3)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 100 	Train time: 0.000197178529103597 	Inference time 5.233177741368612e-05 (3)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 200 	Train time: 0.0001714441657066345 	Inference time 4.712630377875434e-05 (3)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size:

Done inference

 Batch size: 800 	Train time: 0.00014415589888890583 	Inference time 4.344169994195302e-05 (4)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 900 	Train time: 0.00013813772281010946 	Inference time 4.3009133140246076e-05 (4)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 1000 	Train time: 0.00013527613162994385 	Inference time 4.3336531519889835e-05 (4)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 2000 	Train time: 0.000131144909

Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 8000 	Train time: 0.0001395345679918925 	Inference time 4.52654759089152e-05 (3)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 9000 	Train time: 0.00013844428618748982 	Inference time 4.459146062533061e-05 (4)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 10000 	Train time: 0.00013860533952713013 	Inference time 4.4684839248657224e-05 (3)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

TRAIN ON GPUS

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train dataset 

Done inference

 Batch size: 6 	Train time: 0.0007073731883366903 	Inference time 0.00010258510112762451 (2)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 7 	Train time: 0.000611465265750885 	Inference time 9.054846366246541e-05 (2)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 8 	Train time: 0.0005274283957481384 	Inference time 7.262385686238607e-05 (2)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 9 	Train time: 0.0004222261953353882 	Infere

Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 60 	Train time: 6.55035392443339e-05 	Inference time 1.2837229172388712e-05 (8)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 70 	Train time: 5.7511791388193766e-05 	Inference time 1.172607938448588e-05 (8)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 80 	Train time: 5.007208585739136e-05 	Inference time 1.1071547865867615e-05 (8)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 90 	Train

Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 600 	Train time: 1.0306185881296794e-05 	Inference time 3.6934071116977266e-06 (12)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 700 	Train time: 9.374789396921794e-06 	Inference time 3.716956575711568e-06 (12)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 800 	Train time: 8.592690626780192e-06 	Inference time 3.686127879402854e-06 (11)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Ep

Done inference

 Batch size: 5000 	Train time: 5.650022029876709e-06 	Inference time 3.910848727593055e-06 (13)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 6000 	Train time: 6.9256997108459475e-06 	Inference time 4.416551854875353e-06 (12)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 7000 	Train time: 6.716055075327555e-06 	Inference time 7.853390552379466e-06 (9)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train on 60000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Done inference

 Batch size: 8000 	Train time: 5.322608153025

Occasionally, the CNN implementation of MNIST will run out of memory. So we will create a backup now of train time and inference times for MLP MNIST in case this happens and we are forced to restart the notebook.

In [26]:
backup = open("backup.txt", 'w')

assert len(mlp_cpu.batch_sizes) == len(mlp_gpu.batch_sizes) == len(mlp_cpu.train_times) == len(mlp_cpu.inference_times) \
                 == len(mlp_gpu.train_times) == len(mlp_gpu.inference_times)
for i in range(len(mlp_cpu.batch_sizes)):
    assert mlp_cpu.batch_sizes[i] == mlp_gpu.batch_sizes[i]
    backup.write(str(mlp_cpu.batch_sizes[i]) + '|' +
                 str(mlp_cpu.train_times[i]) + '|' +
                 str(mlp_cpu.inference_times[i]) + '|' +
                 str(mlp_gpu.train_times[i]) + '|' +
                 str(mlp_gpu.inference_times[i]) + '|' + '\n')

backup.close()

Now we are ready to get the data for the CNN implementations of MNIST.

In [10]:
print("MNIST_CNN")
cnn_cpu = MNIST_CNN('/cpu:0')
cnn_gpu = MNIST_CNN('/gpu:0')

print()
print("TRAIN ON CPUS")
print()
print('+'*100)
cnn_cpu.get_data()

print()
print("TRAIN ON GPUS")
print()
print('+'*100)
cnn_gpu.get_data()

print()
print('+'*47, "DONE", '+'*47)

MNIST_CNN

TRAIN ON CPUS

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Train dataset size: 60000
Test dataset size: 10000


KeyboardInterrupt: 

In [14]:
f = open("MLPdata.txt", 'r')
lines = f.read().split('\n')
f.close()
mlp_cpu = MNIST_MLP("\cpu:0")
mlp_gpu = MNIST_MLP("\gpu:0")

first = False
for line in lines:
    if line == '': continue
    batch_size, train_time, inference_time = line.split('|')
    if batch_size == '1':
        first = not first
    if first:
        mlp_cpu.batch_sizes.append(int(batch_size))
        mlp_cpu.train_times.append(float(train_time))
        mlp_cpu.inference_times.append(float(inference_time))
    else:
        mlp_cpu.batch_sizes.append(int(batch_size))
        mlp_cpu.train_times.append(float(train_time))
        mlp_cpu.inference_times.append(float(inference_time))
        
    
cnn_cpu = MNIST_CNN("\cpu:0")
f = open("CNNdata.txt", 'r')
lines = f.read().split('\n')
f.close()
for line in lines:
    if line == '': continue
    batch_size, train_time, inference_time = line.split('|')
    cnn_cpu.batch_sizes.append(int(batch_size))
    cnn_cpu.train_times.append(float(train_time))
    cnn_cpu.inference_times.append(float(inference_time))

## Plotting the Data
Now we wish to compare train time and inference time between CPUs and GPUs. We will make two plots, one for train time and one for inference time. First, we import the required modules. Each plot will have both MLP and CNN on it, and they can be compared or viewed as separate. First we import the required modules.

In general, MLP is plotted with cool colors and CNN is plotted with warm colors.

In [None]:
import numpy as np
import matplotlib.pyplot as plt

x_mlp = np.array(mlp_cpu.batch_sizes)
x_cnn = np.array(cnn_cpu.batch_sizes)

Then we plot the data we have gathered and obtain a graph of train times

In [None]:
plt.scatter(x_mlp, mlp_cpu.train_times, c='b', alpha = 0.5)
plt.scatter(x_mlp, mlp_gpu.train_times, c='r', alpha = 0.5, marker='s')
#plt.scatter(x_cnn, cnn_cpu.train_times, c='y', alpha = 0.5, marker='^')
#plt.scatter(x_cnn, cnn_gpu.train_times, c='m', alpha = 0.5, marker='v')
plt.xlabel('Batch size')
plt.ylabel('Train time (s)')
plt.xscale('log')
plt.yscale('log')
plt.axis([1, 10000, 0.001, 2])
plt.legend(['MLP CPU', 'MLP GPU', 'CNN CPU', 'CNN GPU'])
plt.show()

and a graph of inference times.

In [None]:
plt.scatter(x_mlp, mlp_cpu.inference_times, c='b', alpha = 0.5)
plt.scatter(x_mlp, mlp_gpu.inference_times, c='r', alpha = 0.5, marker='s')
#plt.scatter(x_cnn, cnn_cpu.inference_times, c='y', alpha = 0.5, marker='^')
#plt.scatter(x_cnn, cnn_gpu.inference_times, c='m', alpha = 0.5, marker='v')
plt.xlabel('Batch size')
plt.ylabel('Inference time (s)')
plt.xscale('log')
plt.yscale('log')
plt.axis([1, 10000, 0.0003, 1])
plt.legend(['MLP CPU', 'MLP GPU', 'CNN CPU', 'CNN GPU'])
plt.show()

We can also make a couple other plots, such as the performance gain in train time and inference time in using GPUs over CPUs. This is train time

In [None]:
def get_improvement(cpu_times, gpu_times):
    gain = []
    for i in range(len(cpu_times)):
        gain.append(cpu_times[i] / gpu_times[i] * 100)
    return np.array(gain)

gain_train_mlp = get_improvement(mlp_cpu.train_times, mlp_gpu.train_times)
gain_train_cnn = get_improvement(cnn_cpu.train_times, cnn_gpu.train_times)

plt.scatter(x_mlp, gain_train_mlp, c='k', alpha = 0.5, marker = 'd')
plt.scatter(x_cnn, gain_train_cnn, c='c', alpha = 0.5, marker = '>')
plt.xlabel('Batch size')
plt.ylabel('Train speed gain by using GPUs (%)')
plt.xscale('log')
plt.yscale('linear')
plt.legend(['MLP', 'CNN'])
plt.axhline(100, linestyle='--', linewidth=1, color='k')
plt.show()

and this is inference time.

In [None]:
gain_inference_mlp = get_improvement(mlp_cpu.inference_times, mlp_gpu.inference_times)
gain_inference_cnn = get_improvement(cnn_cpu.inference_times, cnn_gpu.inference_times)

plt.scatter(x_mlp, gain_inference_mlp, c='k', alpha = 0.5, marker='d')
plt.scatter(x_cnn, gain_inference_cnn, c='c', alpha = 0.5, marker='>')
plt.xlabel('Batch size')
plt.ylabel('Inference speed gain by using GPUs (%)')
plt.xscale('log')
plt.yscale('linear')
plt.legend(['MLP', 'CNN'])
plt.axhline(100, linestyle='--', linewidth=1, color='k')
plt.show()

The collection of the above data concludes this experiment.