# FloydHub vs AWS vs GCE

This notebook is used to visualize the results between the different Cloud Service instance on Keras with TF as backend.

These are the instance we are comparing:

- GPU: 4 core vCPU, NVIDIA K80, 12 GB VRAM, 60GB RAM 
- CPU: 2 core (vCPU), 8GM RAM

Note: plotly does not save local png file, I have to find a workaround.

#### Table of contents

- [MNIST MLP](#MLP)
- [MNIST CNN](#CNN)
- [CIFAR-10 CNN](#CNN2)
- [IMDB Bi-dir LSTM](#BI)
- [IMDB Fasttext](#FAST)
- [LSTM text gen](#LSTM)

In [6]:
# Dep
import plotly
import numpy as np
import pandas as pd
from plotly import __version__
import plotly.graph_objs as go
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot

print (__version__) # requires version >= 1.9.0
plotly.offline.init_notebook_mode(connected=True)

2.2.1


In [7]:
# Supporting function

def from_csv_to_time(csv_file):
    """Take csv metrics file and return the total training time"""
    df = pd.read_csv(csv_file)
    #df.head(n=10)
    time_col = df[['elapsed']]
    total_time = time_col.sum(axis=0) / 60 # Get minutes
    return np.asscalar(total_time.values)
    
    
def plot_total_time(fh_time, gce_time, title):
    """Plot the total training time comparison"""
    trace1 = go.Bar(
        x=['CPU', 'GPU'],
        y=fh_time,
        name='FloydHub'
    )
    # trace2 = go.Bar(
    #     x=['CPU', 'GPU'],
    #     y=[12, 18, 29],
    #     name='AWS'
    # )
    trace2 = go.Bar(
        x=['GPU'],
        y=gce_time,
        name='Google Cloud Engine'
    )

    # data = [trace1, trace2, trace3]
    data = [trace1, trace2]
    layout = go.Layout(
        barmode='group',
        title=title,
        yaxis={'title': 'Time in minutes'},
        width=1
    )


    fig = go.Figure(data=data, layout=layout)
    iplot(fig, filename='grouped-bar')

## 1. MNIST MLP
<a name="MLP"></a>

FC(784, 512)[ReLU][Dropout 0.2] -> FC(512, 512)[ReLU][Dropout 0.2] -> FC(512, 10)[Softmax]

In [8]:
# Extract info from CSV
fh_mnist_mlp_file = ['../logs/fh/cpu/mnist_mlp_tensorflow.csv',
                     '../logs/fh/gpu/mnist_mlp_tensorflow.csv']

gce_mnist_mlp_file = ['../logs/gce/gpu/mnist_mlp_tensorflow.csv']

fh_time = []
for csv_file in fh_mnist_mlp_file:
    fh_time.append(from_csv_to_time(csv_file))

gce_time = []
for csv_file in gce_mnist_mlp_file:
    gce_time.append(from_csv_to_time(csv_file))
    
# Plot
plot_total_time(fh_time, gce_time, title="MNIST-MLP")

## 2. MNIST ConvNet
<a name="CNN"></a>

Conv(32,3,3)[ReLU] -> Conv(64,3,3)[ReLU] -> MaxPool(2,2)[Dropout 0.25] ->
FC(_, 128)[ReLU][Dropout 0.5] -> FC(128, 10)[Softmax]

In [9]:
# Extract info from CSV
fh_mnist_mlp_file = ['../logs/fh/cpu/mnist_cnn_tensorflow.csv',
                     '../logs/fh/gpu/mnist_cnn_tensorflow.csv']

gce_mnist_mlp_file = ['../logs/gce/gpu/mnist_cnn_tensorflow.csv']

fh_time = []
for csv_file in fh_mnist_mlp_file:
    fh_time.append(from_csv_to_time(csv_file))

gce_time = []
for csv_file in gce_mnist_mlp_file:
    gce_time.append(from_csv_to_time(csv_file))
    
# Plot
plot_total_time(fh_time, gce_time, title="MNIST-CNN")

## 3. CIFAR-10 CNN
<a name="CNN2"></a>

Conv(32,3,3)[ReLU] -> Conv(32,3,3)[ReLU] -> MaxPool(2,2)[Dropout 0.25] ->
Conv(64,3,3)[ReLU] -> Conv(64,3,3)[ReLU] -> MaxPool(2,2)[Dropout 0.25] ->
FC(_, 512)[ReLU][Dropout 0.5] -> FC(512, 10)[Softmax]

In [11]:
# Extract info from CSV
fh_mnist_mlp_file = ['../logs/fh/cpu/cifar10_cnn_tensorflow.csv',
                     '../logs/fh/gpu/cifar10_cnn_tensorflow.csv']

gce_mnist_mlp_file = ['../logs/gce/gpu/cifar10_cnn_tensorflow.csv']

fh_time = []
for csv_file in fh_mnist_mlp_file:
    fh_time.append(from_csv_to_time(csv_file))

gce_time = []
for csv_file in gce_mnist_mlp_file:
    gce_time.append(from_csv_to_time(csv_file))
    
# Plot
plot_total_time(fh_time, gce_time, title="CIFAR-10")

## 4. IMDB Bi-dir LSTM
<a name="BI"></a>

Embedding(20000, 128) -> LSTM()[tanh][Dropout 0.5] -> FC(64,1)[Sigmoid]

In [12]:
# Extract info from CSV
fh_mnist_mlp_file = ['../logs/fh/cpu/imdb_bidirectional_lstm_tensorflow.csv',
                     '../logs/fh/gpu/imdb_bidirectional_lstm_tensorflow.csv']

gce_mnist_mlp_file = ['../logs/gce/gpu/imdb_bidirectional_lstm_tensorflow.csv']

fh_time = []
for csv_file in fh_mnist_mlp_file:
    fh_time.append(from_csv_to_time(csv_file))

gce_time = []
for csv_file in gce_mnist_mlp_file:
    gce_time.append(from_csv_to_time(csv_file))
    
# Plot
plot_total_time(fh_time, gce_time, title="IMDB Bi-dir LSTM")

## 5. IMDB Fasttext
<a name="FAST"></a>

Embedding(20000, 50) -> GlobalAveragePooling1D() -> FC(50, 1)[Sigmoid]

In [13]:
# Extract info from CSV
fh_mnist_mlp_file = ['../logs/fh/cpu/imdb_fasttext_tensorflow.csv',
                     '../logs/fh/gpu/imdb_fasttext_tensorflow.csv']

gce_mnist_mlp_file = ['../logs/gce/gpu/imdb_fasttext_tensorflow.csv']

fh_time = []
for csv_file in fh_mnist_mlp_file:
    fh_time.append(from_csv_to_time(csv_file))

gce_time = []
for csv_file in gce_mnist_mlp_file:
    gce_time.append(from_csv_to_time(csv_file))
    
# Plot
plot_total_time(fh_time, gce_time, title="IMDB Fasttext")

## 6. LSTM text gen
<a name="LSTM"></a>

LSTM -> FC()[softmax]

In [14]:
# Extract info from CSV
fh_mnist_mlp_file = ['../logs/fh/cpu/lstm_text_generation_tensorflow.csv',
                     '../logs/fh/gpu/lstm_text_generation_tensorflow.csv']

gce_mnist_mlp_file = ['../logs/gce/gpu/lstm_text_generation_tensorflow.csv']

fh_time = []
for csv_file in fh_mnist_mlp_file:
    fh_time.append(from_csv_to_time(csv_file))

gce_time = []
for csv_file in gce_mnist_mlp_file:
    gce_time.append(from_csv_to_time(csv_file))
    
# Plot
plot_total_time(fh_time, gce_time, title="LSTM text generation")

## Summary

At the moment FH GPU instance seems a bit faster than GCE'K80. AWS soon.