## Use MLflow to Experiment a Keras Network Model Binary Classification for Movie Reviews

This notebook is based on 
* Jules Damji's ([@2twitme](https://twitter.com/2twitme)) blog [How to Use MLflow to Experiment a Keras Network Model: Binary Classification for Movie Reviews](https://databricks.com/blog/2018/08/23/how-to-use-mlflow-to-experiment-a-keras-network-model-binary-classification-for-movie-reviews.html) 
* [jsd-mlflow-examples](https://github.com/dmatrix/jsd-mlflow-examples)
* Francois Cholllet's book: [_Deep Learning with Python_](https://www.manning.com/books/deep-learning-with-python?a_aid=keras&a_bid=76564dff)


Pre-requisites:
* Refer to [MLflow Quick Start: Model Training and Logging](https://docs.databricks.com/spark/latest/mllib/mlflow.html) to setup an MLflow tracking server on a Linux instance.

In [2]:
import mlflow

# Set global variable for output_dir
output_dir = ""

In [3]:
from keras.datasets import imdb
import numpy as np

"""
Part of this module is derived and borrowed heavily from Francois Chollet's book Deep Learning Python. Original code
can be found at :https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/3.5-classifying-movie-reviews.ipynb

We'll be working with "IMDB dataset", a set of 50,000 highly-polarized reviews from the Internet Movie Database. They are split into 
25,000 reviews for training and 25,000 reviews for testing, each set consisting in 50% negative and 50% positive reviews.
Why do we have these two separate training and test sets? You should never test a machine learning model on the same data 
that you used to train it! Just because a model performs well on its training data doesn't mean that it will perform well 
on data it has never seen, and what you actually care about is your model's performance on new data (since you already 
know the labels of your training data -- obviously you don't need your model to predict those). For instance, it is 
possible that your model could end up merely memorizing a mapping between your training samples and their targets -- 
which would be completely useless for the task of predicting targets for data never seen before. We will go over this 
point in much more detail in the next chapter.

Just like the MNIST dataset, the IMDB dataset comes packaged with Keras. It has already been preprocessed: the reviews 
(sequences of words) have been turned into sequences of integers, where each integer stands for a specific word in a dictionary.

The following code will load the dataset (when you run it for the first time, about 80MB of data will be downloaded to your 
machine)
"""

class KIMDB_Data_Utils():

    def __init__(self):
        return

    def fetch_imdb_data(self, num_words=10000):
        """
        :param num_words: This arguments meants that we want to keep the top 10,000 most frequently occuring words in the training data. Rare words will be discarded
        :return: The variables train_data and test_data are lists of reviews, each review being a list of word indices (encoding a sequence of words).  train_labels and test_labels are lists of 0s and 1s, where 0 stands for "negative" \
        and 1 stands for "positive":
        """
        (train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=num_words)

        return (train_data, train_labels), (test_data, test_labels)

    def decode_review(self, train_data, index=0):
        """
        Return a decoded review
        :param index: is index into mapping of words into the integer index
        :return: a string matching the review
        """
        # word_index is a dictionary mapping words to an integer index
        word_index = imdb.get_word_index()
        # We reverse it, mapping integer indices to words
        reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])
        # We decode the review; note that our indices were offset by 3
        # because 0, 1 and 2 are reserved indices for "padding", "start of sequence", and "unknown".
        decoded_review = ' '.join([reverse_word_index.get(i - 3, '?') for i in train_data[index]])

        return decoded_review

    def prepare_vectorized_sequences(self, sequences, dimension=10000):
        """
        We cannot feed lists of integers into a neural network. We have to turn our lists into tensors. One way is to convert the sequence
        into tensors using Numpy. Also, we are going to use one-hot-encode our lists into vectors of 0s and 1s. That is, for instance turning the sequence
        [3, 5] into a 10,000-dimensional vector that would be all-zeros except for indices 3 and 5, which would be ones. Then we could use as first layer in our
        network a Dense layer, capable of handling floating point vector data.
        :param sequences: this is the sequence we want to convert
        :param dimension: size of the sequence
        :return: list of one-hot-encoded vector []
        """
        # Create an all-zero matrix of shape (len(sequences), dimension)
        results = np.zeros((len(sequences), dimension))
        for i, sequence in enumerate(sequences):
            results[i, sequence] = 1.  # set specific indices of results[i] to 1s

        return results

    def prepare_vectorized_labels(self, labels):
        """
        labels are scalars so we can just use numpy as arrays of type float
        :param labels: label data
        :return: numpy array
        """
        return np.asarray(labels).astype('float32')
#
# Test the functions
#
if __name__ == '__main__':
    # create a class handle
    kdata_cls = KIMDB_Data_Utils()
    (train_data, train_labels), (test_data, test_labels) = kdata_cls.fetch_imdb_data(num_words=10000)
    print(train_data[0])
    print(len(train_data))
    decoded = kdata_cls.decode_review(train_data)
    print(decoded)
    x_train = kdata_cls.prepare_vectorized_sequences(train_data)
    x_test = kdata_cls.prepare_vectorized_sequences(test_data)
    print(x_train[0])
    print(x_test[0])
    y_train = kdata_cls.prepare_vectorized_labels(train_labels)
    y_test = kdata_cls.prepare_vectorized_labels(test_labels)
    print(y_train)
    print(y_test)

In [4]:
from keras import models
from keras import layers

class KModel():

    def __init__(self):
        return

    def build_basic_model(self):

        """
        Build the base line model with one input layer, one hidden layer, and one output layer, with
        16, 16, and 1 output neurons. Default activation functions for input and hidden layer are relu
        and sigmoid respectively
        :return: a Keras network model
        """

        base_model = models.Sequential()
        base_model.add(layers.Dense(16, activation='relu', input_shape=(10000,)))
        base_model.add(layers.Dense(16, activation='relu'))
        base_model.add(layers.Dense(1, activation='sigmoid'))

        return base_model

    def build_experimental_model(self, hidden_layers=1, output=16, activation='relu'):

        exp_model = models.Sequential()
        # add the input layers
        exp_model.add(layers.Dense(output, activation=activation, input_shape=(10000,)))
        # add hidden layers
        for i in range(0, hidden_layers):
            exp_model.add(layers.Dense(output, activation=activation))
        # add output layer
        exp_model.add(layers.Dense(1, activation='sigmoid'))

        return exp_model

if __name__ == '__main__':

    mmaker = KModel()
    model = mmaker.build_basic_model()
    model.summary()

    custom_model = mmaker.build_experimental_model(3, 32, 'tanh')
    custom_model.summary()

In [5]:
import matplotlib.pyplot as plt

class KPlot():
    
    def __init__(self):
        return

    def plot_loss_graph(self,history, title):
        """
        Generate a matplotlib graph for the loss and accuracy metrics
        :param histroy:
        :return: instance of a graph
        """

        acc = history.history['binary_accuracy']
        loss = history.history['loss']
        val_loss = history.history['val_loss']

        epochs = range(1, len(acc) + 1)

        fig, ax = plt.subplots()
        ax.plot(epochs, loss, 'bo')

        # "bo" is for "blue dot"
        plt.plot(epochs, loss, 'bo', label='Training loss')
        # b is for "solid blue line"
        plt.plot(epochs, val_loss, 'b', label='Validation loss')
        plt.title(title)
        plt.xlabel('Epochs')
        plt.ylabel('Loss')
        plt.legend()

        return fig

    def plot_accuracy_graph(elf, history, title):

        plt.clf()
        acc = history.history['binary_accuracy']
        val_acc = history.history['val_binary_accuracy']

        epochs = range(1, len(acc) + 1)

        fig, ax = plt.subplots()
        ax.plot(epochs, acc, 'bo')

        plt.plot(epochs, acc, 'bo', label='Training acc')
        plt.plot(epochs, val_acc, 'b', label='Validation acc')
        plt.title(title)
        plt.xlabel('Epochs')
        plt.ylabel('Loss')
        plt.legend()

        return fig

In [6]:
# Create Params dictionary
class Params(object):
	def __init__(self, hidden_layers, epochs, output, loss):
		self.hidden_layers = hidden_layers
		self.epochs = epochs
		self.output = output
		self.loss = loss

# Configure args
#args = Params(1, 20, 16, "binary_crossentropy")
#args = Params(3, 30, 16, "binary_crossentropy")
#args = Params(3, 20, 32, "mse")

In [7]:
import os
import sys
import mlflow
import mlflow.keras
import tensorflow as tf
import tempfile

from keras import optimizers
from keras import metrics

class KTrain():

    def __init__(self):
        return

    def compile_and_fit_model(self, model, x_train, y_train, epochs=20, batch_size=512, loss='binary_crossentropy',
                              optimizer='rmsprop', lr=0.0001, metrics=metrics.binary_accuracy,
                              verbose=1, save_model=0):
        #
        # generate validation data and training data
        #
        x_val = x_train[:10000]
        partial_x_train = x_train[10000:]

        y_val = y_train[:10000]
        partail_y_train = y_train[10000:]

        if optimizer == 'rmsprop':
            opt = optimizers.RMSprop(lr=lr)
        model.compile(optimizer=opt,
                      loss=loss,
                      metrics=[metrics])
        #
        # fit the model: use part of the training data and use validation for unseen data
        #
        history = model.fit(partial_x_train,
                            partail_y_train,
                            epochs=epochs,
                            batch_size=batch_size,
                            verbose=verbose,
                            validation_data=(x_val, y_val))

        if save_model:
            pathdir = "keras_models/" + run_uuid
            model_dir = self.get_directory_path(pathdir)
            self.keras_save_model(model, model_dir)

        return history

    def keras_save_model(self, model, model_dir='/tmp'):
        """
        Convert Keras estimator to TensorFlow
        :type model_dir: object
        """
        print("Keras model saved locally at %s" % model_dir)
        mlflow.keras.save_model(model, model_dir)

    def evaulate_model(self,model, x_test, y_test):
        """
        Evaulate the model with unseen and untrained data
        :param model:
        :return: results of probability
        """

        return model.evaluate(x_test, y_test)

    def get_binary_loss(self, hist):
        loss = hist.history['loss']
        loss_val = loss[len(loss) - 1]
        return loss_val

    def get_binary_acc(self, hist):
        acc = hist.history['binary_accuracy']
        acc_value = acc[len(acc) - 1]

        return acc_value

    def get_validation_loss(self, hist):
        val_loss = hist.history['val_loss']
        val_loss_value = val_loss[len(val_loss) - 1]

        return val_loss_value

    def get_validation_acc(self, hist):
        val_acc = hist.history['val_binary_accuracy']
        val_acc_value = val_acc[len(val_acc) - 1]

        return val_acc_value


    def print_metrics(self, hist):

        acc_value = self.get_binary_acc(hist)
        loss_value = self.get_binary_loss(hist)

        val_acc_value = self.get_validation_acc(hist)

        val_loss_value = self.get_validation_loss(hist)

        print("Final metrics: binary_loss:%6.4f" % loss_value)
        print("Final metrics: binary_accuracy=%6.4f" % acc_value)
        print("Final metrics: validation_binary_loss:%6.4f" % val_loss_value)
        print("Final metrics: validation_binary_accuracy:%6.4f" % val_acc_value)

    def get_directory_path(self, dir_name, create_dir=True):

        cwd = os.getcwd()

        dir = os.path.join(cwd, dir_name)
        if create_dir:
          if not os.path.exists(dir):
             os.mkdir(dir)

        return dir

    def train_models(self, args, base_line=True):
        """
        Train the model and log all the MLflow Metrics
        :param args: command line arguments. If no arguments then use default
        :param base_line: Default flag. Create Baseline model
        """
        with mlflow.start_run(experiment_id=11):
            # Create TensorFlow Session
            sess = tf.InteractiveSession()


            #
            # initialize some classes
            #
            kdata_cls = KIMDB_Data_Utils()
            ktrain_cls = KTrain()
            kplot_cls = KPlot()

            #
            # get IMDB Data
            #
            (train_data, train_labels), (test_data, test_labels) = kdata_cls.fetch_imdb_data()

            #
            # prepare and vectorize data
            #
            x_train = kdata_cls.prepare_vectorized_sequences(train_data)
            x_test = kdata_cls.prepare_vectorized_sequences(test_data)

            y_train = kdata_cls.prepare_vectorized_labels(train_labels)
            y_test = kdata_cls.prepare_vectorized_labels(test_labels)

            image_dir = ktrain_cls.get_directory_path("images")
            model_dir = ktrain_cls.get_directory_path("models")

            graph_label_loss = 'Baseline Model: Training and Validation Loss'
            graph_label_acc = 'Baseline Model: Training and Validation Accuracy'
            graph_image_loss_png = os.path.join(image_dir,'baseline_loss.png')
            graph_image_acc_png = os.path.join(image_dir, 'baseline_accuracy.png')

            if not base_line:
                graph_label_loss = 'Experimental: Training and Validation Loss'
                graph_label_acc = 'Experimental Model: Training and Validation Accuracy'
                graph_image_loss_png = os.path.join(image_dir, 'experimental_loss.png')
                graph_image_acc_png = os.path.join(image_dir,'experimental_accuracy.png')

            kmodel = KModel()
            if base_line:
                print("Baseline Model:")
                model = kmodel.build_basic_model()
            else:
                print("Experiment Model:")
                model = kmodel.build_experimental_model(args.hidden_layers, args.output)

            history = ktrain_cls.compile_and_fit_model(model, x_train, y_train, epochs=args.epochs, loss=args.loss)
            model.summary()
            ktrain_cls.print_metrics(history)

            figure_loss = kplot_cls.plot_loss_graph(history, graph_label_loss)
            figure_loss.savefig(graph_image_loss_png )

            figure_acc = kplot_cls.plot_accuracy_graph(history, graph_label_acc)
            figure_acc.savefig(graph_image_acc_png)

            results = ktrain_cls.evaulate_model(model, x_test, y_test)

            print("Average Probability Results:")
            print(results)

            print()
            print("Predictions Results:")
            predictions = model.predict(x_test)
            print(predictions)

            # print out current run_uuid
            run_uuid = mlflow.active_run().info.run_uuid
            print("MLflow Run ID: %s" % run_uuid)
            
            # log parameters
            mlflow.log_param("hidden_layers", args.hidden_layers)
            mlflow.log_param("output", args.output)
            mlflow.log_param("epochs", args.epochs)
            mlflow.log_param("loss_function", args.loss)
            
            # log metrics
            mlflow.log_metric("binary_loss", ktrain_cls.get_binary_loss(history))
            mlflow.log_metric("binary_acc",  ktrain_cls.get_binary_acc(history))
            mlflow.log_metric("validation_loss", ktrain_cls.get_binary_loss(history))
            mlflow.log_metric("validation_acc", ktrain_cls.get_validation_acc(history))
            mlflow.log_metric("average_loss", results[0])
            mlflow.log_metric("average_acc", results[1])
            
            # log artifacts
            mlflow.log_artifacts(image_dir, "images")
                        
            # log model
            mlflow.keras.log_model(model, "models")

            # save model locally
            pathdir = "keras_models/" + run_uuid
            model_dir = self.get_directory_path(pathdir, False)
            ktrain_cls.keras_save_model(model, model_dir)

            # Write out tensorflow graph
            output_dir = tempfile.mkdtemp()
            print("Writing TensorFlow events locally to %s\n" % output_dir)
            writer = tf.summary.FileWriter(output_dir, graph=sess.graph)
            print("Uploading TensorFlow events as a run artifact.")
            mlflow.log_artifacts(output_dir, artifact_path="events")

        print("loss function use", args.loss)

        # Close TensorFlow Session
        sess.close()

  
def runReviews(args, flag):
    
    if flag:
        print("Using Default Baseline parameters")
    else:
        print("Using Experimental parameters")

    print("hidden_layers:", args.hidden_layers)
    print("output:", args.output)
    print("epochs:", args.epochs)
    print("loss:", args.loss)

    train_models_cls = KTrain().train_models(args, flag)

In [8]:
args = Params(1, 20, 16, "binary_crossentropy")
runReviews(args, True)

In [9]:
args = Params(3, 30, 16, "binary_crossentropy")
runReviews(args, False)

In [10]:
args = Params(3, 30, 32, "mse")
runReviews(args, True)

In [11]:
# Update this manually by reviewing the previous cell `Writing TensorFlow events locally to...`
output_dir = "/tmp/tmp2npaqjhu"
dbutils.tensorboard.start(output_dir)

In [12]:
dbutils.tensorboard.stop()