## From interactive programming to production ready code

### Imports

In [1]:
from luigi.parameter import IntParameter, Parameter
from luigi import LocalTarget, Task
import luigi
import json
import keras
from keras.models import load_model

Using TensorFlow backend.


## Task No.1: Check for existing dataset

The provided Docker Image already contains the dataset. This tasks just checks if everything that is needed is at the right place.

*Output*: A folder containing the images

In [2]:
class DatasetExists(Task):
    
    dataset_name = Parameter(default="../../fruits")

    def output(self):
        return LocalTarget("%s" % self.dataset_name)

In [3]:
luigi.build([DatasetExists()], local_scheduler=True, no_lock=True)

DEBUG: Checking if DatasetExists(dataset_name=../../fruits) is complete
INFO: Informed scheduler that task   DatasetExists_______fruits_7c796e5355   has status   DONE
INFO: Done scheduling tasks
INFO: Running Worker with 1 processes
DEBUG: Asking scheduler for work...
DEBUG: Done
DEBUG: There are no more tasks to run at this time
INFO: Worker Worker(salt=673766696, workers=1, host=dd55a1764ecf, username=root, pid=907) was stopped. Shutting down Keep-Alive thread
INFO: 
===== Luigi Execution Summary =====

Scheduled 1 tasks of which:
* 1 complete ones were encountered:
    - 1 DatasetExists(dataset_name=../../fruits)

Did not run any tasks
This progress looks :) because there were no failed tasks or missing dependencies

===== Luigi Execution Summary =====



True

## Task No.2: Create a preprocessing configuration

The configuration for the deep-learning model is essentially the Keras ImageDataGenerator. For the sake of simplicity we do not parameterize this task. But we can grasp the idea how to do it.

*Input*: Nothing required <br>
*Output*: A pickled ImageDataGenerator

In [4]:
class Configure(Task):
    
    config_name = Parameter(default="standard")

    def output(self):
        return LocalTarget("configurations/%s.pickle" % self.config_name, format=luigi.format.Nop)

    def run(self):
        import pickle
        self.output().makedirs()
        generator = keras.preprocessing.image.ImageDataGenerator(rescale=1. / 255)
        with self.output().open("wb") as f:
            pickle.dump(generator, f)

In [5]:
luigi.build([Configure()], local_scheduler=True, no_lock=True)

DEBUG: Checking if Configure(config_name=standard) is complete
INFO: Informed scheduler that task   Configure_standard_6a58a4195c   has status   DONE
INFO: Done scheduling tasks
INFO: Running Worker with 1 processes
DEBUG: Asking scheduler for work...
DEBUG: Done
DEBUG: There are no more tasks to run at this time
INFO: Worker Worker(salt=417099903, workers=1, host=dd55a1764ecf, username=root, pid=907) was stopped. Shutting down Keep-Alive thread
INFO: 
===== Luigi Execution Summary =====

Scheduled 1 tasks of which:
* 1 complete ones were encountered:
    - 1 Configure(config_name=standard)

Did not run any tasks
This progress looks :) because there were no failed tasks or missing dependencies

===== Luigi Execution Summary =====



True

## Task No.3: Run the baseline validation

This task runs the baseline validation and saves it to a file. The same as before, flexibility can be greatly enhanced by als versioning the baseline validation.

*Input*: DatasetExists, Configure <br>
*Output*: A JSON-File containing the baseline accuracy

**This should be implemented by yourself ;-)**

In [6]:
class BaselineValidation(Task):
    
    dataset_name = Parameter(default="../../fruits")
    config_name = Parameter(default="standard")

    validation_set = "Test"
    img_height = 100
    img_width = 100
    baseline_name = "eval.json"

    def requires(self):
        yield DatasetExists(self.dataset_name)
        yield Configure(self.config_name)

    def output(self):
        pass

    def run(self):
        pass

## Task No.4: Train the deep learning model

Task No.5 trains a Keras model and persists it to the filesystem.

*Input*: DatasetExists, Configure <br>
*Output*: A .h5 file representing the model architecture and its weights

In [7]:
import pickle

def build_generator(config_path, 
                    dataset_path, 
                    data_set_type,
                    img_height=100, img_width=100):
    
    with open(config_path, "rb") as f:
        generator = pickle.load(f)
    path = "%s/%s" % (dataset_path, data_set_type)
    return generator.flow_from_directory(path,
                                         target_size=(img_height, 
                                                      img_width),
                                         color_mode='rgb')

def define_model(input_shape, num_classes):
        model = keras.models.Sequential()
        model.add(
        keras.layers.Conv2D(filters=4, kernel_size=(2, 2), strides=1, activation='relu', input_shape=input_shape))
        model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))
        model.add(keras.layers.Conv2D(filters=4, kernel_size=(2, 2), strides=1, activation='relu'))
        model.add(keras.layers.BatchNormalization())
        model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))
        model.add(keras.layers.Dropout(rate=0.25))
        model.add(keras.layers.Flatten())
        model.add(keras.layers.Dense(units=8, activation='relu'))
        model.add(keras.layers.Dropout(rate=0.5))
        model.add(keras.layers.Dense(units=num_classes, activation='softmax'))
        model.compile(loss='categorical_crossentropy',
                      optimizer=keras.optimizers.Adadelta(),
                      metrics=['accuracy'])
        return model
    
class TrainModel(Task):
    
    dataset_name = Parameter(default="../../fruits")
    config_name = Parameter(default="standard")
    model_version = IntParameter(default=1)
    model_name = Parameter(default="fruits")
    
    training_set = "Training"
    epochs = 2

    def requires(self):
        yield DatasetExists(self.dataset_name)
        yield Configure(self.config_name)

    def output(self):
        return LocalTarget("model/%d/%s.h5" % (self.model_version, self.model_name))

    def run(self):
        self.output().makedirs()
        dataset = self.input()[0].path
        config = self.input()[1].path
        training_data = build_generator(config, dataset, self.training_set)
        input_shape = training_data.image_shape
        num_classes = len(training_data.class_indices)
        model = define_model(input_shape, num_classes)
        steps_per_epoch = training_data.samples // training_data.batch_size
        model.fit_generator(training_data,
                            steps_per_epoch=steps_per_epoch,
                            epochs=self.epochs,
                            verbose=2)
        model.save(self.output().path)

In [8]:
luigi.build([TrainModel()], local_scheduler=True, no_lock=True)

DEBUG: Checking if TrainModel(dataset_name=../../fruits, config_name=standard, model_version=1, model_name=fruits) is complete
INFO: Informed scheduler that task   TrainModel_standard_______fruits_fruits_70c43a64fd   has status   DONE
INFO: Done scheduling tasks
INFO: Running Worker with 1 processes
DEBUG: Asking scheduler for work...
DEBUG: Done
DEBUG: There are no more tasks to run at this time
INFO: Worker Worker(salt=626370923, workers=1, host=dd55a1764ecf, username=root, pid=907) was stopped. Shutting down Keep-Alive thread
INFO: 
===== Luigi Execution Summary =====

Scheduled 1 tasks of which:
* 1 complete ones were encountered:
    - 1 TrainModel(dataset_name=../../fruits, config_name=standard, model_version=1, model_name=fruits)

Did not run any tasks
This progress looks :) because there were no failed tasks or missing dependencies

===== Luigi Execution Summary =====



True

## Task No.5: Evaluate the model

The last task evaluates our model and - if it surpasses the baseline accuracy - saves the evaluation results to the filesystem. Let the task crash if the model does not perform well enough. It's worth an exception!

*Input*: DatasetExists, Configure, TrainModel, BaselineValidation<br>
*Output*: A JSON file containing the evaluation results

**This should be implemented by yourself ;-)**

In [9]:
class Evaluate(Task):
    
    dataset_name = Parameter(default="../../fruits")
    config_name = Parameter(default="standard")
    model_version = IntParameter(default=1)
    model_name = Parameter(default="fruits")

    validation_set = "Test"

    def requires(self):
        yield TrainModel(self.dataset_name, 
                         self.config_name,
                         self.model_version,
                         self.model_name)
        yield BaselineValidation(self.dataset_name,
                                 self.config_name)
        yield DatasetExists(self.dataset_name)
        yield Configure(self.config_name)

    def output(self):
        pass

    def run(self):
        pass

## Surprise Task No.6: Deploy to TensorFlow-Serving

The Keras model is performing well. Let's deploy it to TensorFlow Serving.

It can be loaded with TensorFlow Serving by the following command:
tensorflow_model_server --model_name="keras_model" --model_base_path="serving/keras_model"

*Input*: TrainModel, Evaluate </br>
*Output*: The TensorFlow-Graph and its weights

In [10]:
import tensorflow as tf
import keras
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.saved_model import tag_constants
from tensorflow.python.saved_model import builder

class Export(Task):
    dataset_name = Parameter(default="../../fruits")
    config_name = Parameter(default="standard")
    model_version = IntParameter(default=1)
    model_name = Parameter(default="fruits")

    def requires(self):
        yield TrainModel(self.dataset_name,
                         self.config_name,
                         self.model_version,
                         self.model_name)

    def output(self):
        return LocalTarget("../6-models/%s/%d" % (self.model_name,
                                                  self.model_version))

    def run(self):
        self.output().makedirs()
        model_path = self.input()[0].path
        model = keras.models.load_model(model_path)
        tensor_info_input = tf.saved_model.utils.build_tensor_info(model.input)
        tensor_info_output = tf.saved_model.utils.build_tensor_info(model.output)
        prediction_signature = (
            tf.saved_model.signature_def_utils.build_signature_def(
                inputs={'input': tensor_info_input},
                outputs={'prediction': tensor_info_output},
                method_name=signature_constants.PREDICT_METHOD_NAME))

        export_path = self.output().path
        tf_builder = builder.SavedModelBuilder(export_path)
        with keras.backend.get_session() as sess:
            tf_builder.add_meta_graph_and_variables(
                sess=sess,
                tags=[tag_constants.SERVING],
                signature_def_map={
                    signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: prediction_signature
                }
            )
            tf_builder.save()


In [13]:
luigi.build([Export()], local_scheduler=True, no_lock=True)

DEBUG: Checking if Export(dataset_name=../../fruits, config_name=standard, model_version=1, model_name=fruits) is complete
DEBUG: Checking if TrainModel(dataset_name=../../fruits, config_name=standard, model_version=1, model_name=fruits) is complete
INFO: Informed scheduler that task   Export_standard_______fruits_fruits_70c43a64fd   has status   PENDING
INFO: Informed scheduler that task   TrainModel_standard_______fruits_fruits_70c43a64fd   has status   DONE
INFO: Done scheduling tasks
INFO: Running Worker with 1 processes
DEBUG: Asking scheduler for work...
DEBUG: Pending tasks: 1
INFO: [pid 907] Worker Worker(salt=171444472, workers=1, host=dd55a1764ecf, username=root, pid=907) running   Export(dataset_name=../../fruits, config_name=standard, model_version=1, model_name=fruits)


INFO:tensorflow:No assets to save.
INFO:tensorflow:No assets to write.
INFO:tensorflow:SavedModel written to: b'../6-models/fruits/1/saved_model.pb'


INFO: [pid 907] Worker Worker(salt=171444472, workers=1, host=dd55a1764ecf, username=root, pid=907) done      Export(dataset_name=../../fruits, config_name=standard, model_version=1, model_name=fruits)
DEBUG: 1 running tasks, waiting for next task to finish
INFO: Informed scheduler that task   Export_standard_______fruits_fruits_70c43a64fd   has status   DONE
DEBUG: Asking scheduler for work...
DEBUG: Done
DEBUG: There are no more tasks to run at this time
INFO: Worker Worker(salt=171444472, workers=1, host=dd55a1764ecf, username=root, pid=907) was stopped. Shutting down Keep-Alive thread
INFO: 
===== Luigi Execution Summary =====

Scheduled 2 tasks of which:
* 1 complete ones were encountered:
    - 1 TrainModel(dataset_name=../../fruits, config_name=standard, model_version=1, model_name=fruits)
* 1 ran successfully:
    - 1 Export(dataset_name=../../fruits, config_name=standard, model_version=1, model_name=fruits)

This progress looks :) because there were no failed tasks or missing 

True