# Tensorflow-Keras Classifier and Transfer Learning Using Serverless Functions

```mlrun``` is an open-source Python package that provides a framework for running machine learning tasks transparently in multiple, scalable, runtime environments.  ```mlrun``` provides tracking of code, metadata, inputs, outputs and the results of machine learning pipelines. 

In this notebook we"ll compose a pipeline that deploys a classifier model, and uses it as the input in either evaluation, inference, or retrain steps.


1. [Setup](#Setup)
2. [Utilties](#utilities)
3. [Components](#Components)
     * [acquire](#acquire)
     * [transform](#transform)
     * [tsdb-ingest](#tsdb-ingest)
     * [tsdb-query](#tsdb-query)
     * [split](#split)
     * [train](#train)
     * [test](#test)
     * [feature-map](#feature-map)
     * [retrain](#transfer%20learning)
4. [Test](#testing)
5. [Compose](#image)
6. [Run](#run)

## Setup

The following will reinstall the latest development version of ```mlrun```:

In [8]:
# !pip uninstall -y mlrun
# !pip install git+https://github.com/mlrun/mlrun.git@development

Uninstalling mlrun-0.3.3:
  Successfully uninstalled mlrun-0.3.3
Collecting git+https://github.com/mlrun/mlrun.git@development
  Cloning https://github.com/mlrun/mlrun.git (to revision development) to /tmp/pip-req-build-85ncuvz0
Branch development set up to track remote branch development from origin.
Switched to a new branch 'development'
Collecting sqlalchemy==1.3.11 (from mlrun==0.3.3)
[?25l  Downloading https://files.pythonhosted.org/packages/34/5c/0e1d7ad0ca52544bb12f9cb8d5cc454af45821c92160ffedd38db0a317f6/SQLAlchemy-1.3.11.tar.gz (6.0MB)
[K    100% |████████████████████████████████| 6.0MB 7.3MB/s eta 0:00:011
Building wheels for collected packages: mlrun, sqlalchemy
  Running setup.py bdist_wheel for mlrun ... [?25ldone
[?25h  Stored in directory: /tmp/pip-ephem-wheel-cache-j28v02iw/wheels/ce/82/2f/a98d204a5dd1b27fa2a685cd11e705f1690d8f7ce2d8c08c9a
  Running setup.py bdist_wheel for sqlalchemy ... [?25ldone
[?25h  Stored in directory: /igz/.cache/pip/wheels/a3/67/7d/6c4110

Install the KubeFlow pipelines package ```kfp```. For more information see the **[KubeFlow documentation on nuclio](https://www.kubeflow.org/docs/components/misc/nuclio/)** and  **[Kubeflow pipelines and nuclio](https://github.com/kubeflow/pipelines/tree/master/components/nuclio)**. For logging the estimated machine learning models we"ll use ```joblib```"s [```dump``` and ```load```](https://joblib.readthedocs.io/en/latest/persistence.html#persistence).

In [9]:
# !pip install -U kfp joblib seaborn tensorflow==1.14 keras

Requirement already up-to-date: kfp in /User/.pythonlibs/lib/python3.6/site-packages (0.1.37)
Requirement already up-to-date: joblib in /User/.pythonlibs/lib/python3.6/site-packages (0.14.1)
Requirement already up-to-date: seaborn in /conda/lib/python3.6/site-packages (0.9.0)
Requirement already up-to-date: tensorflow==1.14 in /User/.pythonlibs/lib/python3.6/site-packages (1.14.0)
Requirement already up-to-date: keras in /User/.pythonlibs/lib/python3.6/site-packages (2.3.1)


<a id="nuclio-code-section"></a>
# Nuclio code section

### nuclio"s _**ignore**_ notation

You"ll write all the code that gets packaged for execution between the tags ```# nuclio: ignore```, meaning ignore all the code here and above, and ```# nuclio: end-code```, meaning ignore everything after this annotation.  Methods in this code section can be called separately if designed as such (```acquire```, ```split```, ```train```, ```test```), or as you"ll discover below, they are most often "chained" together to form a pipeline where the output of one stage serves as the input to the next. The **[docs](https://github.com/nuclio/nuclio-jupyter#creating-and-debugging-functions-using-nuclio-magic)** also suggest another approach: we can use ```# nuclio: start``` at the first relevant code cell instead of marking all the cells above with ```# nuclio: ignore```.

See the **[nuclio-jupyter](https://github.com/nuclio/nuclio-jupyter)** repo for further information on these and many other **[nuclio magic commands](https://github.com/nuclio/nuclio-jupyter#creating-and-debugging-functions-using-nuclio-magic)** that make it easy to transform a Jupyter notebook environment into a platform for developing production-quality, machine learning systems.

The ```nuclio-jupyter``` package provides methods for automatically generating and deploying nuclio serverless functions from code, repositories or Jupyter notebooks. **_If you have never run nuclio functions in your notebooks, please uncomment and run the following_**: ```!pip install nuclio-jupyter```

The following two lines _**should be in the same cell**_ and mark the start of your mchine learning coding section:

In [10]:
# nuclio: ignore
import nuclio 

<a id="function-dependencies"></a>
### function dependencies

The installs made in the section **[Setup](#Setup)** covered the Jupyter environment within which this notebook runs.  However, we need to ensure that all the dependencies our nuclio function relies upon (such as ```matplotlib```, ```sklearn```, ```lightgbm```), will be available when that code is wrapped up into a nuclio function _**on some presently unknown runtime**_.   Within the nuclio code section we can ensure these dependencies get built into the function with the ```%nuclio cmd``` magic command.

In [11]:
%nuclio cmd -c pip install -U matplotlib tensorflow==1.14.0 keras sklearn pandas numpy joblib

We"ll use a standard base image here, however the build step can be shortened by preparing images with pre-installed packages.

In [12]:
%nuclio config spec.build.baseImage = "python:3.6-jessie"

%nuclio: setting spec.build.baseImage to 'python:3.6-jessie'


In [13]:
import warnings
warnings.simplefilter(action="ignore", category=FutureWarning)

In [14]:
import time
from io import BytesIO
from os import path, makedirs, getenv
from pathlib import Path
from urllib.request import urlretrieve
from typing import IO, AnyStr, TypeVar, Union, List

import keras
from keras.models import Sequential
from keras.layers import Dense

from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import (classification_report, 
                             confusion_matrix, 
                             accuracy_score,
                             f1_score,
                             precision_score,
                             recall_score)
import joblib

import matplotlib.pyplot as plt
from matplotlib.figure import Figure
import matplotlib.lines as mlines
from mpl_toolkits.mplot3d import Axes3D

import seaborn as sns

import pandas as pd
import numpy as np
import pyarrow.parquet as pq
import pyarrow as pa
from pyarrow import Table
import v3io_frames as v3f

from mlrun.artifacts import ChartArtifact, TableArtifact, PlotArtifact
from mlrun.execution import MLClientCtx
from mlrun.datastore import DataItem

Using TensorFlow backend.


In [2]:
target_path = "/User/projects/paysim/data"
src_path = target_path
srcname = "PS_20174392719_1491204439457_log.csv.zip"
destname = "paysim.parquet"

### Components

These are the methods that we"ll be using to compose a pipeline.

#### **tables**

In [None]:
def get_context_table(ctxtable: MLClientCtx) -> Table:
    """Get table from context.
    
    Convenience function to retrieve a table via a blob.
    
    :param ctxtable: The table saved in the context, 
            which needs to be deserialized.
        
    In this demonstration tables are stored in parquet format and passed
    between steps as blobs.  We could also pass folder or file names
    in the context, which may be faster.
    
    Returns a pyarrow table.
    """
    blob = BytesIO(ctxtable.get())
    return pd.read_parquet(blob, engine="pyarrow")

In [None]:
def log_context_table(
    context: MLClientCtx,
    target: str, 
    name: str,
    table: pd.DataFrame
) -> None:
    """Log a table through the context.
    
    The table is written as a parquet file, and its target
    path is saved in the context.
    
    :param context: The context.
    :param target: Location (folder) of our DataItem.
    :param name: Name of the object in the context.
    :param table: The object we wish to store.
    """
    context.logger.info(f"writing {name}")
    pq.write_table(
        pa.Table.from_pandas(table),
        path.join(target, name))    
    context.log_artifact(name, target_path=path.join(target, name))

#### **plots**

In [None]:
def plot_time_density(
    context: MLClientCtx,
    artifact_key:str,
    time_series: np.ndarray,
    title: str = "Time Series,
    xlabel: str = "time",
    ylabel: str = "density",
    figsize: Tuple[int, int] = (12,4), # pass a matplotlib plot definition class
    color: str = "#756bb1" # could be Union some color class...
) -> Figure:
    """Plot density of data points per time interval.

    :param context: The context.
    :param artifact_key: The plot"s key in the context.
    :param time_series: The time-series whose density we wish to plot.
    :param title: Plot title.
    :param xlabel: X-axis label.
    :param ylabel: Y-axis label.
    :param figsize: Matplotlib figsize.
    :param fmt: The file image format (png, jpg, ...), and the saved file extension.
    """
    plt.figure(figsize=figsize)
    sns.distplot(sdf.step, color=color);
    plt.title(title)
    plt.xlabel(xlabel)
    plt.ylabel(ylabel)
    context.log_artifact(PlotArtifact(artifact_key, body=plt.gcf()))

In [None]:
def plot_validation(
    context: MLClientCtx,
    train_loss: np.ndarray, 
    valid_loss: np.ndarray, 
    title : str = "training validation results",
    xlabel: str = "epoch",
    ylabel: str = "logloss",
    fmt: str = "png"):
    """Plot train and validation loss curves.
    
    These curves represent the training round losses from the training
    and validation sets. The actual type of loss curve depends on the 
    algorithm and selcted metrics.

    :param context: The context.
    :param artifact_key: The plot"s key in the context.
    :param train_loss: Vector of loss metric estimates for training set.
    :param valid_loss: Predictions given a test sample and an estimated model.
    :param title: Plot title.
    :param xlabel: X-axis label.
    :param ylabel: Y-axis label.
    :param fmt: The file image format (png, jpg, ...), and the saved file extension.
    """
    plt.plot(train_loss)
    plt.plot(valid_loss)
    plt.title("")
    plt.xlabel(xlabel)
    plt.ylabel(ylabel)
    plt.legend(["train", "valid"])
    context.log_artifact(PlotArtifact(artifact_key, body=plt.gcf()))

In [None]:
def plot_roc(
    context: MLClientCtx,
    artifact_key: str,
    ytest: np.ndarray,
    ypred: np.ndarray,
    title: str = "roc curve",
    xlabel: str = "false positive rate",
    ylabel: str = "true positive rate",
    fmt: str = "png"
) -> Figure:
    """Plot an ROC curve.
    
    :param context: The context.
    :param artifact_key: The plot"s key in the context.
    :param ytest: Ground-truth labels.
    :param ypred: Predictions given a test sample and 
                an estimated model.
    :param title: Plot title.
    :param xlabel: X-axis label (not tick labels).
    :param ylabel: Y-axis label (not tick labels).
    :param fmt: The file image format (png, jpg, ...), and 
                the saved file extension.
    """
    fpr_xg, tpr_xg, _ = roc_curve(ytest, ypred)

    plt.plot([0, 1], [0, 1], "k--")
    plt.plot(fpr_xg, tpr_xg, label="tf-keras")
    plt.xlabel(xlabel)
    plt.ylabel(ylabel)
    plt.title(title)
    plt.legend(loc="best")
    context.log_artifact(PlotArtifact(artifact_key, body=plt.gcf()))

In [None]:
def plot_feature_importances(
    context: MLClientCtx,
    atrifact_key: str,
    feature_imps: np.ndarray,
    title: str = "feature importances",
    xlabel: str = "freq",
    ylabel: str = "feature"
    fmt: str = "png"
) -> None:
    """Generate Feature Importances Chart.
    
    :param context: The context.
    :param artifact_key: The plot"s key in the context.
    :param feature_imps: Feature importances.
    :param title: Plot title.
    :param xlabel: X-axis label (not tick labels).
    :param ylabel: Y-axis label (not tick labels).
    :param fmt: The file image format (png, jpg, ...), and 
                the saved file extension.
    """
    plt.figure(figsize=(20, 10))
    sns.barplot(x=xlabel, y=ylabel, data=feature_imps)
    plt.title(title)
    plt.tight_layout()
    fig = plt.gcf()
    
    context.log_artifact(PlotArtifact(artifact_key, body=plt.gcf))

#### **files**

This function would be used to acquire a remote archive (csv, tar, zip,...) and deposit it as a parquet file for performance. No further transformation is undertaken here.

In [None]:
def arc_to_parquet(
    context: MLClientCtx,
    archive_url: Union[str, Path, IO[AnyStr]],
    header: Union[None, List[str]],
    name: str = "original",
    target_path: str = "content",
    chunksize: int = 10_000
) -> None:
    """Open a file/object archive and save as a parquet file.
    
    Args:
    :param context: The context.
    :param archive_url: Any valid string path consistent with the path variable
            of pandas.read_csv. Includes, strings as file paths, as urls, 
            pathlib.Path objects, etc...
    :param header: Column names.
    :param target_path: Destination folder of table.
    :param chunksize: (default=0). Row size retrieved per iteration. 
    """
    makedirs(target_path, exist_ok=True)
    context.logger.info("verified directories")
   
    if not name.endswith(".parquet"):
        name += ".parquet"
    dest_path = path.join(target_path , name)
    
    if not path.isfile(dest_path):
        context.logger.info("destination file does not exist, downloading")
        pqwriter = None
        for i, df in enumerate(pd.read_csv(archive_url, chunksize=chunksize, names=header)):
            table = pa.Table.from_pandas(df)
            if i == 0:
                pqwriter = pq.ParquetWriter(dest_path, table.schema)
            pqwriter.write_table(table)

        if pqwriter:
            pqwriter.close()

    context.logger.info(f"saved table to {dest_path}")

#### **feature engineering**

This is a highly specific example, there are many generic feature engineering algos that could be added.

In our example, the raw data only contains a ```step``` variable, which represents a time period of 1 hour.  There are 743 unique steps in the data, which is approximately 1 month.  Here we will translate (map) the step values into unique ```DateTime ```s for ingestion into the time series database. We will then save separate ```labels``` and ```features``` objects in the context for use elsewhere: by logging these tables into the context we can expose them as outputs and make them available to another step or even several steps:

TODO: we may want to do other stuff here, or wait til training graph.

In [None]:
def extract_features_labels(
    context: MLClientCtx,
    target_path: str = '',
    name: str = ''
    labels_column = 'labels'
) -> None:
    """Extract features and labels from raw data.
    
    :param context: The context.
    :param target_path: The path for source, labels and features.
    :param name: The data source name.
    :param labels_column: Column holding ground-truth labels.
    """
    src_filepath = path.join(target_path, name)
    if not path.isfile(src_filepath):
        msg = 'data has not been downloaded yet or there was a problem'
        context.logger.info(msg)
        raise Exception(msg)
    
    # probably a leak
    raw.drop('isFlaggedFraud', axis=1, inplace=True)
    
    # preprocess index by inventing a month and giving each row a unique time
    # while still preserving the 'step' category and its distribution.
    df = pd.DataFrame(
        {"hours": pd.date_range("2019-01-01", freq="1H", periods=744)})
    # Create the mappings ```dict``` where each unique step is mapped to a start time
    time_mappings = {}
    for a,b in zip(range(744), df.hours):
        time_mappings[a] = b
    raw.step = raw.step.map(time_mappings)
    steps = []
    for (i, g) in enumerate(raw.groupby('step')):
        step = g[1].step.values
        for r in range(1, step.shape[0]):
            step[r] = step[r-1] + np.timedelta64(1,'ms')
        steps.append(step)
    s = np.concatenate(steps, axis=0)
     # does the new vector have the right shape?
    assert s.shape == (raw.shape[0],)
    raw['step'] = s 
    # there are as many groups as rows after the change:
    assert raw.groupby('step').ngroups == raw.shape[0]

    # now set the index, extract labels, and save.
    raw.set_index('step', inplace=True)
    labels = raw.pop(labels_column)

    labels = pd.DataFrame(labels, columns=["labels"])
    labels["step"] = s
    labels.set_index('step, inplace=True')
    
    log_context_table(context, target_path, 'features.parquet', raw)
    log_context_table(context, target_path, 'labels.parquet', labels)

#### **tsdb ingress**

In [14]:
import v3io_frames as v3f
client = v3f.Client("framesd:8081", container="users")

# Relative path to the TSDB table within the parent platform data container
tsdb_table = path.join(getenv("V3IO_USERNAME") + "projects/paysim/tsdb_tbl")

In [15]:
if path.isdir(tsdb_table):
    print("found existing table, deleting")
    client.delete("tsdb", tsdb_table)

client.create(backend="tsdb", table=tsdb_table, attrs={"rate": "1/s"})

In [None]:
timeproc = []
for g, chunk in features.groupby(1_000):
    start = time.time()
    client.write(backend="tsdb", table=tsdb_table, dfs=chunk)
    timeproc.append(int(time.time()-start))

#### **partitioning**

Data partitioning into train, test, and validation sets, cross validation, ... not only  partition the data, also take into account it's distribution, parallelization...

TODO: Perhaps use tensorflow dataset...

In [None]:
def splitter(
    context: MLClientCtx,
    features: DataItem,
    labels: DataItem,
    target_path: str = "",
    test_size: float = 0.1,
    train_val_split: float = 0.75,
    random_state: int = 1,
    sample: int = -1
) -> None:
    """Split raw data into train, validation and test sets.
    
    The file loaded at this stage is the raw data file that has been
    downloaded in a previous step (as a parquet file).  Here it is read
    and split into train, validation and test sets. The context is 
    updated with the target_path.
    
    context: The `context`.
    :param target: Data storage location.
    :param src: (default "original"). Location of original parquet file.
    :param test_size: (defaults=0.1) Set test set size, and leave the
            remainder for the second split into train and validation sets.
    :param train_val_split: (defaults=0.75) Once the test set has been
            removed the training set gets this proportion.
    :param random_state: (default 1). Seed used by the scikit-learn random
            number generator in the method train_test_split.
    :param sample: (default -1, all rows). Selects the first n rows, or
            select a sample. Check the balance of resulting sets if
            using the random sample option. Use this feature to explore the
            system or for debugging.
            
    Outputs
        The following outputs are saved at the target path:
        xtrain, ytrain (Tuple[pd.DataFrame, pd.DataFrame]): Training set.
        xvalid, yvalid (Tuple[pd.DataFrame, pd.DataFrame]): Validation set.
        xtest, ytest (Tuple[pd.DataFrame, pd.DataFrame]): Test set.
    """
    
    filepath = path.join(target_path, src)
    features = get_context_table(features)
    labels = get_context_table(labels
    
    # split twice to get training, validation and test sets.
    context.logger.info("splitting into train-valid-test data sets")
    x, xtest, y, ytest = train_test_split(features
                                          labels, 
                                          train_size=1-test_size, 
                                          test_size=test_size, 
                                          random_state=random_state)
    
    xtrain, xvalid, ytrain, yvalid = train_test_split(x, 
                                                      y, 
                                                      train_size=train_val_split, 
                                                      test_size=1-train_val_split,
                                                      random_state=random_state)    

    # save and log all the intermediate tables
    log_context_table(context, target_path, "xtrain.parquet", xtrain)
    log_context_table(context, target_path, "xvalid.parquet", xvalid)
    log_context_table(context, target_path, "xtest.parquet", xtest)
    log_context_table(context, target_path, "ytrain.parquet", pd.DataFrame({"labels":ytrain}))
    log_context_table(context, target_path, "yvalid.parquet", pd.DataFrame({"labels":yvalid}))
    log_context_table(context, target_path, "ytest.parquet", pd.DataFrame({"labels":ytest}))

#### ```train```

TODO

for more detail on the other parameters available and their default values.


In [None]:
def train(context: MLClientCtx,
          xtrain: DataItem,
          ytrain: DataItem,
          xvalid: DataItem,
          yvalid: DataItem,
          silent: bool = False,
          random_state: int = 1,
          model_target: str = "",
          model_name: str = "model.defaultname.pickle",
          losses_target = "",
          losses_name = "",
          num_leaves: int = 31,
          learning_rate: float = 0.1,
    ) -> None:
    """Train and save a LightGBM model.
    
    :param context: The function"s context.
    :param xtrain: DataItem in context representing 2D array 
            (obs, features)  of features. 
    :param ytrain: DataItem in the context representing 
            ground-truth labels. 
    :param xvalid: See xtrain, for validation set.
    :param yvalid: See ytrain, for validation set.
    :param silent : (default False) Show metrics for 
            training/validation steps.
    :param random_state : Random number generator seed.
    :param model_target : Destination path for model artifact.
    :param model_name : Destination name for model artifact.
        
    Also included for demonstration are a randomly selected sample
    of LightGBM parameters:
    :param num_leaves : (Default is 31).  In the LightGBM model
            controls complexity.
    :param learning_rate : Step size at each iteration, constant.
    """
    context.logger.info("read tables")
    xtrain = get_context_table(xtrain)
    ytrain = get_context_table(ytrain)
    xvalid = get_context_table(xvalid)
    yvalid = get_context_table(yvalid)
    
    context.logger.info(f"training input {xtrain.shape[0]} rows")
    context.logger.info("starting train")
    
    <insert model>
    
    # pickle/serialize the model at target
    if not path.isdir(model_target):
        makedirs(model_target)
    file_path = path.join(model_target, model_name)
    joblib.dump(lgb_clf, open(file_path, "wb"))
    context.log_artifact("model_dir",
                         target_path=model_target,
                         labels={"framework": "lgbmboost"})

#### ```test```

In addition to a ```model```, the first step (```load```) created test features and labels we can retrieve and pass on to the ```test``` method.  An ROC plot is built using the test set and make it available for display.

TODO: don't expect much change here

In [None]:
def test(context: MLClientCtx,
         model_dir: DataItem, 
         xtest: DataItem,
         ytest: DataItem,
         fmt:str = "png", 
         target_path:str = "",
         model_name: str = "lightgbm.model.pickle"):
    """Load model and predict.
    
    :param context: The context.
    :param model_dir: Contains the model"s path.
    :param xtest: (NxM), N is sample size and M the number of features
            of the test set.
    :param ytest: 1D (N,1) Array of ground-truth labels,
    :param fmt: (Default is "png"). The image format.
    :param target_path: Unused. 
        
    """
    print(str(model_dir))
    modelpath = path.join(str(model_dir), model_name)
    lgbm_model = joblib.load(
        open(modelpath, "rb"))
    
    xtest = get_context_table(xtest)
    ytest = get_context_table(ytest)
    context.logger.info(f"test input {xtest.shape[0]} rows")
    
    ypred = lgbm_model.predict(xtest)
    
    acc = accuracy_score(ytest, ypred)
    
    context.logger.info(f"type: {type(acc)}   value: {acc}")
    context.log_result("accuracy", float(acc))

#### ```importance```

Need to replace this for specific model type.

In [None]:
def importance(
    context: MLClientCtx,
    model_dir: DataItem,
    xtest: DataItem,
    title: str = "Model Features",
    xlabel:str = "",
    ylable:str = "",
    fmt:str = "png", 
    target_path:str = "",
    model_name: str = "model.pickle"
)-> None:
    """Display estimated feature importances.
    
    :param context: The context.
    :param model_dir: Contains the model"s path.
    :param xtest: (NxM), N is sample size and M the number of features
            of the test set.
    :param title: (Defaults to "Model Features"). Plot title.
    :param xlabel: Plot x-axis label.
    :param ylabel: Plot y-axis label.
    :param fmt: (Default is "png"). The image format.
    :param target_path: Unused.
    :param model_name: Name of the model file used to generate the feature
        importance vector.
    """
    modelpath = path.join(str(model_dir), model_name)
    model = joblib.load(
        open(modelpath, "rb"))
    
    xtest = get_context_table(xtest)
    
    # create a feature importance table with desired labels
    zipped = zip(model.feature_importances_, xtest.columns)
    
    feature_imp = pd.DataFrame(
        sorted(zipped), columns=["freq","feature"]
    ).sort_values(by="freq", ascending=False)
    log_context_table(context, target_path, "feature-importances-table.csv", feature_imp)

#### **end of nuclio function definition**

In [None]:
# nuclio: end-code

<a id="testing"></a>
## Testing locally

The function can be run locally and debugged/tested before deployment:

In [None]:
from mlrun import code_to_function, mount_v3io, new_function, new_model_server, mlconf
%env MLRUN_DBPATH=/User/mlrun
mlconf.dbpath = "/User/mlrun"

<a id="image"></a>
### Create a deployment image

Once debugged you can create a reusable image, and then deploy it for testing. In the following line we are converting the code block between the ```#nuclio: ignore``` and ```#nuclio: end-code``` to be run as a KubeJob.  Next we build an image named ```mlrun/mlrunlgb:latest```.  _**It is important to ensure that this image has been built at least once, and that you have access to it.**_

In [28]:
lgbm_job = code_to_function(runtime="job").apply(mount_v3io())

lgbm_job.build(image="mlrun/mlrunlgb:latest")

[mlrun] 2019-12-18 20:19:24,195 building image (mlrun/mlrunlgb:latest)
FROM python:3.6-jessie
WORKDIR /run
RUN pip install -U matplotlib seaborn sklearn lightgbm kfp joblib pyarrow
RUN pip install mlrun
ENV PYTHONPATH /run
[mlrun] 2019-12-18 20:19:24,197 using in-cluster config.
[mlrun] 2019-12-18 20:19:24,217 Pod mlrun-build-2j6tx created
..
[36mINFO[0m[0000] Resolved base name python:3.6-jessie to python:3.6-jessie 
[36mINFO[0m[0000] Resolved base name python:3.6-jessie to python:3.6-jessie 
[36mINFO[0m[0000] Downloading base image python:3.6-jessie     
[36mINFO[0m[0000] Error while retrieving image from cache: getting file info: stat /cache/sha256:0318d80cb241983eda20b905d77fa0bfb06e29e5aabf075c7941ea687f1c125a: no such file or directory 
[36mINFO[0m[0000] Downloading base image python:3.6-jessie     
[36mINFO[0m[0000] Built cross stage deps: map[]                
[36mINFO[0m[0000] Downloading base image python:3.6-jessie     
[36mINFO[0m[0001] Error while retrievin

<mlrun.runtimes.kubejob.KubejobRuntime at 0x7fabc92761d0>

While debugging, and _**after you have run**_ ```build``` **_at least once**_, you can comment out the last cell so that the build process isn"t started needlessly.  The code can be injected into the job using the following line:

In [None]:
# lgbm_job.with_code()

<a id="pipeline"></a>
### Create a KubeFlow Pipeline from our functions

Our pipeline will consist of two instead of three steps, ```load``` and ```train```.  We"ll drop the ```test```
here since at the end of this deployment we can test the system with API requests.

For complete details on KubeFlow Pipelines please refer to the following docs:
1. **[KubeFlow pipelines](https://www.kubeflow.org/docs/pipelines/)**.
2. **[kfp.dsl Python package](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html#module-kfp.dsl)**.

Please note, the model server file name in the ```new_model_server``` function call below should identical in every respect to the name of the model server notebook.

In [29]:
import kfp
from kfp import dsl

In [30]:
@dsl.pipeline(
    name="TF-Keras Classifier Training Pipeline - Paysim",
    description="Shows how to use mlrun/kfp."
)
def tfkeras_pipeline(
   learning_rate = [0.1, 0.3]
):

    <insert pipeline>

    # define a nuclio-serving function, generated from a notebook file
    srvfn = new_model_server(
        "paysim-serving", 
        model_class="TFKerasClassifier", 
        filename="model-server.ipynb")
    
    # deploy the model serving function with inputs from the training stage
    deploy = srvfn.with_v3io("User", "~/").deploy_step(project="refactor-demos", 
                                                       models={"tfkeras_v1_joblib": train_step.outputs["model_dir"]})

<a id="compile the pipeline"></a>
### compile the pipeline

We can compile our KubeFlow pipeline and produce a yaml description of the pipeline worflow:

In [31]:
makedirs("/User/projects/tfkeras/yaml", exist_ok=True)
kfp.compiler.Compiler().compile(lgbm_pipeline, "/User/projects/tfkeras/yaml/mlrunpipe.yaml")



In [32]:
client = kfp.Client(namespace="default-tenant")

Finally, the following line will run the pipeline as a job::

In [33]:
arguments = {
    "learning_rate": [ 0.1, 0.3]
}

run_result = client.create_run_from_pipeline_func(
    tfkeras_pipeline, 
    arguments, 
    run_name="tfkeras 1",
    experiment_name="tfkeras_tsdb")