# Train and deploy on Kubeflow from Notebooks

This notebook introduces you to using Kubeflow Fairing to train and deploy a model to Kubeflow on Google Kubernetes Engine (GKE), and Kubeflow Pipeline to build a simple pipeline and deploy on GKE. This notebook demonstrate how to:
 
* Train an XGBoost model in a local notebook,
* Use Kubeflow Fairing to train an XGBoost model remotely on Kubeflow,
  * For simplicity code-generated syntthetic data is used. If you would like to use actual data please refer to `ames-xgboost-buld-train-deploy` notebook whcih shows how to attach and read data from PVC.  
  * The append builder is used to rapidly build a docker image
* Use Kubeflow Fairing to deploy a trained model to Kubeflow, and Call the deployed endpoint for predictions.
* Use a simple pipeline to train a model in GKE. 

To learn more about how to run this notebook locally, see the guide to [training and deploying on GCP from a local notebook][gcp-local-notebook].

[gcp-local-notebook]: https://kubeflow.org/docs/fairing/gcp-local-notebook/

## Set up your notebook for training an XGBoost model

Import the libraries required to train this model.

In [2]:
import demo_util
from pathlib import Path
import os
fairing_code = os.path.join(Path.home(), "fairing")
demo_util.notebook_setup(fairing_code)


INFO:root:Adding /home/jovyan/fairing to path


In [3]:
# fairing:include-cell
import fire
import joblib
import logging
import nbconvert
import os
import pathlib
import sys
from pathlib import Path
import pandas as pd
import pprint
from sklearn.metrics import mean_absolute_error
from sklearn.model_selection import train_test_split
from sklearn.impute import SimpleImputer
from xgboost import XGBRegressor
from importlib import reload
from sklearn.datasets import make_regression


In [4]:
# Imports not to be included in the built docker image
import kfp
import kfp.components as comp
import kfp.gcp as gcp
import kfp.dsl as dsl
import kfp.compiler as compiler
from kubernetes import client as k8s_client
import fairing   
from fairing.builders import append
from fairing.deployers import job
import fairing_util

In [5]:
# fairing:include-cell
def read_synthetic_input(test_size=0.25):
    """generate synthetic data and split it into train and test."""
    # generate regression dataset
    X, y = make_regression(n_samples=200, n_features=5, noise=0.1)
    train_X, test_X, train_y, test_y = train_test_split(X,
                                                      y,
                                                      test_size=test_size,
                                                      shuffle=False)

    imputer = SimpleImputer()
    train_X = imputer.fit_transform(train_X)
    test_X = imputer.transform(test_X)

    return (train_X, train_y), (test_X, test_y)


In [6]:
# fairing:include-cell
def train_model(train_X,
                train_y,
                test_X,
                test_y,
                n_estimators,
                learning_rate):
    """Train the model using XGBRegressor."""
    model = XGBRegressor(n_estimators=n_estimators, learning_rate=learning_rate)

    model.fit(train_X,
            train_y,
            early_stopping_rounds=40,
            eval_set=[(test_X, test_y)])

    print("Best RMSE on eval: %.2f with %d rounds",
               model.best_score,
               model.best_iteration+1)
    return model

def eval_model(model, test_X, test_y):
    """Evaluate the model performance."""
    predictions = model.predict(test_X)
    logging.info("mean_absolute_error=%.2f", mean_absolute_error(predictions, test_y))

def save_model(model, model_file):
    """Save XGBoost model for serving."""
    joblib.dump(model, model_file)
    logging.info("Model export success: %s", model_file)

Define various constants

## Define Train and Predict functions

In [13]:
# fairing:include-cell
class HousingServe(object):
    
    def __init__(self, model_file=None):
        self.n_estimators = 50
        self.learning_rate = 0.1
        if not model_file:
            if "MODEL_FILE" in os.environ:
                print("model_file not supplied; checking environment variable")
                model_file = os.getenv("MODEL_FILE")
            else:
                print("model_file not supplied; using the default")
                model_file = "mockup-model.dat"
        
        self.model_file = model_file
        print("model_file={0}".format(self.model_file))
        
        self.model = None

    def train(self):
        (train_X, train_y), (test_X, test_y) = read_synthetic_input()
        model = train_model(train_X,
                          train_y,
                          test_X,
                          test_y,
                          self.n_estimators,
                          self.learning_rate)

        eval_model(model, test_X, test_y)
        save_model(model, self.model_file)

    def predict(self, X, feature_names):
        """Predict using the model for given ndarray."""
        if not self.model:
            self.model = joblib.load(self.model_file)
        # Do any preprocessing
        prediction = self.model.predict(data=X)
        # Do any postprocessing
        return [[prediction.item(0), prediction.item(0)]]

## Train your Model Locally

* Train your model locally inside your notebook

In [14]:
HousingServe(model_file="mockup-model.dat").train()

model_file=mockup-model.dat
[0]	validation_0-rmse:97.625
Will train until validation_0-rmse hasn't improved in 40 rounds.
[1]	validation_0-rmse:92.9346
[2]	validation_0-rmse:88.4163
[3]	validation_0-rmse:84.9513
[4]	validation_0-rmse:81.4807
[5]	validation_0-rmse:78.0301
[6]	validation_0-rmse:74.3916
[7]	validation_0-rmse:72.6324
[8]	validation_0-rmse:70.0073
[9]	validation_0-rmse:67.4423
[10]	validation_0-rmse:66.0759
[11]	validation_0-rmse:63.7281
[12]	validation_0-rmse:61.7721
[13]	validation_0-rmse:59.8362
[14]	validation_0-rmse:58.0936
[15]	validation_0-rmse:56.2871
[16]	validation_0-rmse:54.6282
[17]	validation_0-rmse:53.242
[18]	validation_0-rmse:51.9367
[19]	validation_0-rmse:50.4069
[20]	validation_0-rmse:49.4686
[21]	validation_0-rmse:48.2332
[22]	validation_0-rmse:47.4084
[23]	validation_0-rmse:46.8214
[24]	validation_0-rmse:46.1743
[25]	validation_0-rmse:45.2428
[26]	validation_0-rmse:44.6314
[27]	validation_0-rmse:43.7469
[28]	validation_0-rmse:42.8601
[29]	validation_0-rm

INFO:root:mean_absolute_error=25.64
INFO:root:Model export success: mockup-model.dat


Best RMSE on eval: %.2f with %d rounds 32.798336 50


## Predict locally

* Run prediction inside the notebook using the newly created notebook

In [16]:
(train_X, train_y), (test_X, test_y) =read_synthetic_input()

HousingServe().predict(test_X, None)

model_file not supplied; using the default
model_file=mockup-model.dat


[[-37.04857635498047, -37.04857635498047]]

## Use Fairing to Launch a K8s Job to train your model

### Set up Kubeflow Fairing for training and predictions

Import the `fairing` library and configure the environment that your training or prediction job will run in.

In [17]:
# Setting up google container repositories (GCR) for storing output containers
# You can use any docker container registry istead of GCR
GCP_PROJECT = fairing.cloud.gcp.guess_project_name()
print(GCP_PROJECT)
DOCKER_REGISTRY = 'gcr.io/{}/fairing-job'.format(GCP_PROJECT)
print(DOCKER_REGISTRY)
PY_VERSION = ".".join([str(x) for x in sys.version_info[0:3]])
BASE_IMAGE = 'python:{}'.format(PY_VERSION)
# ucan use Dockerfile in this repo to build and use the base_image
base_image = "gcr.io/kubeflow-images-public/xgboost-fairing-example-base:v-20190612"


zahrakubeflowcodelab
gcr.io/zahrakubeflowcodelab/fairing-job


## Use fairing to build the docker image

* This uses the append builder to rapidly build docker images

In [18]:
preprocessor = fairing_util.ConvertNotebookPreprocessorWithFire("HousingServe")

if not preprocessor.input_files:
    preprocessor.input_files = set()
input_files=["ames.py", "deployment/update_model_job.yaml", "update_model.py"]
preprocessor.input_files =  set([os.path.normpath(f) for f in input_files])
preprocessor.preprocess()
builder = append.append.AppendBuilder(registry=DOCKER_REGISTRY,
                                      base_image=base_image, preprocessor=preprocessor)
builder.build()


INFO:root:Creating docker context: /tmp/fairing_context_de6bgft2
INFO:root:Loading Docker credentials for repository 'gcr.io/kubeflow-images-public/xgboost-fairing-example-base:v-20190612'
INFO:root:Invoking 'docker-credential-gcloud' to obtain Docker credentials.
INFO:root:Successfully obtained Docker credentials.
INFO:root:Loading Docker credentials for repository 'gcr.io/zahrakubeflowcodelab/fairing-job/fairing-job:6F63F28C'
INFO:root:Invoking 'docker-credential-gcloud' to obtain Docker credentials.
INFO:root:Successfully obtained Docker credentials.
INFO:root:Layer sha256:2f1ee468081da0ca09360c50281ed261d8b3fb01f664262c3f278d8619eb4e9a exists, skipping
INFO:root:Layer sha256:90a7e2cb4d7460e55f83c6e47f9f8d089895ee6e1cc51ae5c23eab3bdcb70363 exists, skipping
INFO:root:Layer sha256:b893ca5fa31bb87be0d3fa3a403dac7ca12c955d6fd522fd35e3260dbd0e99da exists, skipping
INFO:root:Layer sha256:eed14867f5ee443ad7efc89d0d4392683799a413244feec120f43074bc2d43ef exists, skipping
INFO:root:Layer sha2

## Launch the K8s Job

* Use pod mutators to attach a PVC and credentials to the pod

In [19]:
pod_spec = builder.generate_pod_spec()
NAMESPACE = "user1"
train_deployer = job.job.Job(namespace=NAMESPACE, 
                             cleanup=False,
                             pod_spec_mutators=[
                             fairing.cloud.gcp.add_gcp_credentials_if_exists])

# Add command line arguments
pod_spec.containers[0].command.extend(["train"])
result = train_deployer.deploy(pod_spec)

INFO:fairing.kubernetes.manager:Pod started running True


model_file not supplied; using the default
model_file=mockup-model.dat
[0]	validation_0-rmse:90.6249
Will train until validation_0-rmse hasn't improved in 40 rounds.
[1]	validation_0-rmse:85.3672
[2]	validation_0-rmse:80.6077
[3]	validation_0-rmse:75.9867
[4]	validation_0-rmse:72.15
[5]	validation_0-rmse:68.4247
[6]	validation_0-rmse:65.4166
[7]	validation_0-rmse:62.7606
[8]	validation_0-rmse:60.1438
[9]	validation_0-rmse:57.9401
[10]	validation_0-rmse:55.8747
[11]	validation_0-rmse:53.957
[12]	validation_0-rmse:52.2249
[13]	validation_0-rmse:50.556
[14]	validation_0-rmse:49.2282
[15]	validation_0-rmse:47.8585
[16]	validation_0-rmse:46.6933
[17]	validation_0-rmse:45.5335
[18]	validation_0-rmse:44.3206
[19]	validation_0-rmse:43.2371
[20]	validation_0-rmse:42.5117
[21]	validation_0-rmse:41.6298
[22]	validation_0-rmse:40.9242
[23]	validation_0-rmse:40.1302
[24]	validation_0-rmse:39.4707
[25]	validation_0-rmse:38.8031
[26]	validation_0-rmse:38.3108
[27]	validation_0-rmse:37.689
[28]	valida

In [20]:
!kubectl get jobs -l fairing-id={train_deployer.job_id} -o yaml

apiVersion: v1
items:
- apiVersion: batch/v1
  kind: Job
  metadata:
    creationTimestamp: "2019-06-12T20:21:53Z"
    generateName: fairing-job-
    labels:
      fairing-deployer: job
      fairing-id: b7955e0a-8d4f-11e9-9207-96ec34699c76
    name: fairing-job-t429t
    namespace: user1
    resourceVersion: "7556018"
    selfLink: /apis/batch/v1/namespaces/user1/jobs/fairing-job-t429t
    uid: b7b87f19-8d4f-11e9-b008-42010a8e01a5
  spec:
    backoffLimit: 0
    completions: 1
    parallelism: 1
    selector:
      matchLabels:
        controller-uid: b7b87f19-8d4f-11e9-b008-42010a8e01a5
    template:
      metadata:
        creationTimestamp: null
        labels:
          controller-uid: b7b87f19-8d4f-11e9-b008-42010a8e01a5
          fairing-deployer: job
          fairing-id: b7955e0a-8d4f-11e9-9207-96ec34699c76
          job-name: fairing-job-t429t
        name: fairing-deployer
      spec:
        containers:
        - command:
          - python

## Deploy the trained model to Kubeflow for predictions

In [21]:
from fairing.deployers import serving
import fairing_util
pod_spec = builder.generate_pod_spec()
#pvc_mutator = fairing_util.add_pvc_mutator("kubeflow-gcfs", "/mnt/kubeflow-gcfs")

module_name = os.path.splitext(preprocessor.executable.name)[0]
deployer = serving.serving.Serving(module_name + ".HousingServe",
                                   service_type="ClusterIP",
                                   labels={"app": "mockup"})
    
#pvc_mutator(None, pod_spec, deployer.namespace)
#pod_spec.containers[0].env.append({"name": "MODEL_FILE", "value": model_file})
url = deployer.deploy(pod_spec)

INFO:root:Cluster endpoint: http://fairing-service-jjgxd.user1.svc.cluster.local


In [22]:
!kubectl get deploy -o yaml {deployer.deployment.metadata.name}

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
  creationTimestamp: "2019-06-12T20:22:27Z"
  generateName: fairing-deployer-
  generation: 1
  labels:
    app: mockup
    fairing-deployer: serving
    fairing-id: cbc0e610-8d4f-11e9-9207-96ec34699c76
  name: fairing-deployer-cltbb
  namespace: user1
  resourceVersion: "7556174"
  selfLink: /apis/extensions/v1beta1/namespaces/user1/deployments/fairing-deployer-cltbb
  uid: cbc54e8f-8d4f-11e9-b008-42010a8e01a5
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: mockup
      fairing-deployer: serving
      fairing-id: cbc0e610-8d4f-11e9-9207-96ec34699c76
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: mockup
        fairing-deployer: serv

## Call the prediction endpoint

Create a test dataset, then call the endpoint on Kubeflow for predictions.

In [23]:
(train_X, train_y), (test_X, test_y) =read_synthetic_input()


In [24]:
full_url = url + ":5000/predict"
result = fairing_util.predict_nparray(full_url, test_X)
pprint.pprint(result.content)

(b'<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">\n<title>500 Inter'
 b'nal Server Error</title>\n<h1>Internal Server Error</h1>\n<p>The server en'
 b'countered an internal error and was unable to complete your request. Either '
 b'the server is overloaded or there is an error in the application.</p>\n')


## Clean up the prediction endpoint

Delete the prediction endpoint created by this notebook.

In [33]:
# !kubectl delete service -l app=ames
# !kubectl delete deploy -l app=ames

## Build a simple 1 step pipeline

In [25]:
EXPERIMENT_NAME = 'MockupModel'

#### Define the pipeline
Pipeline function has to be decorated with the `@dsl.pipeline` decorator

In [26]:
@dsl.pipeline(
   name='Training pipeline',
   description='A pipeline that trains an xgboost model for the Ames dataset.'
)
def train_pipeline(
   ):      
    command=["python", preprocessor.executable.name, "train"]
    train_op = dsl.ContainerOp(
            name="train", 
            image=builder.image_tag,        
            command=command,
            ).apply(
                gcp.use_gcp_secret('user-gcp-sa'),
            )
    train_op.container.working_dir = "/app"

#### Compile the pipeline

In [27]:
pipeline_func = train_pipeline
pipeline_filename = pipeline_func.__name__ + '.pipeline.zip'
compiler.Compiler().compile(pipeline_func, pipeline_filename)

#### Submit the pipeline for execution

In [28]:
#Specify pipeline argument values
arguments = {}

# Get or create an experiment and submit a pipeline run
client = kfp.Client()
experiment = client.create_experiment(EXPERIMENT_NAME)

#Submit a pipeline run
run_name = pipeline_func.__name__ + ' run'
run_result = client.run_pipeline(experiment.id, run_name, pipeline_filename, arguments)

#vvvvvvvvv This link leads to the run information page. (Note: There is a bug in JupyterLab that modifies the URL and makes the link stop working)

INFO:root:Creating experiment MockupModel.
