# Train and deploy on Kubeflow from Notebooks

This notebook introduces you to using Kubeflow Fairing to train and deploy a model to Kubeflow on Google Kubernetes Engine (GKE), and Google Cloud ML Engine. This notebook demonstrate how to:
 
* Train an XGBoost model in a local notebook,
* Use Kubeflow Fairing to train an XGBoost model remotely on Kubeflow,
  * Data is read from a PVC
  * The append builder is used to rapidly build a docker image
* Use Kubeflow Fairing to deploy a trained model to Kubeflow, and
* Call the deployed endpoint for predictions.

To learn more about how to run this notebook locally, see the guide to [training and deploying on GCP from a local notebook][gcp-local-notebook].

[gcp-local-notebook]: https://kubeflow.org/docs/fairing/gcp-local-notebook/

## Set up your notebook for training an XGBoost model

Import the libraries required to train this model.

In [25]:
!pip3 install joblib
!pip3 install sklearn
!pip3 install fire

[33mYou are using pip version 19.0.1, however version 19.1.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m
[33mYou are using pip version 19.0.1, however version 19.1.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m
[33mYou are using pip version 19.0.1, however version 19.1.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [26]:
# Override fairing path; 
# do this before importing anything else
import logging
import os
from pathlib import Path
import sys
fairing_code = os.path.join(Path.home(), "git_jlewi-kubecon-demo", "fairing")

if os.path.exists(fairing_code):    
    logging.info("Adding %s to path", fairing_code)
    sys.path = [fairing_code] + sys.path
    
import fairing

INFO:root:Adding /home/jovyan/git_jlewi-kubecon-demo/fairing to path


In [27]:
# fairing:include-cell
import ames
import argparse
import fire
import logging
import nbconvert
import os
import joblib
import sys
from pathlib import Path
import pandas as pd
from sklearn.metrics import mean_absolute_error
from sklearn.model_selection import train_test_split
from sklearn.impute import SimpleImputer
from xgboost import XGBRegressor
from importlib import reload

In [28]:
# Imports not to be included in the built docker image
from fairing.builders import append
from fairing.deployers import job
import fairing_util

In [29]:
logging.basicConfig(format='%(message)s')
logging.getLogger().setLevel(logging.INFO)

In [30]:
# Copy the data to pvc
import shutil
nfs_path = os.path.join("/mnt/kubeflow-gcfs/data/ames_dataset")
model_dir = os.path.join("/mnt/kubeflow-gcfs/models")
train_data = "/mnt/kubeflow-gcfs/data/ames_dataset/train.csv"
model_file = os.path.join(model_dir, "trained_ames_model.dat")
if not os.path.exists(nfs_path):
    shutil.copytree("ames_dataset", nfs_path)
    
if not os.path.exists(model_dir):
    os.makedirs(model_dir)

In [31]:
# Base image is built from the Dockerfile in the repo
# Can be the same image as your notebook
base_image = "gcr.io/code-search-demo/kubecon-demo/notebook:v20190517-300d2f2-dirty-d1a703"

In [32]:
!gcloud auth configure-docker --quiet
!gcloud auth activate-service-account --key-file=${GOOGLE_APPLICATION_CREDENTIALS} --quiet

`docker` and `docker-credential-gcloud` need to be in the same PATH in order to work correctly together.
gcloud's Docker credential helper can be configured but it will not work until this is corrected.
gcloud credential helpers already registered correctly.
Activated service account credentials for: [label-issues-0409-user@code-search-demo.iam.gserviceaccount.com]


In [33]:
# fairing:include-cell
class HousingServe(object):    
    def __init__(self, model_file=None):
        self.n_estimators = 50
        self.learning_rate = 0.1
        if not model_file:
            print("model_file not supplied; checking environment variable")
            model_file = os.getenv("MODEL_FILE")
        
        self.model_file = model_file
        print("model_file={0}".format(self.model_file))
        
        self.model = None
                

    def train(self, train_input, model_file):
        (train_X, train_y), (test_X, test_y) = ames.read_input(train_input)
        model = ames.train_model(train_X,
                                 train_y,
                                 test_X,
                                 test_y,
                                 self.n_estimators,
                                 self.learning_rate)

        ames.eval_model(model, test_X, test_y)
        ames.save_model(model, model_file)

    def predict(self, X, feature_names):
        """Predict using the model for given ndarray."""
        if not self.model:
            print("Loading model {0}".format(self.model_file))
            self.model = joblib.load(self.model_file)
        # Do any preprocessing
        prediction = self.model.predict(data=X)
        # Do any postprocessing
        return [[prediction.item(0), prediction.item(0)]]

## Train your Model Locally

* Train your model locally inside your notebook

In [34]:
HousingServe().train(train_data, model_file="/tmp/trained_model.dat")

model_file not supplied; checking environment variable
model_file=None
[0]	validation_0-rmse:177514
Will train until validation_0-rmse hasn't improved in 40 rounds.
[1]	validation_0-rmse:161858
[2]	validation_0-rmse:147237
[3]	validation_0-rmse:134132
[4]	validation_0-rmse:122224
[5]	validation_0-rmse:111538
[6]	validation_0-rmse:102142
[7]	validation_0-rmse:93392.3
[8]	validation_0-rmse:85824.6
[9]	validation_0-rmse:79667.6
[10]	validation_0-rmse:73463.4
[11]	validation_0-rmse:68059.4
[12]	validation_0-rmse:63350.5
[13]	validation_0-rmse:59732.1
[14]	validation_0-rmse:56260.7
[15]	validation_0-rmse:53392.6
[16]	validation_0-rmse:50770.8
[17]	validation_0-rmse:48107.8
[18]	validation_0-rmse:45923.9
[19]	validation_0-rmse:44154.2
[20]	validation_0-rmse:42488.1
[21]	validation_0-rmse:41263.3
[22]	validation_0-rmse:40212.8
[23]	validation_0-rmse:39089.1
[24]	validation_0-rmse:37691.1
[25]	validation_0-rmse:36875.2
[26]	validation_0-rmse:36276.2
[27]	validation_0-rmse:35444.1
[28]	validati

INFO:root:Best RMSE on eval: 28787.72 with 50 rounds
INFO:root:mean_absolute_error=18173.15
INFO:root:Model export success: /tmp/trained_model.dat


## Use Fairing to Launch a K8s Job to train your model

### Set up Kubeflow Fairing for training and predictions

Import the `fairing` library and configure the environment that your training or prediction job will run in.

In [35]:
import os
import fairing

# Setting up google container repositories (GCR) for storing output containers
# You can use any docker container registry istead of GCR
GCP_PROJECT = fairing.cloud.gcp.guess_project_name()
DOCKER_REGISTRY = 'gcr.io/{}/fairing-job'.format(GCP_PROJECT)
PY_VERSION = ".".join([str(x) for x in sys.version_info[0:3]])
BASE_IMAGE = 'python:{}'.format(PY_VERSION)

## Use fairing to build the docker image

* This uses the append builder to rapidly build docker images

In [36]:
import pathlib

In [37]:
import fairing_util
reload(fairing_util)
preprocessor = fairing_util.ConvertNotebookPreprocessorWithFire("HousingServe")

if not preprocessor.input_files:
    preprocessor.input_files = set()
input_files=["ames.py"]
preprocessor.input_files =  set([os.path.normpath(f) for f in input_files])
preprocessor.preprocess()
builder = append.append.AppendBuilder(registry=DOCKER_REGISTRY,
                                      base_image=base_image, preprocessor=preprocessor)
builder.build()


INFO:root:Creating docker context: /tmp/fairing.context.tar.gz
INFO:root:Adding files to context: [PosixPath('xgboost-train-deploy-low-level-apis.py'), 'ames.py']
INFO:root:Context: /tmp/fairing.context.tar.gz, Adding /home/jovyan/git_jlewi-kubecon-demo/fairing/fairing/__init__.py at /app/fairing/__init__.py
INFO:root:Context: /tmp/fairing.context.tar.gz, Adding /home/jovyan/git_jlewi-kubecon-demo/fairing/fairing/runtime_config.py at /app/fairing/runtime_config.py
INFO:root:Context: /tmp/fairing.context.tar.gz, Adding xgboost-train-deploy-low-level-apis.py at /app/xgboost-train-deploy-low-level-apis.py
INFO:root:Context: /tmp/fairing.context.tar.gz, Adding ames.py at /app/ames.py
INFO:root:Loading Docker credentials for repository 'gcr.io/code-search-demo/kubecon-demo/notebook:v20190517-300d2f2-dirty-d1a703'
INFO:root:Invoking 'docker-credential-gcloud' to obtain Docker credentials.
INFO:root:Successfully obtained Docker credentials.
INFO:root:Loading Docker credentials for repository 

## Launch the K8s Job

* Use pod mutators to attach a PVC and credentials to the pod

In [38]:
import fairing_util
reload(fairing_util)

pod_spec = builder.generate_pod_spec()
pvc_mutator = fairing_util.add_pvc_mutator("kubeflow-gcfs", "/mnt/kubeflow-gcfs")
deployer = job.job.Job(namespace="kubeflow", 
                       cleanup=False,
                       pod_spec_mutators=[
                       fairing.cloud.gcp.add_gcp_credentials_if_exists, pvc_mutator])

# Add command line arguments
pod_spec.containers[0].command.extend(["train", train_data, model_file])
result = deployer.deploy(pod_spec)

INFO:fairing.kubernetes.manager:Pod started running True


model_file not supplied; checking environment variable
model_file=None
[0]	validation_0-rmse:177514
Will train until validation_0-rmse hasn't improved in 40 rounds.
[1]	validation_0-rmse:161858
[2]	validation_0-rmse:147237
[3]	validation_0-rmse:134132
[4]	validation_0-rmse:122224
[5]	validation_0-rmse:111538
[6]	validation_0-rmse:102142
[7]	validation_0-rmse:93392.3
[8]	validation_0-rmse:85824.6
[9]	validation_0-rmse:79667.6
[10]	validation_0-rmse:73463.4
[11]	validation_0-rmse:68059.4
[12]	validation_0-rmse:63350.5
[13]	validation_0-rmse:59732.1
[14]	validation_0-rmse:56260.7
[15]	validation_0-rmse:53392.6
[16]	validation_0-rmse:50770.8
[17]	validation_0-rmse:48107.8
[18]	validation_0-rmse:45923.9
[19]	validation_0-rmse:44154.2
[20]	validation_0-rmse:42488.1
[21]	validation_0-rmse:41263.3
[22]	validation_0-rmse:40212.8
[23]	validation_0-rmse:39089.1
[24]	validation_0-rmse:37691.1
[25]	validation_0-rmse:36875.2
[26]	validation_0-rmse:36276.2
[27]	validation_0-rmse:35444.1
[28]	validati

In [39]:
!kubectl get jobs -o yaml fairing-job-9kl8f

apiVersion: batch/v1
kind: Job
metadata:
  creationTimestamp: "2019-05-17T20:21:35Z"
  generateName: fairing-job-
  labels:
    controller-uid: 5dff10cf-78e1-11e9-8964-42010a8e00ff
    fairing-deployer: job
    fairing-id: 5df3e36c-78e1-11e9-b05d-0a580a000143
    job-name: fairing-job-9kl8f
  name: fairing-job-9kl8f
  namespace: kubeflow
  resourceVersion: "13374444"
  selfLink: /apis/batch/v1/namespaces/kubeflow/jobs/fairing-job-9kl8f
  uid: 5dff10cf-78e1-11e9-8964-42010a8e00ff
spec:
  backoffLimit: 6
  completions: 1
  parallelism: 1
  selector:
    matchLabels:
      controller-uid: 5dff10cf-78e1-11e9-8964-42010a8e00ff
  template:
    metadata:
      creationTimestamp: null
      labels:
        controller-uid: 5dff10cf-78e1-11e9-8964-42010a8e00ff
        fairing-deployer: job
        fairing-id: 5df3e36c-78e1-11e9-b05d-0a580a000143
        job-name: fairing-job-9kl8f
      name: fairing-deployer
    spec:
      containers:
      - command:
        

## Deploy the trained model to Kubeflow for predictions

In [40]:
from fairing.deployers import serving
import fairing_util
pod_spec = builder.generate_pod_spec()
pvc_mutator = fairing_util.add_pvc_mutator("kubeflow-gcfs", "/mnt/kubeflow-gcfs")
deployer = serving.serving.Serving("xgboost-train-deploy-low-level-apis.HousingServe",
                                   service_type="ClusterIP",
                                   labels={"app": "ames"})
    
pvc_mutator(None, pod_spec, deployer.namespace)
pod_spec.containers[0].env.append({"name": "MODEL_FILE", "value": model_file})
url = deployer.deploy(pod_spec)

INFO:root:Cluster endpoint: http://fairing-service-xjg5h.kubeflow.svc.cluster.local


In [41]:
!kubectl get deploy -o yaml {deployer.deployment.metadata.name}

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
  creationTimestamp: "2019-05-18T01:33:33Z"
  generateName: fairing-deployer-
  generation: 1
  labels:
    app: ames
    fairing-deployer: serving
    fairing-id: f2e10f1a-790c-11e9-a0a6-0a580a000143
  name: fairing-deployer-z4wvg
  namespace: kubeflow
  resourceVersion: "13451916"
  selfLink: /apis/extensions/v1beta1/namespaces/kubeflow/deployments/fairing-deployer-z4wvg
  uid: f2e2af0c-790c-11e9-8964-42010a8e00ff
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: ames
      fairing-deployer: serving
      fairing-id: f2e10f1a-790c-11e9-a0a6-0a580a000143
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: ames
        fairing-deployer: ser

## Call the prediction endpoint

Create a test dataset, then call the endpoint on Kubeflow for predictions.

In [42]:
(train_X, train_y), (test_X, test_y) = ames.read_input("ames_dataset/train.csv")

In [44]:
import pprint
test_X
full_url = url + ":5000/predict"
result = fairing_util.predict_nparray(full_url, test_X)
pprint.pprint(result.content)

(b'{"data":{"names":["t:0","t:1"],"tensor":{"shape":[1,2],"values":[165164.875,'
 b'165164.875]}},"meta":{}}\n')


## Clean up the prediction endpoint

Delete the prediction endpoint created by this notebook.

In [None]:
# !kubectl delete service -l app=ames
# !kubectl delete deploy -l app=ames