## Kubeflow pipelines

This notebook goes through the steps of using Kubeflow pipelines using the Python3 interpreter (command-line) to preprocess, train, tune and deploy the babyweight model.

### 1. Create cluster

In [1]:
%%bash
gcloud config set compute/zone us-central1-b
gcloud container clusters create lakpipeline \
  --zone us-central1-b \
  --scopes cloud-platform \
  --enable-cloud-logging \
  --enable-cloud-monitoring \
  --machine-type n1-standard-2 \
  --num-nodes 4
kubectl create clusterrolebinding ml-pipeline-admin-binding --clusterrole=cluster-admin --user=$(gcloud config get-value account)

NAME         LOCATION       MASTER_VERSION  MASTER_IP      MACHINE_TYPE   NODE_VERSION  NUM_NODES  STATUS
lakpipeline  us-central1-b  1.9.7-gke.11    35.224.160.49  n1-standard-2  1.9.7-gke.11  4          RUNNING


Updated property [compute/zone].
client [kubectl]. To install, run
  $ gcloud components install kubectl

This will enable the autorepair feature for nodes. Please see
https://cloud.google.com/kubernetes-engine/docs/node-auto-repair for more
information on node autorepairs.

Creating cluster lakpipeline...
..........................................................................................................................................done.
Created [https://container.googleapis.com/v1/projects/qwiklabs-gcp-71241bf9054616ac/zones/us-central1-b/clusters/lakpipeline].
To inspect the contents of your cluster, go to: https://console.cloud.google.com/kubernetes/workload_/gcloud/us-central1-b/lakpipeline?project=qwiklabs-gcp-71241bf9054616ac
kubeconfig entry generated for lakpipeline.
bash: line 9: kubectl: command not found


In [3]:
%%bash
gcloud components install kubectl
#kubectl create clusterrolebinding ml-pipeline-admin-binding --clusterrole=cluster-admin --user=$(gcloud config get-value account)



Your current Cloud SDK version is: 212.0.0
Installing components from version: 212.0.0

+------------------------------------------------------------------+
|               These components will be installed.                |
+---------------------+---------------------+----------------------+
|         Name        |       Version       |         Size         |
+---------------------+---------------------+----------------------+
| kubectl             |               1.9.7 |             14.9 MiB |
| kubectl             |                     |                      |
+---------------------+---------------------+----------------------+

For the latest full release notes, please visit:
  https://cloud.google.com/sdk/release_notes

Do you want to continue (Y/n)?  Please enter 'y' or 'n':  
#= Creating update staging area                             =#
#= Installing: kubectl                                      =#
#= Installing: kubectl                                      =#
#= Creating ba

In [4]:
%%bash
kubectl create clusterrolebinding ml-pipeline-admin-binding --clusterrole=cluster-admin --user=$(gcloud config get-value account)

Error from server (Forbidden): clusterrolebindings.rbac.authorization.k8s.io is forbidden: User "105650655507479386988" cannot create clusterrolebindings.rbac.authorization.k8s.io at the cluster scope: Required "container.clusterRoleBindings.create" permission.


Go the [Google Kubernetes Engine section of the GCP console](https://console.cloud.google.com/kubernetes) and make sure that the cluster is started and ready.  This will take about 3 minutes.

### 2. Deploy Kubeflow pipeline to cluster

In [5]:
%%bash
PIPELINE_VERSION=0.1.2
kubectl create -f https://storage.googleapis.com/ml-pipeline/release/$PIPELINE_VERSION/bootstrapper.yaml

job "deploy-ml-pipeline-lj45s" created


Error from server (Forbidden): error when creating "https://storage.googleapis.com/ml-pipeline/release/0.1.2/bootstrapper.yaml": clusterroles.rbac.authorization.k8s.io is forbidden: User "105650655507479386988" cannot create clusterroles.rbac.authorization.k8s.io at the cluster scope: Required "container.clusterRoles.create" permission.
Error from server (Forbidden): error when creating "https://storage.googleapis.com/ml-pipeline/release/0.1.2/bootstrapper.yaml": clusterrolebindings.rbac.authorization.k8s.io is forbidden: User "105650655507479386988" cannot create clusterrolebindings.rbac.authorization.k8s.io at the cluster scope: Required "container.clusterRoleBindings.create" permission.


These are the (important parts) of `bootstrapper.yaml`:

```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: mlpipeline-deploy-admin
rules:
- apiGroups:
  - '*'
  resources:
  - '*'
  verbs:
  - '*'
- nonResourceURLs:
  - '*'
  verbs:
  - '*'

---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  creationTimestamp: null
  name: mlpipeline-admin-role-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: mlpipeline-deploy-admin
subjects:
- kind: ServiceAccount
  name: default
  namespace: default

---
apiVersion: batch/v1
kind: Job
metadata:
  generateName: deploy-ml-pipeline-
spec:
  backoffLimit: 1
  template:
    metadata:
      name: deploy-ml-pipeline
    spec:
      containers:
      - name: deploy
        image: gcr.io/ml-pipeline/bootstrapper:0.1.2
        imagePullPolicy: 'Always'
        # Additional parameter available:
        args: [
          # "--namespace", "foo",
          # "--report_usage", "false",
          # "--uninstall",
        ]
      restartPolicy: Never
```

### 3. Install local interpreter

In [None]:
%%bash
PIPELINE_VERSION=0.1.2
pip install python-dateutil https://storage.googleapis.com/ml-pipeline/release/$PIPELINE_VERSION/kfp.tar.gz --upgrade

After pip install, always <b>Reset Session</b> so that the new package gets picked up.

### 4. Set up port forward

In [None]:
%%bash
export NAMESPACE=kubeflow
kubectl port-forward -n ${NAMESPACE} $(kubectl get pods -n ${NAMESPACE} --selector=service=ambassador -o jsonpath='{.items[0].metadata.name}') 8085:80

Now visit https://8085-dot-4972031-dot-devshell.appspot.com/pipeline

### 5. Do the DSL compile

In [12]:
%%bash
OUTDIR=pipelines/dsl
rm -rf $OUTDIR
mkdir -p $OUTDIR
dsl-compile --py pipelines/mlp_babyweight.py --output $OUTDIR/mlp_babyweight.tar.gz
ls -l $OUTDIR

total 4
-rw-r--r-- 1 root root 970 Nov 14  2018 mlp_babyweight.tar.gz


#### Inspect pipeline 

In [None]:
%%bash
ls pipelines

In [None]:
%%bash
cat pipelines/mlp_babyweight.py

### 5. Upload and execute pipeline

Start by navigating to https://8085-dot-4972031-dot-devshell.appspot.com (as in port forward), create an experiment, upload the above pipeline and run it once.

In [1]:
# Copyright 2018 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.