## Kubeflow pipelines

This notebook goes through the steps of using Kubeflow pipelines using the Python3 interpreter (command-line) to preprocess, train, tune and deploy the babyweight model.


### 1. Start Hosted Pipelines and Notebook

To try out this notebook, first launch Kubeflow Hosted Pipelines and an AI Platform Notebooks instance.
Follow the instructions in this [README.md](pipelines/README.md) file.

### 2. Install necessary packages

In [1]:
#%pip install --quiet kfp python-dateutil --upgrade

Make sure to *restart the kernel* to pick up new packages (look for button in the ribbon of icons above this notebook)

In [2]:
#%%bash
#cd pipelines
#./setup_auth.sh kfpdemo us-central1-a cluster-1 default

### 3. Connect to the Hosted Pipelines

Visit https://console.cloud.google.com/ai-platform/pipelines/clusters
and get the hostname for your cluster.  You can get it by clicking on the Settings icon.
Alternately, click on the Open Pipelines Dashboard link and look at the URL.
Change the settings in the following cell

In [3]:
# CHANGE THESE
PIPELINES_HOST='67ec0276503ce122-dot-us-central1.pipelines.googleusercontent.com'
PROJECT='youtubelist-256522'
BUCKET='cesar-pipelines-kfp'

In [4]:
import kfp
import os
client = kfp.Client(host=PIPELINES_HOST)
client.list_pipelines()

{'next_page_token': None,
 'pipelines': [{'created_at': datetime.datetime(2021, 3, 12, 6, 9, 50, tzinfo=tzlocal()),
                'default_version': {'code_source_url': None,
                                    'created_at': datetime.datetime(2021, 3, 12, 6, 9, 50, tzinfo=tzlocal()),
                                    'id': '97bd0db1-01bd-4a0a-85a4-3a200d3b66d8',
                                    'name': '[Demo] XGBoost - Iterative model '
                                            'training',
                                    'package_url': None,
                                    'parameters': None,
                                    'resource_references': [{'key': {'id': '97bd0db1-01bd-4a0a-85a4-3a200d3b66d8',
                                                                     'type': 'PIPELINE'},
                                                             'name': None,
                                                             'relationship': 'OWNER'}]},
             

## 4. [Optional] Build Docker containers

I have made my containers public (See https://cloud.google.com/container-registry/docs/access-control on how to do this), so you can simply use my images.

In [5]:
#%%bash
#cd pipelines/containers
#bash build_all.sh

In [6]:
#!docker image rm -f 84

Check that the Docker images work properly ...

In [7]:
#!docker run -t gcr.io/ai-analytics-solutions/babyweight-pipeline-bqtocsv:latest --project $PROJECT  --bucket $BUCKET --mode local

### 5. Upload and execute pipeline

Upload to the Kubeflow pipeline cluster

In [8]:
from pipelines.containers.pipeline import mlp_babyweight

args = {
    'project' : PROJECT, 
    'bucket' : BUCKET
}
#os.environ['HPARAM_JOB'] = 'babyweight_210311_191208'

pipeline = client.create_run_from_pipeline_func(mlp_babyweight.preprocess_train_and_deploy, args)

#os.environ['HPARAM_JOB'] = 'babyweight_200207_231639' # change to job from complete step
#pipeline = client.create_run_from_pipeline_func(mlp_babyweight.train_and_deploy, args)

In [4]:
# Copyright 2020 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

In [12]:
%%bash
cd pipelines
./setup_auth.sh kfpdemo us-central1-a cluster-1 default

secret/user-gcp-sa created


Fetching cluster endpoint and auth data.
kubeconfig entry generated for cluster-1.
created key [8f0efd32546d7a441e488188704559a001a085da] of type [json] as [application_default_credentials.json] for [kfpdemo@youtubelist-256522.iam.gserviceaccount.com]
W0311 10:08:09.874882   10725 helpers.go:553] --dry-run is deprecated and can be replaced with --dry-run=client.


In [5]:
#!docker run -t gcr.io/youtubelist-256522/babyweight-pipeline-traintuned:latest babyweight_210311_191208 cesar-pipelines-kfp

In [6]:
#%%bash
#docker run -t gcr.io/youtubelist-256522/babyweight-pipeline-deploycmle:latest gs://cesar-pipelines-kfp/babyweight/traintuned babyweight mlp

In [10]:
#!chmod +rwx pipelines/containers/traintuned/train.sh

In [11]:
#!./pipelines/containers/traintuned/train.sh babyweight_210308_033518 cesar-pipelines-kfp

In [18]:
!docker image rm -f 98

Untagged: gcr.io/youtubelist-256522/babyweight-pipeline-deploycmle:latest
Untagged: gcr.io/youtubelist-256522/babyweight-pipeline-deploycmle@sha256:28dae94738c36e2991cd36fcff00bb611ed5e920ee87f9e68ebb9be449a26c3f
Deleted: sha256:987f557e42229667cbf6dd0d38fb1182b199c24c0ab68eb6a6d22293841105b9
