## Kubeflow pipelines

This notebook goes through the steps of using Kubeflow pipelines using the Python3 interpreter (command-line) to preprocess, train, tune and deploy the babyweight model.

### 1. Start Hosted Pipelines

Create Hosted Kubeflow Pipelines Instance

* Navigate to https://console.cloud.google.com/marketplace/details/google-cloud-ai-platform/kubeflow-pipelines
* Make sure your GCP project is selected in the dropdown at the top.
* Click CONFIGURE
* Change the App Instance Name to “kfpdemo”
* Click on the Create Cluster button and wait 2-3 minutes for cluster to get created.
* Click Deploy
* Navigate to https://console.cloud.google.com/ai-platform/pipelines/clusters
* Click on the HOSTED PIPELINES DASHBOARD LINK for kfpdemo on the cluster that you just started.


### 2. Launch AI Platform notebook

Create Notebooks instance
* Navigate to https://console.cloud.google.com/ai-platform/notebooks/instances
* Click on +New Instance and create a TensorFlow 2.x notebook
* Name the instance kfpdemo
* Click Customize 
  * In Machine Configuration, change it to n1-standard-2
  * In Permissions, set the notebook to be single-user and provide your GCP login email
  * Click Create
* Click on the URL for Open JupyterLab
* Open a Terminal
* Type:
    ```git clone https://github.com/GoogleCloudPlatform/training-data-analyst```
* On the left-hand side menu, navigate to this notebook (training-data-analyst/courses/machine_learning/deepdive/06_structured/7_pipelines.ipynb)


### 3. Install necessary packages

In [2]:
%pip install --quiet kfp python-dateutil --upgrade

[31mERROR: tensorflow-probability 0.9.0 has requirement cloudpickle>=1.2.2, but you'll have cloudpickle 1.1.1 which is incompatible.[0m
Note: you may need to restart the kernel to use updated packages.


Make sure to *restart the kernel* to pick up new packages (look for button in the ribbon of icons above this notebook)

### 4. Connect to the Hosted Pipelines

Visit https://console.cloud.google.com/ai-platform/pipelines/clusters
and get the hostname for your cluster.  You can get it by clicking on the Settings icon.
Alternately, click on the Open Pipelines Dashboard link and look at the URL.
Change the settings in the following cell

In [13]:
# CHANGE THESE
PIPELINES_HOST='447cdd24f70c9541-dot-us-central1.notebooks.googleusercontent.com'
PROJECT='ai-analytics-solutions'
BUCKET='ai-analytics-solutions-kfpdemo'

In [10]:
import kfp
client = kfp.Client(host=PIPELINES_HOST)
#client.list_pipelines()

### 5. Upload and execute pipeline

Upload to the Kubeflow pipeline cluster

In [14]:
from pipelines import mlp_babyweight

pipeline = client.create_run_from_pipeline_func(mlp_babyweight.train_and_deploy, 
                                                arguments={'project': PROJECT, 'bucket': BUCKET})

In [12]:
# Copyright 2020 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.