# Deploy and run a Kubeflow Pipeline from outside the Kubeflow cluster: non-IAP version

This notebook shows how to deploy and run a [Kubeflow](https://kubeflow.org) Pipeline from outside the Kubeflow cluster.  

In contrast to the [`kfp_remote_deploy.ipynb`](kfp_remote_deploy.ipynb) notebook, this example does not require that the cluster has been set up to use Google Cloud Platform's [Identity-Aware Proxy (IAP)](https://cloud.google.com/iap/). 
For simplicity of example we do assume a cluster running on [Kubernetes Engine](https://cloud.google.com/kubernetes-engine/) (GKE). However, this basic approach should work for Kubeflow installed on non-GKE  Kubernetes-conformant clusters as well.

## Setup and configuration


### Deploy a Kubeflow cluster on GKE using IAP

Deploy a [Kubeflow](https://kubeflow.org) cluster on GKE.  The [launcher web app](https://deploy.kubeflow.cloud/#/deploy) is recommended.

Once the GKE cluster is up and running, visit https://console.cloud.google.com/kubernetes/list and click on the **Connect** button to the right of your Kubeflow cluster. Copy the given command-line access snippet, which should look like:

```
gcloud container clusters get-credentials <your-cluster-name> --zone <your-zone> --project <your-project>
```
You'll need this for the next step.


## Running the notebook example

### Run the example in the ML Engine Notebook


Visit [https://console.cloud.google.com/mlengine/notebooks/instances](https://console.cloud.google.com/mlengine/notebooks/instances) and create a **NEW INSTANCE** (or you can use an existing instance if you prefer).

Once the instance is up and running, click on **OPEN JUPYTERLAB**, and upload this notebook.

Under **File** > **New**, start up a new **Terminal** tab.  In the terminal window, run the `gcloud container clusters get-credentials ...` command described above.
This will auth `kubectl` to connect to your Kubeflow cluster. 

Then, port-forward to the Kubeflow dashboard by running this command in the terminal window:

```
kubectl port-forward --namespace kubeflow $(kubectl get pod --namespace kubeflow --selector="service=ambassador" \
  --output jsonpath='{.items[0].metadata.name}') 8889:80
```

(Often the convention is to use port 8080, but in the ML Engine notebook environment that port is already taken, so we're using another, `8889`.)

### Run the example in a local Jupyter installation

To run this notebook locally, you'll need Jupyter and Python 3 installed.  

Then, in your local environment, run the `gcloud container clusters get-credentials ...` command described above.
This will auth `kubectl` to connect to your Kubeflow cluster. 

Then, port-forward to the Kubeflow dashboard by running:

```
kubectl port-forward --namespace kubeflow $(kubectl get pod --namespace kubeflow --selector="service=ambassador" \
  --output jsonpath='{.items[0].metadata.name}') 8889:80
```


### The code

Now we're ready to run the example code.

First, install the Kubeflow Pipelines SDK. If you get import errors, **you may need to restart your notebook kernel after you do the installation**.    
Make sure you're using Python 3. 
If you're running this notebook locally within a Conda environment, you may need to change `pip3` to `pip`.

In [None]:
!pip3 install https://storage.googleapis.com/ml-pipeline/release/0.1.12/kfp.tar.gz --upgrade

Next, do some imports:

In [None]:
import datetime

import kfp
import kfp.compiler as compiler
import kfp.dsl as dsl

from google.cloud import storage

Define a (very) simple example pipeline to run:

In [None]:
@dsl.pipeline(
  name='Sequential',
  description='A pipeline with two sequential steps.'
)
def sequential_pipeline(filename='gs://ml-pipeline-playground/shakespeare1.txt'):
  """A pipeline with two sequential steps."""

  op1 = dsl.ContainerOp(
     name='getfilename',
     image='library/bash:4.4.23',
     command=['sh', '-c'],
     arguments=['echo "%s" > /tmp/results.txt' % filename],
     file_outputs={'newfile': '/tmp/results.txt'})
  op2 = dsl.ContainerOp(
     name='echo',
     image='library/bash:4.4.23',
     command=['sh', '-c'],
     arguments=['echo "%s"' % op1.outputs['newfile']]
     )

Next we'll create an instance of the Kubeflow Pipelines client. It will connect via the port-forward connection you set up.

In [None]:
ts = int(datetime.datetime.utcnow().timestamp() * 100000)
client = kfp.Client(host='localhost:8889/pipeline')

Compile the pipeline, and create a Pipelines `Experiment`. (If you're running in an ML Engine notebook, the generated 'experiment' link will not work).

In [None]:
compiler.Compiler().compile(sequential_pipeline, '/tmp/sequential.tar.gz')
exp = client.create_experiment(name='sequential')

Finally, run the pipeline.
If you're running this example an ML engine notebook, the generated 'run' link will not work. Instead, to view the Kubeflow and Kubeflow Pipelines dashboards, you can set up an additional port-forward from your local machine (a bit more info is [here](https://www.kubeflow.org/docs/other-guides/accessing-uis/)), or from the GCP [cloud shell](https://cloud.google.com/shell/docs/using-web-preview) using its 'Web Preview' feature.

In [None]:
res =  client.run_pipeline(exp.id, 'sequential_' + str(ts), 
                           '/tmp/sequential.tar.gz',
                          )
print(res)

Copyright 2019, Google, LLC.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.