## Launcher notebook

This notebook shows how to launch the flights_model.ipynb notebook either in Deep Learning VM or on Kubeflow pipelines

In [7]:
# change these to try this notebook out
BUCKET = 'cloud-training-demos-ml'
PROJECT = 'cloud-training-demos'
REGION = 'us-central1'

In [8]:
import os
os.environ['BUCKET'] = BUCKET
os.environ['PROJECT'] = PROJECT
os.environ['REGION'] = REGION

## Launch "locally" using Python

Make sure the notebook runs as intended when executed through papermill.

In [None]:
# Install papermill package in the current Jupyter kernel
import sys
!{sys.executable} -m pip install papermill

In [94]:
def run_flights_model(input_notebook: str, 
                      output_notebook: str, 
                      develop_mode: bool, 
                      bucket : str, 
                      project: str):
  # execute notebook using papermill
  import papermill as pm
  pm.execute_notebook(
    input_notebook,
    output_notebook,
    parameters = {'BUCKET': bucket, 'PROJECT': project, 'DEVELOP_MODE': develop_mode}
  )

In [49]:
%%bash
rm -rf kfp_output
mkdir kfp_output

In [50]:
run_flights_model('flights_model.ipynb',
                  os.path.join('kfp_output', 'flights_model.ipynb'),
                  develop_mode=True, bucket=BUCKET, project=PROJECT)

Input Notebook:  flights_model.ipynb
Output Notebook: kfp_output/flights_model.ipynb


HBox(children=(IntProgress(value=0, max=30), HTML(value='')))




In [82]:
!ls kfp_output

flights_model.ipynb


## Submit to Deep Learning VM to execute

This will launch a beefy virtual machine on the cloud and execute the notebook.
Change the params.yaml below to be your PROJECT and BUCKET

In [144]:
%%writefile params.yaml
---
BUCKET: cloud-training-demos-ml
PROJECT: cloud-training-demos
DEVELOP_MODE: False

Overwriting params.yaml


In [146]:
%%bash
gcloud storage rm --recursive gs://$BUCKET/flights/notebook
gcloud storage cp *.ipynb params.yaml gs://$BUCKET/flights/notebook

Removing gs://cloud-training-demos-ml/flights/notebook/flights_model.ipynb#1543619551322929...
Removing gs://cloud-training-demos-ml/flights/notebook/flights_small.ipynb#1543619551422089...
Removing gs://cloud-training-demos-ml/flights/notebook/launcher.ipynb#1543619551539081...
Removing gs://cloud-training-demos-ml/flights/notebook/params.yaml#1543619551633910...
/ [1/4 objects]  25% Done                                                       / [2/4 objects]  50% Done                                                       / [3/4 objects]  75% Done                                                       / [4/4 objects] 100% Done                                                       
Operation completed over 4 objects.                                              
Copying file://flights_model.ipynb [Content-Type=application/octet-stream]...
/ [0 files][    0.0 B/ 27.9 KiB]                                                / [1 files][ 27.9 KiB/ 27.9 KiB]                                   

In [147]:
%%bash

set -x

GCS_FOLDER=gs://$BUCKET/flights/notebook

export IMAGE_FAMILY="tf-latest-cu100" # or put any required
export ZONE="us-west1-b"
export INSTANCE_NAME="notebookexecutor"
export INSTANCE_TYPE="n1-standard-4"
#export INPUT_NOTEBOOK="$GCS_FOLDER/flights_small.ipynb"
export INPUT_NOTEBOOK="$GCS_FOLDER/flights_model.ipynb"
export OUTPUT_NOTEBOOK_DIR=$GCS_FOLDER
export PARAMS="$GCS_FOLDER/params.yaml"
export LAUNCHER_SCRIPT=https://raw.githubusercontent.com/b0noI/gcp-notebook-executor/master/notebook_executor.sh

gcloud compute instances create $INSTANCE_NAME \
        --zone=$ZONE \
        --image-family=$IMAGE_FAMILY \
        --image-project=deeplearning-platform-release \
        --maintenance-policy=TERMINATE \
        --accelerator='type=nvidia-tesla-p100,count=1' \
        --machine-type=$INSTANCE_TYPE \
        --boot-disk-size=200GB \
        --scopes=https://www.googleapis.com/auth/cloud-platform \
        --metadata="parameters_file=$PARAMS,input_notebook=$INPUT_NOTEBOOK,output_notebook=$OUTPUT_NOTEBOOK_DIR,startup-script-url=$LAUNCHER_SCRIPT"


NAME              ZONE        MACHINE_TYPE   PREEMPTIBLE  INTERNAL_IP  EXTERNAL_IP     STATUS
notebookexecutor  us-west1-b  n1-standard-4               10.138.0.2   35.247.113.103  RUNNING


+ GCS_FOLDER=gs://cloud-training-demos-ml/flights/notebook
+ export IMAGE_FAMILY=tf-latest-cu100
+ IMAGE_FAMILY=tf-latest-cu100
+ export ZONE=us-west1-b
+ ZONE=us-west1-b
+ export INSTANCE_NAME=notebookexecutor
+ INSTANCE_NAME=notebookexecutor
+ export INSTANCE_TYPE=n1-standard-4
+ INSTANCE_TYPE=n1-standard-4
+ export INPUT_NOTEBOOK=gs://cloud-training-demos-ml/flights/notebook/flights_model.ipynb
+ INPUT_NOTEBOOK=gs://cloud-training-demos-ml/flights/notebook/flights_model.ipynb
+ export OUTPUT_NOTEBOOK_DIR=gs://cloud-training-demos-ml/flights/notebook
+ OUTPUT_NOTEBOOK_DIR=gs://cloud-training-demos-ml/flights/notebook
+ export PARAMS=gs://cloud-training-demos-ml/flights/notebook/params.yaml
+ PARAMS=gs://cloud-training-demos-ml/flights/notebook/params.yaml
+ export LAUNCHER_SCRIPT=https://raw.githubusercontent.com/b0noI/gcp-notebook-executor/master/notebook_executor.sh
+ LAUNCHER_SCRIPT=https://raw.githubusercontent.com/b0noI/gcp-notebook-executor/master/notebook_executor.sh
+ gcloud 

In [105]:
%%bash
gcloud storage mv gs://$BUCKET/flights/notebook/notebook.ipynb gs://$BUCKET/flights/notebook/flights_model_dlvm.ipynb
gcloud storage ls gs://$BUCKET/flights/notebook

gs://cloud-training-demos-ml/flights/notebook/flights_small.ipynb
gs://cloud-training-demos-ml/flights/notebook/notebook.ipynb
gs://cloud-training-demos-ml/flights/notebook/params.yaml


## Submit to Kubeflow pipelines to run as a component

The submitnotebook docker image has the latest version of tensorflow and papermill installed. It invokes papermill on the supplied notebook.

In [None]:
import sys
!{sys.executable} -m pip install https://storage.googleapis.com/ml-pipeline/release/0.1.3-rc.2/kfp.tar.gz --upgrade

### Create the pipeline

In [132]:
import kfp.components as comp
import kfp.dsl as dsl

# a single-op pipeline that runs the flights pipeline on the pod
@dsl.pipeline(
   name='FlightsPipeline',
   description='Trains, deploys flights model'
)
def flights_pipeline(
   inputnb=dsl.PipelineParam('inputnb'),
   outputnb=dsl.PipelineParam('outputnb'),
   params=dsl.PipelineParam('params')
):
    notebookop = dsl.ContainerOp(
      name='flightsmodel',
      # image needs to be a compile-time string
      image='gcr.io/cloud-training-demos/submitnotebook:latest',
      arguments=[
        inputnb,
        outputnb,
        params
      ]
    )

# compile the pipeline
pipeline_func = flights_pipeline
pipeline_filename = pipeline_func.__name__ + '.tar.gz'
import kfp.compiler as compiler
compiler.Compiler().compile(pipeline_func, pipeline_filename)
print(pipeline_filename)

flights_pipeline.tar.gz


In [133]:
!ls *.tar.gz

flights_pipeline.tar.gz


### Run the pipeline

In [121]:
%%writefile params.yaml
---
BUCKET: cloud-training-demos-ml
PROJECT: cloud-training-demos
DEVELOP_MODE: False

Overwriting params.yaml


In [None]:
%%bash
gcloud storage rm --recursive gs://$BUCKET/flights/notebook
gcloud storage cp *.ipynb params.yaml gs://$BUCKET/flights/notebook

In [148]:
!gcloud storage ls gs://$BUCKET/flights/notebook

gs://cloud-training-demos-ml/flights/notebook/flights_model.ipynb
gs://cloud-training-demos-ml/flights/notebook/flights_small.ipynb
gs://cloud-training-demos-ml/flights/notebook/launcher.ipynb
gs://cloud-training-demos-ml/flights/notebook/params.yaml


In [143]:
#Specify pipeline argument values
GCSDIR='gs://{}/flights/notebook'.format(BUCKET)
arguments = {
    'inputnb': '{}/flights_model.ipynb'.format(GCSDIR),
    'outputnb': '{}/flights_model_kfp.ipynb'.format(GCSDIR),
    'params': '{}/params.yaml'.format(GCSDIR),
}

#Get or create an experiment and submit a pipeline run
import kfp
client = kfp.Client()
list_experiments_response = client.list_experiments()
experiments = list_experiments_response.experiments
if not experiments:
    #The user does not have any experiments available. Creating a new one
    experiment = client.create_experiment('Flight pipeline experiment')
else:
    experiment = experiments[-1] #Using the last experiment

#Submit a pipeline run
from datetime import datetime
run_name = 'Flight pipeline {}'.format(datetime.now().strftime("%Y%m%d %H%M%S"))
run_result = client.run_pipeline(experiment.id, run_name, pipeline_filename, params=arguments)

In [149]:
!gcloud storage ls gs://$BUCKET/flights/notebook

gs://cloud-training-demos-ml/flights/notebook/flights_model.ipynb
gs://cloud-training-demos-ml/flights/notebook/flights_model_kfp.ipynb
gs://cloud-training-demos-ml/flights/notebook/flights_small.ipynb
gs://cloud-training-demos-ml/flights/notebook/launcher.ipynb
gs://cloud-training-demos-ml/flights/notebook/notebook.ipynb
gs://cloud-training-demos-ml/flights/notebook/params.yaml
