# Deploying the Pipeline
This notebook assumes one has created, built and commiting the artifacts required. Here we will deploy only a new pipeline

## Environment Setup
**NOTE:** Set Project ID to your project  

In [None]:
PROJECT_ID = 'mmlops3'
PREFIX = PROJECT_ID
REGION = 'us-central1'
JOB_DIR_ROOT='gs://{}-artifact-store/jobs'.format(PREFIX)
NAMESPACE='kubeflow'
ZONE='us-central1-a'
ARTIFACT_STORE_URI='gs://{}-artifact-store'.format(PREFIX)
GCS_STAGING_PATH='{}/staging'.format(ARTIFACT_STORE_URI)
GKE_CLUSTER_NAME='{}-cluster'.format(PREFIX)


!gcloud container clusters get-credentials $GKE_CLUSTER_NAME --zone $ZONE
HOST_TEMP=!(kubectl describe configmap inverse-proxy-config -n $NAMESPACE | grep "googleusercontent.com")
INVERSE_PROXY_HOSTNAME=HOST_TEMP[0]


## Deploying the pipeline
Select a pipeline name, ensure it is not already in use at the allocated hostname (else a 500 error will be displayed). Deploy the pipeline. 

In [None]:
PIPELINE_NAME='covertype_classifier_training_v0'

!kfp --endpoint {INVERSE_PROXY_HOSTNAME} pipeline upload -p {PIPELINE_NAME} covertype_training_pipeline.yaml

This command will return a list of pipelines depolyed at the given hostname. We see that `covertype_classifier_training` has been deployed. This list also allows us to copy the pipeline ID. 

In [None]:
!kfp --endpoint {INVERSE_PROXY_HOSTNAME} pipeline list

#### Viewing the pipeline
The deployed pipeline can be viewed through the Kubeflow Pipeline UI given at the URL below. 

In [None]:
print('https://{}'.format(INVERSE_PROXY_HOSTNAME))

## Run Experiment 
Now that the pipeline is deployed we want to run an experiment, this will cause the pipeline to run, pulling the data from bigquery and splitting it, training the models, evaluating them and deploy the best performing model. This experiment takes approximately an hour to execute and will result in a deployed model which can be interacted with through GCP's AI platform predicting service. 

**NOTE:** Change the PIPELINE_ID to reflect the ID copied from above.  

In [None]:
PIPELINE_ID='d99c0b38-cef2-4056-a6aa-fed4f35dd06c'

EXPERIMENT_NAME='Covertype_Classifier_Training_V0'
RUN_ID='Run_001'
SOURCE_TABLE='covertype_dataset.covertype'
DATASET_ID='splits'
EVALUATION_METRIC='accuracy'
EVALUATION_METRIC_THRESHOLD='0.69'
MODEL_ID='covertype_classifier_V0'
VERSION_ID='v01'
REPLACE_EXISTING_VERSION=True

In [None]:
!kfp --endpoint {INVERSE_PROXY_HOSTNAME} run submit \
-e {EXPERIMENT_NAME} \
-r {RUN_ID} \
-p {PIPELINE_ID} \
project_id={PROJECT_ID} \
gcs_root={GCS_STAGING_PATH} \
region={REGION} \
source_table_name={SOURCE_TABLE} \
dataset_id={DATASET_ID} \
evaluation_metric_name={EVALUATION_METRIC} \
evaluation_metric_threshold={EVALUATION_METRIC_THRESHOLD} \
model_id={MODEL_ID} \
version_id={VERSION_ID} \
replace_existing_version={REPLACE_EXISTING_VERSION}

## Testing model
To test the model we can use the AI platforms prediction API to ask for a prediction based on a JSON input aternatively we can use the prediction UI and input: *{"instances":[[2395,0,0,60,6,1170,218,238,156,1054,"Cache","C2717"]]}* in the test case window.

We write a prediction JSON file with a set of data points, the correct cover types are 6 and 1 respectively.

In [None]:
%%writefile predict.json
[3366,122,15,789,127,2881,244,227,107,2437,"Commanche","C8772"]
[2791,340,15,30,10,3906,188,217,168,5401,"Rawah","C7745"]

In [None]:
INPUT_DATA_FILE="./predict.json"

!gcloud ai-platform predict --model {MODEL_ID} \
  --version {VERSION_ID} \
  --json-instances {INPUT_DATA_FILE}