# Compile and Upload the TFX Pipeline

This Notebook helps you to compile the **TFX Pipeline** to a **KFP package**. This will creat an **Argo YAML** file in a **.tar.gz** package. We perform the following steps:
1. Build a custom container image that include our modules
2. Compile TFX Pipeline using CLI
3. Deploy the compiled pipeline to KFP 

## 1. Build Container Image

The pipeline uses a custom docker image, which is a derivative of the [tensorflow/tfx:0.15.0](https://hub.docker.com/r/tensorflow/tfx) image, as a runtime execution environment for the pipeline's components. The same image is also used as a a training image used by **AI Platform Training**.

The custom image modifies the base image by:
 * Downgrading from Tensoflow v2.0 to v1.15 (since AI Platform Prediction is not supporting TF v2.0 yet).
 * Adding the `modules` folder, which includes the **train.py** and **transform.py** code files required by the **Trainer** and **Transform** components, as well as the implementation code for the custom **AccuracyModelValidator** component.


In [1]:
PROJECT_ID='ksalama-ocado' # Set your GCP project Id
IMAGE_NAME='tfx-image'
TAG='latest'
TFX_IMAGE='gcr.io/{}/{}:{}'.format(PROJECT_ID, IMAGE_NAME, TAG)

!gcloud builds submit --tag $TFX_IMAGE ./ml-pipeline

Creating temporary tarball archive of 17 file(s) totalling 51.6 KiB before compression.
Uploading tarball of [./ml-pipeline] to [gs://ksalama-ocado_cloudbuild/source/1581501568.68-f78934fd77e24b148d32e17cc86596c8.tgz]
Created [https://cloudbuild.googleapis.com/v1/projects/ksalama-ocado/builds/880884a9-43e1-4f23-a228-0323e1ce5f86].
Logs are available at [https://console.cloud.google.com/gcr/builds/880884a9-43e1-4f23-a228-0323e1ce5f86?project=621539573576].
----------------------------- REMOTE BUILD OUTPUT ------------------------------
starting build "880884a9-43e1-4f23-a228-0323e1ce5f86"

FETCHSOURCE
Fetching storage object: gs://ksalama-ocado_cloudbuild/source/1581501568.68-f78934fd77e24b148d32e17cc86596c8.tgz#1581501569098225
Copying gs://ksalama-ocado_cloudbuild/source/1581501568.68-f78934fd77e24b148d32e17cc86596c8.tgz#1581501569098225...
/ [1 files][ 11.8 KiB/ 11.8 KiB]                                                
Operation completed over 1 objects/11.8 KiB.                     

## 2. Compile TFX Pipeline using CLI

In [None]:
!tfx pipeline --help

In [2]:
%%bash 

export PROJECT_ID=$(gcloud config get-value core/project) # Set your GCP project Id

export IMAGE_NAME=tfx-image
export TAG=latest
export TFX_IMAGE=gcr.io/${PROJECT_ID}/${IMAGE_NAME}:${TAG}

export PREFIX=ksalama-mlops-dev # Set your prefix
export NAMESPACE=kfp # Set your namespace
export GCP_REGION=europe-west1 # Set your region
export ZONE=europe-west1-b # Set your zone

export ARTIFACT_STORE_URI=gs://${PREFIX}-artifact-store
export GCS_STAGING_PATH=${ARTIFACT_STORE_URI}/staging
export GKE_CLUSTER_NAME=${PREFIX}-cluster
export DATASET_NAME=sample_datasets # Set your BigQuery Dataset
    
export PIPELINE_NAME=tfx-census-classifier-ct
export RUNTIME_VERSION=1.15
export PYTHON_VERSION=3.7

tfx pipeline compile \
    --engine=kubeflow \
    --pipeline_path=ml-pipeline/pipeline.py 

CLI
Compiling pipeline
Pipeline compiled successfully.
Pipeline package path: /home/tfx-workshop/03-tfx-kfp-gcp/tfx-census-classifier-ct.tar.gz


## 3. Deploy the Compiled Pipeline to KFP

In [None]:
!kfp pipeline --help

In [3]:
%%bash

export NAMESPACE=kfp # Set your namespac
export PREFIX=ksalama-mlops-dev # Set your prefix
export GKE_CLUSTER_NAME=${PREFIX}-cluster
export ZONE=europe-west1-b # Set your zone

gcloud container clusters get-credentials ${GKE_CLUSTER_NAM}E --zone ${ZONE}
export INVERSE_PROXY_HOSTNAME=$(kubectl describe configmap inverse-proxy-config -n ${NAMESPACE} | grep "googleusercontent.com")

kfp --namespace=${NAMESPACE} --endpoint=${INVERSE_PROXY_HOSTNAME} \
    pipeline upload \
    --pipeline-name='[TFX] Census Classification CT' \
    tfx-census-classifier-ct.tar.gz

Pipeline Details
------------------
ID           d9495d19-3e84-43c3-9066-e667d292228c
Name         [TFX] Census Classification CT
Description
Uploaded at  2020-02-12T10:12:17+00:00
+------------------+----------------------------------------------------------------+
| Parameter Name   | Default Value                                                  |
| pipeline-root    | gs://ksalama-mlops-dev-artifact-store/tfx-census-classifier-ct |
+------------------+----------------------------------------------------------------+


Fetching cluster endpoint and auth data.
ERROR: (gcloud.container.clusters.get-credentials) ResponseError: code=404, message=Not found: projects/ksalama-ocado/zones/europe-west1-b/clusters/E.
No cluster named 'E' in ksalama-ocado.
Pipeline d9495d19-3e84-43c3-9066-e667d292228c has been submitted

