# Overview

In this section, we'll see how to create and use a custom container for a Vertex AI training job 

Create a Custom Container for Vertex AI pipeline model training
1. Create a Python model trainer module using the model_sample.py file provided - YOU WILL NEED TO CHANGE THE VARIABLES FOR THIS TO RUN PROPERLY - e.g. the project and bucket information
2. Save your code as `model.py` in the `model/trainer` beneath the current working directory for this notebook
3. Make sure you set the Project ID correctly in the Python script. 
4. Create the Dockerfile definition in the `model/` directory for your custom training container using the `gcr.io/deeplearning-platform-release/tf2-cpu.2-6` base container image

Once you have prepared the custom container Python module code and Dockerfile you can build and test the custom container. 

Optionally, you can test how to use this custom container in training pipeline. 

In [1]:
!pip3 install google-cloud-aiplatform --user --quiet


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.3.1[0m[39;49m -> [0m[32;49m23.0[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In order to create the container and run this training job, you first need to get the training data moved into your own Google Cloud Storage bucket. Then, you'll need to update the corresponding variables in the model.py script to point to the proper location / region

The training source data can be downloaded from this repository as:
area_cover_dataset.csv

You may need to create a GCP Container Registry if you do not have an existing one to use


In [None]:
# These are the modules used by the training job in the model.py script
import tensorflow 
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import to_categorical
import keras_tuner 
from google.cloud import aiplatform

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

import numpy
import pandas
import json, os

In [None]:
# These are the variables you will need to change - both here and in your model.py script
REGION = "us-central1"
PROJECT_ID = !(gcloud config get-value core/project)
PROJECT_ID = PROJECT_ID[0]
MODEL_PATH='gs://'+PROJECT_ID+'-bucket/model'
DATASET_PATH='gs://'+PROJECT_ID+'-bucket/area_cover_dataset.csv'
PIPELINE_ROOT = 'gs://'+PROJECT_ID+'-bucket'
MODEL_ARTIFACTS_LOCATION ='gs://'+PROJECT_ID+'-bucket/'

Once you have updated all of your variables, you're ready to start building the container, testing it, and then running the training job

In [None]:
# Build the container using the following gcr.io tag
IMAGE_URI="gcr.io/{}/tensorflow:latest".format(PROJECT_ID)
!docker build ~/model/. -t $IMAGE_URI

In [None]:
# Run the docker image locally to test it
!docker run $IMAGE_URI

In [None]:
# Push the docker image to the Google container registry
!docker push $IMAGE_URI

You can navigate to the Container Registry to see the image created successfully - this is also where you can get its URI

Now, this can be used as part of a training job or pipeline. See example below - may make more sense to come back to this when you've completed the basic pipelines 101 section

In [None]:
# Install kubeflow pipeline SDK and google cloud pipeline component for building Vertex AI pipelines
!pip3 install kfp google_cloud_pipeline_components

In [None]:
# Import the libraries required for Vertext AI pipelines
import kfp
from kfp.v2 import compiler
from google.cloud import aiplatform
from google_cloud_pipeline_components import aiplatform as gcc_aip

If you want to test running the pipeline yourself:
* Make sure to update the container_uri to use the custom container URI that you created in the previous steps
* You will also want to update the base_output_dir location.

In [None]:
# Define the Vertex AI pipeline
@kfp.dsl.pipeline(name="vertex-ai-pipeline",
                  pipeline_root=PIPELINE_ROOT)
def pipeline(
    bucket: str = MODEL_PATH,
    project: str = PROJECT_ID,
    gcp_region: str = REGION,
    container_uri: str = "gcr.io/uki-mlops-dev-demo/tensorflow@sha256:e3f9f2c4bc1879b864f2931416d7c6d6a78a36d7493222a98ff39afc679a8f81",
):
    
    training_op = gcc_aip.CustomContainerTrainingJobRunOp(
        display_name="forestcover-train",
        container_uri=container_uri,
        project=project,
        location=gcp_region,
        staging_bucket=bucket,
        base_output_dir="gs://uki-mlops-dev-demo-bucket",
        training_fraction_split=0.8,
        validation_fraction_split=0.1,
        test_fraction_split=0.1,
        model_serving_container_image_uri="us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-6:latest",
        model_display_name="forestcover",
        machine_type="n1-standard-4",
    )   
    
    create_endpoint_op = gcc_aip.EndpointCreateOp(
        project=project,
        display_name = "forestcover-endpoint",
    )
    
    model_deploy_op = gcc_aip.ModelDeployOp(
        endpoint=create_endpoint_op.outputs["endpoint"],
        model=training_op.outputs["model"],
        deployed_model_display_name="forestcover",
        dedicated_resources_machine_type="n1-standard-4",
        dedicated_resources_min_replica_count=1,
        dedicated_resources_max_replica_count=1,   
    )

In [None]:
# Compile the  Vertex AI pipeline
compiler.Compiler().compile(
    pipeline_func=pipeline, package_path="pipeline.json"
)

You can use a timestamp for debugging pipeline runs

In [None]:
from datetime import datetime

TIMESTAMP = datetime.now().strftime("%Y%m%d%H%M%S")

In [None]:
# Create the Vertex AI Pipeline job object
pipeline_job = aiplatform.PipelineJob(
    display_name="forest-cover",
    template_path="pipeline.json",
    job_id="forest-train-pipeline-{0}".format(TIMESTAMP),
    parameter_values={
        "project": PROJECT_ID,
        "bucket": MODEL_PATH,
        "gcp_region": REGION,
        "container_uri": "gcr.io/uki-mlops-dev-demo/tensorflow@sha256:e3f9f2c4bc1879b864f2931416d7c6d6a78a36d7493222a98ff39afc679a8f81"
    },
    enable_caching=True,  
    
)

In [None]:
# Run the Vertex AI pipeline job
pipeline_job.run()

Now if you navigate to the Pipelines UI - you'll see the pipeline job running, when it's finished you'll also see the endpoint and model successfully deployed to the endpoint.