# End-to-end MLOps & Gen AI Event-driven Solution

## Background / overview

In this notebook, we will build end-to-end event-driven solution that combines MLOps & Gen AI.

We will focus on a computer vision (CV) use case to identify objects in images. Extending the CV use case we implemented in Chapter 15, we will build our CV model via an MLOps pipeline that incorporates the majority of the main topics from the earlier chapters in this book, such as data preparation, model training, deployment, inference, and evaluation. 

The novel approach we are using here is to utilize generative AI to generate the data that will be used to test and evaluate our model.

## Prerequisites
**Note:** This notebook and repository are supporting artifacts for the "Google Machine Learning and Generative AI for Solutions Architects" book. The book describes the concepts associated with this notebook, and for some of the activities, the book contains instructions that should be performed before running the steps in the notebooks. Each top-level folder in this repo is associated with a chapter in the book. Please ensure that you have read the relevant chapter sections before performing the activities in this notebook.

**There are also important generic prerequisite steps outlined [here](https://github.com/PacktPublishing/Google-Machine-Learning-for-Solutions-Architects/blob/main/Prerequisite-steps/Prerequisites.ipynb).**


**Attention:** The code in this notebook creates Google Cloud resources that can incur costs.

Refer to the Google Cloud pricing documentation for details.

For example:

* [Vertex AI Pricing](https://cloud.google.com/vertex-ai/pricing)
* [Google Cloud Storage Pricing](https://cloud.google.com/storage/pricing)
* [Cloud Functions Pricing](https://cloud.google.com/functions/pricing)
* [Eventarc Pricing](https://cloud.google.com/eventarc/pricing)

## Solution Architecture

Let's begin with the MLOps pipeline architecture.

### MLOps pipeline architecture

The following is the architecture of the MLOps pipeline we will build in this notebook:

![MLOps](images/cpt-18-mlops-pipeline.png)

In the MLOps pipeline, the process works as follows: 

1. The first step in our pipeline—the model training step—is invoked. While the MLOps pipeline we built for our tabular Titanic dataset in Chapter 11 started with distinct data preprocessing steps using Serverless Spark in Dataproc, in our pipeline in this chapter, the data ingestion and preparation steps are handled directly in the code of our model training job. Also, as noted, in this case, we are using the built-in CIFAR-10 image dataset in Tensorflow/Keras rather than fetching a dataset from an external source. Vertex AI Pipelines starts the model training process by submitting a model training job to the Vertex AI Training service.  

1. In order to execute our custom training job, the Vertex AI Training service fetches our custom Docker container from Google Artifact Registry. 

1. When our model has been trained, the trained model artifacts are saved in Google Cloud Storage.  

1. The model training job status is complete.  

1. The next step in our pipeline—the model import step—is invoked. This is an intermediate step that prepares the model metadata to be referenced in later components of our pipeline. The relevant metadata in this case consists of the location of the model artifacts in Google Cloud Storage and the specification of the Docker container image in Google Artifact Registry that will be used to serve our model.  

1. The next step in our pipeline—the model upload step—is invoked. This step references the metadata from the model import step.  

1. The model metadata is used to register the model in Vertex AI Model Registry. This makes it easy to deploy our model for serving traffic in Vertex AI.  

1. The model upload job status is complete.  

1. The next step in our pipeline—the endpoint creation step—is invoked.  

1. An endpoint is created in the Vertex AI Prediction service. This endpoint will be used to host our model.  

1. The endpoint creation job status is complete.  

1. The next step in our pipeline - the model deployment step - is invoked.  

1. Our model is deployed to our endpoint in the Vertex AI Prediction service. This step references the metadata of the endpoint that has just been created by our pipeline, as well as the metadata of our model in the Vertex AI Model Registry.   

1. The model deployment job status is complete.  

### End-to-end solution architecture

The solution architecture is shown in the following diagram:

![end-to-end](images/cpt-18-end-to-end.png)

Our MLOps pipeline is simplified in the top-left corner of the diagram. It still implements all of the same steps we discussed in the previous section, but the diagram is simplified so we can focus our discussion on the broader, end-to-end solution. In this context, the MLOps pipeline is represented as a single step in the overall process. 

The following set of steps describes the architecture:

1. Our MLOps pipeline trains and deploys our CV model. 

1. When the MLOps pipeline completes, it publishes a message to a Pub/Sub topic we created for that purpose. 

1. Eventarc detects that a message has been published to the Pub/Sub topic. 

1. Eventarc triggers the Cloud Function we’ve created to generate an image. 

1. The code in our image generation function makes a call to the Imagen API with a prompt to generate an image containing one of the types of objects our model was trained to recognize (i.e., a type of object supported by the CIFAR-10 dataset). 

1. Imagen generates an image and returns it to our function. 

1. Our function stores the new image in GCS. 

1. GCS emits an event indicating that a new object has been uploaded to our bucket. Eventarc detects this event. 

1. Eventarc invokes our next Cloud Function and passes the GCS event metadata to our function. This metadata includes details such as the identifiers of the bucket and the object in question. 

1. Our prediction function takes the details regarding the bucket and the object in question from the event metadata and uses those details to fetch the newly created object (i.e., the newly generated image from Imagen). 

1. Our prediction function then performs some preprocessing on the image to transform it into a format that is expected by our model (i.e., similar to the format of the CIFAR-10 data the model was trained on). Our function then sends the transformed data as a prediction request to the Vertex AI endpoint that hosts our model. 

1. Our model predicts what type of object is in the image, and sends a prediction response to our Cloud Function. 

1. Our Cloud Function saves the prediction response in GCS. 

When the process has been completed, you can view the generated image and the resulting prediction from our model in GCS. 

Notice that all of the steps in the solution are implemented automatically and without the need to provision any servers. This is a fully serverless, event-driven solution architecture. 

An interesting side-effect of this solution is that, although the primary intention is to test our newly trained model on generated data, this solution could also be applied to the inverse use case. That is, if we are confident that our model has been trained effectively and provides consistently accurate results, we could use it to evaluate the quality of the generated data. For example, if our model predicts that the generated data contains a particular type of object with a probability of 99.8%, we can interpret this as a reflection of the quality of the generated data. 

Now that we’ve discussed the various steps in the process, let’s start building it! 

# Initial setup

In this section, we set up all of the baseline requirements to build our solution.

## Install required packages

We will use the following libraries in this notebook:

* [The Vertex AI Python SDK](https://cloud.google.com/python/docs/reference/aiplatform/latest)
* [Kubeflow Pipelines (KFP)](https://www.kubeflow.org/docs/components/pipelines/v1/sdk/sdk-overview/)
* [Google Cloud Pipeline Components (GCPC)](https://cloud.google.com/vertex-ai/docs/pipelines/components-introduction)

**Note:** Sometimes the `pip` installation commands display warnings or errors regarding dependencies. In Chapter 14, I explain how to create custom Conda kernels to avoid dependency conflicts. However, if you have been following the instructions in the book that accompanies this repo (creating new Vertex AI notebooks where relevant, etc.) then our activities are not affected by any dependency conflicts, and you can ignore any pip dependency errors.

In [None]:
! python -m pip install --upgrade pip

In [None]:
! pip3 install --quiet --user --upgrade google-cloud-aiplatform kfp google-cloud-pipeline-components

*The pip installation commands sometimes report various errors. Those errors usually do not affect the activities in this notebook, and you can ignore them.*


## Restart the kernel

The code in the next cell will retart the kernel, which is sometimes required after installing/upgrading packages.

**When prompted, click OK to restart the kernel.**

The sleep command simply prevents further cells from executing before the kernel restarts.

In [None]:
import IPython

app = IPython.Application.instance()
app.kernel.do_shutdown(True)


In [None]:
import time
time.sleep(10)

# (Wait for kernel to restart before proceeding...)

## Import required libraries

In [None]:
# General
from datetime import datetime
from google.cloud import aiplatform

# Kubeflow Pipelines (KFP)
import kfp
from kfp import compiler, dsl
from kfp.dsl import component, Input, Output, Artifact

# Google Cloud Pipeline Components (GCPC)
from google_cloud_pipeline_components.v1 import dataset, custom_job
from google_cloud_pipeline_components.v1.model import ModelUploadOp
from google_cloud_pipeline_components.types import artifact_types
from google_cloud_pipeline_components.v1.endpoint import EndpointCreateOp, ModelDeployOp

## Set Google Cloud resource variables

The following code will set variables specific to your Google Cloud resources that will be used in this notebook, such as the Project ID, Region, and GCS Bucket.

**Note: This notebook is intended to execute in a Vertex AI Workbench Notebook, in which case the API calls issued in this notebook are authenticated according to the permissions (e.g., service account) assigned to the Vertex AI Workbench Notebook.**

We will use the `gcloud` command to get the Project ID details from the local Google Cloud project, and assign the results to the PROJECT_ID variable. If, for any reason, PROJECT_ID is not set, you can set it manually or change it, if preferred.

We also use a default bucket name for most of the examples and activities in this book, which has the format: `{PROJECT_ID}-aiml-sa-bucket`. You can change the bucket name if preferred.

Also, we're defaulting to the **us-central1** region, but you can optionally replace this with your [preferred region](https://cloud.google.com/about/locations).

In [None]:
PROJECT_ID_DETAILS = !gcloud config get-value project
PROJECT_ID = PROJECT_ID_DETAILS[0]  # The project ID is item 0 in the list returned by the gcloud command
BUCKET=f"{PROJECT_ID}-aiml-sa-bucket" # Optional: replace with your preferred bucket name, which must be a unique name.
REGION="us-central1" # Optional: replace with your preferred region (See: https://cloud.google.com/about/locations) 
print(f"Project ID: {PROJECT_ID}")
print(f"Bucket Name: {BUCKET}")

## Create bucket

The following code will create the bucket if it doesn't already exist.

If you get an error saying that it already exists, that's fine, you can ignore it and continue with the rest of the steps, unless you want to use a different bucket.

In [None]:
!gsutil mb -l us-central1 gs://{BUCKET}

# Begin implementation

Now that we have performed the prerequisite steps for this activity, it's time to implement the activity.

## Enable services

We will use the following Google Cloud services and APIs in this solution:
* [Cloud Functions](https://cloud.google.com/functions?hl=en)
* [Cloud Run](https://cloud.google.com/run?hl=en) (Cloud Functions run on Cloud Run)
* [Eventarc](https://cloud.google.com/eventarc/docs)
* [Pub/Sub](https://cloud.google.com/pubsub?hl=en)

In order to use those services, we need to enable their APIs in our GCP project. The following command enables them.

In [None]:
! gcloud services enable cloudfunctions.googleapis.com run.googleapis.com eventarc.googleapis.com pubsub.googleapis.com

### Verify that they are enabled

The following command lists all enabled APIs in our GCP project, and filters for the ones we want to use in this solution. Ensure that each of the relevant APIs appears in the output.

In [None]:
! gcloud services list --enabled | egrep 'functions|run|event|pub'

## Define constants
In this section, we define all of the constants that will be referenced throughout the rest of the notebook.

**REPLACE THE REGION, AND BUCKET DETAILS WITH YOUR DETAILS.**

In [None]:
# Core constants
BUCKET_URI = f"gs://{BUCKET}"
BUCKET_DIR = "chapter-18"
APPLICATION_DIR = "mlops-images-app" # Local parent directory for our pipeline resources
TRAINER_DIR = f"{APPLICATION_DIR}/trainer" # Local directory for training resources
APP_NAME="mlops-images" # Base name for our pipeline application

# Cloud Function constants
GEN_FUNCTION="image_gen_function"
PREDICT_FUNCTION="image_predict_function"
DATA_FOLDER_PATH = f"{BUCKET_DIR}-data"
PRED_FOLDER_PATH = f"{BUCKET_DIR}-predictions"
PROMPT = "A ship on the ocean. It is fully visible." # We can use any object from CIFAR-10 (https://www.cs.toronto.edu/~kriz/cifar.html)
TOPIC_NAME="pipeline_completed_notifications"

# Pipeline constants
PIPELINE_NAME = "mlops-images-pipeline" # Name of our pipeline
PIPELINE_ROOT = f"{BUCKET_URI}/pipelines" # (See: https://www.kubeflow.org/docs/components/pipelines/v1/overview/pipeline-root/)
MODEL_NAME = "mlops-images" # Name of our model
EXPERIMENT_NAME = "aiml-sa-images-experiment" # Vertex AI "Experiment" name for metadata tracking

# Training constants
TRAIN_REPO_NAME=f'{APP_NAME}-training' # Name of repository in which we will store our custom training image
TRAIN_IMAGE_URI = f"{REGION}-docker.pkg.dev/{PROJECT_ID}/{TRAIN_REPO_NAME}/{APP_NAME}-train:latest"
MODEL_URI = f"{BUCKET_URI}/models/chapter-18/cv" # Where to store our trained model

# Hyperparameters for training
BATCH_SIZE: int = 4
EPOCHS: int = 30
LEARNING_RATE: float = 0.001

# Arguments to pass to our training job
TRAINING_ARGS=[
    "--project_id",
    PROJECT_ID,
    "--bucket_name",
    BUCKET,
    "--model_path",
    MODEL_URI,
    "--batch_size",
    str(BATCH_SIZE),
    "--epochs",
    str(EPOCHS),
    "--learning_rate",
    str(LEARNING_RATE),
]

# Worker pool spec (see https://cloud.google.com/vertex-ai/docs/reference/rest/v1/CustomJobSpec#workerpoolspec)
WORKER_POOL_SPEC = [
    {
        "machine_spec": {
            "machine_type": "n1-standard-4",
        },
        "replica_count": 1,
        "container_spec": {
            "image_uri": TRAIN_IMAGE_URI,
            "args": TRAINING_ARGS
        },
    }
]

# Serving constants
SERVING_IMAGE_URI = "us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-12:latest" # (See: https://cloud.google.com/vertex-ai/docs/predictions/pre-built-containers)
ENDPOINT_NAME = "mlops-endpoint" # Name of endpoint on which to serve our trained model

### Create local directories
We will use the following local directories during the activities in this notebook.

In [None]:
# make a source directory to save the code
!mkdir -p $APPLICATION_DIR
!mkdir -p $TRAINER_DIR
!mkdir -p $GEN_FUNCTION
!mkdir -p $PREDICT_FUNCTION

### Set  project ID for  gcloud
The following command sets our project ID for using gcloud commands in this notebook.

In [None]:
! gcloud config set project $PROJECT_ID --quiet

In [None]:
! gcloud pubsub topics create $TOPIC_NAME

### Initialize the Vertex AI SDK client

In [None]:
aiplatform.init(project=PROJECT_ID, location=REGION, staging_bucket=BUCKET_URI)

# Create Cloud Functions


## Create the image generation function

First, we'll create the function that will use Imagen to generate an image. Our function stores the image in GCS (we'll write another function later that will take our newly-generated image from GCS and send it in an inference request to a Computer Vision model that we will train later in this notebook).

In this section we are not directly executing the code in this notebook; we are writing/saving the code to local files that will be uploaded to be executed by Google Cloud Functions.

### Create our requirements.txt file
The requirements.txt file is a convenient way to specify all of the packages that we want to install in our custom container image. This file will be referenced in the Dockerfile for our image.

In this case, we will install:
* [Google Cloud Functions Framework](https://cloud.google.com/functions/docs/functions-framework)
* [The Vertex Generative AI Python SDK](https://pypi.org/project/vertexai/)
* [Pillow](https://pypi.org/project/pillow/)
* [Python Client for Google Cloud Storage](https://cloud.google.com/python/docs/reference/storage/latest)

In [None]:
%%writefile {GEN_FUNCTION}/requirements.txt
functions-framework==3.*
vertexai
Pillow
google-cloud-storage

### Create our function code 

This is the code our function will run when invoked. It will perform the following steps:
1. Import required libraries.
1. Set local variables from [environment variables](https://cloud.google.com/functions/docs/configuring/env-var). (We specified the values of these variables earlier in this notebook.)
1. Implement a function that is invoked by Google Cloud Events (this is specified by using the `@functions_framework.cloud_event` decorator). This function does the following:
* Set up the Vertex AI environment using our specified project and region.
* Load the Imagen `"imagegeneration@006"` image generation model.
* Generate an image. See [documentation here](https://cloud.google.com/vertex-ai/generative-ai/docs/image/generate-images#vertex-ai-sdk-for-python) for details regarding the specified parameters and values.
* Store the image in GCS.
* Return the status

In [None]:
%%writefile {GEN_FUNCTION}/main.py
import functions_framework
from PIL import Image
import vertexai
from vertexai.preview.vision_models import ImageGenerationModel
from google.cloud import storage
import os

# Environment variables
PROJECT_ID = os.environ.get("PROJECT_ID")
REGION = os.environ.get("REGION")
BUCKET = os.environ.get("BUCKET")
DATA_FOLDER_PATH = os.environ.get("DATA_FOLDER_PATH")
PROMPT = os.environ.get("PROMPT")

BLOB_NAME = "func_generated_image.png"

@functions_framework.cloud_event  # Decorator for Cloud Events 
def generate_and_store_image(cloud_event):
    try:
        # Initialize the Vertex AI environment using our specified project and region
        vertexai.init(project=PROJECT_ID, location=REGION)
        
        # Load the Imagen `"imagegeneration@006"` image generation model
        model = ImageGenerationModel.from_pretrained("imagegeneration@006")

        # Generate an image
        images = model.generate_images(
            prompt=PROMPT,
            number_of_images=1,
            language="en",
            aspect_ratio="1:1",
            safety_filter_level="block_some",
            person_generation="allow_adult",
        )

        # Store the image data in GCS
        image_data = images[0]._image_bytes
        storage_client = storage.Client(project=PROJECT_ID)
        bucket = storage_client.bucket(BUCKET)
        blob_path = f"{DATA_FOLDER_PATH}/{BLOB_NAME}"
        blob = bucket.blob(blob_path)
        blob.upload_from_string(image_data, content_type="image/png")

        # Return status
        return "Image generated and stored successfully!", 200

    except Exception as e:
        print(f"Error: {e}")
        return f"An error occurred: {e}", 500

## Deploy the image generation function

Until this point, we have simply saved our function code locally in our Jupyter Notebook. In this section, we will deploy our code to the Google Cloud Functions service, and specify the trigger that will cause the function to be invoked.

In the next cell, we will use the `gcloud functions deploy` command to deploy our Cloud Function. The command will perform all of the following steps on our behalf:

1. Packages the code and any dependencies to be deployed to Cloud Functions. (This consists of the `main.py` and `requirements.txt` files we created above.)
1. Triggers Cloud Build to build a container image for our function. The build process includes:
* Fetching the base container image based on our chosen runtime (in our case, Python 3.12).
* Copying our function code into the container.
* Installing dependencies (as specified in the `requirements.txt` file).
* Configuring the function entry point (the function to be executed when the Cloud Function is triggered).

The resulting container image is stored in Google Artifact Registry.

The variables and flags in the command are as follows:
* `{GEN_FUNCTION}`: The name of the Cloud Function to deploy. 
* `--region {REGION}`: The region in which to deploy our Cloud Function.
* `--runtime python312`: Our desired function runtime version; in this case, Python 3.12
* `--memory 512`: The desired amount of memory to use for running our function; in this case, 512 MB
* `--trigger-topic {TOPIC_NAME}`: The `--trigger-topic` flag specifies that we want our function to be triggered every time a message is published to a specific Pub/Sub topic. The `{TOPIC_NAME}` specifies the name of the topic.
* `--entry-point generate_and_store_image`: Specifies the function within our code that should be executed on each invocation. (In this case, it's our `generate_and_store_image` function.)
* `--source {GEN_FUNCTION}`: The name of the local directory in our Jupyter Notebook that contains the required code files.
* `--gen2`: This specifies that we want to deploy a 2nd-generation Cloud Function. See further details [here](https://cloud.google.com/functions/docs/concepts/version-comparison).
* `--no-allow-unauthenticated`: Require that the request is authenticated (i.e., do not allow unauthenticated requests).
* `--set-env-vars`: This allows us to set environment variables. See further details [here](https://cloud.google.com/functions/docs/configuring/env-var).

Note: we use the `%%capture output` Jupyter magic to capture the output because the build process generates a lot of output messages. See further details [here](https://ipython.readthedocs.io/en/stable/interactive/magics.html#cellmagic-capture).

In [None]:
%%capture output
! gcloud functions deploy {GEN_FUNCTION} --region {REGION} --runtime python312 --memory 512 --trigger-topic {TOPIC_NAME} --entry-point generate_and_store_image --source {GEN_FUNCTION} --gen2 --no-allow-unauthenticated --set-env-vars "PROJECT_ID={PROJECT_ID},REGION={REGION},DATA_FOLDER_PATH={DATA_FOLDER_PATH},BUCKET={BUCKET},PROMPT={PROMPT}"

### A note on Eventarc

In our solution, Eventarc is used behind the scenes to detect our trigger events and to invoke our functions. Eventarc is designed to capture events from various Google Cloud services and route them to appropriate destinations. In this case, we do not need to explicitly configure Eventarc triggers because they will automatically be set up as follows:

* When we deploy our Cloud Function with the `--trigger-topic` flag in the `gcloud functions deploy` command, we're telling Google Cloud to invoke our function every time a message is published to the specified topic. 

* Behind the scenes, Google Cloud sets up an Eventarc trigger that listens for messages being published to your Pub/Sub topic.

* When a message is published to the relevant Pub/Sub topic, Eventarc's trigger detects it and automatically invokes our Cloud Function, passing the Pub/Sub message data as an argument to the function.

## Verify that the function has been deployed successfully

The following command checks the logs of the Cloud Function we deployed. If all went well, you should just see a few lines that show the function startup messages; you should not see the word "error"

You can also check the deployed Cloud Function details by navigating to `Cloud Functions` in the Google Cloud console.

In [None]:
! gcloud functions logs read {GEN_FUNCTION}

## Create the image prediction function

Next, we'll create the function that will take the image generated by our previous function and send it to our Computer Vision model (which we will train later in this notebook). 

Again, in this section we are not directly executing the code in this notebook; we are writing/saving the code to local files that will be uploaded to be executed by Google Cloud Functions.

### Create our requirements.txt file
Just as we did for our image generation function above, we need to create a requirements.txt file that specifies all of the dependencies that need to be installed for our function to execute correctly. 

In this case, we will install:
* [Google Cloud Functions Framework](https://cloud.google.com/functions/docs/functions-framework)
* [Pillow](https://pypi.org/project/pillow/)
* [NumPy](https://numpy.org/)
* [The Vertex AI Python SDK](https://cloud.google.com/python/docs/reference/aiplatform/latest)
* [Python Client for Google Cloud Storage](https://cloud.google.com/python/docs/reference/storage/latest)

In [None]:
%%writefile {PREDICT_FUNCTION}/requirements.txt
functions-framework==3.*
Pillow
numpy
google-cloud-aiplatform
google-cloud-storage

### Create our function code 

This is the code our function will run when invoked. It will perform the following steps:
1. Import required libraries.
1. Set local variables from [environment variables](https://cloud.google.com/functions/docs/configuring/env-var). (We specified the values of these variables earlier in this notebook.)
1. Create a GCS client that will be used to fetch and write data from/to GCS.
1. Implement a function that is invoked by Google Cloud Events (this is specified by using the `@functions_framework.cloud_event` decorator). In this case, the event is a GCS event that is generated when an object is uploaded or changed in our specified bucket. This function does the following:
* Extract data from the GCS event
* Download an open the relevant image (this is the image that was generated by our previous image generation function).
* Perform some image preprocessing steps. Remember that our Computer Vision model is trained on the CIFAR-10 dataset, so it expects to see images in that format. Our image generation function used Imagen to generate an image, and that image is not in the same format as the CIFAR-10 dataset, so we need to transform our image to a compatible format for our model. Specifically, we resize the image, convert it to RGB, then convert it to a Numpy array, and normalize the resulting array. At that point, it is ready to send to our Computer Vision model.
* Prepare a prediction request payload for our Computer Vision model. 
* Get our Vertex AI endpoint resource name from the environment variable.
* Send the inference request using the request payload.
* Extract the predictions from the response and find the highest probability class.
* Format the prediction results in JSON format. This is just an optional, standardized way to represent the outputs.
* Store the prediction results in GCS.
* Return the status

In [None]:
%%writefile {PREDICT_FUNCTION}/main.py
import functions_framework
from PIL import Image
import base64
import json
from PIL import Image
import numpy as np
from google.cloud import aiplatform
from google.cloud import storage
from io import BytesIO
import os

PROJECT_ID = os.environ.get("PROJECT_ID")
REGION = os.environ.get("REGION")
ENDPOINT_NAME = os.environ.get("ENDPOINT_NAME")
DATA_FOLDER_PATH = os.environ.get("DATA_FOLDER_PATH")
PRED_FOLDER_PATH = os.environ.get("PRED_FOLDER_PATH")

# Create a GCS client
storage_client = storage.Client()

@functions_framework.cloud_event 
def predict(cloud_event):
    try:
        # Extract data from the GCS event
        data = cloud_event.data
        bucket_name = data["bucket"]
        object_name = data["name"]
        image_path = f"gs://{bucket_name}/{object_name}"
        
        if not object_name.startswith(DATA_FOLDER_PATH):
            print(f"Skipping object outside of target folder: {object_name}")
            return "Object not in target folder", 200
        else: 
            # Get a reference to the GCS bucket and blob
            bucket = storage_client.bucket(bucket_name)
            blob = bucket.blob(object_name)

            # Download the image data as bytes
            image_bytes = blob.download_as_bytes()

            # Load the image using Pillow's Image.open
            image = Image.open(BytesIO(image_bytes))

            # Image preprocessing 
            resized_image = image.resize((32, 32), resample=Image.BILINEAR)
            rgb_image = resized_image.convert('RGB')
            image_array = np.array(rgb_image)
            normalized_image = image_array / 255.0

            # Create the request payload
            request_payload = {
                "instances": [normalized_image.tolist()]  # Reshape to [32, 32, 3]
            }

            # Get the endpoint resource name
            mlops_endpoint_list = aiplatform.Endpoint.list(
                filter=f'display_name={ENDPOINT_NAME}', order_by='create_time desc'
            )
            new_mlops_endpoint = mlops_endpoint_list[0]
            endpoint_resource_name = new_mlops_endpoint.resource_name
            print(endpoint_resource_name)

            # Send the inference request using the request payload
            response = aiplatform.Endpoint(endpoint_resource_name).predict(
                instances=request_payload["instances"]
            )

            # Extract predictions and find the highest probability class
            predictions = response.predictions[0]
            class_index = predictions.index(max(predictions))
            class_probability = max(predictions)

            # CIFAR-10 class labels
            class_labels = [
                "airplane",
                "automobile",
                "bird",
                "cat",
                "deer",
                "dog",
                "frog",
                "horse",
                "ship",
                "truck",
            ]
            predicted_label = class_labels[class_index]

            # Format the prediction results in JSON format
            prediction_result = json.dumps({"Predicted class": predicted_label, "probability": class_probability})

            # Upload prediction results to GCS
            result_blob_name = (
                f"{PRED_FOLDER_PATH}/{object_name}-prediction.txt"  # Store results in a subfolder with the image name
            )
            result_blob = bucket.blob(result_blob_name)
            result_blob.upload_from_string(prediction_result)

            print(f"Prediction results uploaded to: gs://{bucket_name}/{result_blob_name}")

            return prediction_result, 200
    except Exception as e:
        print(f"Error: {e}")
        return json.dumps({"error": str(e)}), 500  # Return error response with status code 500


## Deploy the image prediction function

In the next cell, we will use the `gcloud functions deploy` command to deploy our image prediction Cloud Function. Again, the command will perform all of the following steps on our behalf:

1. Packages the code and any dependencies to be deployed to Cloud Functions. (This consists of the `main.py` and `requirements.txt` files we created above.)
1. Triggers Cloud Build to build a container image for our function. The build process includes:
* Fetching the base container image based on our chosen runtime (in our case, Python 3.12).
* Copying our function code into the container.
* Installing dependencies (as specified in the `requirements.txt` file).
* Configuring the function entry point (the function to be executed when the Cloud Function is triggered).

Again, the resulting container image is stored in Google Artifact Registry.

For the most part, the flags and variables used in the command are similar to the ones we used to deploy our image generation function earlier in this notebook. Apart from some slightly different environment variables (based on the needs of our function), the following is the main difference:`--trigger-bucket {BUCKET}`

When we deploy a 2nd generation Cloud Function with the `--trigger-bucket` flag, we're specifying that we want the function to be triggered by events in a specific GCS bucket. In our case, we're specifying the bucket in the `{BUCKET}` variable.

Again, Eventarc is automatically configured and used behind the scenes to handle the triggering mechanism. This trigger is configured to:
* Listen for the specified event type (in our case, the `google.storage.object.finalize` event for object creation/finalization).
* Filter events based on the bucket we provided.
* Deliver the event data to our Cloud Function.

When a matching event occurs in our GCS bucket (e.g., a new object is uploaded), Eventarc captures the event, filters it, and then invokes our Cloud Function, passing the event data as a parameter.

Note: again, we use the `%%capture output` Jupyter magic to capture the output because the build process generates a lot of output messages. See further details [here](https://ipython.readthedocs.io/en/stable/interactive/magics.html#cellmagic-capture).

In [None]:
%%capture output
! gcloud functions deploy {PREDICT_FUNCTION} --region {REGION} --runtime python312 --trigger-bucket {BUCKET} --entry-point predict --source {PREDICT_FUNCTION} --gen2 --no-allow-unauthenticated --memory 512 --set-env-vars "PROJECT_ID={PROJECT_ID},REGION={REGION},DATA_FOLDER_PATH={DATA_FOLDER_PATH},PRED_FOLDER_PATH={PRED_FOLDER_PATH},ENDPOINT_NAME={ENDPOINT_NAME}"

## Verify that the function has been deployed successfully

The following command checks the logs of the Cloud Function we deployed. If all went well, you should just see a few lines that show the function startup messages; you should not see the word "error"

You can also check the deployed Cloud Function details by navigating to `Cloud Functions` in the Google Cloud console.

In [None]:
! gcloud functions logs read {PREDICT_FUNCTION}

# Create custom training job
In this section, we will create our custom training job to train our Computer Vision model. It will consist of the following steps:
1. Create a Google Artifact Registry repository to host our custom container image.
2. Create our custom training script.
3. Create a Dockerfile that will specify how to build our custom container image. 
4. Build our custom container image.
5. Push our custom container image to Google Artifact Registry so that we can use it in subsequent steps in our pipeline.

## Create Google Artifact Registry repository

Our custom training component in our pipeline will run in a container on the Vertex AI Training service. In this section, we will create the Google Artifact Registry repository in which we can store our custom container image that we will build in later steps in this notebook.

In [None]:
!gcloud artifacts repositories create $TRAIN_REPO_NAME --repository-format=docker \
--location=$REGION --description="Train repo for MLOps images workload"

## Define the code for our training job

The following code will create a file that contains the code for our custom training job. 

The code performs the following processing steps:

1. Imports required libraries and sets initial variable values based on arguments passed to the script (the arguments are described below).
2. Reads in and prepares the dataset.
3. Trains our Keras Sequential convolutional neural network (CNN) model for Computer Vision.
4. Saves the model artifacts to GCS.

In [None]:
%%writefile {TRAINER_DIR}/train.py

import os
import tensorflow as tf
from tensorflow.keras import layers, models, datasets, optimizers, utils
import numpy as np
import argparse
from google.cloud import storage
    
# Define the CNN model
def create_model():
    model = models.Sequential()
    model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
    model.add(layers.MaxPooling2D((2, 2)))
    model.add(layers.Conv2D(64, (3, 3), activation='relu'))
    model.add(layers.MaxPooling2D((2, 2)))
    model.add(layers.Conv2D(64, (3, 3), activation='relu'))
    model.add(layers.Flatten())
    model.add(layers.Dense(64, activation='relu'))
    model.add(layers.Dense(10, activation='softmax'))
    return model

def train_model(args):
    # Input arguments
    project_id = args.project_id
    bucket_name = args.bucket_name
    model_path = args.model_path
    batch_size = args.batch_size
    epochs = args.epochs
    learning_rate = args.learning_rate
    
    ### DATA PREPARATION SECTION ###
    
    (x_train, y_train), (x_test, y_test) = datasets.cifar10.load_data()
    x_train, x_test = x_train / 255.0, x_test / 255.0    
    # Convert class vectors to binary class matrices
    y_train = utils.to_categorical(y_train, 10)
    y_test = utils.to_categorical(y_test, 10)
    
    ### MODEL TRAINING AND EVALUATION SECTION ###

    if tf.config.list_physical_devices('GPU'):
        device = '/GPU:0'
    else:
        device = '/CPU:0'
    
    with tf.device(device):
        net = create_model()

    # Compile the model
    net.compile(optimizer=optimizers.SGD(learning_rate=learning_rate, momentum=0.9),
                loss='categorical_crossentropy',
                metrics=['accuracy'])

    # Train the network
    history = net.fit(x_train, y_train, epochs=epochs, batch_size=batch_size,
                      validation_data=(x_test, y_test))

    # Save the trained model locally
    net.save(model_path)
    
    # Return the trained model
    return net  

if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='Train a CNN model on the CIFAR-10 dataset')
    
    parser.add_argument('--project_id', type=str, help='GCP Project ID')
    parser.add_argument('--bucket_name', type=str, help='GCP Bucket ID')
    parser.add_argument('--model_path', type=str, help='Path to save the trained model')
    parser.add_argument('--batch_size', type=int, default=4, help='Batch size')
    parser.add_argument('--epochs', type=int, default=20, help='Number of epochs')
    parser.add_argument('--learning_rate', type=float, default=0.001, help='Learning rate')

    args = parser.parse_args()

    train_model(args)

### Create the requirements.txt file
Just as we did for our Cloud Functions above, we need to create a requirements.txt file that specifies all of the dependencies that need to be installed for our training code to execute correctly. 

In this case, we will install:
* [The Vertex AI Python SDK](https://cloud.google.com/python/docs/reference/aiplatform/latest)
* [Python Client for Google Cloud Storage](https://cloud.google.com/python/docs/reference/storage/latest)
* [Filesystem interfaces for Python](https://filesystem-spec.readthedocs.io/en/latest/)
* [GCSFS](https://gcsfs.readthedocs.io/en/latest/)
* [pyarrow](https://arrow.apache.org/docs/python/index.html)

In [None]:
%%writefile {APPLICATION_DIR}/requirements.txt
google-cloud-aiplatform
tensorflow>=2.0.0
numpy
argparse
google-cloud-storage

## Create the Dockerfile for our custom training container

The [Dockerfile](https://docs.docker.com/engine/reference/builder/) specifies how to build our custom container image.

This Dockerfile specifies that we want to:
1. Use Vertex AI [prebuilt container for custom training](https://cloud.google.com/vertex-ai/docs/training/pre-built-containers) as a base image.
2. Install the required dependencied specified in our requirements.txt file.
3. Copy our custom training script to the container image.
4. Run our custom training script when the container starts up.

In [None]:
%%writefile {APPLICATION_DIR}/Dockerfile

# Use an official Python runtime as a parent image
FROM us-docker.pkg.dev/vertex-ai/training/tf-cpu.2-12.py310:latest

WORKDIR /

COPY requirements.txt /requirements.txt

# Install any needed packages specified in requirements.txt
RUN pip install --upgrade pip
RUN pip install --no-cache-dir -r requirements.txt

# Copies the trainer code to the Docker image.
COPY trainer /trainer

# Sets up the entry point to invoke the trainer.
ENTRYPOINT ["python", "-m", "trainer.train"]

### Build our custom training image

The steps required to build our image are:

1. Change directory to our application directory.
2. Build Docker image.
3. Push the image to our Google Artifact Registry.
4. Change directory back to our parent application directory.

In [None]:
cd $APPLICATION_DIR

In [None]:
! gcloud auth configure-docker us-central1-docker.pkg.dev --quiet

In [None]:
! docker build ./ -t $TRAIN_IMAGE_URI --quiet

### Push our custom image to Google Artifact Registry

In [None]:
! docker push $TRAIN_IMAGE_URI

In [None]:
cd ..

# Define component to notify when pipeline completes

When our pipeline completes, we will publish a message to the Pub/Sub topic we created earlier in this notebook. That event will cause our image generation Cloud Function to be invoked, which will then kick off the rest of the processes in our solution.

The following pipeline component will publish the message to the Pub/Sub topic when our pipeline completes.

In [None]:
@component(packages_to_install=["google-cloud-pubsub"], base_image="python:3.12")
def publish_message(project_id: str, topic_name: str, pipeline_run_id: str = None):
    """Publishes a message to a Pub/Sub topic with the pipeline run ID."""
    from google.cloud import pubsub_v1

    publisher = pubsub_v1.PublisherClient()
    topic_path = publisher.topic_path(project_id, topic_name)
    
    if pipeline_run_id is None:
        # Try to get the pipeline run ID from KFP environment variables
        pipeline_task = kfp.dsl.get_current_task()
        pipeline_run_id = pipeline_task.pipeline_run_id
        print(f"Fetched pipeline run ID from KFP environment: {pipeline_run_id}")
    else:
        print(f"Received pipeline run ID as argument: {pipeline_run_id}")

    # Create a message with the run ID
    message = f"Pipeline with run ID '{pipeline_run_id}' completed." 

    # Publish a message
    message = "Pipeline completed"  # You can customize the message
    data = message.encode("utf-8")
    future = publisher.publish(topic_path, data)
    print(f"Published message ID: {future.result()}")

# Define our Vertex AI Pipeline

Now it's time to define the sequence of steps in our MLOps pipeline.

In this section, we will use the Kubeflow Pipelines SDK and Google Cloud Pipeline Components to define our MLOps pipeline.

We begin by specifying all of the required variables in our pipeline, and populating their values from the constants we defined earlier in our notebook. We then specify the following components in our pipeline:

1. [CustomTrainingJobOp](https://cloud.google.com/vertex-ai/docs/pipelines/customjob-component#customjobop) to perform our custom model training step.
1. [importer](https://www.kubeflow.org/docs/components/pipelines/v2/components/importer-component/) to import our [UnmanagedContainerModel](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform_v1.types.UnmanagedContainerModel) object.
1. [ModelUploadOp](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-2.0.0/api/v1/model.html#v1.model.ModelUploadOp) to upload our Model artifact into Vertex AI Model Registry.
1. [EndpointCreateOp](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-2.0.0/api/v1/endpoint.html#v1.endpoint.EndpointCreateOp) to create a Vertex AI [Endpoint](https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.endpoints).
1. [ModelDeployOp](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-2.0.0/api/v1/endpoint.html#v1.endpoint.ModelDeployOp) to deploy our Google Cloud Vertex AI Model to an Endpoint, creating a [DeployedModel](https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.endpoints#deployedmodel) object within it.
1. `publish_completion_message_op` to the message to our Pub/Sub topic when our pipeline completes.

In [None]:
@dsl.pipeline(name=PIPELINE_NAME, description="MLOps pipeline for custom data preprocessing, model training, and deployment.")
def pipeline(
    bucket_name: str = BUCKET,
    display_name: str = PIPELINE_NAME,
    model_path: str = MODEL_URI,
    model_name: str = MODEL_NAME,
    project_id: str = PROJECT_ID,
    location: str = REGION,
    worker_pool_specs: list = WORKER_POOL_SPEC,
    base_output_directory: str = PIPELINE_ROOT,
    serving_image_uri: str = SERVING_IMAGE_URI,
    endpoint_name: str = ENDPOINT_NAME,
    topic_name: str = TOPIC_NAME
):
    
    # Train model
    model_training_op = custom_job.CustomTrainingJobOp(
        project=project_id,
        location=location,
        display_name="train-mlops-model",
        worker_pool_specs = worker_pool_specs,
    )
    
    importer_op = dsl.importer(
        artifact_uri=model_path,
        artifact_class=artifact_types.UnmanagedContainerModel,
        metadata={
            "containerSpec": {
                "imageUri": serving_image_uri,
            },
        },
    ).after(model_training_op)

    model_upload_op = ModelUploadOp(
        project=project_id,
        display_name=model_name,
        unmanaged_container_model=importer_op.outputs["artifact"],
    ).after(importer_op)

    endpoint_create_op = EndpointCreateOp(
        project=project_id,
        display_name=endpoint_name,
    ).after(model_upload_op)

    model_deploy_op = ModelDeployOp(
        endpoint=endpoint_create_op.outputs["endpoint"],
        model=model_upload_op.outputs["model"],
        deployed_model_display_name=model_name,
        dedicated_resources_machine_type="n1-standard-16",
        dedicated_resources_min_replica_count=1,
        dedicated_resources_max_replica_count=1,
    ).after(endpoint_create_op)
    
    publish_completion_message_op = publish_message(
        project_id=project_id, 
        topic_name=topic_name,
        pipeline_run_id="{{workflow.uid}}",
    ).after(model_deploy_op)

### Compile our pipeline into a YAML file

Now that we have defined out pipeline structure, we need to compile it into YAML format in order to run it in Vertex AI Pipelines.

In [None]:
compiler.Compiler().compile(pipeline, 'mlops-pipeline.yaml')

## Submit and run our pipeline in Vertex AI Pipelines

Now we're ready to use the Vertex AI Python SDK to submit and run our pipeline in Vertex AI Pipelines.

The parameters, artifacts, and metrics produced from the pipeline run are automatically captured into Vertex AI Experiments as an experiment run. We will discuss the concept of Vertex AI Experiments in more detail in laer chapters in the book. The output of the following cell will provide a link at which you can watch your pipeline as it progresses through each of the steps.

In [None]:
pipeline = aiplatform.PipelineJob(display_name=PIPELINE_NAME, template_path='mlops-pipeline.yaml', enable_caching=False)

pipeline.submit(experiment=EXPERIMENT_NAME)

### Wait for the pipeline to complete
The following function will periodically print the status of our pipeline execution. If all goes to plan, you will eventually see a message saying "PipelineJob run completed".

In [None]:
pipeline.wait()

## When the pipeline has completed, you can view the generated image and the prediction outputs in the bucket you specified in GCS.

If you used the conventions suggested in this notebook, you will find the generated image and the prediction outputs in your bucket at the following paths:

* Generated image: `chapter-18-data/`
* Prediction results: `chapter-18-predictions/chapter-18-data/`

## Great job!! You have officially built an end-to-end event-driven solution that combines MLOps & Gen AI on Google Cloud!!!

# Cleaning up

When you no longer need the resources created by this notebook. You can delete them as follows.

**Note: if you do not delete the resources, you will continue to pay for them**

**If you want to delete the resources, set the `clean_up` paramater to `True`.**

In [None]:
clean_up = False

## Delete Vertex AI resources

In [None]:
from google.api_core import exceptions as gcp_exceptions
if clean_up:  
    try:
        endpoint_list = aiplatform.Endpoint.list(filter=f'display_name="{ENDPOINT_NAME}"')
        if endpoint_list:
            endpoint = endpoint_list[0]  # Assuming only one endpoint with that name

            # Undeploy all models (if any)
            try:
                endpoint.undeploy_all()
                print(f"Undeployed all models from endpoint: {ENDPOINT_NAME}")
            except gcp_exceptions.NotFound:
                print(f"No models found to undeploy from endpoint: {ENDPOINT_NAME}")
            except Exception as e:  # Catching general errors for better debugging
                print(f"Unexpected error while undeploying models: {e}")

            # Delete endpoint
            try:
                endpoint.delete()
                print(f"Deleted endpoint: {ENDPOINT_NAME}")
            except Exception as e:
                print(f"Error deleting endpoint: {e}")
        else:
            print(f"No endpoint found matching: {ENDPOINT_NAME}")
    except gcp_exceptions.NotFound:
        print(f"Endpoint not found: {ENDPOINT_NAME}")

    # Delete models
    try:
        model_list = aiplatform.Model.list(filter=f'display_name="{MODEL_NAME}"')
        if model_list:
            for model in model_list:
                print(f"Deleting model: {model.display_name}")
                model.delete()
        else:
            print(f"No models found matching: {MODEL_NAME}")
    except gcp_exceptions.NotFound:
        print(f"Model not found: {MODEL_NAME}")

    # Delete pipeline
    try:
        pipeline.delete()
    except gcp_exceptions.NotFound:
        print(f"Pipeline not found: {pipeline.name}")

else:
    print("clean_up parameter is set to False.")

## Delete artifact repository and pubsub topic 

In [None]:
if clean_up:
    # Delete the pubsub topic    
    ! gcloud pubsub topics delete $TOPIC_NAME
    
    # Delete the Artifact repository
    ! gcloud artifacts repositories delete $TRAIN_REPO_NAME --location=$REGION --quiet
else:
    print("clean_up parameter is set to False")

## Delete Cloud Functions 

In [None]:
%%capture output
if clean_up:
    # Delete the Cloud Functions
    ! gcloud functions delete {GEN_FUNCTION} --quiet
    ! gcloud functions delete {PREDICT_FUNCTION} --quiet
else:
    print("clean_up parameter is set to False")

## Delete Lineage Metadata

# WARNING: THE FOLLOWING CODE WILL DELETE ALL CONTEXTS, EXECUTIONS, AND ARTIFACTS. 

If you want to delete those resources, set the `delete_metadata` parameter to `True`.

In [None]:
delete_metadata = False

In [None]:
if delete_metadata: 
    
    # Delete the artifacts
    try:
        artifacts = aiplatform.Artifact.list()
        # To delete specific artifacts, you can filter your Artifact.list() like below:
        # artifacts = aiplatform.Artifact.list(filter='schema_title="system.Model"') # Deletes all artifacts with schema title as "system.Model"

        if not artifacts:
            print("No Artifacts found in the project and region.")
        else:
            for artifact in artifacts:
                try:
                    artifact.delete()
                    print(f"Deleted Artifact: {artifact.resource_name}")
                except gcp_exceptions.FailedPrecondition as e:
                    print(f"Failed to delete Artifact {artifact.resource_name}: {e}")
                    # Handle specific precondition failures if needed
                except Exception as e:
                    print(f"Unexpected error deleting Artifact {artifact.resource_name}: {e}")

    except Exception as e:
        print(f"Error listing or deleting Artifacts: {e}") 
    
    # Delete the contexts
    try:
        contexts = aiplatform.Context.list()
        if not contexts:
            print("No Contexts found in the project and region.")
        else:
            for context in contexts:
                try:
                    context.delete()
                    print(f"Deleted Context: {context.name}")
                except gcp_exceptions.FailedPrecondition as e:
                    print(f"Failed to delete Context {context.name}: {e}")
                    # Handle specific precondition failures (e.g., Context in use)
                except Exception as e:  # Catching general errors for better debugging
                    print(f"Unexpected error while deleting Context {context.name}: {e}")
    except Exception as e:
        print(f"Error listing or deleting Contexts: {e}")

    # Delete the executions
    try:
        executions = aiplatform.Execution.list()
        if not executions:
            print("No Executions found in the project and region.")
        else:
            for execution in executions:
                try:
                    execution.delete()
                    print(f"Deleted Execution: {execution.name}")
                except gcp_exceptions.FailedPrecondition as e:
                    print(f"Failed to delete Execution {execution.name}: {e}")
                    # Handle specific precondition failures if needed
                except Exception as e:
                    print(f"Unexpected error deleting Execution {execution.name}: {e}")
    except Exception as e:
        print(f"Error listing or deleting Executions: {e}")   

    # Delete the experiments
    try:
        # List all experiments 
        experiments = aiplatform.Experiment.list() 
    
        if not experiments:
            print("No experiments found in the project and region.")
        else:
        # Delete each experiment
            for experiment in experiments:
                try:
                    experiment.delete()
                    print(f"Deleted experiment: {experiment.name}")
                except exceptions.FailedPrecondition as e:
                    print(f"Failed to delete experiment {experiment.name}: {e}")
                    # Handle specific precondition failures if needed (e.g., experiment runs still exist)
                except exceptions.NotFound:
                    print(f"Experiment {experiment.name} not found, likely already deleted.")
                except Exception as e:  # Catching general errors for better debugging
                    print(f"Unexpected error while deleting experiment {experiment.name}: {e}")

    except Exception as e:
        print(f"Error listing or deleting experiments: {e}")

else:
    print("delete_metadata parameter is set to False.")

## Delete GCS Bucket
The bucket can be reused throughout multiple activities in the book. Sometimes, activities in certain chapters make use of artifacts from previous chapters that are stored in the GCS bucket.

I highly recommend **not deleting the bucket** unless you will be performing no further activities in the book. For this reason, there's a separate `delete_bucket` variable to specify if you want to delete the bucket.

If you want to delete the bucket, set the `delete_bucket` parameter to `True`.

In [None]:
delete_bucket = False

In [None]:
if delete_bucket:
    # Delete the bucket
    ! gcloud storage rm --recursive gs://$BUCKET
else:
    print("delete_bucket parameter is set to False")