## ML pipeline
An ML pipeline is a portable and extensible description of an MLOps workflow as a series of steps called pipeline tasks. Each task performs a specific step in the workflow to train and/or deploy an ML model.

An ML pipeline is a directed acyclic graph (DAG) of containerized pipeline tasks that are interconnected using input-output dependencies. Each task can be authored either in Python or as a prebuilt container images.

Define the Pipeline as a DAG using either the Kubeflow Pipelines SDK or the TFX SDK, compile it to its YAML for intermediate representation, and then run the pipeline. By default, pipeline tasks run in parallel. You can link the tasks to execute them in series.

Consider an ML pipeline with the following steps:

* **Prepare data**: Prepare or preprocess training data.
    - Input (from tasks within the same ML pipeline): None.
    - Output: Prepared or preprocessed training data.

* **Train model**: Use the prepared training data to train a model.
    - Input: Prepared or preprocessed training data from pipeline task Prepare data.
    - Output: Trained model.

* **Evaluate model**: Evaluate the trained model.
    - Input: Trained model from pipeline task Train model.

* **Deploy**: Deploy the trained model for predictions.
    - Input: Trained model from pipeline task Train model.

When you compile your ML pipeline, the pipelines SDK you're using (Kubeflow Pipelines or TFX) analyzes the data dependencies between these tasks and creates the following workflow DAG:

* **Prepare data** doesn't rely on other tasks within the same ML pipeline for inputs. Therefore, it can be the first step in the ML pipeline, or run concurrently with other tasks.
* **Train model** relies on **Prepare data** for inputs. Therefore, it occurs after **Prepare data**.
* **Evaluate** and **Deploy** both depend on the trained model. Therefore, they can run concurrently, but after **Train model**.

When you run your ML pipeline, Vertex AI Pipelines executes these tasks in the sequence described in the DAG.

### Pipeline tasks and components
A pipeline task is an instantiation of a pipeline component with specific inputs. While defining an ML pipeline, you can interconnect multiple tasks to form a DAG, by routing the outputs of one pipeline task to the inputs for the next pipeline task in the ML workflow. You can also use the inputs for the ML pipeline as the inputs for a pipeline task.

#### Pipeline component
A pipeline component is a self-contained set of code that performs a specific step of an ML workflow. A component typically consists of the following:
- **Inputs**: A component might have one or more input parameters and artifacts.
- **Outputs**: Every component has one or more output parameters or artifacts.
- **Logic**: This is the component's executable code. For containerized components, the logic also contains the definition of the environment, or container image, where the component runs.

Components are the basis of defining tasks in an ML pipeline. To define pipeline tasks, you can either use predefined Google Cloud Pipeline Components or create your own custom components. Use predefined Google Cloud Pipeline Components if you want to use features of Vertex AI, such as AutoML, in your pipeline.

#### Pipeline task
A pipeline task is the instantiation of a pipeline component and performs a specific step in your ML workflow. You can author ML pipeline tasks either using Python or as prebuilt container images.

Within a task, you can build on the on-demand compute capabilities of Vertex AI with Kubernetes to scalably execute your code, or delegate your workload to another execution engine, such as BigQuery, Dataflow, or Dataproc Serverless.

### Lifecycle of an ML pipeline
From definition to execution and monitoring, the lifecycle of an ML pipeline comprises the following high-level stages:
- **Define**: The process of defining an ML pipeline and its task is also called building a pipeline. In this stage, you need to perform the following steps:
    - Choose an ML framework: Vertex AI Pipelines supports ML pipelines defined using the TFX or Kubeflow Pipelines framework.
    - Define pipeline tasks and configure pipeline.
- **Compile**: In this stage, you need to perform the following steps:
    - Generate your ML pipeline definition in a compiled YAML file for intermediate representation, which you can use to run your ML pipeline.
    - Optional: You can upload the compiled YAML file as a pipeline template to a repository and reuse it to create ML pipeline runs.
- **Run**: Create an execution instance of your ML pipeline using the compiled YAML file or a pipeline template. The execution instance of a pipeline definition is called a pipeline run.
    - You can create a one-time occurrence of a pipeline run or use the scheduler API to create recurring pipeline runs from the same ML pipeline definition. You can also clone an existing pipeline run.
- **Monitor, visualize, and analyze runs**: After you create a pipeline run, you can do the following to monitor the performance, status, and costs of pipeline runs:
    - Configure email notifications for pipeline failures.
    - Use Cloud Logging to create log entries for monitoring events.
    - Visualize, analyze, and compare pipeline runs.
    - Use Cloud Billing export to BigQuery to analyze pipeline run costs.

- **Optional: stop or delete pipeline runs**: There is no restriction on how long you can keep a pipeline run active. You can optionally do the following:
    - Stop a pipeline run.
    - Pause or resume a pipeline run schedule.
    - Delete an existing pipeline template, pipeline run, or pipeline run schedule.

### Pipeline run
A pipeline run is an execution instance of your ML pipeline definition. Each pipeline run is identified by a unique run name. Using Vertex AI Pipelines, you can create an ML pipeline run in the following ways:
- Use the compiled YAML definition of a pipeline
- Use a pipeline template from the Template Gallery

### Track the lineage of ML artifacts
A pipeline run contains several artifacts and parameters, including pipeline metadata. To understand changes in the performance or accuracy of your ML system, you need to analyze the metadata and the lineage of ML artifacts from your ML pipeline runs. The lineage of an ML artifact includes all the factors that contributed to its creation, along with metadata and references to artifacts derived from it.

Lineage graphs help you analyze upstream root cause and downstream impact. Each pipeline run produces a lineage graph of parameters and artifacts that are input into the run, materialized within the run, and output from the run. Metadata that composes this lineage graph is stored in Vertex ML Metadata. This metadata can also be synced to Dataplex.

- **Use Vertex ML Metadata to track pipeline artifact lineage**
  When you run a pipeline using Vertex AI Pipelines, all parameters and artifact metadata consumed and generated by the pipeline are stored in Vertex ML Metadata. Vertex ML Metadata is a managed implementation of the ML Metadata library in TensorFlow, and supports registering and writing custom metadata schemas. When you create a pipeline run in Vertex AI Pipelines, metadata from the pipeline run is stored in the default metadata store for the project and region where you execute the pipeline.

- **Use Dataplex to track pipeline artifact lineage**
  Dataplex is a global and cross-project data fabric integrated with multiple systems within Google Cloud, such as Vertex AI, BigQuery, and Cloud Composer. Within Dataplex, you can search for a pipeline artifact and view its lineage graph. Note that to prevent artifact conflicts, any resource catalogued in Dataplex is identified with a fully qualified name (FQN).

## Interfaces to define a pipeline

### Kubeflow Pipelines (KFP) SDK
Use KFP for all use cases where you don't need to use TensorFlow Extended to process huge amounts of structured or text data. Vertex AI Pipelines supports KFP SDK v1.8 or later.

When you use the KFP SDK, you can define your ML workflow by building custom components and also by reusing prebuilt components, such as the Google Cloud Pipeline Components. Vertex AI Pipelines supports Google Cloud Pipeline Components SDK v2 or later.

### TensorFlow Extended (TFX) SDK
TFX if you use TensorFlow Extended in your ML workflow to process terabytes of structured or text data. Vertex AI Pipelines supports TFX SDK v0.30.0 or later.

## Interfaces to run a pipeline

### REST API
To create a pipeline run using REST, use the `Pipelines` service API. This API uses the `projects.locations.pipelineJobs` REST resource.

### SDK Clients
Vertex AI Pipelines lets you create pipeline runs using the Vertex AI SDK for Python or client libraries.

**Vertex AI SDK for Python**  
The Vertex AI SDK for Python (`aiplatform`) is the recommended SDK for programmatically working with the `Pipelines` service API. `google.cloud.aiplatform.PipelineJob`.

**Client libraries**  
Client libraries are programmatically Generated API Clients (GAPIC) SDKs. Vertex AI Pipelines supports the following client libraries:
- Python (`aiplatform v1` and `v1beta1`)

### Google Cloud console (GUI)
Google Cloud console is the recommended way for reviewing and monitoring your pipeline runs. You can also perform other tasks using the Google Cloud console, such as creating, deleting and cloning pipeline runs, accessing the Template Gallery, and retrieving the billing label for a pipeline run.

## Build a pipeline

### Kubeflow Pipelines DSL package
The `kfp.dsl` package contains the domain-specific language (DSL) that you can use to define and interact with pipelines and components.

Kubeflow pipeline components are factory functions that create pipeline steps. Each component describes the inputs, outputs, and implementation of the component. For example, in the code sample below, `ds_op` is a component.

Components are used to create pipeline steps. When a pipeline runs, steps are executed as the data they depend on becomes available. For example, a training component could take a CSV file as an input and use it to train a model.

In [None]:
import kfp
from google.cloud import aiplatform
from google_cloud_pipeline_components.v1.dataset import ImageDatasetCreateOp
from google_cloud_pipeline_components.v1.automl.training_job import AutoMLImageTrainingJobRunOp
from google_cloud_pipeline_components.v1.endpoint import EndpointCreateOp, ModelDeployOp

# The Google Cloud project that this pipeline runs in.
project_id = PROJECT_ID

# Specify a Cloud Storage URI that your pipelines service account can access.
# The artifacts of your pipeline runs are stored within the pipeline root.
pipeline_root_path = PIPELINE_ROOT

# Define the workflow of the pipeline.
@kfp.dsl.pipeline(
    name="automl-image-training-v2",
    pipeline_root=pipeline_root_path)
def pipeline(project_id: str):
    # The first step of your workflow is a dataset generator.
    # This step takes a Google Cloud Pipeline Component, providing the necessary
    # input arguments, and uses the Python variable `ds_op` to define its
    # output. Note that here the `ds_op` only stores the definition of the
    # output but not the actual returned object from the execution. The value
    # of the object is not accessible at the dsl.pipeline level, and can only be
    # retrieved by providing it as the input to a downstream component.
    ds_op = ImageDatasetCreateOp(
        project=project_id,
        display_name="flowers",
        gcs_source="gs://cloud-samples-data/vision/automl_classification/flowers/all_data_v2.csv",
        import_schema_uri=aiplatform.schema.dataset.ioformat.image.single_label_classification,
    )

    # The second step is a model training component. It takes the dataset
    # outputted from the first step, supplies it as an input argument to the
    # component (see `dataset=ds_op.outputs["dataset"]`), and will put its
    # outputs into `training_job_run_op`.
    training_job_run_op = AutoMLImageTrainingJobRunOp(
        project=project_id,
        display_name="train-iris-automl-mbsdk-1",
        prediction_type="classification",
        model_type="CLOUD",
        dataset=ds_op.outputs["dataset"],
        model_display_name="iris-classification-model-mbsdk",
        training_fraction_split=0.6,
        validation_fraction_split=0.2,
        test_fraction_split=0.2,
        budget_milli_node_hours=8000,
    )

    # The third and fourth step are for deploying the model.
    create_endpoint_op = EndpointCreateOp(
        project=project_id,
        display_name = "create-endpoint",
    )

    model_deploy_op = ModelDeployOp(
        model=training_job_run_op.outputs["model"],
        endpoint=create_endpoint_op.outputs['endpoint'],
        automatic_resources_min_replica_count=1,
        automatic_resources_max_replica_count=1,
    )

> The pipeline root can be set as an argument of the @kfp.dsl.pipeline annotation on the pipeline function, or it can be set when you call create_run_from_job_spec to create a pipeline run.

### Compile your pipeline into a YAML file
After the workflow of your pipeline is defined, you can proceed to compile the pipeline into YAML format. The YAML file includes all the information for executing your pipeline on Vertex AI Pipelines.

In [None]:
from kfp import compiler

compiler.Compiler().compile(
    pipeline_func=pipeline,     # The name of your pipeline's function.
    # The path to where to store your compiled pipeline.
    package_path='image_classif_pipeline.yaml'
)

### Submit your pipeline run
After the workflow of your pipeline is compiled into the YAML format, you can use the Vertex AI Python client to submit and run your pipeline.

In [None]:
import google.cloud.aiplatform as aip

# Before initializing, make sure to set the GOOGLE_APPLICATION_CREDENTIALS
# environment variable to the path of your service account.
aip.init(
    project=project_id,
    location=PROJECT_REGION,  # The region that this pipeline runs in.
)

# Prepare the pipeline job
job = aip.PipelineJob(
    display_name="automl-image-training-v2",
    template_path="image_classif_pipeline.yaml",
    pipeline_root=pipeline_root_path,
    parameter_values={
        'project_id': project_id
    }
)

job.submit()

In the preceding example:
- A Kubeflow pipeline is defined as a Python function. The function is annotated with the `@kfp.dsl.pipeline` decorator, which specifies the pipeline's name and root path. The pipeline root path is the location where the pipeline's artifacts are stored.
- The pipeline's workflow steps are created using the Google Cloud Pipeline Components. By using the outputs of a component as an input of another component, you define the pipeline's workflow as a graph. For example: `training_job_run_op` depends on the `dataset` output of `ds_op`.
- You compile the pipeline using `kfp.compiler.Compiler`.
- You create a pipeline run on Vertex AI Pipelines using the Vertex AI Python client. When you run a pipeline, you can override the pipeline name and the pipeline root path. Pipeline runs can be grouped using the pipeline name. Overriding the pipeline name can help you distinguish between production and experimental pipeline runs.

## Building Kubeflow pipelines
Use the following process to build a pipeline.
1. Design your pipeline as a series of components. To promote reusability, each component should have a single responsibility. Whenever possible, design your pipeline to reuse proven components such as the Google Cloud Pipeline Components.
2. Build any custom components that are required to implement your ML workflow using Kubeflow Pipelines SDK. Components are self-contained sets of code that perform a step in your ML workflow. Use the following options to create your pipeline components.
    - Package your component's code as a container image. This option lets you include code in your pipeline that was written in any language that can be packaged as a container image.
    - Implement your component's code as a standalone Python function and use the Kubeflow Pipelines SDK to package your function as a component. This option makes it easier to build Python-based components.
3. Build your pipeline as a Python function.
4. Use the Kubeflow Pipelines SDK compiler to compile your pipeline.
5. Run your pipeline using Google Cloud console or Python.

> If you have a pipeline template or definition that references a container with security vulnerabilities, you should do the following:  
    1. Install the latest patched version of the SDK.  
    2. Rebuild and recompile your pipeline template or definition.  
    3. Re-upload the template or definition to Artifact Registry or Cloud Storage  

## Run a pipeline 
Vertex AI Pipelines lets you run machine learning (ML) pipelines that were built using the Kubeflow Pipelines SDK or TensorFlow Extended in a serverless manner. You can also create pipeline runs using prebuilt templates in the **Template Gallery**.

### Create a pipeline run
**Set up authentication**  
To set up authentication, you must create a service account key, and set an environment variable for the path to the service account key.

`GOOGLE_APPLICATION_CREDENTIALS="[PATH]"`      # Replace *[PATH]* with the path of the JSON file that contains your service account key.

#### Run a pipeline
Running a Vertex AI `PipelineJob` requires you to create a `PipelineJob` object, and then invoke the `submit` method.

**Special input types supported by KFP**
While creating a pipeline run, you can also pass the following placeholders supported by the KFP SDK as inputs:
- `{{$.pipeline_job_name_placeholder}}`
- `{{$.pipeline_job_resource_name_placeholder}}`
- `{{$.pipeline_job_id_placeholder}}`
- `{{$.pipeline_task_name_placeholder}}`
- `{{$.pipeline_task_id_placeholder}}`
- `{{$.pipeline_job_create_time_utc_placeholder}}`
- `{{$.pipeline_root_placeholder}}`

In [None]:
from google.cloud import aiplatform

job = aiplatform.PipelineJob(display_name=DISPLAY_NAME,
                             template_path=COMPILED_PIPELINE_PATH,
                             job_id=JOB_ID,
                             pipeline_root=PIPELINE_ROOT_PATH,
                             parameter_values=PIPELINE_PARAMETERS,
                             enable_caching=ENABLE_CACHING,
                             encryption_spec_key_name=CMEK,
                             labels=LABELS,
                             credentials=CREDENTIALS,
                             project=PROJECT_ID,
                             location=LOCATION,
                             failure_policy=FAILURE_POLICY)

job.submit(service_account=SERVICE_ACCOUNT,
           network=NETWORK)

Replace the following:
- ***DISPLAY_NAME***: The name of the pipeline, this will show up in the Google Cloud console.
- ***COMPILED_PIPELINE_PATH***: The path to your compiled pipeline YAML file. It can be a local path or a Cloud Storage URI.  
  Optional: To specify a particular version of a compiled pipeline, include the version tag in any one of the following formats:
  - ***`COMPILED_PIPELINE_PATH:TAG`***, where ***TAG*** is the version tag.
  - ***`COMPILED_PIPELINE_PATH@SHA256_TAG`***, where ***SHA256_TAG*** is the `sha256` hash value of the pipeline version.
- ***JOB_ID***: (optional) A unique identifier for this pipeline run. If the job ID is not specified, Vertex AI Pipelines creates a job ID for you using the pipeline name and the timestamp of when the pipeline run was started.
- ***PIPELINE_ROOT_PATH***: (optional) To override the pipeline root path specified in the pipeline definition, specify a path that your pipeline job can access, such as a Cloud Storage bucket URI.
- ***PIPELINE_PARAMETERS***: (optional) The pipeline parameters to pass to this run. For example, create a `dict()` with the parameter names as the dictionary keys and the parameter values as the dictionary values.
- ***ENABLE_CACHING***: (optional) Specifies if this pipeline run uses execution caching. Execution caching reduces costs by skipping pipeline tasks where the output is known for the current set of inputs. If the enable caching argument is not specified, execution caching is used in this pipeline run.
- ***CMEK***: (optional) The name of the customer-managed encryption key that you want to use for this pipeline run.
- ***LABELS***: (optional) The user defined labels to organize this `PipelineJob`.  
  Vertex AI Pipelines automatically attaches the following label to a pipeline run:  
  `vertex-ai-pipelines-run-billing-id: pipeline_run_id`  
  where `pipeline_run_id` is the unique ID of the pipeline run.  
  This label connects the usage of Google Cloud resources generated by the pipeline run in billing reports.
- ***CREDENTIALS***: (optional) Custom credentials to use to create this `PipelineJob`. Overrides credentials set in `aiplatform.init`.
- ***PROJECT_ID***: (optional) The Google Cloud project that you want to run the pipeline in. If you don't set this parameter, the project set in `aiplatform.init` is used.
- ***LOCATION***: (optional) The region that you want to run the pipeline in. If you don't set this parameter, the default location set in `aiplatform.init` is used.
- ***FAILURE_POLICY***: (optional) Specify the failure policy for the entire pipeline. The following configurations are available:
  - To configure the pipeline to fail after one task fails, enter `fast`. Tasks that are already scheduled continue running until they are completed.
  - To configure the pipeline to continue scheduling tasks after one task fails, enter `slow`. The pipeline continues to run until all tasks have been executed.
  If you don't set this parameter, the failure policy configuration is set to `slow`, by default.
- ***SERVICE_ACCOUNT***: (optional) The name of the service account to use for this pipeline run. If you don't specify a service account, Vertex AI Pipelines runs your pipeline using the default Compute Engine service account.
- ***NETWORK***: (optional) :The name of the VPC peered network to use for this pipeline run.

In the output of the `job.submit()` function, you should be able to click the link that brings you to view the pipeline execution in the Google Cloud console.

## Configure execution caching 
When Vertex AI Pipelines runs a pipeline, it checks to see whether or not an execution exists in Vertex ML Metadata with the interface (cache key) of each pipeline step.

The step's interface is defined as the combination of the following:
- The pipeline step's inputs. These inputs include the input parameters' value (if any) and the input artifact id (if any).
- The pipeline step's output definition. This output definition includes output parameter definition (name, if any) and output artifact definition (name, if any).
- The component's specification. This specification includes the image, commands, arguments and environment variables being used, as well as the order of the commands and arguments.

Additionally, only the pipelines with the same pipeline name will share the cache.

If there is a matching execution in Vertex ML Metadata, the outputs of that execution are used and the step is skipped. This helps to reduce costs by skipping computations that were completed in a previous pipeline run.

You can turn off execution caching at task level by setting the following:
`eval_task.set_caching_options(False)`

You can turn off execution caching for an entire pipeline job. When you run a pipeline using `PipelineJob()`, you can use the enable_caching argument to specify that this pipeline run does not use caching. All steps within the pipeline job will not use caching. 
 
`enable_caching=False`

> Whether or not to enable caching  
    True = enable the current run to use caching results from previous runs  
    False = disable the current run's use of caching results from previous runs  
    None = defer to cache option for each pipeline component in the pipeline definition  

> **Important**: Pipeline components should be built to be deterministic. A given set of inputs should always produce the same output. Depending on their interface, non-deterministic pipeline components can be unexpectedly skipped due to execution caching.

The following limitations apply to this feature:
- The cached result doesn't have a time-to-live (TTL), and can be reused as long as the entry is not deleted from the Vertex ML Metadata. If the entry is deleted from Vertex ML Metadata, the task will rerun to regenerate the result again.

## Configure retries for a pipeline task 
You can specify whether a pipeline task must be rerun if it fails, by configuring the retries for that task. You can set the number of attempts to rerun the task on failure and the delay between subsequent retries.

Use the following code sample to configure the failure policy of a pipeline task named `train_op` by using the `set_retry` method in the Kubeflow Pipelines SDK:

In [None]:
from kfp import dsl

@dsl.pipeline(name='custom-container-pipeline')
def pipeline():
  generate = generate_op()
  train = (
    train_op(
      training_data=generate.outputs['training_data'],
      test_data=generate.outputs['test_data'],
      config_file=generate.outputs['config_file'])
    .set_retry(
      num_retries=NUMBER_OF_RETRIES,

      # Optional, The duration of time wait after the task fails before retrying. Default is '0s'.
      backoff_duration='BACKOFF_DURATION',

      # Optional, The factor by which the backoff duration is multiplied for each subsequent retry. Default is '2.0'.
      backoff_factor=BACKOFF_FACTOR, 

      # Optional, The maximum backoff duration between subsequent retries. Default maximum duration '3600s'.
      backoff_maxk_duration='BACKOFF_MAX_DURATION'
    )
  )

> **Caution**: You can't pass output parameters from other pipeline tasks or pipeline input parameters as parameter values for the set_retry method. These values must be available when you compile the pipeline.

## Specify the machine configuration for a pipeline step
By setting the machine type parameters on the pipeline step, you can manage the requirements of each step in your pipeline. If you have two training steps and one step trains on a huge data file and the second step trains on a small data file, you can allocate more memory and CPU to the first task, and fewer resources to the second task.

By default, the component will run on as a Vertex AI `CustomJob` using an **e2-standard-4 machine**, with 4 core CPUs and 16GB memory.

> **Note**: If you want to specify the disk space in the machine configuration, you must create a custom training job from a component by requesting Google Cloud machine resources instead.

In [None]:
from kfp import dsl


@dsl.pipeline(name='custom-container-pipeline')
def pipeline():
  generate = generate_op()
  train = (
      train_op(
          training_data=generate.outputs['training_data'],
          test_data=generate.outputs['test_data'],
          config_file=generate.outputs['config_file'])
      .set_cpu_limit('CPU_LIMIT')
      .set_memory_limit('MEMORY_LIMIT')
      .add_node_selector_constraint(SELECTOR_CONSTRAINT)
      .set_accelerator_limit(ACCELERATOR_LIMIT))

Replace the following:
- ***CPU_LIMIT***: The maximum CPU limit for this operator. This string value can be a number (integer value for number of CPUs), or a number followed by "m", which means 1/1000. You can specify at most 96 CPUs.
- ***MEMORY_LIMIT***: The maximum memory limit for this operator. This string value can be a number, or a number followed by "K" (kilobyte), "M" (megabyte), or "G" (gigabyte). At most 624GB is supported.
> **Note**: Vertex AI Pipelines does not support calling set_memory_request on an operator ; you must use set_memory_limit to request a specific memory amount.
- ***SELECTOR_CONSTRAINT***: Each constraint is a key-value pair label. For the container to be eligible to run on a node, the node must have each constraint as a label. For example: `'cloud.google.com/gke-accelerator'`, `'NVIDIA_TESLA_T4'`  
  Available constraints: `NVIDIA_H100_MEGA_80GB`* (includes GPUDirect-TCPXO), `NVIDIA_H100_80GB`,`NVIDIA_A100_80GB`, `NVIDIA_TESLA_A100` (NVIDIA A100 40GB), `NVIDIA_TESLA_P4`, `NVIDIA_TESLA_P100`, `NVIDIA_TESLA_T4`, `NVIDIA_TESLA_V100`, `NVIDIA_L4`, `TPU_V2`, `TPU_V3`
- ***ACCELERATOR_LIMIT***: The accelerator (GPU or TPU) limit for the operator. You can specify a positive integer.

`CustomJob` supports specific machine types that limit you to a maximum of 96 CPUs and 624GB of memory. Based on the CPU, memory, and accelerator configuration that you specify, Vertex AI Pipelines automatically selects the closest match from the supported machine types.

## Request Google Cloud machine resources with Vertex AI Pipelines 

### Create a custom training job from a component using Vertex AI Pipelines
The following sample shows how to use the `create_custom_training_job_from_component` method to transform a Python component into a custom training job with user-defined Google Cloud machine resources, and then run the compiled pipeline on Vertex AI Pipelines:

In [None]:

import kfp
from kfp import dsl
from google_cloud_pipeline_components.v1.custom_job import create_custom_training_job_from_component

# Create a Python component
@dsl.component
def my_python_component():
  import time
  time.sleep(1)

# Convert the above component into a custom training job
custom_training_job = create_custom_training_job_from_component(
    my_python_component,
    display_name = 'DISPLAY_NAME',
    machine_type = 'MACHINE_TYPE',
    accelerator_type='ACCELERATOR_TYPE',
    accelerator_count='ACCELERATOR_COUNT',
    boot_disk_type: 'BOOT_DISK_TYPE',
    boot_disk_size_gb: 'BOOT_DISK_SIZE',
    network: 'NETWORK',
    reserved_ip_ranges: 'RESERVED_IP_RANGES',
    nfs_mounts: 'NFS_MOUNTS'
    persistent_resource_id: 'PERSISTENT_RESOURCE_ID'
)

# Define a pipeline that runs the custom training job
@dsl.pipeline(
  name="resource-spec-request",
  description="A simple pipeline that requests a Google Cloud machine resource",
  pipeline_root='PIPELINE_ROOT',
)
def pipeline():
  training_job_task = custom_training_job(
      project='PROJECT_ID',
      location='LOCATION',
  ).set_display_name('training-job-task')

Replace the following:
- ***DISPLAY_NAME***: The name of the custom job. If you don't specify the name, the component name is used, by default.
- ***MACHINE_TYPE***: The type of the machine for running the custom job—for example, `e2-standard-4`. If you specified a TPU as the `accelerator_type`, set this to `cloud-tpu`.
- ***ACCELERATOR_TYPE***: The type of accelerator attached to the machine.
- ***ACCELERATOR_COUNT***: The number of accelerators attached to the machine running the custom job. Default is `1`.
- ***BOOT_DISK_TYPE***: The type of boot disk.
- ***BOOT_DISK_SIZE***: The size of the boot disk in GB.
- ***NETWORK***: If the custom job is peered to a Compute Engine network that has private services access configured, specify the full name of the network.
- ***RESERVED_IP_RANGES***: A list of names for the reserved IP ranges under the VPC network used to deploy the custom job.
- ***NFS_MOUNTS***: A list of NFS mount resources in JSON dict format.
- ***PERSISTENT_RESOURCE_ID*** (preview): The ID of the persistent resource to run the pipeline. If you specify a persistent resource, the pipeline runs on existing machines associated to the persistent resource, instead of on-demand and short-lived machine resources. Note that the network and CMEK configuration for the pipeline must match the configuration specified for the persistent resource.
- ***PIPELINE_ROOT***: Specify a Cloud Storage URI that your pipelines service account can access. The artifacts of your pipeline runs are stored within the pipeline root.

## Configure secrets with Secret Manager
You can use Secret Manager's Python client with Vertex AI Pipelines to access secrets stored on Secret Manager.

### Build and run a pipeline with Python function based components
1. Grant the service account that runs the pipeline with secrete manager permission.
2. Using Kubeflow Pipelines SDK, build a simple pipeline with one task.

In [None]:
from kfp import compiler
from kfp import dsl

# A simple component that prints a secret stored in Secret Manager
# Be sure to specify "google-cloud-secret-manager" as one of packages_to_install
@dsl.component(
    packages_to_install=['google-cloud-secret-manager']
)
def print_secret_op(project_id: str, secret_id: str, version_id: str) -> str:
    from google.cloud import secretmanager

    secret_client = secretmanager.SecretManagerServiceClient()
    secret_name = f'projects/{project_id}/secrets/{secret_id}/versions/{version_id}'
    response = secret_client.access_secret_version(request={"name": secret_name})
    payload = response.payload.data.decode("UTF-8")
    answer = "The secret is: {}".format(payload)
    print(answer)
    return answer

# A simple pipeline that contains a single print_secret task
@dsl.pipeline(
    name='secret-manager-demo-pipeline')
def secret_manager_demo_pipeline(project_id: str, secret_id: str, version_id: str):
    print_secret_task = print_secret_op(project_id, secret_id, version_id)

# Compile the pipeline
compiler.Compiler().compile(pipeline_func=secret_manager_demo_pipeline,
                            package_path='secret_manager_demo_pipeline.yaml')


3. Run the pipeline using the Vertex AI SDK.

In [None]:
from google.cloud import aiplatform

parameter_values = {
    "project_id": PROJECT_ID,
    "secret_id": SECRET_ID,     # Id (name) of secret
    "version_id": VERSION_ID    # The version name of the secret
}

aiplatform.init(
    project=PROJECT_ID,
    location=REGION,
)

job = aiplatform.PipelineJob(
    display_name=f'test-secret-manager-pipeline',
    template_path='secret_manager_demo_pipeline.yaml',
    pipeline_root=PIPELINE_ROOT,
    enable_caching=False,
    parameter_values=parameter_values
)

job.submit(
    service_account=SERVICE_ACCOUNT
)

## Output HTML and Markdown
Vertex AI Pipelines provides a set of pre-defined visualization types for evaluating the result of a pipeline job (for example, Metrics, ClassificationMetrics). However, there are many cases where custom visualization is needed. Vertex AI Pipelines provides two main approaches to output custom visualization artifacts: Markdown and HTML files.

### Import required dependencies

In [None]:
from kfp import dsl
from kfp.dsl import (
    Output,
    HTML,
    Markdown
)

### Output HTML
To export an HTML file, define a component with the `Output[HTML]` artifact. You also must write HTML content to the artifact's path.

In [None]:
@dsl.component
def html_visualization(html_artifact: Output[HTML]):
    public_url = 'https://user-images.githubusercontent.com/37026441/140434086-d9e1099b-82c7-4df8-ae25-83fda2929088.png'
    html_content = \
      '<html><head></head><body><h1>Global Feature Importance</h1>\n<img src="{}" width="97%"/></body></html>'.format(public_url)
    with open(html_artifact.path, 'w') as f:
        f.write(html_content)

### Output Markdown
To export a Markdown file, define a component with the `Output[Markdown]` artifact. You also must write Markdown content to the artifact's path.

In [None]:
@dsl.component
def markdown_visualization(markdown_artifact: Output[Markdown]):
    import urllib.request

    with urllib.request.urlopen('https://gist.githubusercontent.com/zijianjoy/a288d582e477f8021a1fcffcfd9a1803/raw/68519f72abb59152d92cf891b4719cd95c40e4b6/table_visualization.md') as table:
        markdown_content = table.read().decode('utf-8')
        with open(markdown_artifact.path, 'w') as f:
            f.write(markdown_content)

### Create your pipeline
After you have defined your component with the HTML or Markdown artifact create and run a pipeline that use the component.

In [None]:
@dsl.pipeline(
    name=f'metrics-visualization-pipeline')
def metrics_visualization_pipeline():
    html_visualization_op = html_visualization()
    markdown_visualization_op = markdown_visualization()