# Vertex AI > Pipelines - Retrain All 05a through 05i
## IN ACTIVE DEVELOPMENT - NOT COMPLETE

In each of the notebooks `05a` through `05j` methods of using Vertex AI > Training were demonstrated.  In this notebook a Vertex AI > Pipeline job will be created that runs a retraining of all of these methods in a single pipeline.  This will showcase a combination of custom pipeline components and pre-built component for Vertex AI.


Introduction: https://cloud.google.com/vertex-ai/docs/pipelines/components-introduction
Component List: https://cloud.google.com/vertex-ai/docs/pipelines/gcpc-list
SDK Reference: https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-1.0.18/
- [google_cloud_pipeline_components.aiplatform.CustomContainerTrainingJobRunOp](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-1.0.18/google_cloud_pipeline_components.aiplatform.html#google_cloud_pipeline_components.aiplatform.CustomContainerTrainingJobRunOp)
- [google_cloud_pipeline_components.aiplatform.CustomPythonPackageTrainingJobRunOp](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-1.0.18/google_cloud_pipeline_components.aiplatform.html#google_cloud_pipeline_components.aiplatform.CustomPythonPackageTrainingJobRunOp)
- [google_cloud_pipeline_components.experimental.custom_job](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-1.0.18/google_cloud_pipeline_components.experimental.custom_job.html)
- [google_cloud_pipeline_components.experimental.hyperparameter_tuning_job.*](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-1.0.18/google_cloud_pipeline_components.experimental.hyperparameter_tuning_job.html)
- [google_cloud_pipeline_components.v1.custom_job.*](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-1.0.18/google_cloud_pipeline_components.v1.custom_job.html#module-google_cloud_pipeline_components.v1.custom_job)
- [google_cloud_pipeline_components.v1.hyperparameter_tuning_job](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-1.0.18/google_cloud_pipeline_components.v1.hyperparameter_tuning_job.html#module-google_cloud_pipeline_components.v1.hyperparameter_tuning_job)

Idea is to
- create a component for each of 05a - 05i
    - component should run and create new version of model
- gather metrics from all and pick best from run
- deploy best to exisitng endpoint

### Conceptual Flow & Workflow
<p align="center">
  <img alt="Conceptual Flow" src="../architectures/slides/05tools_pipe_arch.png" width="45%">
&nbsp; &nbsp; &nbsp; &nbsp;
  <img alt="Workflow" src="../architectures/slides/05tools_pipe_console.png" width="45%">
</p>

---
## Setup

### Package Installs (if needed)

This notebook uses the Python Clients for
- Google Service Usage
    - to enable APIs (Artifact Registry)
- Artifact Registry
    - to create a repository for storing custom Python packages in a GCP Project

The cells below check to see if the required Python libraries are installed.  If any are not it will print a message to do the install with the associated pip command to use.  These installs must be completed before continuing this notebook.

In [21]:
try:
    import google.cloud.service_usage_v1
except ImportError:
    print('You need to pip install google-cloud-service-usage')
    !pip install google-cloud-service-usage -q

In [22]:
try:
    import google.cloud.artifactregistry_v1
except ImportError:
    print('You need to pip install google-cloud-artifact-registry')
    !pip install google-cloud-artifact-registry -q

### Environment

inputs:

In [1]:
project = !gcloud config get-value project
PROJECT_ID = project[0]
PROJECT_ID

'statmike-mlops-349915'

In [2]:
REGION = 'us-central1'
EXPERIMENT = 'pipelines'
SERIES = '05'

# source data
BQ_PROJECT = PROJECT_ID
BQ_DATASET = 'fraud'
BQ_TABLE = 'fraud_prepped'

# Model Training
VAR_TARGET = 'Class'
VAR_OMIT = 'transaction_id' # add more variables to the string with space delimiters

packages:

In [23]:
import os
import shutil

from google.cloud import aiplatform
from google.cloud import bigquery
from datetime import datetime

from google.cloud import service_usage_v1
from google.cloud import artifactregistry_v1

from datetime import datetime
from typing import NamedTuple

from kfp import dsl
from kfp.v2 import dsl as dsl2
from kfp.v2 import compiler

clients:

In [24]:
aiplatform.init(project=PROJECT_ID, location=REGION)
bq = bigquery.Client()

su_client = service_usage_v1.ServiceUsageClient()
ar_client = artifactregistry_v1.ArtifactRegistryClient()

parameters:

In [5]:
TIMESTAMP = datetime.now().strftime("%Y%m%d%H%M%S")
BUCKET = PROJECT_ID
URI = f"gs://{BUCKET}/{SERIES}/{EXPERIMENT}/pipelines"
DIR = f"temp/{EXPERIMENT}"

In [6]:
SERVICE_ACCOUNT = !gcloud config list --format='value(core.account)' 
SERVICE_ACCOUNT = SERVICE_ACCOUNT[0]
SERVICE_ACCOUNT

'1026793852137-compute@developer.gserviceaccount.com'

List the service accounts current roles:

In [7]:
!gcloud projects get-iam-policy $PROJECT_ID --filter="bindings.members:$SERVICE_ACCOUNT" --format='table(bindings.role)' --flatten="bindings[].members"

ROLE
roles/bigquery.admin
roles/owner
roles/run.admin
roles/storage.objectAdmin


To take a quick anonymous survey, run:
  $ gcloud survey



>Note: If the resulting list is missing [roles/storage.objectAdmin](https://cloud.google.com/storage/docs/access-control/iam-roles) then [revisit the setup notebook](../00%20-%20Setup/00%20-%20Environment%20Setup.ipynb#permissions) and add this permission to the service account with the provided instructions.

environment:

In [37]:
!rm -rf {DIR}
!mkdir -p {DIR}

### Enable APIs

Using Cloud Build and Artifact Registry requires enabling these APIs for the Google Cloud Project.

Options for enabeling these.  In this notebook option 2 is used.
 1. Use the APIs & Services page in the console: https://console.cloud.google.com/apis
     - `+ Enable APIs and Services`
     - Search for Cloud Build and Enable
     - Search for Artifact Registry and Enable
 2. Use [Google Service Usage](https://cloud.google.com/service-usage/docs) API from Python
     - [Python Client For Service Usage](https://github.com/googleapis/python-service-usage)
     - [Python Client Library Documentation](https://cloud.google.com/python/docs/reference/serviceusage/latest)
     
The following code cells use the Service Usage Client to:
- get the state of the service
- if 'DISABLED':
    - Try enabling the service and return the state after trying
- if 'ENABLED' print the state for confirmation

#### Artifact Registry

In [25]:
artifactregistry = su_client.get_service(
    request = service_usage_v1.GetServiceRequest(
        name = f'projects/{PROJECT_ID}/services/artifactregistry.googleapis.com'
    )
).state.name


if artifactregistry == 'DISABLED':
    print(f'Artifact Registry is currently {artifactregistry} for project: {PROJECT_ID}')
    print(f'Trying to Enable...')
    operation = su_client.enable_service(
        request = service_usage_v1.EnableServiceRequest(
            name = f'projects/{PROJECT_ID}/services/artifactregistry.googleapis.com'
        )
    )
    response = operation.result()
    if response.service.state.name == 'ENABLED':
        print(f'Artifact Registry is now enabled for project: {PROJECT_ID}')
    else:
        print(response)
else:
    print(f'Artifact Registry already enabled for project: {PROJECT_ID}')

Artifact Registry already enabled for project: statmike-mlops-349915


#### Setup Artifact Registry

Artifact registry organizes artifacts with repositories.  Each repository contains packages and is designated to hold a partifcular format of package: Docker images, Python Packages and [others](https://cloud.google.com/artifact-registry/docs/supported-formats#package).

##### List Repositories

This may be empty if no repositories have been created for this project

In [32]:
for repo in ar_client.list_repositories(parent = f'projects/{PROJECT_ID}/locations/{REGION}'):
    print(repo.format_.name, repo.name)

DOCKER projects/statmike-mlops-349915/locations/us-central1/repositories/statmike-mlops-349915
DOCKER projects/statmike-mlops-349915/locations/us-central1/repositories/statmike-mlops-349915-docker
PYTHON projects/statmike-mlops-349915/locations/us-central1/repositories/statmike-mlops-349915-python


#### Create/Retrieve Docker Image Repository

Create an Artifact Registry Repository to hold Docker Images created by this notebook.  First, check to see if it is already created by a previous run and retrieve it if it has.  Otherwise, create!

In [27]:
docker_repo = None
for repo in ar_client.list_repositories(parent = f'projects/{PROJECT_ID}/locations/{REGION}'):
    if repo.name.endswith(PROJECT_ID):
        docker_repo = repo
        print(f'Retrieved existing repo: {docker_repo.name}')

if not docker_repo:
    operation = ar_client.create_repository(
        request = artifactregistry_v1.CreateRepositoryRequest(
            parent = f'projects/{PROJECT_ID}/locations/{REGION}',
            repository_id = f'{PROJECT_ID}',
            repository = artifactregistry_v1.Repository(
                description = f'A repository for the {PROJECT_ID} project that holds docker images.',
                name = f'{PROJECT_ID}',
                format_ = artifactregistry_v1.Repository.Format.DOCKER,
                labels = {'series': SERIES, 'experiment': EXPERIMENT}
            )
        )
    )
    print('Creating Repository ...')
    docker_repo = operation.result()
    print(f'Completed creating repo: {docker_repo.name}')

Retrieved existing repo: projects/statmike-mlops-349915/locations/us-central1/repositories/statmike-mlops-349915


In [33]:
print(docker_repo.format_.name, docker_repo.name)

DOCKER projects/statmike-mlops-349915/locations/us-central1/repositories/statmike-mlops-349915


In [34]:
REPOSITORY = f"{REGION}-docker.pkg.dev/{PROJECT_ID}/{docker_repo.name.split('/')[-1]}"
REPOSITORY

'us-central1-docker.pkg.dev/statmike-mlops-349915/statmike-mlops-349915'

---

---
## Remove Resources
see notebook "99 - Cleanup"