In [None]:
# Copyright 2023 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Getting Started with Vertex AI Turbo Templates

This notebook sets up infrastructure to run production-ready pipelines on Google Cloud. Follow this three-part notebook series to get started in a local Jupyter notebook or in [Vertex AI Workbench](https://cloud.google.com/vertex-ai-notebooks):

1. **[Infrastructure Setup](./02_run_pipelines.ipynb) - this notebook**
1. [Run Pipelines](./02_run_pipelines.ipynb)
1. [Infrastructure Clean Up](./02_run_pipelines.ipynb)


**Prerequisites:**

- [Google Cloud SDK (gcloud)](https://cloud.google.com/sdk/docs/quickstart)
- Make
- [Terraform](https://www.terraform.io)

**For Vertex AI Workbench users**: Uncomment and execute the following cell to install Terraform.

In [None]:
# ! bash ./scripts/install_terraform.sh

## Authenticate

#### Set your project ID

**If you don't know your project ID**, try the following:
* Run `gcloud config list`.
* Run `gcloud projects list`.
* See the support page: [Locate the project ID](https://support.google.com/googleapi/answer/7014113)

In [None]:
VERTEX_PROJECT_ID = "my-project-id"
! gcloud auth login
! gcloud config set project {VERTEX_PROJECT_ID}

## Clone Code

**If you haven't cloned the template, yet:** Uncomment and execute the following cell to clone the code.

In [None]:
#! git clone -b develop https://github.com/teamdatatonic/vertex-pipelines-end-to-end-samples

Switch to the folder in which the template code is cloned to:

In [None]:
%cd vertex-pipelines-end-to-end-samples/

Configure your code by setting the variables:
- `VERTEX_PROJECT_ID` - as set above
- `VERTEX_LOCATION` - location of the cloud project
- `RESOURCE_SUFFIX` - suffix (e.g. `<your name>`) to facilitate running concurrent pipelines in the same Google Cloud project. Change if working in a team to avoid overwriting resources during development 

In [None]:
%%writefile .env.sh
#!/bin/bash
VERTEX_PROJECT_ID=my-project-id
VERTEX_LOCATION=europe-west2
RESOURCE_SUFFIX=default

For most use cases you won't need to change the following variables unless you've modified the Terraform code.

In [None]:
%%writefile -a .env.sh
# Optional
VERTEX_CMEK_IDENTIFIER=
VERTEX_NETWORK=
# Leave as-is
VERTEX_SA_EMAIL=vertex-pipelines@${VERTEX_PROJECT_ID}.iam.gserviceaccount.com
VERTEX_PIPELINE_ROOT=gs://${VERTEX_PROJECT_ID}-pl-root
CONTAINER_IMAGE_REGISTRY=${VERTEX_LOCATION}-docker.pkg.dev/${VERTEX_PROJECT_ID}/vertex-images

## Deploy Infrastructure


The cloud infrastructure is managed using Terraform and is defined in the [`terraform`](terraform) directory. There are three Terraform modules defined in [`terraform/modules`](terraform/modules):

- `cloudfunction` - deploys a (Pub/Sub-triggered) Cloud Function from local source code
- `scheduled_pipelines` - deploys Cloud Scheduler jobs that will trigger Vertex Pipeline runs (via the above Cloud Function)
- `vertex_deployment` - deploys Cloud infrastructure required for running Vertex Pipelines, including enabling APIs, creating buckets, Artifact Registry repos, service accounts, and IAM permissions.

**Enable APIs**:

In [None]:
! gcloud services enable cloudresourcemanager.googleapis.com serviceusage.googleapis.com

**Create Cloud Storage bucket:**

Store the [Terraform state files](https://developer.hashicorp.com/terraform/language/state/remote) in the bucket `[project-id]-tfstate`:

In [None]:
! source .env.sh && gsutil mb -l $VERTEX_LOCATION -p $VERTEX_PROJECT_ID gs://$VERTEX_PROJECT_ID-tfstate

**Deploy:**

In [None]:
! make deploy auto-approve=true

You've successfully deployed a `dev` environment! 🎉 
Continue with [this notebook](./02_run_pipelines.ipynb) to run your first Vertex AI Pipelines in the deployed project.

**Note:** If you'd like to deploy separate cloud environments as shown below, try out `make deploy env=dev` where you can replace `dev` with `test` or `prod`.