In [None]:
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Get started with Model Garden Terraform Deployment


<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/open-models/get_started_with_model_garden_terraform_deployment.ipynb">
      <img width="32px" src="https://www.gstatic.com/pantheon/images/bigquery/welcome_page/colab-logo.svg" alt="Google Colaboratory logo"><br> Open in Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Fopen-models%2Fget_started_with_model_garden_terraform_deployment.ipynb">
      <img width="32px" src="https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN" alt="Google Cloud Colab Enterprise logo"><br> Open in Colab Enterprise
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/open-models/get_started_with_model_garden_terraform_deployment.ipynb">
      <img src="https://www.gstatic.com/images/branding/gcpiconscolors/vertexai/v1/32px.svg" alt="Vertex AI logo"><br> Open in Vertex AI Workbench
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://github.com/GoogleCloudPlatform/generative-ai/blob/main/open-models/get_started_with_model_garden_terraform_deployment.ipynb">
      <img width="32px" src="https://storage.googleapis.com/github-repo/generative-ai/logos/GitHub_Invertocat_Dark.svg" alt="GitHub logo"><br> View on GitHub
    </a>
  </td>
</table>

<div style="clear: both;"></div>

<p>
<b>Share to:</b>

<a href="https://www.linkedin.com/sharing/share-offsite/?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/open-models/get_started_with_model_garden_terraform_deployment.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/8/81/LinkedIn_icon.svg" alt="LinkedIn logo">
</a>

<a href="https://bsky.app/intent/compose?text=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/open-models/get_started_with_model_garden_terraform_deployment.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/7/7a/Bluesky_Logo.svg" alt="Bluesky logo">
</a>

<a href="https://twitter.com/intent/tweet?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/open-models/get_started_with_model_garden_terraform_deployment.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/5/5a/X_icon_2.svg" alt="X logo">
</a>

<a href="https://reddit.com/submit?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/open-models/get_started_with_model_garden_terraform_deployment.ipynb" target="_blank">
  <img width="20px" src="https://redditinc.com/hubfs/Reddit%20Inc/Brand/Reddit_Logo.png" alt="Reddit logo">
</a>

<a href="https://www.facebook.com/sharer/sharer.php?u=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/open-models/get_started_with_model_garden_terraform_deployment.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/5/51/Facebook_f_logo_%282019%29.svg" alt="Facebook logo">
</a>
</p>

| Authors |
| --- |
| [Ivan Nardini](https://github.com/inardini) |
| [Eliza Huang](https://github.com/lizzij) |

## Overview

Deploying open models on Vertex AI through Terraform provides a
powerful infrastructure-as-code approach to manage your model
deployments. Instead of clicking through the UI or writing custom
API calls, you can define your entire Model Garden deployment in
configuration files that are version-controlled, repeatable, and
easily automated.

The [Vertex AI Model Garden Terraform resource](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/vertex_ai_endpoint_with_model_garden_deployment#example-usage---vertex-ai-deploy-basic) simplifies deploying
state-of-the-art open models by providing a declarative
configuration interface. With Terraform, you can deploy models from
  both the curated Model Garden catalog and Hugging Face Hub, manage
  compute resources, and maintain consistent deployments across
environments—all through simple configuration files.

This tutorial shows how to use Terraform to deploy open models from
  Vertex AI Model Garden.

You will learn how to:

- Set up Terraform for Vertex AI Model Garden deployments
- Find models that you can deploy
- Deploy your first Model Garden model using Terraform
- Handle advanced configurations including custom machine types,
accelerators, and replicas
- Deploy models from Hugging Face Hub
- Handle common deployment errors and troubleshooting

## Get started

### Prerequisites

Before you begin, ensure you have:

1. A Google Cloud project with billing enabled
2. The Vertex AI API enabled
3. Sufficient IAM permissions (Vertex AI Administrator or Editor role)

### Install Vertex AI SDK and other required packages

In [None]:
%pip install --upgrade --force-reinstall --quiet 'google-cloud-aiplatform>=1.93.1' 'openai' 'google-auth' 'requests' 'huggingface_hub'

### Install Terraform

If you don't have Terraform installed, download and install it from [terraform.io](https://www.terraform.io/downloads).

In [None]:
# For Linux dist
! wget https://releases.hashicorp.com/terraform/1.13.3/terraform_1.13.3_linux_amd64.zip
! unzip terraform_1.13.3_linux_amd64.zip
! sudo mv terraform /usr/local/bin/

# Verify installation
! terraform version

### Authenticate your notebook environment (Colab only)

If you're running this notebook on Google Colab, run the cell below to authenticate your environment.

In [None]:
import sys

if "google.colab" in sys.modules:
    from google.colab import auth

    auth.authenticate_user()

### Set Google Cloud project information

To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).

Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment).

In [None]:
# Use the environment variable if the user doesn't provide Project ID.
import os

import vertexai

# fmt: off
PROJECT_ID = "[your-project-id]"  # @param {type: "string", placeholder: "[your-project-id]", isTemplate: true}
# fmt: on
if not PROJECT_ID or PROJECT_ID == "[your-project-id]":
    PROJECT_ID = str(os.environ.get("GOOGLE_CLOUD_PROJECT"))

LOCATION = os.environ.get("GOOGLE_CLOUD_REGION", "us-central1")

vertexai.init(project=PROJECT_ID, location=LOCATION)

## Create a Terraform workspace

Create a new directory for your Terraform configuration

In [None]:
# Create a directory for your Terraform project
! rm -rf ./model-garden-terraform && mkdir ./model-garden-terraform

## Find the models that you can deploy

In Vertex AI Model Garden, you can discover and deploy a wide range of open-source models. Many models are directly supported with optimized configurations for Vertex AI deployment.

To find deployable models, you can:

1. **Use the Model Garden UI**: Browse models at [console.cloud.google.com/vertex-ai/model-garden](https://console.cloud.google.com/vertex-ai/model-garden)
2. **Use the Vertex AI SDK**: Run `model_garden.list_deployable_models()` to see available models programmatically
3. **Check the documentation**: Review the [Model Garden documentation](https://cloud.google.com/vertex-ai/docs/start/explore-models)

For this tutorial, we'll deploy models from the curated Model Garden catalog and Hugging Face Hub. Common model families include:

- **Gemma** models: `publishers/google/models/gemma@gemma-3-1b-it`
- **Llama** models: `publishers/meta/models/llama3-2@llama-3.2-1b-instruct`
- **PaliGemma** models: `publishers/google/models/paligemma@paligemma-224-float32`
- **Hugging Face** models: Any model ID from the Hugging Face Hub (e.g., `Qwen/Qwen3-0.6B`)

## Deploy your first Model Garden model

Let's deploy a Gemma model using Terraform with a basic configuration.


In [None]:
! rm -rf ./model-garden-terraform/01-basic-deployment && mkdir ./model-garden-terraform/01-basic-deployment

In [None]:
basic_deploy_config = """
# Configure the Terraform provider
terraform {
  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "7.5.0"
    }
  }
}

# Configure the Google Cloud provider
provider "google" {
  project = var.project_id
  region  = var.region
}

# Define variables
variable "project_id" {
  description = "Google Cloud Project ID"
  type        = string
}

variable "region" {
  description = "Google Cloud region"
  type        = string
  default     = "us-central1"
}

# Deploy a Gemma model to Vertex AI
resource "google_vertex_ai_endpoint_with_model_garden_deployment" "gemma_deployment" {
  publisher_model_name = "publishers/google/models/gemma3@gemma-3-1b-it"
  location             = var.region

  model_config {
    accept_eula = true
  }
}

# Output the endpoint information
output "endpoint_id" {
  description = "The ID of the deployed endpoint"
  value       = google_vertex_ai_endpoint_with_model_garden_deployment.gemma_deployment.id
}

output "endpoint_name" {
  description = "The name of the deployed endpoint"
  value       = google_vertex_ai_endpoint_with_model_garden_deployment.gemma_deployment.deployed_model_display_name
}
"""

with open("./model-garden-terraform/01-basic-deployment/main.tf", "w") as f:
    f.write(basic_deploy_config)

### Create a variables file

Create a `terraform.tfvars` file to set your project-specific values

In [None]:
deploy_vars = f"""
project_id="{PROJECT_ID}"
region="{LOCATION}"
"""

with open("./model-garden-terraform/01-basic-deployment/terraform.tfvars", "w") as f:
    f.write(deploy_vars)

### Initialize and deploy

Run the following Terraform commands to deploy your model.

Terraform will show you the planned changes and ask for confirmation. Type `yes` to proceed with the deployment.

> **Note**: The deployment typically takes 10-15 minutes depending on the model size and compute resources.

In [None]:
# Set the current workspace
! cd ./model-garden-terraform/01-basic-deployment && terraform init && terraform plan && terraform apply

### Verify the deployment

After deployment completes, you can verify the endpoint in the Google Cloud Console or use the following command.

In [None]:
# Use Terraform to show the outputs
! cd ./model-garden-terraform/01-basic-deployment && terraform output

You can always use the gcloud CLI as well.

```bash
# Get the endpoint information
! gcloud ai endpoints list --region=$LOCATION --project=$PROJECT_ID
```

### Generate predictions

After deploying your model, you can generate predictions using the Vertex AI API or SDK.

#### Using Vertex AI SDK for Python

In [None]:
from google.cloud import aiplatform

# Initialize Vertex AI
aiplatform.init(project=PROJECT_ID, location=LOCATION)

# Set endpoint id
ENDPOINT_ID = ! cd ./model-garden-terraform/01-basic-deployment && terraform output -raw endpoint_id
ENDPOINT_ID = ENDPOINT_ID[0]

# Get the endpoint
endpoint = aiplatform.Endpoint(ENDPOINT_ID)

# Generate prediction
response = endpoint.predict(
    instances=[
        {"prompt": "Tell me a joke about AI", "temperature": 0.7, "max_tokens": 35}
    ],
    use_dedicated_endpoint=True,
)
print(response.predictions[0])

#### Using OpenAI SDK (for compatible models)

In [None]:
import google.auth
import openai
from google.auth.transport.requests import Request

# Get credentials
creds, _ = google.auth.default()
auth_req = Request()
creds.refresh(auth_req)

# Get the dedicated endpoint domain name
endpoint_url = f"https://{endpoint.gca_resource.dedicated_endpoint_dns}/v1beta1/{endpoint.resource_name}"

client = openai.OpenAI(base_url=endpoint_url, api_key=creds.token)

# Generate prediction
response = client.chat.completions.create(
    model="",
    messages=[{"role": "user", "content": "Tell me a joke about AI"}],
    temperature=0.7,
    max_tokens=35,
)

print(response.choices[0].message.content)

## Deploy Hugging Face models

Terraform also supports deploying models directly from the Hugging Face Hub using the `hugging_face_model_id` parameter.

### Basic Hugging Face deployment

Deploy a Qwen model from Hugging Face.

In [None]:
! rm -rf ./model-garden-terraform/02-deploy-hf && mkdir ./model-garden-terraform/02-deploy-hf

In [None]:
deploy_hf_config = """
# Configure the Terraform provider
terraform {
  required_providers {
    google = {
      source  = "hashicorp/google-beta"
      version = "7.5.0"
    }
  }
}

# Configure the Google Cloud provider
provider "google" {
  project = var.project_id
  region  = var.region
}

# Define variables
variable "project_id" {
  description = "Google Cloud Project ID"
  type        = string
}

variable "region" {
  description = "Google Cloud region"
  type        = string
  default     = "us-central1"
}

# Deploy Qwen model from Hugging Face
resource "google_vertex_ai_endpoint_with_model_garden_deployment" "qwen_deployment" {
  hugging_face_model_id = "Qwen/Qwen2.5-0.5B"
  location              = var.region

  model_config {
    accept_eula = true
  }
}

# Output the endpoint information
output "endpoint_id" {
  description = "The ID of the deployed endpoint"
  value       = google_vertex_ai_endpoint_with_model_garden_deployment.qwen_deployment.id
}

output "endpoint_name" {
  description = "The name of the deployed endpoint"
  value       = google_vertex_ai_endpoint_with_model_garden_deployment.qwen_deployment.deployed_model_display_name
}
"""

with open("./model-garden-terraform/02-deploy-hf/main.tf", "w") as f:
    f.write(deploy_hf_config)

### Set the variables

Use the same variables you use before to deploy Gemma

In [None]:
! cp ./model-garden-terraform/01-basic-deployment/terraform.tfvars ./model-garden-terraform/02-deploy-hf/terraform.tfvars

### Deploy the Hugging Face model

In [None]:
! nohup bash -c "cd ./model-garden-terraform/02-deploy-hf && terraform init && terraform plan && terraform apply -auto-approve" > ./model-garden-terraform/02-deploy-hf/terraform.log 2>&1 &

In [None]:
! tail -f ./model-garden-terraform/02-deploy-hf/terraform.log

## Advanced scenarios

The Terraform resource supports advanced deployment configurations including custom machine types, accelerators, replica counts, and more.

### Deploy with custom compute resources

Here's an example deploying a PaliGemma model with specific machine types and GPU accelerators.


#### Configuration options

The `deploy_config` block supports the following options:

- **machine_spec**: Defines the compute resources
  - `machine_type`: Machine type (e.g., `g2-standard-16`, `n1-standard-4`)
  - `accelerator_type`: GPU type (e.g., `NVIDIA_L4`, `NVIDIA_TESLA_T4`)
  - `accelerator_count`: Number of GPUs per replica

- **min_replica_count**: Minimum number of replicas (for autoscaling)
- **max_replica_count**: Maximum number of replicas (for autoscaling)


> **Note**: This deployment uses a `g2-standard-16` machine with NVIDIA L4 GPU. Make sure you have sufficient quota for these resources in your project.

In [None]:
! rm -rf ./model-garden-terraform/03-deployment-with-config && mkdir ./model-garden-terraform/03-deployment-with-config

In [None]:
deploy_with_config = """
# Configure the Terraform provider
terraform {
  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "7.5.0"
    }
  }
}

# Configure the Google Cloud provider
provider "google" {
  project = var.project_id
  region  = var.region
}

# Define variables
variable "project_id" {
  description = "Google Cloud Project ID"
  type        = string
}

variable "region" {
  description = "Google Cloud region"
  type        = string
  default     = "us-central1"
}

# Deploy PaliGemma with custom compute resources
resource "google_vertex_ai_endpoint_with_model_garden_deployment" "paligemma_deployment" {
  publisher_model_name = "publishers/google/models/paligemma@paligemma-224-float32"
  location             = var.region

  model_config {
    accept_eula = true
  }

  deploy_config {
    dedicated_resources {
      machine_spec {
        machine_type      = "g2-standard-16"
        accelerator_type  = "NVIDIA_L4"
        accelerator_count = 1
      }
      min_replica_count = 1
      max_replica_count = 3
    }
  }
}

# Output the endpoint information
output "endpoint_id" {
  description = "The ID of the deployed endpoint"
  value       = google_vertex_ai_endpoint_with_model_garden_deployment.paligemma_deployment.id
}

output "endpoint_name" {
  description = "The name of the deployed endpoint"
  value       = google_vertex_ai_endpoint_with_model_garden_deployment.paligemma_deployment.deployed_model_display_name
}
"""

with open("./model-garden-terraform/03-deployment-with-config/main.tf", "w") as f:
    f.write(deploy_with_config)

#### Set the variables

Use the same variables you use before to deploy Gemma

In [None]:
! cp ./model-garden-terraform/01-basic-deployment/terraform.tfvars ./model-garden-terraform/03-deployment-with-config/terraform.tfvars

#### Deploy the model


In [None]:
! nohup bash -c "cd ./model-garden-terraform/03-deployment-with-config && terraform init && terraform plan && terraform apply -auto-approve" > ./model-garden-terraform/03-deployment-with-config/terraform.log 2>&1 &

#### Check the deployment status

In [None]:
# View all deployed endpoints
! cd ./model-garden-terraform/03-deployment-with-config && terraform output

### Deploy multiple models

You can deploy multiple models in the same Terraform configuration. This example deploys both a Gemma text model and a PaliGemma vision model.

#### Append multiple models to the main module

In this case, we combine the previous deployment in one unique main module

In [None]:
! rm -rf ./model-garden-terraform/04-multiple-models && mkdir ./model-garden-terraform/04-multiple-models

In [None]:
deploy_multiple_models_config = """
# Configure the Terraform provider
terraform {
  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "7.5.0"
    }
  }
}

# Configure the Google Cloud provider
provider "google" {
  project = var.project_id
  region  = var.region
}

# Define variables
variable "project_id" {
  description = "Google Cloud Project ID"
  type        = string
}

variable "region" {
  description = "Google Cloud region"
  type        = string
  default     = "us-central1"
}

# Deploy Gemma model
resource "google_vertex_ai_endpoint_with_model_garden_deployment" "gemma" {
  publisher_model_name = "publishers/google/models/gemma3@gemma-3-1b-it"
  location             = var.region

  model_config {
    accept_eula = true
  }
}

# Deploy PaliGemma model with custom resources
resource "google_vertex_ai_endpoint_with_model_garden_deployment" "paligemma" {
  publisher_model_name = "publishers/google/models/paligemma@paligemma-224-float32"
  location             = var.region

  model_config {
    accept_eula = true
  }

  deploy_config {
    dedicated_resources {
      machine_spec {
        machine_type      = "g2-standard-16"
        accelerator_type  = "NVIDIA_L4"
        accelerator_count = 1
      }
      min_replica_count = 1
    }
  }
}

# Output the Gemma endpoint information
output "gemma_endpoint_id" {
  description = "The ID of the Gemma endpoint"
  value       = google_vertex_ai_endpoint_with_model_garden_deployment.gemma.id
}

output "gemma_endpoint_name" {
  description = "The name of the Gemma endpoint"
  value       = google_vertex_ai_endpoint_with_model_garden_deployment.gemma.deployed_model_display_name
}

# Output the PaliGemma endpoint information
output "paligemma_endpoint_id" {
  description = "The ID of the PaliGemma endpoint"
  value       = google_vertex_ai_endpoint_with_model_garden_deployment.paligemma.id
}

output "paligemma_endpoint_name" {
  description = "The name of the PaliGemma endpoint"
  value       = google_vertex_ai_endpoint_with_model_garden_deployment.paligemma.deployed_model_display_name
}
"""

with open("./model-garden-terraform/04-multiple-models/main.tf", "w") as f:
    f.write(deploy_multiple_models_config)

#### Set the variables

Use the same variables you use before to deploy Gemma

In [None]:
! cp ./model-garden-terraform/01-basic-deployment/terraform.tfvars ./model-garden-terraform/04-multiple-models/terraform.tfvars

#### Deploy both models

> **Note**: Deploying multiple models will take longer (20-30 minutes total). Each model deploys to its own endpoint.


In [None]:
! nohup bash -c "cd ./model-garden-terraform/04-multiple-models && terraform init && terraform plan && terraform apply -auto-approve" > ./model-garden-terraform/04-multiple-models/terraform.log 2>&1 &

In [None]:
! tail -f ./model-garden-terraform/04-multiple-models/terraform.log

#### Check the deployment status

In [None]:
! cd ./model-garden-terraform/04-multiple-models && terraform output

### Deploy gated Hugging Face models

For gated models that require authentication, you'll need to provide a Hugging Face access token. This example deploys Meta's Llama model from Hugging Face.

#### Create the Terraform configuration

In this case, you pass the additional HF variable.

In [None]:
! rm -rf ./model-garden-terraform/05-gated-hf && mkdir -p ./model-garden-terraform/05-gated-hf

In [None]:
hf_gated_deploy_config = """
# Configure the Terraform provider
terraform {
  required_providers {
    google = {
      source  = "hashicorp/google-beta"
      version = "7.5.0"
    }
  }
}

# Configure the Google Cloud provider
provider "google" {
  project = var.project_id
  region  = var.region
}

# Define variables
variable "project_id" {
  description = "Google Cloud Project ID"
  type        = string
}

variable "region" {
  description = "Google Cloud region"
  type        = string
  default     = "us-central1"
}

variable "hugging_face_token" {
  description = "Hugging Face access token for gated models"
  type        = string
  sensitive   = true
}

# Deploy a gated Hugging Face model (Meta Llama)
resource "google_vertex_ai_endpoint_with_model_garden_deployment" "llama_deployment" {
  hugging_face_model_id = "meta-llama/Llama-3.2-1B"
  location              = var.region

  model_config {
    accept_eula               = true
    hugging_face_access_token = var.hugging_face_token
  }
}

# Output the endpoint information
output "endpoint_id" {
  description = "The ID of the deployed endpoint"
  value       = google_vertex_ai_endpoint_with_model_garden_deployment.llama_deployment.id
}

output "endpoint_name" {
  description = "The name of the deployed endpoint"
  value       = google_vertex_ai_endpoint_with_model_garden_deployment.llama_deployment.deployed_model_display_name
}
"""

with open("./model-garden-terraform/05-gated-hf/main.tf", "w") as f:
    f.write(hf_gated_deploy_config)

#### Create a variables file

Create a `terraform.tfvars` file to set your project-specific values. In this case we also add the HF_TOKEN variable.

In [None]:
# Authenticate with Hugging Face
from huggingface_hub import interpreter_login

# Get it from: https://huggingface.co/settings/tokens
interpreter_login()

In [None]:
# Set your Hugging Face access token
from huggingface_hub import get_token

HF_TOKEN = get_token()

In [None]:
deploy_vars = f"""
project_id="{PROJECT_ID}"
region="{LOCATION}"
hugging_face_token="{HF_TOKEN}"
"""

with open("./model-garden-terraform/05-gated-hf/terraform.tfvars", "w") as f:
    f.write(deploy_vars)

#### Deploy the gated model

In [None]:
! nohup bash -c "cd ./model-garden-terraform/05-gated-hf && terraform init && terraform plan && terraform apply -auto-approve" > ./model-garden-terraform/05-gated-hf/terraform.log 2>&1 &

In [None]:
! tail -f ./model-garden-terraform/05-gated-hf/terraform.log

## Cleaning up

To avoid incurring unnecessary charges, clean up the resources when you're done.

### Destroy specific resources

To destroy only specific resources. Terraform will show you all resources that will be deleted and ask for confirmation. Type `yes` to proceed.


In [None]:
# Destroy a specific endpoint
! cd ./model-garden-terraform/01-basic-deployment && terraform destroy -target=google_vertex_ai_endpoint_with_model_garden_deployment.gemma_deployment

### Destroy all resources

In [None]:
import os

def list_subfolders(folder_path):
    """Lists all subfolders in a given folder path."""
    return [
        os.path.join(folder_path, d)
        for d in os.listdir(folder_path)
        if os.path.isdir(os.path.join(folder_path, d))
    ]


# Replace 'model-garden-terraform' with the actual folder path you want to list
folder_to_check = "./model-garden-terraform"
subfolders = list_subfolders(folder_to_check)

# Delete model
for folder in subfolders:
    print(f"Destroying model in {folder}...")
    ! cd {folder} && terraform destroy -auto-approve
    print(f"Destroyed model in {folder}!\n")

### Verify cleanup

After destruction, verify that resources were deleted.

In [None]:
# Verify Terraform state is clean
! cd ./model-garden-terraform/01-basic-deployment && terraform show

Again you can also use gcloud CLI.

```bash
# Check for remaining endpoints
! gcloud ai endpoints list --region=$LOCATION --project=$PROJECT_ID
```

## Best practices

When using Terraform for Model Garden deployments:

1. **Use version control**: Store your Terraform configurations in Git to track changes and enable collaboration
2. **Use remote state**: Configure remote state storage (e.g., Google Cloud Storage) for team environments
3. **Separate environments**: Use Terraform workspaces or separate directories for dev/staging/prod
4. **Use variables**: Parameterize your configurations with variables for reusability
5. **Tag resources**: Add labels to resources for better organization and cost tracking
6. **Plan before apply**: Always run `terraform plan` to preview changes before applying
7. **Secure secrets**: Use environment variables or secret management tools for sensitive data (API tokens, etc.)

## Next steps

Now that you've learned how to deploy models with Terraform, you can:

- Explore more models in the [Model Garden catalog](https://console.cloud.google.com/vertex-ai/model-garden)
- Learn about [Vertex AI Prediction](https://cloud.google.com/vertex-ai/docs/predictions/get-predictions) for inference
- Automate deployments with [CI/CD pipelines](https://cloud.google.com/docs/terraform/best-practices-for-terraform#cicd)
- Learn more about [Terraform on Google Cloud](https://cloud.google.com/docs/terraform)
