In [None]:
# Copyright 2022 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Vertex AI SDK: Training an AutoML text sentiment analysis model for online predictions

<table align="left">
  <td>
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/automl/sdk_automl_text_sentiment_analysis_online.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Colab logo"> Run in Colab
    </a>
  </td>
  <td>
    <a href="https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/automl/sdk_automl_text_sentiment_analysis_online.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      View on GitHub
    </a>
  </td>
  <td>
<a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/vertex-ai-samples/main/notebooks/official/automl/sdk_automl_text_sentiment_analysis_online.ipynb" target='_blank'>
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo">
      Open in Vertex AI Workbench
    </a>
  </td>
</table>
<br/><br/><br/>

## Overview

This tutorial demonstrates how to use the Vertex AI SDK to train and deploy an [AutoML](https://cloud.google.com/vertex-ai/docs/start/automl-users) text sentiment analysis model and get online predictions from it.

Learn more about [Sentiment analysis for text data](https://cloud.google.com/vertex-ai/docs/training-overview#sentiment_analysis_for_text).

### Objective

In this tutorial, you learn how to create an AutoML text sentiment analysis model and deploy it for online predictions from a Python script using the Vertex AI SDK. You can alternatively create and deploy models using the `gcloud` command-line tool or online using the Cloud Console.

This tutorial uses the following Google Cloud ML services and resources:
- Vertex AI Datasets
- Vertex AI Training (AutoML)
- Vertex AI Model Registry
- Vertex AI Endpoints

The steps performed include:

- Create a `Vertex AI Dataset` resource.
- Create a training job for the AutoML model on the dataset.
- View the model evaluation metrics.
- Deploy the `Vertex AI Model` resource to a serving `Vertex AI Endpoint`.
- Make a prediction request to the deployed model.
- Undeploy the model from endpoint.
- Perform clean up process.

### Dataset

The dataset used for this tutorial is the [Crowdflower Claritin-Twitter dataset](https://data.world/crowdflower/claritin-twitter) that consists of tweets tagged with sentiment, the author's gender, and whether or not they mention any of the top 10 adverse events reported to the FDA. The version of the dataset you use in this tutorial is stored in a public Cloud Storage bucket. In this tutorial, you use the tweets data to build an AutoML text sentiment analysis model on Google Cloud platform.

### Costs

This tutorial uses billable components of Google Cloud:

* Vertex AI
* Cloud Storage

Learn about [Vertex AI
pricing](https://cloud.google.com/vertex-ai/pricing) and [Cloud Storage
pricing](https://cloud.google.com/storage/pricing), and use the [Pricing
Calculator](https://cloud.google.com/products/calculator/)
to generate a cost estimate based on your projected usage.

## Installation

Install the latest version of Vertex AI SDK for Python.

In [4]:
! gcloud auth login


You are running on a Google Compute Engine virtual machine.
It is recommended that you use service accounts for authentication.

You can run:

  $ gcloud config set account `ACCOUNT`

to switch accounts if necessary.

Your credentials may be visible to others with access to this
virtual machine. Are you sure you want to authenticate with
your personal account?

Do you want to continue (Y/n)?  y

Go to the following link in your browser:

    https://accounts.google.com/o/oauth2/auth?response_type=code&client_id=32555940559.apps.googleusercontent.com&redirect_uri=https%3A%2F%2Fsdk.cloud.google.com%2Fauthcode.html&scope=openid+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fuserinfo.email+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fappengine.admin+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fsqlservice.login+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcompute+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Faccounts.reauth&state=SVcmB6SZhvVMLViKEqmar

In [2]:
from google.colab import auth
auth.authenticate_user()

In [5]:
import os

! pip3 install --upgrade --quiet google-cloud-aiplatform \
                                 google-cloud-storage

### Colab only: Uncomment the following cell to restart the kernel

In [None]:
# Automatically restart kernel after installs so that your environment can access the new packages
# import IPython

# app = IPython.Application.instance()
# app.kernel.do_shutdown(True)

## Before you begin

### Set your project ID

**If you don't know your project ID**, try the following:
* Run `gcloud config list`.
* Run `gcloud projects list`.
* See the support page: [Locate the project ID](https://support.google.com/googleapi/answer/7014113)

In [None]:
!gcloud projects list

PROJECT_ID                 NAME                      PROJECT_NUMBER
appbank-8a219              AppBank                   377221485638
applicationbancaire-2aa4d  ApplicationBancaire       874295996718
applicationbancaire-ac99c  applicationBancaire       318454794710
applicationbancairedata    ApplicationBancaireData   11165248269
citric-biplane-401208      My First Project          80446538947
clean-sequencer-401208     My First Project          343807322783
fbapplicationbancaire      FbApplicationBancaire     241901972096
monsecondprojet-401209     monsecondprojet           955221964053
projet-tuto-401908         projet tuto               124483135830
stone-passage-401820       My First Project          435337108847
tweetssentimentsanalyses   tweetsSentimentsAnalyses  710936728289


In [None]:
!gcloud config list

[component_manager]
disable_update_check = True
[compute]
gce_metadata_read_timeout_sec = 0
[core]
account = karrytuba.test22@gmail.com

Your active configuration is: [default]


In [6]:
PROJECT_ID = "tweetssentimentsanalyses"  # @param {type:"string"}

# Set the project id
! gcloud config set project {PROJECT_ID}

Updated property [core/project].


#### Region

You can also change the `REGION` variable used by Vertex AI. Learn more about [Vertex AI regions](https://cloud.google.com/vertex-ai/docs/general/locations).

In [7]:
REGION = "us-central1"  # @param {type: "string"}

### Authenticate your Google Cloud account

Depending on your Jupyter environment, you may have to manually authenticate. Follow the relevant instructions below.

**1. Vertex AI Workbench**
* Do nothing as you are already authenticated.

**2. Local JupyterLab instance, uncomment and run:**

**3. Colab, uncomment and run:**

**4. Service account or other**
* See how to grant Cloud Storage permissions to your service account at https://cloud.google.com/storage/docs/gsutil/commands/iam#ch-examples.

### Create a Cloud Storage bucket

Create a storage bucket to store intermediate artifacts such as datasets.

In [8]:
BUCKET_URI = f"gs://bucket-{PROJECT_ID}"  # @param {type:"string"}

**Only if your bucket doesn't already exist**: Run the following cell to create your Cloud Storage bucket.

In [None]:
! gsutil mb -l $REGION -p $PROJECT_ID $BUCKET_URI

Creating gs://bucket-tweetssentimentsanalyses/...


In [9]:
# préprocessing
import pandas as pd

### this will deal with punctuation ###
import string
import re
import nltk
nltk.download('stopwords')
#Downloads NLTK stopwords, which are commonly used words (e.g., "the," "is," "and")
#that are often removed from text data during text processing for tasks like sentiment analysis or classification.
from nltk.corpus import stopwords
from nltk.stem import SnowballStemmer

import re # Regular expressions are useful for pattern matching and text manipulation.

stop_words = stopwords.words('english') # (e.g., "the," "is," "and") that are often removed from text data
stemmer = SnowballStemmer('english') #"running" and "ran" would be stemmed to "run".

text_cleaning_re = "@\S+|https?:\S+|http?:\S|[^A-Za-z0-9]+"  # URL and username
def preprocess(text):
    text = re.sub(text_cleaning_re, ' ', str(text).lower()).strip()
    #Any pattern matched by text_cleaning_re will be replaced with a single space ' '
    #.strip() removes leading and trailing whitespace characters (spaces, tabs, etc.) from the modified text.
    tokens = []
    for token in text.split():
        if token not in stop_words:
            tokens.append(token)
    return " ".join(tokens)  #Joins the filtered tokens back together using " "

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.


In [10]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [11]:
import pandas as pd

# DATASET
DATASET_COLUMNS = ["target", "ids", "date", "flag", "user", "text"]
DATASET_ENCODING = "ISO-8859-1"

# Read the CSV file
d_d  = pd.read_csv('/content/drive/MyDrive/data/sent_data.csv', encoding=DATASET_ENCODING, names=DATASET_COLUMNS)

d_d.head(7)

Unnamed: 0,target,ids,date,flag,user,text
0,0,1467810369,Mon Apr 06 22:19:45 PDT 2009,NO_QUERY,_TheSpecialOne_,"@switchfoot http://twitpic.com/2y1zl - Awww, t..."
1,0,1467810672,Mon Apr 06 22:19:49 PDT 2009,NO_QUERY,scotthamilton,is upset that he can't update his Facebook by ...
2,0,1467810917,Mon Apr 06 22:19:53 PDT 2009,NO_QUERY,mattycus,@Kenichan I dived many times for the ball. Man...
3,0,1467811184,Mon Apr 06 22:19:57 PDT 2009,NO_QUERY,ElleCTF,my whole body feels itchy and like its on fire
4,0,1467811193,Mon Apr 06 22:19:57 PDT 2009,NO_QUERY,Karoli,"@nationwideclass no, it's not behaving at all...."
5,0,1467811372,Mon Apr 06 22:20:00 PDT 2009,NO_QUERY,joy_wolf,@Kwesidei not the whole crew
6,0,1467811592,Mon Apr 06 22:20:03 PDT 2009,NO_QUERY,mybirch,Need a hug


In [12]:
# garder que les deux colonne text et label
data =d_d[['text','target']]
data

Unnamed: 0,text,target
0,"@switchfoot http://twitpic.com/2y1zl - Awww, t...",0
1,is upset that he can't update his Facebook by ...,0
2,@Kenichan I dived many times for the ball. Man...,0
3,my whole body feels itchy and like its on fire,0
4,"@nationwideclass no, it's not behaving at all....",0
...,...,...
1599995,Just woke up. Having no school is the best fee...,4
1599996,TheWDB.com - Very cool to hear old Walt interv...,4
1599997,Are you ready for your MoJo Makeover? Ask me f...,4
1599998,Happy 38th Birthday to my boo of alll time!!! ...,4


In [None]:
data['text'] = data['text'].map(preprocess)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data['text'] = data['text'].map(preprocess)


In [None]:
data


Unnamed: 0,text,target
0,awww bummer shoulda got david carr third day,0
1,upset update facebook texting might cry result...,0
2,dived many times ball managed save 50 rest go ...,0
3,whole body feels itchy like fire,0
4,behaving mad see,0
...,...,...
1599995,woke school best feeling ever,4
1599996,thewdb com cool hear old walt interviews,4
1599997,ready mojo makeover ask details,4
1599998,happy 38th birthday boo alll time tupac amaru ...,4


In [None]:
data.groupby(['target']).count()

Unnamed: 0_level_0,text
target,Unnamed: 1_level_1
0,800000
4,800000


In [13]:
data = data.replace(4,1)
data.groupby(['target']).count()

Unnamed: 0_level_0,text
target,Unnamed: 1_level_1
0,800000
1,800000


### Import libraries

In [14]:
import google.cloud.aiplatform as aiplatform

### Initialize Vertex AI SDK for Python

Initialize the Vertex AI SDK for Python for your project and corresponding bucket.

In [15]:
aiplatform.init(project=PROJECT_ID, staging_bucket=BUCKET_URI)

In [None]:
# Spécifiez le chemin de votre bucket et le nom du fichier
bucket_path = 'gs://bucket-tweetssentimentsanalyses/dataSentimentsAnalysies.csv'

# Enregistrez le DataFrame dans le bucket GCP
data.to_csv(bucket_path, header=False, index=False)

### Define the constants

Set the constants that you use in this tutorial.

In [37]:
# Set the location of the CSV index file in Cloud Storage.
IMPORT_FILE = "gs://bucket-tweetssentimentsanalyses/final_dataset.csv"
# Set the max. sentiment score
SENTIMENT_MAX = 1

In [16]:
SENTIMENT_MAX = 1

## Take a quick peek at your data

This tutorial uses a version of the `Crowdflower Claritin-Twitter` dataset which is stored in a public Cloud Storage bucket, using a CSV index file.

Start by taking a quick peek at the data. Further, count the number of examples by counting the number of rows in the CSV index file  (`wc -l`) and then print the first few rows.

In [None]:
FILE = IMPORT_FILE

count = ! gsutil cat $FILE | wc -l
print("Number of Examples", int(count[0]))

print("First 10 rows")
! gsutil cat $FILE | head

## Create the Dataset

Now, create a `Vertex AI Dataset` resource using the `create` method of the `TextDataset` class, which takes the following parameters:

- `display_name`: The human readable name for the dataset resource.
- `gcs_source`: A list of one or more dataset index files to import the data items into the dataset resource.
- `import_schema_uri`: The data labeling schema for the data items.

This operation may take several minutes.

In [None]:
from google.colab import auth
auth.authenticate_user()

In [21]:
dataset = aiplatform.TextDataset.create(
    display_name="sentimentsData5",
    gcs_source=[IMPORT_FILE],
    import_schema_uri=aiplatform.schema.dataset.ioformat.text.sentiment,
)

print(dataset.resource_name)

NameError: name 'IMPORT_FILE' is not defined

In [24]:
from google.cloud import aiplatform

def list_datasets(project_id, region):
    # Initialize AI Platform client
    client = aiplatform.gapic.DatasetServiceClient(client_options={
        'api_endpoint': f'{region}-aiplatform.googleapis.com'
    })

    # List datasets
    parent = f"projects/{project_id}/locations/{region}"
    for dataset in client.list_datasets(parent=parent):
        print("Dataset name:", dataset.name)
        print("Dataset display name:", dataset.display_name)
        print("Dataset ID:", dataset.name.split('/')[-1])



list_datasets(PROJECT_ID, REGION)

Dataset name: projects/710936728289/locations/us-central1/datasets/4331377623553802240
Dataset display name: untitled_1704450913614
Dataset ID: 4331377623553802240
Dataset name: projects/710936728289/locations/us-central1/datasets/8327090839822008320
Dataset display name: untitled_1704449775720
Dataset ID: 8327090839822008320
Dataset name: projects/710936728289/locations/us-central1/datasets/2552455770742456320
Dataset display name: sentimentsData3
Dataset ID: 2552455770742456320
Dataset name: projects/710936728289/locations/us-central1/datasets/1362379569209802752
Dataset display name: sentimentsData2
Dataset ID: 1362379569209802752
Dataset name: projects/710936728289/locations/us-central1/datasets/3895654359605706752
Dataset display name: sentimentsData
Dataset ID: 3895654359605706752


## Create and run training job

In this section, to train an AutoML model, you perform these steps:

1) create a training job.
2) run the job.

### Create a training job

An AutoML training job is created with the `AutoMLTextTrainingJob` class, with the following parameters:

- `display_name`: The human readable name for the training job resource.
- `prediction_type`: The type task to train the model for.
  - `classification`: A text classification model.
  - `sentiment`: A text sentiment analysis model.
  - `extraction`: A text entity extraction model.
- `multi_label`: If a classification task, whether single (False) or multi-labeled (True).
- `sentiment_max`: If a sentiment analysis task, the maximum sentiment value.

In [25]:
job = aiplatform.AutoMLTextTrainingJob(
    display_name="claritin",
    prediction_type="sentiment",
    sentiment_max=SENTIMENT_MAX,
)

print(job)

<google.cloud.aiplatform.training_jobs.AutoMLTextTrainingJob object at 0x7f78bc3c9f30>


### Run the training job

Next, you run the training job by invoking the method `run`, with the following parameters:

- `dataset`: The `Dataset` resource to train the model.
- `model_display_name`: The human readable name for the trained model.
- `training_fraction_split`: The percentage of the dataset to use for training.
- `test_fraction_split`: The percentage of the dataset to use for test (holdout data).
- `validation_fraction_split`: The percentage of the dataset to use for validation.

The `run` method when completed returns the `Model` resource.

The execution of the training pipeline take upto 180 minutes.

In [26]:
selected_dataset_id = '4331377623553802240'  # Replace with the ID of the dataset you want to use
dataset = aiplatform.TextDataset(selected_dataset_id)
model = job.run(
    dataset=dataset,
    model_display_name="claritin",
    training_fraction_split=0.8,
    validation_fraction_split=0.1,
    test_fraction_split=0.1,
)

INFO:google.cloud.aiplatform.training_jobs:View Training:
https://console.cloud.google.com/ai/platform/locations/us-central1/training/1616940356698374144?project=710936728289
INFO:google.cloud.aiplatform.training_jobs:AutoMLTextTrainingJob projects/710936728289/locations/us-central1/trainingPipelines/1616940356698374144 current state:
PipelineState.PIPELINE_STATE_PENDING
INFO:google.cloud.aiplatform.training_jobs:AutoMLTextTrainingJob projects/710936728289/locations/us-central1/trainingPipelines/1616940356698374144 current state:
PipelineState.PIPELINE_STATE_PENDING
INFO:google.cloud.aiplatform.training_jobs:AutoMLTextTrainingJob projects/710936728289/locations/us-central1/trainingPipelines/1616940356698374144 current state:
PipelineState.PIPELINE_STATE_RUNNING
INFO:google.cloud.aiplatform.training_jobs:AutoMLTextTrainingJob projects/710936728289/locations/us-central1/trainingPipelines/1616940356698374144 current state:
PipelineState.PIPELINE_STATE_RUNNING
INFO:google.cloud.aiplatform.

## Review model evaluation scores

Once your model training has finished, you can review the evaluation scores.

Firstly, you need to get a reference to the newly created model. As with datasets, you can either use the reference to the model variable you created when you deployed the model or you can list all of the models in your project and filter.

In [27]:
# Get model resource ID
models = aiplatform.Model.list(filter="display_name=claritin")

# Get a reference to the Model Service client
client_options = {"api_endpoint": f"{REGION}-aiplatform.googleapis.com"}
model_service_client = aiplatform.gapic.ModelServiceClient(
    client_options=client_options
)

model_evaluations = model_service_client.list_model_evaluations(
    parent=models[0].resource_name
)
model_evaluation = list(model_evaluations)[0]
print(model_evaluation)

name: "projects/710936728289/locations/us-central1/models/7165488690913869824@1/evaluations/51091298865643520"
metrics_schema_uri: "gs://google-cloud-aiplatform/schema/modelevaluation/text_sentiment_metrics_1.0.0.yaml"
metrics {
  struct_value {
    fields {
      key: "confusionMatrix"
      value {
        struct_value {
          fields {
            key: "annotationSpecs"
            value {
              list_value {
                values {
                  struct_value {
                    fields {
                      key: "displayName"
                      value {
                        string_value: "0"
                      }
                    }
                    fields {
                      key: "id"
                      value {
                        string_value: "9213938501966364672"
                      }
                    }
                  }
                }
                values {
                  struct_value {
                    fields {
      

## Deploy the model

Next, deploy your model to serve online predictions. To deploy the model, you invoke the `deploy` method of the model resource which in turn returns you the deployed endpoint.

**Note:** Normally, an endpoint is created beforehand and is given as a reference while model deployment. By default, `deploy()` method creates an endpoint when an endpoint reference is not given.

In [28]:
endpoint = model.deploy()

INFO:google.cloud.aiplatform.models:Creating Endpoint
INFO:google.cloud.aiplatform.models:Create Endpoint backing LRO: projects/710936728289/locations/us-central1/endpoints/1677074090730455040/operations/2681473495937843200
INFO:google.cloud.aiplatform.models:Endpoint created. Resource name: projects/710936728289/locations/us-central1/endpoints/1677074090730455040
INFO:google.cloud.aiplatform.models:To use this Endpoint in another session:
INFO:google.cloud.aiplatform.models:endpoint = aiplatform.Endpoint('projects/710936728289/locations/us-central1/endpoints/1677074090730455040')
INFO:google.cloud.aiplatform.models:Deploying model to Endpoint : projects/710936728289/locations/us-central1/endpoints/1677074090730455040
INFO:google.cloud.aiplatform.models:Deploy Endpoint model backing LRO: projects/710936728289/locations/us-central1/endpoints/1677074090730455040/operations/4226208168125923328
INFO:google.cloud.aiplatform.models:Endpoint model deployed. Resource name: projects/71093672828

## Send online prediction requests

In this step, you prepare some test instances from the dataset and send an online prediction request to your deployed model.

### Create test instances

You use an arbitrary example out of the dataset as a test item. Don't be concerned that the example was likely used in training the model. It is just to demonstrate how to make a prediction.

In [38]:
test_item = ! gsutil cat $IMPORT_FILE | head -n1
print(test_item)
if len(test_item[0]) == 3:
    _, test_item, test_label, max = str(test_item[0]).split(",")
else:
    test_item, test_label, max = str(test_item[0]).split(",")

print(test_item, test_label)

['want pizza chips lashings salt vinager blah,0']


ValueError: not enough values to unpack (expected 3, got 2)

In [39]:
# Get the first line from the file
test_item = !gsutil cat $IMPORT_FILE | head -n1

# Convert the first item in the list to a string
test_item_str = str(test_item[0])

# Check if the string contains the expected number of commas
num_commas = test_item_str.count(',')

# Depending on the number of commas, split accordingly
if num_commas == 2:
    test_item, test_label, max = test_item_str.split(",")
elif num_commas == 1:
    test_item, test_label = test_item_str.split(",")

print(test_item, test_label)


want pizza chips lashings salt vinager blah 0


### Make the prediction request

Now that your model is deployed to an endpoint, you can send online prediction requests to the endpoint resource.

#### Request format

The format of each instance should be in JSON as below:

     { 'content': text_string }

Since the `predict()` method can take multiple instances, send your request as a list of one test instance.

#### Response

The response from the `predict()` call is a Python dictionary with the following entries:

- `ids`: The internal assigned unique identifiers for each prediction request.
- `sentiment`: The sentiment value.
- `deployed_model_id`: The Vertex AI identifier for the deployed `Model` resource which did the predictions.

In [40]:
instances_list = [{"content": test_item}]

prediction = endpoint.predict(instances_list)
print(prediction)

Prediction(predictions=[{'sentiment': 0.0}], deployed_model_id='2051923992918360064', model_version_id='1', model_resource_name='projects/710936728289/locations/us-central1/models/7165488690913869824', explanations=None)


## Undeploy the model

After you explore the predictions, you undeploy the model from the `Endpoint` resouce. This deprovisions all compute resources and ends billing for the deployed model.

In [None]:
endpoint.undeploy_all()

# Cleaning up

To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud
project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.

Otherwise, you can delete the individual resources you created in this tutorial:

- Vertex AI Dataset
- Vertex AI Model
- Vertex AI Endpoint
- AutoML Training Job
- Cloud Storage Bucket (set `delete_bucket` to **True** to delete the bucket)

In [None]:
delete_bucket = False

# Delete the dataset using the Vertex dataset object
dataset.delete()

# Delete the model using the Vertex model object
model.delete()

# Delete the endpoint using the Vertex endpoint object
endpoint.delete()

# Delete the AutoML or Pipeline training job
job.delete()

# Delete the Cloud storage bucket
if delete_bucket or os.getenv("IS_TESTING"):
    ! gsutil rm -r $BUCKET_URI