# Deploy DistilBERT for Text Classification on Vertex AI

This notebook demonstrates how to deploy a DistilBERT model for text classification on Vertex AI. The model is trained on the IMDB dataset for sentiment analysis. The model is fine-tuned on the IMDB dataset and then deployed on Vertex AI for online prediction.

## Installations

Before we can install the packages make sure you have the cli installed: https://cloud.google.com/sdk/docs/install

Install the packages required for executing this notebook.

In [1]:
! pip install --upgrade --quiet google-cloud-aiplatform google-cloud-storage "google-auth>=2.23.3"

## Setup Vertex AI and SDK

Login to gcloud and set your project id.

```bash
gcloud auth login 
gcloud auth application-default login
``````

### Setup SDK with your project id

**If you don't know your project ID**, try the following:
* Run `gcloud config list`.
* Run `gcloud projects list`.
* See the support page: [Locate the project ID](https://support.google.com/googleapi/answer/7014113)

In [4]:
# PROJECT_ID = "[your-project-id]"  # @param {type:"string"}
PROJECT_ID = "gcp-partnership-412108"  # @param {type:"string"}
REGION = "us-central1"  # @param {type: "string"}

# Set the project id
! gcloud config set project {PROJECT_ID} --quiet
# Set the region
! gcloud config set ai/region {REGION} --quiet


To update your Application Default Credentials quota project, use the `gcloud auth application-default set-quota-project` command.
Updated property [core/project].
Updated property [ai/region].


Initialize the Vertex AI SDK for Python for your project and corresponding bucket.

In [10]:
from google.cloud import aiplatform

aiplatform.init(project=PROJECT_ID, location=REGION)

## 2. Deploy model to Vertex AI

create new model

In [11]:
SERVING_CONTAINER_IMAGE_URI = "us-central1-docker.pkg.dev/gcp-partnership-412108/base-infernece-image/base-inference-image:latest"

model = aiplatform.Model.upload(
    display_name="distilbert-base-uncased-finetuned-sst-2-english",
    serving_container_image_uri=SERVING_CONTAINER_IMAGE_URI,
    serving_container_environment_variables={
        "HF_MODEL_ID": "distilbert/distilbert-base-uncased-finetuned-sst-2-english",
        "HF_TASK": "text-classification",
        },
)


model.wait()

print(model.display_name)
print(model.resource_name)

Creating Model
Create Model backing LRO: projects/755607090520/locations/us-central1/models/6133463987339132928/operations/947387906241069056
Model created. Resource name: projects/755607090520/locations/us-central1/models/6133463987339132928@1
To use this Model in another session:
model = aiplatform.Model('projects/755607090520/locations/us-central1/models/6133463987339132928@1')
distilbert-base-uncased-finetuned-sst-2-english
projects/755607090520/locations/us-central1/models/6133463987339132928


The deployment will take ~20-25 minutes. You can check the status of the deployment in the cloud console.

In [12]:
machine_type = 'g2-standard-4' # L4 GPUs
endpoint = aiplatform.Endpoint.create(display_name="distilbert-base-uncased-finetuned-sst-2-english-endpoint")

deployed_model = model.deploy(
    endpoint=endpoint,
    deployed_model_display_name="distilbert-base-uncased-finetuned-sst-2-english-deployed",
    machine_type=machine_type,
    accelerator_type="NVIDIA_L4",
    accelerator_count=1,
    traffic_percentage=100,
    min_replica_count=1,
    sync=True,
)

Creating Endpoint
Create Endpoint backing LRO: projects/755607090520/locations/us-central1/endpoints/5604226159937060864/operations/1005231013955108864
Endpoint created. Resource name: projects/755607090520/locations/us-central1/endpoints/5604226159937060864
To use this Endpoint in another session:
endpoint = aiplatform.Endpoint('projects/755607090520/locations/us-central1/endpoints/5604226159937060864')
Deploying model to Endpoint : projects/755607090520/locations/us-central1/endpoints/5604226159937060864
Deploy Endpoint model backing LRO: projects/755607090520/locations/us-central1/endpoints/5604226159937060864/operations/3921311772677505024
Endpoint model deployed. Resource name: projects/755607090520/locations/us-central1/endpoints/5604226159937060864


In [15]:
res = deployed_model.predict(instances=["I love this product", "I hate this product"], parameters={ "top_k": 2 })
res.predictions

[[{'score': 0.9998788833618164, 'label': 'POSITIVE'},
  {'score': 0.0001210561968036927, 'label': 'NEGATIVE'}],
 [{'score': 0.9997544884681702, 'label': 'NEGATIVE'},
  {'score': 0.0002454846107866615, 'label': 'POSITIVE'}]]

Delete resources

In [16]:
deployed_model.undeploy_all()
deployed_model.delete()
model.delete()

Undeploying Endpoint model: projects/755607090520/locations/us-central1/endpoints/5604226159937060864
Undeploy Endpoint model backing LRO: projects/755607090520/locations/us-central1/endpoints/5604226159937060864/operations/1597595102442684416
Endpoint model undeployed. Resource name: projects/755607090520/locations/us-central1/endpoints/5604226159937060864
Deleting Endpoint : projects/755607090520/locations/us-central1/endpoints/5604226159937060864
Delete Endpoint  backing LRO: projects/755607090520/locations/us-central1/operations/6209281120870072320
Endpoint deleted. . Resource name: projects/755607090520/locations/us-central1/endpoints/5604226159937060864
Deleting Model : projects/755607090520/locations/us-central1/models/6133463987339132928
Delete Model  backing LRO: projects/755607090520/locations/us-central1/models/6133463987339132928/operations/3266460239360163840
Model deleted. . Resource name: projects/755607090520/locations/us-central1/models/6133463987339132928
