# Deploy & Run Online Inference

Now that we've trained the model, and we are (presumably) happy with the results of training, we can deploy the model to a Vertex AI endpoint and use online predictions in order to test out a sample datapoint.

In [5]:
PROJECT_NAME = 'ds-training-380514'
LOCATION = "us-central1"
MODEL_NAME = "beatles_automl_file_out_2200_tags"
TARGET_COLUMN = "Like_The_Beatles"

In [6]:
from google.cloud import aiplatform

In [7]:
def create_endpoint(
    project: str,
    display_name: str,
    location: str,
):
    """Create an Vertex AI Model Endpoint in the given project and location"""
    
    aiplatform.init(project=project, location=location)

    endpoint = aiplatform.Endpoint.create(
        display_name=display_name,
        project=project,
        location=location,
    )

    print(endpoint.display_name)
    print(endpoint.resource_name)
    return endpoint

In [4]:
# Note that you don't have to create an endpoint every time you run this notebook
# create_endpoint(PROJECT_NAME, f'{MODEL_NAME}_endpoint', LOCATION)

Creating Endpoint
Create Endpoint backing LRO: projects/354621994428/locations/us-central1/endpoints/1823759936992051200/operations/8554083709706305536
Endpoint created. Resource name: projects/354621994428/locations/us-central1/endpoints/1823759936992051200
To use this Endpoint in another session:
endpoint = aiplatform.Endpoint('projects/354621994428/locations/us-central1/endpoints/1823759936992051200')
beatles_automl_file_out_2200_tags_endpoint
projects/354621994428/locations/us-central1/endpoints/1823759936992051200


<google.cloud.aiplatform.models.Endpoint object at 0x7f33d17e7850> 
resource name: projects/354621994428/locations/us-central1/endpoints/1823759936992051200

In [8]:
def deploy_model(
    project: str,
    location: str,
    model_name: str,
    endpoint_name: str
):
    """
    model_name: A fully-qualified model resource name or model ID.
    endpoint_name: A fully-qualified endpoint resource name or endpoint ID.
    """

    aiplatform.init(project=project, location=location)

    model = aiplatform.Model(model_name=model_name)
    endpoint = aiplatform.Endpoint(endpoint_name=endpoint_name)

    model.deploy(
        endpoint=endpoint,
        machine_type="e2-standard-4"
    )

    model.wait()

    print(model.display_name)
    print(model.resource_name)
    return model

In [7]:
# deploy_model(
#     PROJECT_NAME,
#     LOCATION,
#     "projects/354621994428/locations/us-central1/models/5489591077625135104",
#     "projects/354621994428/locations/us-central1/endpoints/1823759936992051200"
# )

Deploying model to Endpoint : projects/354621994428/locations/us-central1/endpoints/1823759936992051200
Deploy Endpoint model backing LRO: projects/354621994428/locations/us-central1/endpoints/1823759936992051200/operations/6910269845716074496
Endpoint model deployed. Resource name: projects/354621994428/locations/us-central1/endpoints/1823759936992051200
beatles_automl_file_out_2485_tags-automl
projects/354621994428/locations/us-central1/models/5489591077625135104


<google.cloud.aiplatform.models.Model object at 0x7f33d05a3ed0> 
resource name: projects/354621994428/locations/us-central1/models/5489591077625135104

Now that the model is deployed to the prediction endpoint, we will use our test data point and make an API call to the Vertex AI online inference service, in order to predict whether this user would like the Beatles or not.

In [9]:
from typing import List, Dict

def predict_tabular_classification(
    project: str,
    location: str,
    endpoint_name: str,
    instances: List[Dict],
):
    """
    Args
        project: Your project ID or project number.
        location: Region where Endpoint is located. For example, 'us-central1'.
        endpoint_name: A fully qualified endpoint name or endpoint ID. Example: "projects/123/locations/us-central1/endpoints/456" or
               "456" when project and location are initialized or passed.
        instances: A list of one or more instances (examples) to return a prediction for.
    """
    aiplatform.init(project=project, location=location)

    endpoint = aiplatform.Endpoint(endpoint_name)

    response = endpoint.predict(instances=instances)

    for prediction_ in response.predictions:
        print(prediction_)
        return prediction_

In [10]:
import pandas as pd

inference_sample = pd.read_feather('test_data/inference_sample.feather')

In [11]:
import json

In [12]:
inference_sample

Unnamed: 0,user_name,30_Seconds_to_Mars,65daysofstatic,A_Perfect_Circle,A_Tribe_Called_Quest,ABBA,ACDC,Adele,Aerosmith,Air,...,tag_shoegazer,tag_hair_metal,tag_rapcore,tag_underground_hip_hop,tag_symphonic_black_metal,tag_darkwave,tag_world,tag_latin,tag_spanish,Like_The_Beatles
0,thegiant,1.0,,,,,,11.0,1.0,,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,True
1,nezter,,,,,,,,,3.0,...,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,1.0,False
2,augustohp,,52.0,502.0,,1.0,452.0,1.0,215.0,14.0,...,0.0,2.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,True
3,stalphonzo,,,,,,6.0,,,,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,True
4,davenall,,,,,,,,,,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,False
5,Andy_Greenwell,,,,,,,,,,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,True
6,lilyean,,,,,,,,,,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,False
7,absentbebnim,,,,,,,,,,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,False
8,adherr,,,,,,,,,,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,False
9,auserzz,,,,,,,25.0,,,...,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,False


In [13]:
inference_results = []
for index, row in inference_sample.iterrows():
    instance = json.loads(row.astype(str).to_json())
    results = predict_tabular_classification(PROJECT_NAME, LOCATION, 'projects/354621994428/locations/us-central1/endpoints/1823759936992051200', [instance])
    inference_results.append(results)


{'classes': ['True', 'False'], 'scores': [0.2971574068069458, 0.7028425931930542]}
{'classes': ['True', 'False'], 'scores': [0.4739128947257996, 0.5260871648788452]}
{'classes': ['True', 'False'], 'scores': [0.9814208745956421, 0.01857918873429298]}
{'classes': ['True', 'False'], 'scores': [0.4892224967479706, 0.510777473449707]}
{'classes': ['True', 'False'], 'scores': [0.02431050315499306, 0.9756895899772644]}
{'classes': ['True', 'False'], 'scores': [0.03823720291256905, 0.9617628455162048]}
{'scores': [0.03098485246300697, 0.9690151214599609], 'classes': ['True', 'False']}
{'scores': [0.02387252263724804, 0.9761275053024292], 'classes': ['True', 'False']}
{'classes': ['True', 'False'], 'scores': [0.5519230365753174, 0.4480769634246826]}
{'classes': ['True', 'False'], 'scores': [0.04236870259046555, 0.9576313495635986]}


In [14]:
import json

with open('inference.json', 'w') as outfile:
    json.dump(inference_results, outfile)

### Undeploy Model from Vertex AI Endpoint

### Pricing Notes
Resources that incur costs
Answer: you pay for three main activities
- Training the model
    - Price per node hour of tabular classification is $21.252, so that's the charge I incur every time I train the AutoML Beatles Model
- Deploying the model to an endpoint (models must be deployed before they can make either online predictions or online evaluations)
    - You pay for each model deployed to an endpoint, even if no prediction is made
    - Must undeploy your model to stop incurring further charges
    - Models that are not deployed or have failed to deploy are not charged
- Using the model to make predictions; this is for both batch and online predictions (which I think is BS, since we're also paying to host the model at an endpoint, but whatever)