# **Lab 3:** Vertex AI Model Deployment
This lab deploys our trained BQML model to Vertex AI. We will then submit our inference requests for prediction in real time!

In [1]:
! pip install --quiet --upgrade google-cloud-aiplatform 

In [2]:
project_id   = ""
team_name    = "" 
location     = "us" #This is currently necessary
region       = "us-central1"

dataset_name = "datathon_ds_{}".format(team_name)
bucket_name  = "gs://{}_{}".format(project_id,dataset_name)

In [3]:
from typing import Dict, List, Union
from google.cloud import aiplatform
from google.protobuf import json_format
from google.protobuf.struct_pb2 import Value
from google.cloud import bigquery
from google.cloud.bigquery import Client, QueryJobConfig
import json

client = bigquery.Client(project=project_id)

In [4]:
! gcloud config set project $project_id

Updated property [core/project].


In [5]:
! gsutil mb -l $region $bucket_name

Creating gs://qwiklabs-gcp-01-723f91d866d9_datathon_ds_admin/...


TODO enable ml.googleapis.com API

## Deploying your BQML model to Vertex

**Step 1**: Export BQML model to GCS bucket

To export your trained BQML model to the previously created bucket by follow the steps provided [[here]](https://cloud.google.com/bigquery/docs/exporting-models#export)

**Step 2** Import model to vertex model registry

We will now import the model to vertex ai model registry using the steps mentioned [[here]](https://cloud.google.com/vertex-ai/docs/model-registry/import-model#import_a_model_using)

Under model settings select:
*   Model framework -> tensorflow
*   Model framework version -> 1.15
*   Accelerator type -> None
*   leave all other settings as default

**Step 3** Deploy model to vertex endpoint (~15 mins)

We will now deploy our model to an endpoint following the steps provided [[here]](https://cloud.google.com/vertex-ai/docs/predictions/get-predictions#deploy_a_model_to_an_endpoint)

Under model settings select:
*   machine type -> n1-standard-2
*   leave all other settings as default
*   skip model monitoring section

Note that the model deployment will take several minutes (~15 mins)




## Run inference on Deployed Model in real time

To run your first real time prediction:
*   Select your model endpoint under deployments. 
*   Select the ```DEPLOY & TEST``` sub-menu.
*   Under the ```Test your model``` section paste the payload provided below


```
{"instances": [{"country":"India","operating_system":"ANDROID","language":"en-us","cnt_user_engagement": 72,"cnt_level_start_quickplay": 0,"cnt_level_end_quickplay": 6,"cnt_level_complete_quickplay": 3,"cnt_level_reset_quickplay": 1,"cnt_post_score": 9,"cnt_spend_virtual_currency": 0,"cnt_ad_reward": 0,"cnt_challenge_a_friend": 0,"cnt_completed_5_levels": 1,"cnt_use_extra_steps": 0,"user_first_engagement": 1533434460293005}
]}
```

Congrats!!!! You have successfully recived your first prediction

## Sending larger requests to the endpoint

In [6]:
query = f"""SELECT * FROM `{dataset_name}.cc_eval_dataset`"""
job = client.query(query)
df = job.to_dataframe()

In [7]:
df = df.drop(['user_pseudo_id', 'churned'], axis=1)


In [8]:
df.head(2)

Unnamed: 0,country,operating_system,language,cnt_user_engagement,cnt_level_start_quickplay,cnt_level_end_quickplay,cnt_level_complete_quickplay,cnt_level_reset_quickplay,cnt_post_score,cnt_spend_virtual_currency,cnt_ad_reward,cnt_challenge_a_friend,cnt_completed_5_levels,cnt_use_extra_steps,user_first_engagement
0,Germany,IOS,de-de,7,2,2,0,0,1,0,0,0,0,0,1529940146726009
1,United States,IOS,en-us,120,40,38,20,0,20,0,0,0,0,0,1528868733738017


In [9]:
df = df.dropna()

In [10]:
# format the dataframe for the endpoint
payload = json.loads(df.to_json(orient="records"))

In [11]:
def predict_custom_trained_model_sample(
    project: str,
    endpoint_id: str,
    instances: Union[Dict, List[Dict]],
    location: str = "us-central1",
    api_endpoint: str = "us-central1-aiplatform.googleapis.com",
):
    """
    `instances` can be either single instance of type dict or a list
    of instances.
    """
    # The AI Platform services require regional API endpoints.
    client_options = {"api_endpoint": api_endpoint}
    # Initialize client that will be used to create and send requests.
    # This client only needs to be created once, and can be reused for multiple requests.
    client = aiplatform.gapic.PredictionServiceClient(client_options=client_options)
    # The format of each instance should conform to the deployed model's prediction input schema.
    instances = instances if type(instances) == list else [instances]
    instances = [
        json_format.ParseDict(instance_dict, Value()) for instance_dict in instances
    ]
    parameters_dict = {}
    parameters = json_format.ParseDict(parameters_dict, Value())
    endpoint = client.endpoint_path(
        project=project, location=location, endpoint=endpoint_id
    )
    response = client.predict(
        endpoint=endpoint, instances=instances, parameters=parameters
    )
    print("response")
    print(" deployed_model_id:", response.deployed_model_id)
    # The predictions are a google.protobuf.Value representation of the model's predictions.
    predictions = response.predictions
    for prediction in predictions:
        print(" prediction:", dict(prediction))

In [12]:
predict_custom_trained_model_sample(
    project="<your-project-id>",
    endpoint_id="your-vertex-endpoint-id",
    location="us-central1",
    instances=payload
)

response
 deployed_model_id: 6358873766637862912
 prediction: {'predicted_churned': ['0'], 'churned_values': ['1', '0'], 'churned_probs': [0.2500426757360779, 0.7499573242639221]}
 prediction: {'predicted_churned': ['0'], 'churned_values': ['1', '0'], 'churned_probs': [0.2942146998622611, 0.705785300137739]}
 prediction: {'churned_probs': [0.2673844345280815, 0.7326155654719185], 'predicted_churned': ['0'], 'churned_values': ['1', '0']}
 prediction: {'churned_probs': [0.2683011743429922, 0.7316988256570078], 'predicted_churned': ['0'], 'churned_values': ['1', '0']}
 prediction: {'churned_values': ['1', '0'], 'predicted_churned': ['0'], 'churned_probs': [0.3544376882237117, 0.6455623117762883]}
 prediction: {'churned_values': ['1', '0'], 'churned_probs': [0.3698011653546153, 0.6301988346453847], 'predicted_churned': ['0']}
 prediction: {'churned_values': ['1', '0'], 'predicted_churned': ['0'], 'churned_probs': [0.3134395170988281, 0.686560482901172]}
 prediction: {'predicted_churned': [

## [Optional] Deploy a second version to the same endpoint

You can continue to iterate on your BQML models for improving the ROC metric. Once you are comfortable with your models, we can explore the advanced features of vertex endpoints in the below sections

Lets deploy a second version of our logistic model to the same endpoint and split the traffic 50% to each model. 

Note that deploying a new version to the same endpoint takes lesser time (~5 min)

## [Optional] Lets monitor for prediction skew for incoming requests in our endpoint

In [None]:
#create df for data skew
df_data_skew = df.copy(deep=True)

In [1]:
from random import randrange

# add skew to the integer fields of our evaluation dataset 
df_data_skew['cnt_spend_virtual_currency'] = [ randrange(10000,100000)  for k in df_data_skew.index]
df_data_skew['cnt_user_engagement'] = [ randrange(2000,5000)  for k in df_data_skew.index]
df_data_skew['cnt_challenge_a_friend'] = [ randrange(10,20)  for k in df_data_skew.index]

NameError: ignored

In [None]:
df_data_skew.describe()

Unnamed: 0,cnt_user_engagement,cnt_level_start_quickplay,cnt_level_end_quickplay,cnt_level_complete_quickplay,cnt_level_reset_quickplay,cnt_post_score,cnt_spend_virtual_currency,cnt_ad_reward,cnt_challenge_a_friend,cnt_completed_5_levels,cnt_use_extra_steps,user_first_engagement
count,820.0,820.0,820.0,820.0,820.0,820.0,820.0,820.0,820.0,820.0,820.0,820.0
mean,3511.506098,9.279268,5.636585,2.170732,2.856098,4.947561,54794.231707,0.029268,14.593902,0.131707,0.371951,1532503222071694.0
std,858.023083,48.309329,16.776841,8.440707,40.023604,12.271887,25956.887559,0.375205,2.945946,0.352517,1.509253,3040558481159.4014
min,2000.0,0.0,0.0,0.0,0.0,0.0,10073.0,0.0,10.0,0.0,0.0,1528786836978009.0
25%,2773.5,1.0,0.0,0.0,0.0,0.0,33236.25,0.0,12.0,0.0,0.0,1529911261936505.2
50%,3542.5,2.0,1.0,0.0,0.0,2.0,54996.0,0.0,15.0,0.0,0.0,1531479896363501.0
75%,4228.0,6.0,4.0,2.0,0.0,5.0,76461.0,0.0,17.0,0.0,0.0,1535680959761751.5
max,4987.0,1255.0,255.0,176.0,1122.0,200.0,99929.0,7.0,19.0,2.0,18.0,1538603719745001.0


In [None]:
skew_payload = json.loads(df_data_skew.to_json(orient="records"))


In [None]:
predict_custom_trained_model_sample(
    project="<your-project-id>",
    endpoint_id="your-vertex-endpoint-id",
    location="us-central1",
    instances=skew_payload
)

response
 deployed_model_id: 6358873766637862912
 prediction: {'predicted_churned': ['0'], 'churned_values': ['1', '0'], 'churned_probs': [0.2355812018174638, 0.7644187981825361]}
 prediction: {'predicted_churned': ['1'], 'churned_probs': [0.5940648438790859, 0.4059351561209141], 'churned_values': ['1', '0']}
 prediction: {'churned_values': ['1', '0'], 'churned_probs': [0.6645167253144579, 0.3354832746855421], 'predicted_churned': ['1']}
 prediction: {'predicted_churned': ['0'], 'churned_probs': [0.2808192049872898, 0.7191807950127102], 'churned_values': ['1', '0']}
 prediction: {'churned_values': ['1', '0'], 'churned_probs': [0.3110666241688444, 0.6889333758311555], 'predicted_churned': ['0']}
 prediction: {'churned_values': ['1', '0'], 'predicted_churned': ['0'], 'churned_probs': [0.2578796000089605, 0.7421203999910395]}
 prediction: {'churned_values': ['1', '0'], 'churned_probs': [0.2295484150448799, 0.7704515849551201], 'predicted_churned': ['0']}
 prediction: {'churned_values': ['