# Classification Model using BQML deploying to Vertex AI (Manual) 
This notebook will walk through the creation process of a BQML classification model for marketing data on the new Vertex Managed Notebooks. The classification goal is to predict if the client targeted by a marketing campaign will subscribe to a new finanical product (variable y).

[Data Source](https://archive.ics.uci.edu/ml/datasets/bank+marketing#)
 | [Raw Data: bank-additional](https://archive.ics.uci.edu/ml/machine-learning-databases/00222/) 
 | * Note: change column name 'default' - special BQ statement

## Environment Set up
This demo uses ecommerce Google Analytics data, publicly available as a BigQuery dataset. 

In [1]:
PROJECT_ID = 'sandbox-marina' #replace value
LOCATION = 'us-central1'
DATASET_NAME= 'marketing_bank'
MODEL_NAME= 'cl_model_willbuy'

BUCKET_NAME = f'{PROJECT_ID}_{MODEL_NAME}' ## or replace with string of bucket name already created 
DATASET_ID = f'{PROJECT_ID}.{DATASET_NAME}'## or replace with dataset already created 

### Create Bucket and Dataset 
*(If you haven't already)*

**Create New Bucket**

In [2]:
from google.cloud import storage

storage_client = storage.Client()
bucket = storage_client.bucket(BUCKET_NAME)
new_bucket = storage_client.create_bucket(bucket, location=LOCATION)

**Create Dataset**

In [None]:
from google.cloud import bigquery

client = bigquery.Client()
dataset = bigquery.Dataset(DATASET_ID)
dataset.location = "US"
dataset = client.create_dataset(dataset, timeout=30) 

print("Created dataset {}.{}".format(client.project, dataset.dataset_id))

## Explore Data
Preview Data 

#@bigquery
SELECT * 
FROM `marketing_bank.training` LIMIT 5

What is our conversion rate? 

#@bigquery
SELECT
  COUNT(y) AS total_target_market,
  COUNTIF(y is TRUE) AS total_purchasers,
  ROUND( COUNTIF(y is TRUE) / COUNT(y),2) AS conversion_rate
FROM
    `marketing_bank.training` 

## Create BQML Classification Model
Predict if targeted individual "will subscribe" or "won't subscribe", using logistic_reg in a classification model. Replace your model destination and name if needed.

* We use TRANSFORM in the model defenition to automatically apply any tranformations during prediction and evaluation. 
* We have used FEATURE_CROSS to create a feature cross of if the indiviual has a house and a loan. 
* model type is `logistic_reg` as we have a binary classification problem
* To implement explainability at a model level we set ENABLE_GLOBAL_EXPLAIN to True 


In [3]:
# Model Name 
f'{DATASET_NAME}.{MODEL_NAME}'

'marketing_bank.cl_model_willbuy'

#@bigquery
CREATE OR REPLACE MODEL `marketing_bank.cl_model_willbuy`
TRANSFORM(
    ML.FEATURE_CROSS(STRUCT(housing,loan)) AS house_loan,
    ML.FEATURE_CROSS(STRUCT(month,day_of_week)) AS day_month, *
)
OPTIONS(
    model_type='logistic_reg',
    labels = ['y'],
    ENABLE_GLOBAL_EXPLAIN = True
    )
AS 
SELECT * 
FROM `marketing_bank.training`

## Evaluation 

#@bigquery
#@bigquery
SELECT * FROM 
ML.EVALUATE (MODEL marketing_bank.cl_model_willbuy)

## Explain Model

#@bigquery
SELECT * 
FROM ML.GLOBAL_EXPLAIN(MODEL marketing_bank.cl_model_willbuy)

## Predict

#@bigquery

SELECT
  *
FROM 
	ML.EXPLAIN_PREDICT( MODEL `marketing_bank.cl_model_willbuy`,
    (
    SELECT *
    FROM
      `marketing_bank.test`),
    STRUCT(0.2 AS threshold))

# Export Model to Vertex AI 
While being able to TRANSFORM features in BQ is great, this featyure is not supported by the export functionality. So we will re-run with out adding additional features.

#@bigquery
CREATE OR REPLACE MODEL `marketing_bank.cl_model_willbuy`
OPTIONS(
    model_type='logistic_reg',
    labels = ['y'],
    ENABLE_GLOBAL_EXPLAIN = True
    )
AS 
SELECT * 
FROM `marketing_bank.training` 

**Export to GCS then upload to Vertex AI**

In [4]:
!bq extract -m {PROJECT_ID}:{DATASET_NAME}.{MODEL_NAME} gs://{BUCKET_NAME}/{MODEL_NAME}

Waiting on bqjob_r2c8a72a3935ddc25_0000017c158ff537_1 ... (22s) Current status: DONE   


**Upload model to Vertex**

In [5]:
from google.cloud import aiplatform
aiplatform.init(project=PROJECT_ID, location=LOCATION)
model = aiplatform.Model.upload(
        display_name=MODEL_NAME,
        artifact_uri= f'gs://{BUCKET_NAME}/{MODEL_NAME}',
        serving_container_image_uri='us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-3:latest' ) 
model.wait()

print(model.display_name)
print(model.resource_name)

ImportError: cannot import name 'aiplatform' from 'google.cloud' (unknown location)

**Deploy to an endpoint**

In [None]:
endpoint = aiplatform.Endpoint.create( display_name=f'{MODEL_NAME}_endpt', project=PROJECT_ID, location=LOCATION)

model.deploy(
        endpoint=endpoint,
        traffic_percentage = 100,
        machine_type ='n1-highcpu-2')

model.wait()
print(model.resource_name)

## Endpoint Prediction
Predict if the visitors will buy on return of visit using the API 

In [None]:
%%writefile default-pred.json
{"instances": [{"age" :39,"job":"self-employed","marital":"divorced","education":"high.school","defaulted":"no","housing": "no",
                "loan":"no","contact":"cellular","month":"sep","day_of_week":tue,"duration":261,"campaign":1,"pdays":3,"previous":1,
                "poutcome":"success","emp_var_rate":-3.4,"cons_price_idx":92.379,"cons_conf_idx":-29.8,"euribor3m":0.788,"nr_employed":5018.5}]}

In [None]:
ENDPOINT_ID=endpoint.resource_name

!curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-prediction-aiplatform.googleapis.com/v1alpha1/$ENDPOINT_ID:predict \
-d "@default-pred.json"

# Creating a Vertex AI Pipeline

Now that we have manually created a model and exported it to a vertex endpoint, we can create a pipeline which can be retrained adding MLOps capabilities for a production grade ML workflow. See the bqpipeline_demo.ipynb Notebook. 

In [None]:
#Additional resources: 
#https://www.qwiklabs.com/focuses/1794?parent=catalog
#https://cloud.google.com/bigquery-ml/docs/exporting-models?_ga=2.59990958.-2027684164.1621380090
#https://docs.google.com/document/d/1wre9hLVx-H8syG-806UPWGJbDVGieIM5VvFKi8lbGtw/edit
