# Model Training
## Contents
1. [BigQuery ML](#BQML)  
    1.1 [Training](#BQML_train)  
    1.2 [Evaluation](#BQML_eval)  
    1.3 [Prediction](#BQML_pred)
2. [AutoML Tables](#AutoMLTables)  
    2.1 [AutoML Tables UI](#AutoMLTablesUI)  
    2.2 [AutoML Tables API](#AutoMLTablesAPI)

In [5]:
from google.cloud import bigquery
import pandas as pd
import numpy as np
import sys

<a id='BQML'></a>
# BigQuery ML (BQML)
Reference the [CREATE MODEL syntax](https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-create) to learn about additional model_options for your BigQuery ML model.  
<br/>
This is a great option if you are very comfortable with SQL and want to quickly iterate and test models.
<br/>
BigQuery ML takes care of the following preprocessing steps:
- Null imputation
- One-hot encoding  
<br/>
<a id='BQML_train'></a>  

## Train BQML Model
The below example assumes that you have already loaded a preprocessed table into BigQuery (See `Preprocessing.ipynb` for more information on preprocessing).  
If you want to additional preprocessing in BigQuery, just add the transformations to the select statement.  
<br>The below code sample will only train a model if a model with the same name does not yet exist. This requirement ensures that we can compare model iterations. If you would like to train a new model, change `CREATE MODEL IF NOT EXISTS` to:
- `CREATE OR REPLACE MODEL [existing_model_name]`: if you would like to overwrite an existing model, if it exists
- `CREATE MODEL IF NOT EXISTS [new_model_name]`: if you would like to create a new model, not overwriting the old model

In [None]:
sql = """
CREATE MODEL IF NOT EXISTS `test_upload.sample_model`
OPTIONS(
    MODEL_TYPE='logistic_reg',
    INPUT_LABEL_COLS = ['opened'],
    DATA_SPLIT_METHOD = 'CUSTOM',
    DATA_SPLIT_COL = 'eval'
    ) AS
SELECT * EXCEPT(campaign_send_dt, riid) # Use all columns as features besides key columns (campaign_send_dt and riid)
FROM `test_upload.pandas_table`
"""

client = bigquery.Client()
query_job = client.query(sql) # API request
result = query_job.to_dataframe()

<a id='BQML_eval'></a>  
## Evaluate BQML Model
You have multiple options for analyzing a BQML model's evaluation metrics (i.e. precision, recall, etc...).  
<br/>
As long as you don't overwrite your old BQML models (i.e. by running `CREATE OR REPLACE MODEL...` and not using a new model name), you'll have a collection of old BigQuery models to reference and compare.

### Option #1: Via BigQuery UI
Evaluation metrics for each of your models can be found in the [BigQuery UI](https://console.cloud.google.com/bigquery) under the Evaluation tab.
<br>
<img src="img/eval_metrics_bqml.png" title="Eval Metrics"/>   
<br>
Available metrics include:
- ROC AUC
- Log loss
- Interactive (for different classification thresholds) precision, recall, accuracy, F1 score metrics
- Confusion matrix
- Precision-recall curve
- Precision and Recall vs. Threshold
- ROC Curve  
  
  
### Option #2: Via BigQueryML
You can also access Evaluation Metrics using BQML queries, as shown in the samples below. More information about using `ML.EVALUATE` can be found [here](https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-evaluate).  
  
You can either use the Python BigQuery API (from this notebook) or the [BigQuery UI](https://console.cloud.google.com/bigquery) to run these queries.

If you don't specify a table for `ML.EVALUATE`, the metrics will based on Evaluation data (as specified during model training). If there are more columns in the provided or default table than were used for model training (i.e. key columns), these columns will be ignored.

In [18]:
sql = """
SELECT *
FROM ML.EVALUATE(MODEL `test_upload.sample_model`)
"""
query_job = client.query(sql) # API request
result = query_job.to_dataframe()
print(result)

   precision    recall  accuracy  f1_score  log_loss   roc_auc
0   0.618956  0.279507  0.834747  0.385108  0.423249  0.573457


<br/>
You can also specify a table and/or custom threshold.  
  
If your source table has different column names and transformations than the table used for training, make sure to apply these transformations and rename the columns before using it to query evaluation matrix.

In [20]:
sql = """
SELECT *
FROM ML.EVALUATE(MODEL `test_upload.sample_model`,
    (
    SELECT opened,
        hist_opens,
        hist_sends,
        hist_open_rate
    FROM `test_upload.pandas_table`
    WHERE eval),
    STRUCT(0.55 AS threshold))
"""
query_job = client.query(sql) # API request
result = query_job.to_dataframe()
print(result)

   precision    recall  accuracy  f1_score  log_loss   roc_auc
0   0.620039  0.276913  0.834707  0.382845  0.423249  0.573457


#### ROC Curve
`ML.ROC_CURVE` returns evaluation metrics for different classification thresholds.

In [23]:
sql = """
    SELECT
      *
    FROM
      ML.ROC_CURVE(MODEL `test_upload.sample_model`,
        TABLE `test_upload.pandas_table`)
"""
query_job = client.query(sql) # API request
result = query_job.to_dataframe()

In [24]:
result

Unnamed: 0,threshold,recall,false_positive_rate,true_positives,false_positives,true_negatives,false_negatives
0,0.960219,0.000107,0.0,2,0,81374,18624
1,0.743504,0.052668,0.002593,981,211,81163,17645
2,0.654836,0.12198,0.009131,2272,743,80631,16354
3,0.553906,0.272791,0.039939,5081,3250,78124,13545
4,0.488284,0.293407,0.045838,5465,3730,77644,13161
5,0.317057,0.348599,0.074078,6493,6028,75346,12133
6,0.239029,0.368034,0.088996,6855,7242,74132,11771
7,0.170302,0.378664,0.099037,7053,8059,73315,11573
8,0.167541,0.811178,0.585199,15109,47620,33754,3517
9,0.110966,0.923977,0.810283,17210,65936,15438,1416


### Tuning probability threshold  
We can tune the threshold to achieve a certain recall (then you will live with whatever precision you get). Let’s say that we want to make sure to identify at least 70% of opened emails, i.e. we want a recall of 0.7.   
<br/>We can identify this graph by simply using looking at the chart above, referencing the interactive plots in the BigQuery UI, or by using the below query to identify the given threshold (as explained [here](https://towardsdatascience.com/how-to-tune-a-bigquery-ml-classification-model-to-achieve-a-desired-precision-or-recall-e4d40b93016a)).

In [28]:
sql = """
    WITH roc AS (
        SELECT
          *
        FROM
          ML.ROC_CURVE(MODEL `test_upload.sample_model`,
            (SELECT opened,
                hist_opens,
                hist_sends,
                hist_open_rate
            FROM `test_upload.pandas_table`
            WHERE eval = False)
            ))
    SELECT
        threshold,
        recall, false_positive_rate,
        ABS(recall - 0.7) AS from_desired_recall
    FROM roc
    ORDER BY from_desired_recall ASC
    LIMIT 1    
"""
query_job = client.query(sql) # API request
result = query_job.to_dataframe()
print(result)

   threshold    recall  false_positive_rate  from_desired_recall
0   0.167541  0.810714             0.586374             0.110714


### Confusion Matrix
More information about BQML Confusion Matrices can be found [here](https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-confusion)

In [43]:
sql = """
SELECT
  *
FROM
  ML.CONFUSION_MATRIX(MODEL `test_upload.sample_model`,
  (
    SELECT *
    FROM `test_upload.pandas_table`
    WHERE eval),
    STRUCT(0.55 AS threshold)
    )
"""
query_job = client.query(sql) # API request
result = query_job.to_dataframe()
result

Unnamed: 0,expected_label,_0,_1
0,0,19575,785
1,1,3345,1281


### Feature Info
`ML.FEATURE_INFO`, as explained [here](https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-feature), returns information about input features used to train the model including:
- input: name of the column
- min: sample minimum (NULL if categorical)
- max: sample maximum (NULL if categorical)
- mean: sample average (NULL if categorical)
- stddev: sample standard deviation (NULL if categorical)
- categorical_count: number of categories (NULL if not categorical)
- null_count - number of NULLs

In [34]:
sql = """
    SELECT
      *
    FROM
      ML.FEATURE_INFO(MODEL `test_upload.sample_model`)
"""

query_job = client.query(sql) # API request
result = query_job.to_dataframe()
result

Unnamed: 0,input,min,max,mean,median,stddev,category_count,null_count
0,hist_opens,0.0,8.0,0.228517,0.0,0.633862,,0
1,hist_sends,0.0,10.0,1.008878,1.0,1.283603,,0
2,hist_open_rate,0.0,1.0,0.215606,0.0,0.370378,,35713


### Weights
The `ML.WEIGHTS` function allows you to see the underlying weights used by a model during prediction, as explained [here](https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-weights).  
  
Our example does not include categorical columns. However, if we want to look at weights for categorical (one-hot encoded) features, use the following query:
```
SELECT
  category,
  weight
FROM
  UNNEST((
    SELECT
      category_weights
    FROM
      ML.WEIGHTS(MODEL `[dataset_id].[model_name]`)
    WHERE
      processed_input = '[categorical_column]'))
```  
It's also very simple to look at the weights of numeric or boolean features, as shown below.

In [39]:
sql = """
    SELECT
      processed_input, weight
    FROM
      ML.WEIGHTS(MODEL `test_upload.sample_model`)
"""

query_job = client.query(sql) # API request
result = query_job.to_dataframe()
result

Unnamed: 0,processed_input,weight
0,hist_opens,0.518016
1,hist_sends,-0.094116
2,hist_open_rate,1.779359
3,__INTERCEPT__,-1.986794


<a id='BQML_pred'></a>  
## Predictions using BQML Model
The `ML.PREDICT` function can be used to predict outcomes using the model, as explained [here](https://cloud.google.com/bigquery-ml/docs/reference/standard-sql/bigqueryml-syntax-predict).

In [46]:
sql = """
    SELECT
      *
    FROM
      ML.PREDICT(MODEL `test_upload.sample_model`,
      (
        SELECT *
        FROM `test_upload.pandas_table`
        WHERE eval),
        STRUCT(0.55 AS threshold)
        )
"""

query_job = client.query(sql) # API request
result = query_job.to_dataframe()
result.head()

Unnamed: 0,predicted_opened,predicted_opened_probs,riid,campaign_send_dt,opened,hist_opens,hist_sends,hist_open_rate,eval
0,0,"[{'prob': 0.1675412616617039, 'label': 1}, {'p...",737847182,2018-01-01,0,0,0,,True
1,0,"[{'prob': 0.1675412616617039, 'label': 1}, {'p...",566134962,2018-01-01,0,0,0,,True
2,0,"[{'prob': 0.1675412616617039, 'label': 1}, {'p...",849236702,2018-01-01,0,0,0,,True
3,0,"[{'prob': 0.1675412616617039, 'label': 1}, {'p...",825551142,2018-01-01,0,0,0,,True
4,0,"[{'prob': 0.1675412616617039, 'label': 1}, {'p...",825759702,2018-01-01,0,0,0,,True


The predicted probabilities for each class are stored in a nested array. We can use BigQuery's `UNNEST` function to find the probabilities of an opened email.  

Notice that the name of the prediction column (`predicted_opened`) is formatted `predicted_[name_of_label_column]` and the column containing the nested probabilities (`predicted_opened_probs`) is formatted `predicted_[name_of_label_column]_probs`. You will need to replace the `opened` with the name of your label column in the code samples below.

In [49]:
sql = """WITH results  AS (
      SELECT
        *
      FROM
        ML.PREDICT(MODEL `test_upload.sample_model`,
        (
          SELECT *
          FROM `test_upload.pandas_table`
          WHERE eval),
          STRUCT(0.55 AS threshold)
          ))
    SELECT riid,
        campaign_send_dt,
        predicted_opened, # Replace with predict_[name_of_label_column]
        probs.prob
    FROM results, UNNEST(predicted_opened_probs) as probs # Replace table in UNNEST(...) with predict_[name_of_label_column]_probs
    WHERE probs.label = 1"""

query_job = client.query(sql) # API request
result = query_job.to_dataframe()
result.head()

Unnamed: 0,riid,campaign_send_dt,predicted_opened,prob
0,737847182,2018-01-01,0,0.167541
1,566134962,2018-01-01,0,0.167541
2,849236702,2018-01-01,0,0.167541
3,825551142,2018-01-01,0,0.167541
4,825759702,2018-01-01,0,0.167541


<a id='AutoMLTables'></a>
# AutoML Tables
AutoML Tables trains supervised learning models on structured data using neural network architecture search.  
  
Keep in mind that you can access the same datasets and models from both the AutoML Tables UI and the AutoML Tables API.  
  
Like BQML, AutoML tables will preserve your model versions so that you can compare evaluation metrics and predictions.  
<a id='AutoMLTablesUI'></a>
## Option #1: AutoML Tables UI
You can use the [AutoML Tables UI](https://console.cloud.google.com/automl-tables) to import data.
1. Select "Create Dataset" and name dataset.
2. Select "Import data from BigQuery" and input the following values  
  -  BigQuery Project ID: levis-data-science-challenge
  -  BigQuery Dataset ID: [BigQuery_dataset_ID]
  -  BigQuery Table or View ID: [table_name_of_preprocessed_BQ_table]
3. Look at Schema tab (check Data Type and Nullability) and select a target column
  -  If you don't see the "Select a target" column, make sure your window is wide enough. 
  -  For this example dataset, select `opened` as the target column and press "Continue"
4. Look at Analyze tab (check for anomalies in the data)
5. Look at Train tab to start training job  
  -  Input numeric (1-72) budget
  -  Deselect key columns using "Input feature selection" (i.e. riid and campaign_send_dt)
  -  Select "Advanced options" to use a different optimization objective (i.e. Log loss) and/or turn off Early stopping if desired.
  - Select "Train"  
  
You will get an email when training completes. 
More information about training via the AutoML Table UI can be found [here](https://cloud.google.com/automl-tables/docs/quickstart).  
  
<a id='AutoMLTablesAPI'></a>
## Option #2: AutoML Tables API  
Follow the below example to use the AuttoML Tables API. Reference the markdown cell above the code to ensure that your code is updated to your own data.

In [None]:
%%bash
pip3 install google-cloud-automl

In [8]:
from google.cloud import automl_v1beta1
from google.cloud import bigquery
import os

Set `project_id` to your GCP project ID.

In [5]:
client = automl_v1beta1.AutoMlClient()
prediction_client = automl_v1beta1.PredictionServiceClient()
project_id = 'email-propensity-sandbox' # replace with your project ID
location = 'us-central1'
location_path = client.location_path(project_id, location)
location_path

Set `dataset_display_name` to your desired dataset name

In [12]:
dataset_display_name = 'colab_trial11' # replace with your desired dataset name
create_dataset_response = client.create_dataset(
    location_path,
    {
        'display_name': dataset_display_name,
        'tables_dataset_metadata': {}
    }
)
dataset_name = create_dataset_response.name

Set `dataset_bq_input_uri` to `bq://[project_id].[dataset_id].[bq_table]`

In [9]:
dataset_bq_input_uri = 'bq://email-propensity-sandbox.test_upload.pandas_table'
# Define input configuration.
input_config = {
    'bigquery_source': {
        'input_uri': dataset_bq_input_uri
    }
}

The below cell will continue running until the dataset is done uploading.

In [13]:
import_data_response = client.import_data(dataset_name, input_config)
print('Dataset import operation: {}'.format(import_data_response.operation))
# Wait until import is done.
import_data_result = import_data_response.result()
import_data_result

Dataset import operation: name: "projects/1030417721972/locations/us-central1/operations/TBL9051439204596187136"
metadata {
  type_url: "type.googleapis.com/google.cloud.automl.v1beta1.OperationMetadata"
  value: "\032\014\010\347\236\263\352\005\020\370\355\330\324\003\"\014\010\347\236\263\352\005\020\370\355\330\324\003z\000"
}





### Review the table specs
Run the following command to see table specs (i.e. row count)

In [15]:
import google.cloud.automl_v1beta1.proto.data_types_pb2 as data_types

# List table specs
list_table_specs_response = client.list_table_specs(dataset_name)
table_specs = [s for s in list_table_specs_response]
# List column specs
table_spec_name = table_specs[0].name
list_column_specs_response = client.list_column_specs(table_spec_name)
column_specs = {s.display_name: s for s in list_column_specs_response}
[(x, data_types.TypeCode.Name(
  column_specs[x].data_type.type_code)) for x in column_specs.keys()]

[('eval', 'CATEGORY'),
 ('hist_sends', 'CATEGORY'),
 ('campaign_send_dt', 'TIMESTAMP'),
 ('riid', 'FLOAT64'),
 ('hist_open_rate', 'FLOAT64'),
 ('hist_opens', 'CATEGORY'),
 ('opened', 'CATEGORY')]

### Update dataset (labels and data type)
AutoML Tables automatically detects your data column type.  
  
Depending on the type of your label column, AutoML Tables chooses to run a classification or regression model.  
  
Set `label_column_name` to the name of your label column.  


In [16]:
label_column_name = 'opened' # replace with name of label column
label_column_spec = column_specs[label_column_name]
label_column_id = label_column_spec.name.rsplit('/', 1)[-1]
print('Label column ID: {}'.format(label_column_id))

update_dataset_dict = {
    'name': dataset_name,
    'tables_dataset_metadata': {
        'target_column_spec_id': label_column_id
    }
}
update_dataset_response = client.update_dataset(update_dataset_dict)
update_dataset_response

Label column ID: 532168025890095104


name: "projects/1030417721972/locations/us-central1/datasets/TBL1257058449896767488"
display_name: "colab_trial11"
create_time {
  seconds: 1565314918
  nanos: 298986000
}
etag: "AB3BwFo49585RFFdUZj2yP0L7oUHTQvJnlMnuItBtoQ1_PazB0cvsrg_j-ehZUqJRefx"
example_count: 100000
tables_dataset_metadata {
  primary_table_spec_id: "3736914567368802304"
  target_column_spec_id: "532168025890095104"
  stats_update_time {
    seconds: 1565315026
    nanos: 184000000
  }
}

If any of your features were detected as the wrong data type, use the below code to change the data type. Valid data_type options can be found [here](https://cloud.google.com/automl-tables/docs/reference/rpc/google.cloud.automl.v1beta1#google.cloud.automl.v1beta1.TypeCode).  
  
Set `column_to_category` to the name of the relevant column and `type_code` to a string representation of your desired data type.

In [18]:
column_to_category = 'hist_sends'

update_column_spec_dict = {
    "name": column_specs[column_to_category].name,
    "data_type": {
        "type_code": "FLOAT64"
    }
}
update_column_response = client.update_column_spec(update_column_spec_dict)
update_column_response.display_name , update_column_response.data_type

('hist_sends', type_code: FLOAT64)

In [19]:
column_to_category = 'hist_opens'

update_column_spec_dict = {
    "name": column_specs[column_to_category].name,
    "data_type": {
        "type_code": "FLOAT64"
    }
}
update_column_response = client.update_column_spec(update_column_spec_dict)
update_column_response.display_name , update_column_response.data_type

('hist_opens', type_code: FLOAT64)

Let's check the table schema again to see the new values.

In [20]:
# List table specs
list_table_specs_response = client.list_table_specs(dataset_name)
table_specs = [s for s in list_table_specs_response]
# List column specs
table_spec_name = table_specs[0].name
list_column_specs_response = client.list_column_specs(table_spec_name)
column_specs = {s.display_name: s for s in list_column_specs_response}
[(x, data_types.TypeCode.Name(
  column_specs[x].data_type.type_code)) for x in column_specs.keys()]

[('eval', 'CATEGORY'),
 ('hist_sends', 'FLOAT64'),
 ('campaign_send_dt', 'TIMESTAMP'),
 ('riid', 'FLOAT64'),
 ('hist_open_rate', 'FLOAT64'),
 ('hist_opens', 'FLOAT64'),
 ('opened', 'CATEGORY')]

### Train the model
Set `model_display_name` to your desired name for the model.  
Set `model_train_hours` to a training budget which should be an integer between 1 and 72.  
Set `model_optimization_objective` to your desired optimization objective. Options for binary classification are "MAXIMIZE_AU_ROC" (default), "MINIMIZE_LOG_LOSS", or "MAXIMIZE_AU_PRC".  
Set `columns_to_ignore` to your key columns, or any other columns that should not be used as features.  
  
You can stop the cell and the model will continue training.

In [22]:
model_display_name = 'example_model' #@param {type:'string'}
model_train_hours = 12 #@param {type:'integer'}
model_optimization_objective = 'MINIMIZE_LOG_LOSS' #@param {type:'string'}
columns_to_ignore = ['riid', 'campaign_send_dt'] #@param {type:'string'}

# Create list of features to use
feat_list = list(column_specs.keys())
feat_list.remove(label_column_name)
for c in columns_to_ignore:
    feat_list.remove(c)

model_dict = {
    'display_name': model_display_name,
    'dataset_id': dataset_name.rsplit('/', 1)[-1],
    'tables_model_metadata': {
      'train_budget_milli_node_hours':model_train_hours * 1000,
      'optimization_objective': model_optimization_objective,
      'target_column_spec': column_specs[label_column_name],
      'input_feature_column_specs': [
            column_specs[x] for x in feat_list]}
    }
    
create_model_response = client.create_model(location_path, model_dict)
print('Dataset import operation: {}'.format(create_model_response.operation))
# Wait until model training is done.
create_model_result = create_model_response.result()
model_name = create_model_result.name
create_model_result

Dataset import operation: name: "projects/1030417721972/locations/us-central1/operations/TBL5414219555541090304"
metadata {
  type_url: "type.googleapis.com/google.cloud.automl.v1beta1.OperationMetadata"
  value: "\032\014\010\301\251\263\352\005\020\320\260\265\270\002\"\014\010\301\251\263\352\005\020\320\260\265\270\002R\000"
}



KeyboardInterrupt: 

If you stopped the above cell, run the following command to check if model training is complete. It will return `True` if the model is done training.

In [23]:
create_model_response.done()

False

If your notebook has timed out, use `client.list_models(location_path)` to check whether your model has finished training. You will also get an email when it's done training.

In [None]:
model_name = 'projects/email-propensity/locations/us-central1/models/colab_trial11'
model = client.get_model(model_name)

## Model Metrics

In [None]:
create_model_result = create_model_response.result()
model_name = create_model_result.name

If your notebook has timed out run the following code to retrieve your model name.
```
model_name = 'projects/email-propensity/locations/us-central1/models/colab_trial11'
```
Set `model_name` to `projects/<project_id>/locations/<location>/models/<model_id>`

In [None]:
model = client.get_model(model_name)

In [None]:
metrics= [x for x in client.list_model_evaluations(model_name)][-1]
metrics.regression_evaluation_metrics

### Feature Metrics

In [None]:
model = client.get_model(model_name)
feat_list = [(x.feature_importance, x.column_display_name) for x in model.tables_model_metadata.tables_model_column_info]
feat_list.sort(reverse=True)
feat_list[:15]

### Batch Predictions  
More information about predictions using the AutoML Tables API can be found [here](https://cloud.google.com/automl-tables/docs/predict-batch).  
Set `batch_predict_bq_input_uri` to the BQ URI for your input table (to make predictions on).  
Set `batch_predict_bq_output` to `bq://<project_id>`

In [None]:
batch_predict_bq_input_uri = 'bq://email-propensity-sandbox.test_upload.pandas_table'
batch_predict_bq_output = 'bq://email-propensity-sandbox'
# Define input source.
batch_prediction_input_source = {
  'bigquery_source': {
    'input_uri': batch_predict_bq_input_uri
  }
}
# Define output target.
batch_prediction_output_target = {
    'bigquery_destination': {
      'output_uri': batch_predict_bq_output
    }
}
batch_predict_response = prediction_client.batch_predict(
    model_name, batch_prediction_input_source, batch_prediction_output_target)
print('Batch prediction operation: {}'.format(batch_predict_response.operation))
# Wait until batch prediction is done.

The below function will return `True` when predictions are done.

In [None]:
batch_predict_response.done()

Your predictions are stored in a BigQuery table named `<model_id>.predictions`. Predictions are stored in the column `predicted_<target_column>`, which is an Array  
You can query this table (using the BQ UI or BQ API) to see predictions.

In [None]:
sql = """
    SELECT predicted_opened[OFFSET(0)].tables AS value_1,
        predicted_opened[OFFSET(1)].tables AS value_2
    FROM colab_trial11.predictions
"""
query_job = client.query(sql) # API request
result = query_job.to_dataframe()
print(result)