![tracker](https://us-central1-vertex-ai-mlops-369716.cloudfunctions.net/pixel-tracking?path=statmike%2Fvertex-ai-mlops%2FMLOps%2FServing&file=Serve+TensorFlow+SavedModel+Format+With+BigQuery.ipynb)
<!--- header table --->
<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/statmike/vertex-ai-mlops/blob/main/MLOps/Serving/Serve%20TensorFlow%20SavedModel%20Format%20With%20BigQuery.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Google Colaboratory logo">
      <br>Run in<br>Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/colab/import/https%3A%2F%2Fraw.githubusercontent.com%2Fstatmike%2Fvertex-ai-mlops%2Fmain%2FMLOps%2FServing%2FServe%2520TensorFlow%2520SavedModel%2520Format%2520With%2520BigQuery.ipynb">
      <img width="32px" src="https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN" alt="Google Cloud Colab Enterprise logo">
      <br>Run in<br>Colab Enterprise
    </a>
  </td>      
  <td style="text-align: center">
    <a href="https://github.com/statmike/vertex-ai-mlops/blob/main/MLOps/Serving/Serve%20TensorFlow%20SavedModel%20Format%20With%20BigQuery.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      <br>View on<br>GitHub
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/statmike/vertex-ai-mlops/main/MLOps/Serving/Serve%20TensorFlow%20SavedModel%20Format%20With%20BigQuery.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo">
      <br>Open in<br>Vertex AI Workbench
    </a>
  </td>
</table>

# Serve TensorFlow SavedModel Format With BigQuery

Serve model predictions inside BigQuery for [TensorFlow SavedModel](https://www.tensorflow.org/guide/saved_model) format - the serverless advantage of BigQuery applied to inference!

BigQuery has a vast set of capabilities related to ML known as BigQuery ML (or BQML for short).
- [BigQuery ML](https://cloud.google.com/bigquery/docs/bqml-introduction)
- [BigQuery ML user journey for models](https://cloud.google.com/bigquery/docs/e2e-journey)

> For many workflow examples with BigQuery ML check out the [Framework Workflows/BQML](../../Framework%20Workflows/BQML/readme.md) folder in this repository!

With these capabilities you can train models directly in BigQuery, import model files for serving inside BigQuery, or connect to remote models for use from BigQuery.

This workflow focuses on importing a model, specifically a TensorFlow SavedModel format model as covered in [this documentation page](https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-tensorflow). 

There are limits to cover prior to getting started:
- not all BigQuery ML functions will work with imported models
- Models are limited to sizes under 450MB
- Model files must be in GCS, int he SavedModel format, use a GraphDef version of atleast 20, only use core TensorFlow operation (no tf.contrib operations), no RaggedTensors
- see a [full list here](https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-tensorflow#limitations)
- And add another more recent limitation - XLA compilation, like JAX based models probably have

What this workflow does:
- Prepare the TensorFlow SavedModel files created in this repository by another workflow: [Keras With TensorFlow Overview](../../Framework%20Workflows/Keras/Keras%20With%20TensorFlow%20Overview.ipynb)
- Setup BigQuery Dataset
- Create Models with the imported model files
- Serve predictions with `ML.PREDICT` function

---
## Colab Setup

When running this notebook in [Colab](https://colab.google/) or [Colab Enterprise](https://cloud.google.com/colab/docs/introduction), this section will authenticate to GCP (follow prompts in the popup) and set the current project for the session.

In [1]:
PROJECT_ID = 'statmike-mlops-349915' # replace with project ID

In [2]:
try:
    from google.colab import auth
    auth.authenticate_user()
    !gcloud config set project {PROJECT_ID}
except Exception:
    pass

---
## Installs and API Enablement

The clients packages may need installing in this environment. 

### Installs (If Needed)

In [69]:
# tuples of (import name, install name, min_version)
packages = [
    ('google.cloud.bigquery', 'google-cloud-bigquery'),
    ('google.cloud.storage', 'google-cloud-storage'),
    ('numpy', 'numpy'),
    ('pandas', 'pandas'),
    ('tensorflow', 'tensorflow'),
]

import importlib
install = False
for package in packages:
    if not importlib.util.find_spec(package[0]):
        print(f'installing package {package[1]}')
        install = True
        !pip install {package[1]} -U -q --user
    elif len(package) == 3:
        if importlib.metadata.version(package[0]) < package[2]:
            print(f'updating package {package[1]}')
            install = True
            !pip install {package[1]} -U -q --user

### API Enablement

In [4]:
!gcloud services enable aiplatform.googleapis.com

### Restart Kernel (If Installs Occured)

After a kernel restart the code submission can start with the next cell after this one.

In [5]:
if install:
    import IPython
    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)
    IPython.display.display(IPython.display.Markdown("""<div class=\"alert alert-block alert-warning\">
        <b>⚠️ The kernel is going to restart. Please wait until it is finished before continuing to the next step. The previous cells do not need to be run again⚠️</b>
        </div>"""))

---
## Setup

Inputs

In [6]:
project = !gcloud config get-value project
PROJECT_ID = project[0]
PROJECT_ID

'statmike-mlops-349915'

In [7]:
REGION = 'us-central1'
SERIES = 'mlops-serving'
EXPERIMENT = 'bigquery-tensorflow'

# gcs bucket name
GCS_BUCKET = PROJECT_ID

# Data source for this series of notebooks: Described above
BQ_SOURCE = 'bigquery-public-data.ml_datasets.ulb_fraud_detection'

# make this the BigQuery Project / Dataset / Table prefix to store results
BQ_PROJECT = PROJECT_ID
BQ_DATASET = SERIES.replace('-', '_')
BQ_TABLE = SERIES
BQ_REGION = REGION[0:2] # use a multi region

Packages

In [68]:
# import python package
import os

import tensorflow as tf
import numpy as np
import pandas as pd

# BigQuery
from google.cloud import bigquery

# gcs
from google.cloud import storage

Clients

In [9]:
# bigquery client
bq = bigquery.Client(project = PROJECT_ID)

# gcs client
gcs = storage.Client(project = PROJECT_ID)
bucket = gcs.bucket(GCS_BUCKET)

Parameters:

In [10]:
DIR = f"files/{EXPERIMENT}"

Environment:

In [11]:
if not os.path.exists(DIR):
    os.makedirs(DIR)

---
## Model Files For Examples

The TensorFlow SavedModel files used here were created in this repository by another workflow: [Keras With JAX Overview](../../Framework%20Workflows/Keras/Keras%20With%20JAX%20Overview.ipynb).  That workflow does not need to be run because the resulting model files are within this repository.  If this notebook is being used separate from a full clone of the repository then this section will fetch the files from GitHub.

In [21]:
local_dir = '../../Framework Workflows/Keras/files/keras-tf-overview/tensorflow'

In [22]:
if not os.path.exists(local_dir):
    print('Retrieving files...')
    local_dir = DIR
    parent_dir = os.path.dirname(local_dir)
    temp_dir = os.path.join(parent_dir, 'temp')
    if not os.path.exists(temp_dir):
        os.makedirs(temp_dir)
    !git clone https://www.github.com/statmike/vertex-ai-mlops {temp_dir}/vertex-ai-mlops
    shutil.copytree(f'{temp_dir}/vertex-ai-mlops/Framework Workflows/Keras/files/keras-tf-overview/tensorflow', local_dir)
    shutil.rmtree(temp_dir)
    print(f'Files are now in folder `{local_dir}`')
else:
    print(f'Files Found in folder `{local_dir}`')             

Files Found in folder `../../Framework Workflows/Keras/files/keras-tf-overview/tensorflow`


In [23]:
for root, _, files in os.walk(local_dir):
    for file in files:
        print(os.path.join(root, file))

../../Framework Workflows/Keras/files/keras-tf-overview/tensorflow/embedding_model/fingerprint.pb
../../Framework Workflows/Keras/files/keras-tf-overview/tensorflow/embedding_model/saved_model.pb
../../Framework Workflows/Keras/files/keras-tf-overview/tensorflow/embedding_model/variables/variables.index
../../Framework Workflows/Keras/files/keras-tf-overview/tensorflow/embedding_model/variables/variables.data-00000-of-00001
../../Framework Workflows/Keras/files/keras-tf-overview/tensorflow/stacked_model/fingerprint.pb
../../Framework Workflows/Keras/files/keras-tf-overview/tensorflow/stacked_model/saved_model.pb
../../Framework Workflows/Keras/files/keras-tf-overview/tensorflow/stacked_model/variables/variables.index
../../Framework Workflows/Keras/files/keras-tf-overview/tensorflow/stacked_model/variables/variables.data-00000-of-00001
../../Framework Workflows/Keras/files/keras-tf-overview/tensorflow/final_model/fingerprint.pb
../../Framework Workflows/Keras/files/keras-tf-overview/te

### Copy Files To GCS

In [35]:
!gcloud storage cp -r '{local_dir}' gs://{GCS_BUCKET}/{SERIES}/{EXPERIMENT}/models/

Copying file://../../Framework Workflows/Keras/files/keras-tf-overview/tensorflow/embedding_model/fingerprint.pb to gs://statmike-mlops-349915/mlops-serving/bigquery-tensorflow/models/tensorflow/embedding_model/fingerprint.pb
Copying file://../../Framework Workflows/Keras/files/keras-tf-overview/tensorflow/embedding_model/saved_model.pb to gs://statmike-mlops-349915/mlops-serving/bigquery-tensorflow/models/tensorflow/embedding_model/saved_model.pb
Copying file://../../Framework Workflows/Keras/files/keras-tf-overview/tensorflow/embedding_model/variables/variables.index to gs://statmike-mlops-349915/mlops-serving/bigquery-tensorflow/models/tensorflow/embedding_model/variables/variables.index
Copying file://../../Framework Workflows/Keras/files/keras-tf-overview/tensorflow/embedding_model/variables/variables.data-00000-of-00001 to gs://statmike-mlops-349915/mlops-serving/bigquery-tensorflow/models/tensorflow/embedding_model/variables/variables.data-00000-of-00001
Copying file://../../Fra

### Examine Models Signatures

To use the models with BigQuery ML we will need to know what the expected input is.  In this case each model has an input named `input_layer` that expects an array of length 30.  This was defined back during the models creation.  That means the input for the BigQuery ML.PREDICT function will need to be a column named `input_layer` that is an array of 30 floating point values.  

This code also checks for the presence of XLA compilation, a newer feature for efficiency.  This is a limit of BigQuery ML imported TensorFlow models currently so if the model file has XLA compilation in its graph then it will not work. 

In [71]:
def get_signature(uri):
    
    loaded_model = tf.saved_model.load(uri)
    signatures = loaded_model.signatures
    print(f"Available Signatures for: {uri}")
    for signature_name, signature_fn in signatures.items():
        print(f"\nSignature Name: {signature_name}")
        print("\nInputs:")
        if hasattr(signature_fn, 'structured_input_signature') and signature_fn.structured_input_signature:
            for input_name, input_spec in signature_fn.structured_input_signature[1].items():
                print(f"  {input_name}:")
                print(f"    dtype: {input_spec.dtype}")
                print(f"    shape: {input_spec.shape}")
        else:
            print("  No structured input signature available.")
        print("\nOutputs:")
        if hasattr(signature_fn, "structured_outputs") and signature_fn.structured_outputs:
            for output_name, output_spec in signature_fn.structured_outputs.items():
                print(f"  {output_name}:")
                print(f"    dtype: {output_spec.dtype}")
                print(f"    shape: {output_spec.shape}")
        else:
            print("  No structured output signature available.")
            
    def check_xla(signatures):
        try:
            concrete_func = signatures[tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY]
            graph_def = concrete_func.graph.as_graph_def()
            
            def _check_node(node):
                if "XlaCallModule" in node.op:
                    return True
                for attr in node.attr.values():
                    if attr.func:
                        func_name = attr.func.name
                        for func in graph_def.library.function:
                            if func.signature.name == func_name:
                                for func_node in func.node_def:
                                    if _check_node(func_node):
                                        return True
                return False
            
            for node in graph_def.node:
                if _check_node(node):
                    return True
            return False
        except Exception as e:
            print(f'Error checking XLA: {e}')
            return None
        
    if check_xla(signatures):
        print('XLA Detected - Conversion Required for BQML.\n\n')
        return True
    else:
        print('XLA Not Detected - Model Should Work With BQML\n\n')
        return False

In [72]:
[blob.name for blob in bucket.list_blobs(prefix = f'{SERIES}/{EXPERIMENT}/models')]

['mlops-serving/bigquery-tensorflow/models/tensorflow/embedding_model/fingerprint.pb',
 'mlops-serving/bigquery-tensorflow/models/tensorflow/embedding_model/saved_model.pb',
 'mlops-serving/bigquery-tensorflow/models/tensorflow/embedding_model/variables/variables.data-00000-of-00001',
 'mlops-serving/bigquery-tensorflow/models/tensorflow/embedding_model/variables/variables.index',
 'mlops-serving/bigquery-tensorflow/models/tensorflow/final_model/fingerprint.pb',
 'mlops-serving/bigquery-tensorflow/models/tensorflow/final_model/saved_model.pb',
 'mlops-serving/bigquery-tensorflow/models/tensorflow/final_model/variables/variables.data-00000-of-00001',
 'mlops-serving/bigquery-tensorflow/models/tensorflow/final_model/variables/variables.index',
 'mlops-serving/bigquery-tensorflow/models/tensorflow/stacked_model/fingerprint.pb',
 'mlops-serving/bigquery-tensorflow/models/tensorflow/stacked_model/saved_model.pb',
 'mlops-serving/bigquery-tensorflow/models/tensorflow/stacked_model/variables/

In [73]:
models = []
for blob in bucket.list_blobs(prefix = f'{SERIES}/{EXPERIMENT}/models'):
    if blob.name.endswith('saved_model.pb'):
        uri = f"gs://{bucket.name}/{blob.name.split('/saved_model.pb')[0]}"
        has_xla = get_signature(uri)
        if has_xla:
            print('Model has XLA and will not work with BQML.')
        else:
            models.append(uri)

Available Signatures for: gs://statmike-mlops-349915/mlops-serving/bigquery-tensorflow/models/tensorflow/embedding_model

Signature Name: serve

Inputs:
  input_layer:
    dtype: <dtype: 'float32'>
    shape: (None, 30)

Outputs:
  output_0:
    dtype: <dtype: 'float32'>
    shape: (None, 4)

Signature Name: serving_default

Inputs:
  input_layer:
    dtype: <dtype: 'float32'>
    shape: (None, 30)

Outputs:
  output_0:
    dtype: <dtype: 'float32'>
    shape: (None, 4)
XLA Not Detected - Model Should Work With BQML


Available Signatures for: gs://statmike-mlops-349915/mlops-serving/bigquery-tensorflow/models/tensorflow/final_model

Signature Name: serve

Inputs:
  input_layer:
    dtype: <dtype: 'float32'>
    shape: (None, 30)

Outputs:
  normalized_reconstruction:
    dtype: <dtype: 'float32'>
    shape: (None, 30)
  encoded:
    dtype: <dtype: 'float32'>
    shape: (None, 4)
  denormalized_reconstruction:
    dtype: <dtype: 'float32'>
    shape: (None, 30)
  normalized_RMSE:
    d

---
## BigQuery Model Import

### Create/Recall Dataset

In [42]:
dataset = bigquery.Dataset(f"{BQ_PROJECT}.{BQ_DATASET}")
dataset.location = BQ_REGION
bq_dataset = bq.create_dataset(dataset, exists_ok = True)

### Create/Recall Table With Preparation For ML

Copy the data from the source while adding columns:
- `transaction_id` as a unique identify for the row
    - Use the `GENERATE_UUID()` function
- `splits` column to randomly assign rows to 'TRAIN", "VALIDATE" and "TEST" groups
    - stratified sampling within the levels of `class` by first assigning row numbers within the levels of `class` then using the with a CASE statment to assign the `splits` level.

In [43]:
job = bq.query(f"""
CREATE OR REPLACE TABLE
#CREATE TABLE IF NOT EXISTS 
    `{BQ_PROJECT}.{BQ_DATASET}.{BQ_TABLE}` AS
WITH
    add_id AS (
        SELECT *,
            GENERATE_UUID() transaction_id,
            ROW_NUMBER() OVER (PARTITION BY class ORDER BY RAND()) as rn
            FROM `{BQ_SOURCE}`
    )
SELECT * EXCEPT(rn),
    CASE 
        WHEN rn <= 0.8 * COUNT(*) OVER (PARTITION BY class) THEN 'TRAIN'
        WHEN rn <= 0.9 * COUNT(*) OVER (PARTITION BY class) THEN 'VALIDATE'
        ELSE 'TEST'
    END AS splits
FROM add_id
""")
job.result()
(job.ended-job.started).total_seconds()

10.3

In [44]:
raw_sample = bq.query(f'SELECT * FROM `{BQ_PROJECT}.{BQ_DATASET}.{BQ_TABLE}` LIMIT 5').to_dataframe()
raw_sample

Unnamed: 0,Time,V1,V2,V3,V4,V5,V6,V7,V8,V9,...,V23,V24,V25,V26,V27,V28,Amount,Class,transaction_id,splits
0,118229.0,-0.163817,1.054898,0.863845,4.279682,1.618317,0.416816,0.556619,-0.239917,-2.329588,...,-0.247822,-0.331911,-0.828854,0.428658,-0.031682,0.03737,0.0,0,bba2482f-8c1d-445f-9c0b-5bae570620dc,TEST
1,129755.0,2.052071,0.10391,-1.230392,0.441973,-0.127297,-1.740949,0.441227,-0.453292,0.447167,...,0.284552,0.426661,-0.067168,-0.518989,-0.029692,-0.05163,0.0,0,fd07dc5d-923a-4d6a-9066-4750b65294ef,TEST
2,82566.0,1.225778,0.315277,0.968853,2.369175,0.000762,0.983311,-0.500043,0.181432,0.010966,...,-0.142516,-0.979973,0.528372,0.038881,0.049546,0.024313,0.0,0,62252bf3-2ac8-42e8-ab1e-b6eeedbc283b,TEST
3,145092.0,-19.332047,-24.894678,0.580899,8.895214,17.479929,-12.299495,-12.000347,2.338247,-0.342769,...,-0.334839,0.165841,0.048159,-0.848368,0.310829,-3.928403,0.0,0,da56a7a0-4851-49b3-8a25-31235767d019,TEST
4,71188.0,1.182039,0.043534,0.187616,0.568309,-0.241478,-0.362329,0.002382,0.055659,0.020921,...,-0.054438,0.262501,0.505217,0.557609,-0.044848,-0.012868,0.0,0,d9c0b8f7-9112-4495-8312-7ea56ab92052,TEST


### Add A Column For Feature Array

Combine all the features into a single column, an array of floats, for easier inference.

In [45]:
feature_columns = [col for col in raw_sample.columns if col not in ['splits', 'transaction_id', 'Class']]
feature_columns

['Time',
 'V1',
 'V2',
 'V3',
 'V4',
 'V5',
 'V6',
 'V7',
 'V8',
 'V9',
 'V10',
 'V11',
 'V12',
 'V13',
 'V14',
 'V15',
 'V16',
 'V17',
 'V18',
 'V19',
 'V20',
 'V21',
 'V22',
 'V23',
 'V24',
 'V25',
 'V26',
 'V27',
 'V28',
 'Amount']

In [46]:
query = f"""
CREATE OR REPLACE TABLE `{BQ_PROJECT}.{BQ_DATASET}.{BQ_TABLE}` AS
SELECT
    t.*, # EXCEPT(features_array),
    ARRAY[
        {', '.join(feature_columns)}
    ] AS features_array
FROM
    `{BQ_PROJECT}.{BQ_DATASET}.{BQ_TABLE}` AS t;
"""

job = bq.query(query)
job.result()
(job.ended - job.started).total_seconds()

8.65

### Review the number of records for each level of `Class` for each of the data splits:

In [47]:
bq.query(f"""
SELECT splits, class,
    count(*) as count,
    ROUND(count(*) * 100.0 / SUM(count(*)) OVER (PARTITION BY class), 2) AS percentage
FROM `{BQ_PROJECT}.{BQ_DATASET}.{BQ_TABLE}`
GROUP BY splits, class
""").to_dataframe()

Unnamed: 0,splits,class,count,percentage
0,TRAIN,1,393,79.88
1,VALIDATE,1,49,9.96
2,TEST,1,50,10.16
3,TEST,0,28432,10.0
4,TRAIN,0,227452,80.0
5,VALIDATE,0,28431,10.0


### Create The Models

In [48]:
models

['gs://statmike-mlops-349915/mlops-serving/bigquery-tensorflow/models/tensorflow/embedding_model',
 'gs://statmike-mlops-349915/mlops-serving/bigquery-tensorflow/models/tensorflow/final_model',
 'gs://statmike-mlops-349915/mlops-serving/bigquery-tensorflow/models/tensorflow/stacked_model']

In [49]:
bq_models = []
for model in models:
    bq_model = f"{BQ_PROJECT}.{BQ_DATASET}.{SERIES}-{EXPERIMENT}-{model.split('/')[-1]}"
    job = bq.query(f"""
    CREATE OR REPLACE MODEL `{bq_model}`
        OPTIONS(
            MODEL_TYPE = 'TENSORFLOW',
            MODEL_PATH = '{model}/*'
        )
    """)
    job.result()
    bq_models.append(bq_model)
    print(f"Created BigQuery Model:\n\tName: {bq_model}\n\tGCS URI: {model}\n\tTime (seconds): {(job.ended-job.started).total_seconds()}")

Created BigQuery Model:
	Name: statmike-mlops-349915.mlops_serving.mlops-serving-bigquery-tensorflow-embedding_model
	GCS URI: gs://statmike-mlops-349915/mlops-serving/bigquery-tensorflow/models/tensorflow/embedding_model
	Time (seconds): 17.205
Created BigQuery Model:
	Name: statmike-mlops-349915.mlops_serving.mlops-serving-bigquery-tensorflow-final_model
	GCS URI: gs://statmike-mlops-349915/mlops-serving/bigquery-tensorflow/models/tensorflow/final_model
	Time (seconds): 8.769
Created BigQuery Model:
	Name: statmike-mlops-349915.mlops_serving.mlops-serving-bigquery-tensorflow-stacked_model
	GCS URI: gs://statmike-mlops-349915/mlops-serving/bigquery-tensorflow/models/tensorflow/stacked_model
	Time (seconds): 13.572


In [50]:
bq_models

['statmike-mlops-349915.mlops_serving.mlops-serving-bigquery-tensorflow-embedding_model',
 'statmike-mlops-349915.mlops_serving.mlops-serving-bigquery-tensorflow-final_model',
 'statmike-mlops-349915.mlops_serving.mlops-serving-bigquery-tensorflow-stacked_model']

## BigQuery ML Predictions With ML.PREDICT

### Embedding Model

This model returns an embedding of dimension = 4:

In [53]:
bq_models[0]

'statmike-mlops-349915.mlops_serving.mlops-serving-bigquery-tensorflow-embedding_model'

In [52]:
results = bq.query(f"""
SELECT *
FROM ML.PREDICT (MODEL `{bq_models[0]}`,(
    SELECT features_array as input_layer
    FROM `{BQ_PROJECT}.{BQ_DATASET}.{BQ_TABLE}`
    WHERE splits = 'TEST' and class = 1
    LIMIT 10)
  )
""").to_dataframe()

results

Unnamed: 0,output_0,input_layer
0,"[0.0, 0.0, 0.022259382531046867, 0.00314655387...","[18690.0, -15.3988450085358, 7.472323896501121..."
1,"[0.310149222612381, 0.0, 0.016755089163780212,...","[94362.0, -26.4577446501446, 16.497471901867, ..."
2,"[0.19951258599758148, 0.0, 0.01866228878498077...","[110087.0, 1.9349464556154798, 0.6506777374982..."
3,"[0.0, 0.0, 0.022228442132472992, 0.00309095764...","[41305.0, -12.9809425647533, 6.72050777097643,..."
4,"[0.0, 0.0, 0.022287406027317047, 0.00319691235...","[30852.0, -2.83098405592803, 0.885657038258755..."
5,"[0.2360287755727768, 0.0, 0.019245415925979614...","[102318.0, -1.0206316658236099, 1.496959122432..."
6,"[0.29357069730758667, 0.0, 0.02039346098899841...","[148476.0, -1.12509160979577, 3.68287614423406..."
7,"[0.0, 0.0, 0.022289488464593887, 0.00320065440...","[7535.0, 0.0267792264491516, 4.132463897130029..."
8,"[0.0, 0.0, 0.022370172664523125, 0.00334563432...","[25095.0, 1.19239598990768, 1.33897371069007, ..."
9,"[0.0, 0.0, 0.022382868453860283, 0.00336844893...","[35942.0, -4.19407367570647, 4.382897362444670..."


In [54]:
results['output_0'][0]

array([0.        , 0.        , 0.02225938, 0.00314655])

### Final Model - Detailed Outputs

This model returns a highly customized set of outputs to help with anomlay detection:

In [56]:
bq_models[1]

'statmike-mlops-349915.mlops_serving.mlops-serving-bigquery-tensorflow-final_model'

In [57]:
results = bq.query(f"""
SELECT *
FROM ML.PREDICT (MODEL `{bq_models[1]}`,(
    SELECT features_array as input_layer
    FROM `{BQ_PROJECT}.{BQ_DATASET}.{BQ_TABLE}`
    WHERE splits = 'TEST' and class = 1
    LIMIT 10)
  )
""").to_dataframe()

results

Unnamed: 0,denormalized_MAE,denormalized_MSE,denormalized_MSLE,denormalized_RMSE,denormalized_reconstruction,denormalized_reconstruction_errors,encoded,normalized_MAE,normalized_MSE,normalized_MSLE,normalized_RMSE,normalized_reconstruction,normalized_reconstruction_errors,input_layer
0,1519.109619,68626140.0,0.898874,8284.088867,"[64063.796875, -0.3806326389312744, 0.12962773...","[-45373.796875, -15.01821231842041, 7.34269618...","[0.0, 0.0, 0.022259382531046867, 0.00314655387...",5.74037,61.73278,0.663433,7.857021,"[-0.6491255164146423, -0.20191915333271027, 0....","[-0.9552432894706726, -7.777879238128662, 4.47...","[18690.0, -15.3988450085358, 7.472323896501121..."
1,1389.779785,57201260.0,0.876537,7563.151855,"[135787.03125, 0.6292573809623718, 0.007250748...","[-41425.03125, -27.08700180053711, 16.49022293...","[0.310149222612381, 0.0, 0.016755089163780212,...",7.825902,111.474777,0.620579,10.558162,"[0.860845685005188, 0.32109934091567993, 0.009...","[-0.8721107244491577, -14.028263092041016, 10....","[94362.0, -26.4577446501446, 16.497471901867, ..."
2,330.959473,3259470.0,0.400962,1805.400146,"[119975.5625, 0.25684669613838196, 0.058178655...","[-9888.5625, 1.6780997514724731, 0.59249907732...","[0.19951258599758148, 0.0, 0.01866228878498077...",0.673797,1.155872,0.177158,1.075115,"[0.5279709696769714, 0.12822915613651276, 0.04...","[-0.2081815004348755, 0.869081974029541, 0.361...","[110087.0, 1.9349464556154798, 0.6506777374982..."
3,765.706482,17259720.0,0.764454,4154.481934,"[64059.9609375, -0.38065868616104126, 0.129629...","[-22754.9609375, -12.6002836227417, 6.59087800...","[0.0, 0.0, 0.022228442132472992, 0.00309095764...",5.889391,72.089226,0.675512,8.490538,"[-0.649206280708313, -0.2019326388835907, 0.08...","[-0.4790545701980591, -6.5256428718566895, 4.0...","[41305.0, -12.9809425647533, 6.72050777097643,..."
4,1110.821899,36775370.0,0.31399,6064.27002,"[64067.2734375, -0.38060906529426575, 0.129625...","[-33215.2734375, -2.4503750801086426, 0.756031...","[0.0, 0.0, 0.022287406027317047, 0.00319691235...",1.088636,2.609732,0.219071,1.615467,"[-0.6490523815155029, -0.2019069343805313, 0.0...","[-0.699272871017456, -1.2690407037734985, 0.46...","[30852.0, -2.83098405592803, 0.885657038258755..."
5,767.165894,17582800.0,0.23872,4193.18457,"[125285.015625, 0.3764687478542328, 0.04184361...","[-22967.015625, -1.3971004486083984, 1.4551154...","[0.2360287755727768, 0.0, 0.019245415925979614...",1.592211,4.867549,0.24955,2.206252,"[0.6397494673728943, 0.19018100202083588, 0.03...","[-0.4835188090801239, -0.7235533595085144, 0.8...","[102318.0, -1.0206316658236099, 1.496959122432..."
6,495.459351,7278511.0,0.579933,2697.871582,"[133699.171875, 0.5675977468490601, 0.01573685...","[14776.828125, -1.692689299583435, 3.667139291...","[0.29357069730758667, 0.0, 0.02039346098899841...",2.352207,9.864285,0.403305,3.140746,"[0.8168907761573792, 0.28916603326797485, 0.01...","[0.3110927939414978, -0.8766378164291382, 2.23...","[148476.0, -1.12509160979577, 3.68287614423406..."
7,1887.618652,106530900.0,0.807245,10321.381836,"[64067.53125, -0.3806073069572449, 0.129625782...","[-56532.53125, 0.40738654136657715, 4.00283813...","[0.0, 0.0, 0.022289488464593887, 0.00320065440...",2.52954,13.866784,0.458278,3.723813,"[-0.6490469574928284, -0.2019060254096985, 0.0...","[-1.1901652812957764, 0.21098405122756958, 2.4...","[7535.0, 0.0267792264491516, 4.132463897130029..."
8,1300.985107,50654590.0,0.386287,7117.204102,"[64077.52734375, -0.3805393874645233, 0.129620...","[-38982.52734375, 1.5729354619979858, 1.209353...","[0.0, 0.0, 0.022370172664523125, 0.00334563432...",0.917202,1.755776,0.238906,1.325057,"[-0.6488364934921265, -0.201870858669281, 0.08...","[-0.8206894397735596, 0.8146177530288696, 0.73...","[25095.0, 1.19239598990768, 1.33897371069007, ..."
9,941.5625,26390770.0,0.48228,5137.194824,"[64079.5625, -0.3805256187915802, 0.1296194791...","[-28137.5625, -3.8135480880737305, 4.253277778...","[0.0, 0.0, 0.022382868453860283, 0.00336844893...",3.434613,26.307997,0.466861,5.129132,"[-0.648793637752533, -0.20186372101306915, 0.0...","[-0.5923730731010437, -1.9750230312347412, 2.5...","[35942.0, -4.19407367570647, 4.382897362444670..."


In [58]:
results.iloc[0].to_dict()

{'denormalized_MAE': 1519.109619140625,
 'denormalized_MSE': 68626136.0,
 'denormalized_MSLE': 0.8988741040229797,
 'denormalized_RMSE': 8284.0888671875,
 'denormalized_reconstruction': array([ 6.40637969e+04, -3.80632639e-01,  1.29627734e-01,  7.16346085e-01,
         6.34700134e-02, -2.42524907e-01, -1.63238764e-01, -3.76504138e-02,
         7.85131231e-02, -1.24087527e-01, -1.22334331e-01,  6.89424127e-02,
         1.36412308e-01, -4.08663228e-03,  2.87474059e-02,  2.39056751e-01,
         7.79918060e-02, -1.43595506e-02, -3.97860035e-02, -1.26899183e-02,
        -2.86139119e-02, -6.03521653e-02, -8.56110081e-02, -3.58477235e-02,
         5.68169802e-02,  1.14095710e-01, -6.90234676e-02,  1.34296576e-02,
         2.22326797e-02,  2.35895081e+01]),
 'denormalized_reconstruction_errors': array([-4.53737969e+04, -1.50182123e+01,  7.34269619e+00, -1.97432594e+01,
         1.11020555e+01, -6.65133095e+00, -1.95769787e+00, -1.48756800e+01,
        -7.99727261e-01, -7.05100918e+00, -1.4044

### Stacked Model - Reconstructed Feature Values

This model returns the reconstructed feature values for each input feature:

In [63]:
bq_models[2]

'statmike-mlops-349915.mlops_serving.mlops-serving-bigquery-tensorflow-stacked_model'

In [64]:
results = bq.query(f"""
SELECT *
FROM ML.PREDICT (MODEL `{bq_models[2]}`,(
    SELECT features_array as input_layer
    FROM `{BQ_PROJECT}.{BQ_DATASET}.{BQ_TABLE}`
    WHERE splits = 'TEST' and class = 1
    LIMIT 10)
  )
""").to_dataframe()

results

Unnamed: 0,output_0,input_layer
0,"[64063.796875, -0.3806326389312744, 0.12962773...","[18690.0, -15.3988450085358, 7.472323896501121..."
1,"[135787.03125, 0.6292573809623718, 0.007250748...","[94362.0, -26.4577446501446, 16.497471901867, ..."
2,"[119975.5625, 0.25684669613838196, 0.058178655...","[110087.0, 1.9349464556154798, 0.6506777374982..."
3,"[64059.9609375, -0.38065868616104126, 0.129629...","[41305.0, -12.9809425647533, 6.72050777097643,..."
4,"[64067.2734375, -0.38060906529426575, 0.129625...","[30852.0, -2.83098405592803, 0.885657038258755..."
5,"[125285.015625, 0.3764687478542328, 0.04184361...","[102318.0, -1.0206316658236099, 1.496959122432..."
6,"[133699.171875, 0.5675977468490601, 0.01573685...","[148476.0, -1.12509160979577, 3.68287614423406..."
7,"[64067.53125, -0.3806073069572449, 0.129625782...","[7535.0, 0.0267792264491516, 4.132463897130029..."
8,"[64077.52734375, -0.3805393874645233, 0.129620...","[25095.0, 1.19239598990768, 1.33897371069007, ..."
9,"[64079.5625, -0.3805256187915802, 0.1296194791...","[35942.0, -4.19407367570647, 4.382897362444670..."


In [65]:
results['output_0'][0]

array([ 6.40637969e+04, -3.80632639e-01,  1.29627734e-01,  7.16346085e-01,
        6.34700134e-02, -2.42524907e-01, -1.63238764e-01, -3.76504138e-02,
        7.85131231e-02, -1.24087527e-01, -1.22334331e-01,  6.89424127e-02,
        1.36412308e-01, -4.08663228e-03,  2.87474059e-02,  2.39056751e-01,
        7.79918060e-02, -1.43595506e-02, -3.97860035e-02, -1.26899183e-02,
       -2.86139119e-02, -6.03521653e-02, -8.56110081e-02, -3.58477235e-02,
        5.68169802e-02,  1.14095710e-01, -6.90234676e-02,  1.34296576e-02,
        2.22326797e-02,  2.35895081e+01])

In [66]:
results['input_layer'][0]

array([ 1.86900000e+04, -1.53988450e+01,  7.47232390e+00, -1.90269123e+01,
        1.11655258e+01, -6.89385628e+00, -2.12093657e+00, -1.49133300e+01,
       -7.21214094e-01, -7.17509662e+00, -1.41667947e+01,  1.02777689e+01,
       -1.49854337e+01,  3.45179234e-01, -1.46663890e+01, -3.46352542e-01,
       -8.33324250e+00, -1.26025965e+01, -4.87668342e+00,  6.04625911e-01,
        1.11150221e+00, -2.44488368e+00,  7.27495341e-01, -3.45078151e-01,
       -9.81748551e-01,  9.95271346e-01,  8.16761718e-01,  2.26294237e+00,
       -1.17806316e+00,  1.00000000e+00])

In [70]:
pd.DataFrame(
    {
        'input_layer': results['input_layer'][0],
        'output': results['output_0'][0],
        'abs_difference': np.abs(results['input_layer'][0] - results['output_0'][0])
    }
)

Unnamed: 0,input_layer,output,abs_difference
0,18690.0,64063.796875,45373.796875
1,-15.398845,-0.380633,15.018212
2,7.472324,0.129628,7.342696
3,-19.026912,0.716346,19.743258
4,11.165526,0.06347,11.102056
5,-6.893856,-0.242525,6.651331
6,-2.120937,-0.163239,1.957698
7,-14.91333,-0.03765,14.87568
8,-0.721214,0.078513,0.799727
9,-7.175097,-0.124088,7.051009
