In [1]:
PROJECT_ID = "formazione-riccardo-zanella"
REGION = 'us-central1'
BUCKET_NAME = "bbs-2021-opml4b-explainability"

In [2]:
TRAIN_FILE = 'gs://'+BUCKET_NAME+'/data/tabular_data/train.csv'

EXPORT_PATH = 'gs://' + BUCKET_NAME + '/mdl/tabular_data/tabular_data_20211117_172620/model/'

MODEL = 'bike'
VERSION = 'v1'

In [3]:
import pandas as pd
import tensorflow as tf
from explainable_ai_sdk.metadata.tf.v2 import SavedModelMetadataBuilder

2021-11-17 17:41:40.157364: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-11-17 17:41:40.157406: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.


# Inspect model's signature

When using TensorFlow 2.x, you export the model as a `SavedModel` and load it into Cloud Storage. 

## Using command line

Use TensorFlow's `saved_model_cli` to inspect the model's SignatureDef. You'll use this information when you deploy your model to AI Explanations in the next section.

In [4]:
! saved_model_cli show --dir $EXPORT_PATH --all

2021-11-17 17:41:41.780912: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-11-17 17:41:41.780955: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['__saved_model_init_op']:
  The given SavedModel SignatureDef contains the following input(s):
  The given SavedModel SignatureDef contains the following output(s):
    outputs['__saved_model_init_op'] tensor_info:
        dtype: DT_INVALID
        shape: unknown_rank
        name: NoOp
  Method name is: 

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['dense_input'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 6)
        name: serving_defa

## Using TensorFlow API

In [5]:
model = tf.keras.models.load_model(EXPORT_PATH)
# Print the names of your tensors
print('Model input tensor: ', model.input.name)
print('Model output tensor: ', model.output.name)



2021-11-17 17:41:51.069387: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2021-11-17 17:41:51.069481: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2021-11-17 17:41:51.069517: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (INJ-NB-126): /proc/driver/nvidia/version does not exist
2021-11-17 17:41:51.069946: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


Model input tensor:  dense_input
Model output tensor:  dense_2/BiasAdd:0


# Deploy the model to AI Explanations

In order to deploy the model to Explanations, you need to generate an `explanations_metadata.json` file and upload this to the Cloud Storage bucket with your SavedModel. Then you'll deploy the model using `gcloud`.

## Prepare explanation metadata

In order to deploy this model to AI Explanations, you need to create an explanation_metadata.json file with information about your model inputs, outputs, and baseline. You can use the [Explainable AI SDK](https://pypi.org/project/explainable-ai-sdk/) to generate most of the fields. 

The value for `input_baselines` tells the explanations service what the baseline input should be for your model. Here you're using the median for all of your input features. That means the baseline prediction for this model will be the trip duration your model predicts for the median of each feature in your dataset. 

Since this model accepts a single numpy array with all numerical feature, you can optionally pass an `index_feature_mapping` list to AI Explanations to make the API response easier to parse. When you provide a list of feature names via this parameter, the service will return a key / value mapping of each feature with its corresponding attribution value.

In [6]:
with tf.io.gfile.GFile(TRAIN_FILE) as f:
    train_data = pd.read_csv(f)

features = train_data.drop(columns=['duration'])

In [7]:
FEATURE_NAMES = features.columns.tolist()
INPUT_BASELINES = features.median().to_list() 

In [8]:
for f,v in zip (FEATURE_NAMES, INPUT_BASELINES):
    print('{}: {}'.format(f,v))

start_hr: 14.0
weekday: 4.0
euclidean: 1797.8503302971844
temp: 55.1
dew_point: 46.2
max_temp: 62.2


In [9]:
builder = SavedModelMetadataBuilder(EXPORT_PATH)

builder.set_numeric_metadata(
    model.input.name.split(':')[0],
    input_baselines=INPUT_BASELINES,
    index_feature_mapping=FEATURE_NAMES
)

builder.save_metadata(EXPORT_PATH)

Since this is a regression model (predicting a numerical value), the baseline prediction will be the same for every example you send to the model. If this were instead a classification model, each class would have a different baseline prediction.

### Create the model

In [10]:
# Create the model if it doesn't exist yet (you only need to run this once)
! gcloud ai-platform models create $MODEL --enable-logging --region=$REGION

Using endpoint [https://us-central1-ml.googleapis.com/]
[1;31mERROR:[0m (gcloud.ai-platform.models.create) Resource in projects [formazione-riccardo-zanella] is the subject of a conflict: Field: model.name Error: A model with the same name already exists.
- '@type': type.googleapis.com/google.rpc.BadRequest
  fieldViolations:
  - description: A model with the same name already exists.
    field: model.name


### Create the model version 

Creating the version will take ~5-10 minutes. Note that your first deploy could take longer.

In [11]:
# Create the version with gcloud
explain_method = 'sampled-shapley'
! gcloud beta ai-platform versions create $VERSION --region=$REGION \
--model $MODEL \
--origin $EXPORT_PATH \
--runtime-version 2.1 \
--framework TENSORFLOW \
--python-version 3.7 \
--machine-type n1-standard-4 \
--explanation-method $explain_method \
--num-integral-steps 25

Using endpoint [https://us-central1-ml.googleapis.com/]
Explanations reflect patterns in your model, but don't necessarily reveal fundamental relationships about your data population. See https://cloud.google.com/ml-engine/docs/ai-explanations/limitations for more information.
Creating version (this might take a few minutes)......done.


In [12]:
# Make sure the model deployed correctly. State should be `READY` in the following log
! gcloud ai-platform versions describe $VERSION --region $REGION --model $MODEL

Using endpoint [https://us-central1-ml.googleapis.com/]
createTime: '2021-11-17T16:42:41Z'
deploymentUri: gs://bbs-2021-opml4b-explainability/mdl/tabular_data/tabular_data_20211117_172620/model/
etag: h-pu1mNPSWc=
explanationConfig:
  sampledShapleyAttribution:
    numPaths: 50
framework: TENSORFLOW
isDefault: true
machineType: n1-standard-4
name: projects/formazione-riccardo-zanella/models/bike/versions/v1
pythonVersion: '3.7'
runtimeVersion: '2.1'
state: READY
