# Inspecting TFX metadata


## Learning Objectives

1. Use a GRPC server to access and analyze pipeline artifacts stored in the ML Metadata service of your AI Platform Pipelines instance.

In this lab, you will explore TFX pipeline metadata including pipeline and run artifacts. A hosted **AI Platform Pipelines** instance includes the [ML Metadata](https://github.com/google/ml-metadata) service. In **AI Platform Pipelines**, ML Metadata uses *MySQL* as a database backend and can be accessed using a GRPC server.

## Setup

In [1]:
import os
import json

import ml_metadata
import tensorflow_data_validation as tfdv
import tensorflow_model_analysis as tfma


from ml_metadata.metadata_store import metadata_store
from ml_metadata.proto import metadata_store_pb2

from tfx.orchestration import metadata
from tfx.types import standard_artifacts

from tensorflow.python.lib.io import file_io

2021-11-11 06:35:05.766008: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/nccl2/lib:/usr/local/cuda/extras/CUPTI/lib64
2021-11-11 06:35:05.766066: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.


In [2]:
!python -c "import tfx; print('TFX version: {}'.format(tfx.__version__))"
!python -c "import kfp; print('KFP version: {}'.format(kfp.__version__))"

TFX version: 0.25.0
KFP version: 1.0.4


### Explore metadata from existing TFX pipeline runs from AI Pipelines instance created.

#### Configure Kubernetes port forwarding

To enable access to the ML Metadata GRPC server, configure Kubernetes port forwarding.

From a JupyterLab terminal, execute the following commands:

```
gcloud container clusters get-credentials cluster-1 --zone us-central1-a
kubectl port-forward  service/metadata-grpc-service --namespace default 7000:8080
```

Proceed to the next step, "Connecting to ML Metadata".

## Connecting to ML Metadata 

### Configure ML Metadata GRPC client

In [3]:
grpc_host = 'localhost'
grpc_port = 7000
connection_config = metadata_store_pb2.MetadataStoreClientConfig()
connection_config.host = grpc_host
connection_config.port = grpc_port

### Connect to ML Metadata service

In [4]:
store = metadata_store.MetadataStore(connection_config)

## Exploring ML Metadata 

The Metadata Store uses the following data model:

- `ArtifactType` describes an artifact's type and its properties that are stored in the Metadata Store. These types can be registered on-the-fly with the Metadata Store in code, or they can be loaded in the store from a serialized format. Once a type is registered, its definition is available throughout the lifetime of the store.
- `Artifact` describes a specific instances of an ArtifactType, and its properties that are written to the Metadata Store.
- `ExecutionType` describes a type of component or step in a workflow, and its runtime parameters.
- `Execution` is a record of a component run or a step in an ML workflow and the runtime parameters. An Execution can be thought of as an instance of an ExecutionType. Every time a developer runs an ML pipeline or step, executions are recorded for each step.
- `Event` is a record of the relationship between an Artifact and Executions. When an Execution happens, Events record every Artifact that was used by the Execution, and every Artifact that was produced. These records allow for provenance tracking throughout a workflow. By looking at all Events MLMD knows what Executions happened, what Artifacts were created as a result, and can recurse back from any Artifact to all of its upstream inputs.
- `ContextType` describes a type of conceptual group of Artifacts and Executions in a workflow, and its structural properties. For example: projects, pipeline runs, experiments, owners.
- `Context` is an instances of a ContextType. It captures the shared information within the group. For example: project name, changelist commit id, experiment annotations. It has a user-defined unique name within its ContextType.
- `Attribution` is a record of the relationship between Artifacts and Contexts.
- `Association` is a record of the relationship between Executions and Contexts.

List the registered artifact types.

In [5]:
for artifact_type in store.get_artifact_types():
    print(artifact_type.name)

Schema
Examples
ExampleStatistics
TransformGraph
TransformCache
ExampleAnomalies
Model
ModelRun
InfraBlessing
ModelBlessing
ModelEvaluation
PushedModel


Display the registered execution types.

In [6]:
for execution_type in store.get_execution_types():
    print(execution_type.name)

DummyExecutionType
components.Bigquery - Query@sha256=9bf58d89d71a29a9e34df0ca4e98f2c7cbf6a6ed5942009491116415b1ffb22e
components.Bigquery - Query@sha256=1e5a0ae7b926683a4059f46ffd576b762fa2e63dae9d2b93cdcc4a5f8968cd88
components.Bigquery - Query@sha256=ed22ab37c95f85c2f702af13dd81fb28bb36672d1d92794692956111cfe71968
components.Submitting a Cloud ML training job as a pipeline step@sha256=a948814528749b7bdf01e231a239c2a6dd6d7f818835497a43376c33fb3f70cd
components.Bigquery - Query@sha256=89b7bd521b8549df20d046b5a7f330443f322cfb982205bc79507064bc96f3e9
components.Bigquery - Query@sha256=c8b905fb8557d0d48ddfe284475168d4ff69167287b2ffb19d57c5a21c30efcc
components.Bigquery - Query@sha256=e23b5894e89c7920c285f8bdddf8e8332b44a436e6a125aa0a4897f53270e594
components.Submitting a Cloud ML training job as a pipeline step@sha256=a7ffa6d36287b0ac25911dc9d7d7fa452129e83cc547327a2e7fcedcee9c6df3
components.Bigquery - Query@sha256=2fee5aba5e450c3a555b71561ca4112104f91bb2f9b8736325d7d8b4e51643d3
compone

List the registered context types.

In [7]:
for context_type in store.get_context_types():
    print(context_type.name)

KfpRun
pipeline
pipeline
run
component_run


## Visualizing TFX artifacts

### Retrieve data analysis and validation artifacts

In [8]:
with metadata.Metadata(connection_config) as store:
    schema_artifacts = store.get_artifacts_by_type(standard_artifacts.Schema.TYPE_NAME)    
    stats_artifacts = store.get_artifacts_by_type(standard_artifacts.ExampleStatistics.TYPE_NAME)
    anomalies_artifacts = store.get_artifacts_by_type(standard_artifacts.ExampleAnomalies.TYPE_NAME)

In [9]:
schema_file = os.path.join(schema_artifacts[-1].uri, 'schema.pbtxt')
print("Generated schame file:{}".format(schema_file))

stats_path = stats_artifacts[-1].uri
train_stats_file = os.path.join(stats_path, 'train', 'stats_tfrecord')
eval_stats_file = os.path.join(stats_path, 'eval', 'stats_tfrecord')
print("Train stats file:{}, Eval stats file:{}".format(
    train_stats_file, eval_stats_file))

anomalies_path = anomalies_artifacts[-1].uri
train_anomalies_file = os.path.join(anomalies_path, 'train', 'anomalies.pbtxt')
eval_anomalies_file = os.path.join(anomalies_path, 'eval', 'anomalies.pbtxt')

print("Train anomalies file:{}, Eval anomalies file:{}".format(
    train_anomalies_file, eval_anomalies_file))

Generated schame file:gs://artifacts.qwiklabs-gcp-04-0ad772141888.appspot.com/tfx_covertype_continuous_training/2447d9b8-83b4-4044-84c5-114feaa12ee0/SchemaGen/schema/25/schema.pbtxt
Train stats file:gs://artifacts.qwiklabs-gcp-04-0ad772141888.appspot.com/tfx_covertype_continuous_training/2447d9b8-83b4-4044-84c5-114feaa12ee0/StatisticsGen/statistics/24/train/stats_tfrecord, Eval stats file:gs://artifacts.qwiklabs-gcp-04-0ad772141888.appspot.com/tfx_covertype_continuous_training/2447d9b8-83b4-4044-84c5-114feaa12ee0/StatisticsGen/statistics/24/eval/stats_tfrecord
Train anomalies file:gs://artifacts.qwiklabs-gcp-04-0ad772141888.appspot.com/tfx_covertype_continuous_training/2447d9b8-83b4-4044-84c5-114feaa12ee0/ExampleValidator/anomalies/26/train/anomalies.pbtxt, Eval anomalies file:gs://artifacts.qwiklabs-gcp-04-0ad772141888.appspot.com/tfx_covertype_continuous_training/2447d9b8-83b4-4044-84c5-114feaa12ee0/ExampleValidator/anomalies/26/eval/anomalies.pbtxt


### Visualize schema

In [10]:
schema = tfdv.load_schema_text(schema_file)
tfdv.display_schema(schema=schema)

Unnamed: 0_level_0,Type,Presence,Valency,Domain
Feature name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
'Soil_Type',STRING,required,single,'Soil_Type'
'Wilderness_Area',STRING,required,single,'Wilderness_Area'
'Aspect',INT,required,single,-
'Cover_Type',INT,required,single,-
'Elevation',INT,required,single,-
'Hillshade_3pm',INT,required,single,-
'Hillshade_9am',INT,required,single,-
'Hillshade_Noon',INT,required,single,-
'Horizontal_Distance_To_Fire_Points',INT,required,single,-
'Horizontal_Distance_To_Hydrology',INT,required,single,-


  pd.set_option('max_colwidth', -1)


Unnamed: 0_level_0,Values
Domain,Unnamed: 1_level_1
'Soil_Type',"'C2702', 'C2703', 'C2704', 'C2705', 'C2706', 'C2717', 'C3501', 'C3502', 'C4201', 'C4703', 'C4704', 'C4744', 'C4758', 'C5101', 'C5151', 'C6101', 'C6102', 'C6731', 'C7101', 'C7102', 'C7103', 'C7201', 'C7202', 'C7700', 'C7701', 'C7702', 'C7709', 'C7710', 'C7745', 'C7746', 'C7755', 'C7756', 'C7757', 'C7790', 'C8703', 'C8707', 'C8708', 'C8771', 'C8772', 'C8776'"
'Wilderness_Area',"'Cache', 'Commanche', 'Neota', 'Rawah'"


### Visualize statistics

#### Exercise: looking at the features visualized below, answer the following questions:

- Which feature transformations would you apply to each feature with TF Transform?
- Are there data quality issues with certain features that may impact your model performance? How might you deal with it?

In [11]:
train_stats = tfdv.load_statistics(train_stats_file)
eval_stats = tfdv.load_statistics(eval_stats_file)
tfdv.visualize_statistics(lhs_statistics=eval_stats, rhs_statistics=train_stats,
                          lhs_name='EVAL_DATASET', rhs_name='TRAIN_DATASET')

Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`


Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`


### Visualize anomalies

In [12]:
train_anomalies = tfdv.load_anomalies_text(train_anomalies_file)
tfdv.display_anomalies(train_anomalies)

In [14]:
eval_anomalies = tfdv.load_anomalies_text(eval_anomalies_file)
tfdv.display_anomalies(eval_anomalies)

### Retrieve model artifacts

In [21]:
store.get_pipeline_context

<bound method Metadata.get_pipeline_context of <tfx.orchestration.metadata.Metadata object at 0x7f926ebefb90>>

In [24]:
store.search_artifacts(

id: 15
type_id: 40
uri: "gs://artifacts.qwiklabs-gcp-04-0ad772141888.appspot.com/tfx_covertype_continuous_training/2447d9b8-83b4-4044-84c5-114feaa12ee0/Evaluator/evaluation/28"
custom_properties {
  key: "name"
  value {
    string_value: "evaluation"
  }
}
custom_properties {
  key: "producer_component"
  value {
    string_value: "Evaluator"
  }
}
custom_properties {
  key: "state"
  value {
    string_value: "published"
  }
}
state: LIVE
create_time_since_epoch: 1636603281135
last_update_time_since_epoch: 1636603900819

In [15]:
with metadata.Metadata(connection_config) as store:
    model_eval_artifacts = store.get_artifacts_by_type(standard_artifacts.ModelEvaluation.TYPE_NAME)
    hyperparam_artifacts = store.get_artifacts_by_type(standard_artifacts.HyperParameters.TYPE_NAME)
    
model_eval_path = model_eval_artifacts[-1].uri
print("Generated model evaluation result:{}".format(model_eval_path))
best_hparams_path = os.path.join(hyperparam_artifacts[-1].uri, 'best_hyperparameters.txt')
print("Generated model best hyperparameters result:{}".format(best_hparams_path))

Generated model evaluation result:gs://artifacts.qwiklabs-gcp-04-0ad772141888.appspot.com/tfx_covertype_continuous_training/2447d9b8-83b4-4044-84c5-114feaa12ee0/Evaluator/evaluation/28


IndexError: list index out of range

In [19]:
model_eval_path

'gs://artifacts.qwiklabs-gcp-04-0ad772141888.appspot.com/tfx_covertype_continuous_training/2447d9b8-83b4-4044-84c5-114feaa12ee0/Evaluator/evaluation/28'

### Return best hyperparameters

In [16]:
# Latest pipeline run Tuner search space.
json.loads(file_io.read_file_to_string(best_hparams_path))['space']

NameError: name 'best_hparams_path' is not defined

In [17]:
# Latest pipeline run Tuner searched best_hyperparameters artifacts.
json.loads(file_io.read_file_to_string(best_hparams_path))['values']

NameError: name 'best_hparams_path' is not defined

### Visualize model evaluations

#### Exercise: review the model evaluation results below and answer the following questions:

- Which Wilderness Area had the highest accuracy?
- Which Wilderness Area had the lowest performance? Why do you think that is? What are some steps you could take to improve your next model runs?

In [18]:
eval_result = tfma.load_eval_result(model_eval_path)
tfma.view.render_slicing_metrics(
    eval_result, slicing_column='Wilderness_Area')

SlicingMetricsViewer(config={'weightedExamplesColumn': 'example_count'}, data=[{'slice': 'Wilderness_Area:Cach…

**Debugging tip**: If the TFMA visualization of the Evaluator results do not render, try switching to view in a Classic Jupyter Notebook. You do so by clicking `Help > Launch Classic Notebook` and re-opening the notebook and running the above cell to see the interactive TFMA results.

## License

<font size=-1>Licensed under the Apache License, Version 2.0 (the \"License\");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at [https://www.apache.org/licenses/LICENSE-2.0](https://www.apache.org/licenses/LICENSE-2.0)

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  See the License for the specific language governing permissions and limitations under the License.</font>
