# Introduction and Purpose

Our framework is primarily intended for testing OpenCL implementations of common machine learning operations and (possibly even) hand-written networks. For now, we're going to consider them as _pure functions_, with the return value depending solely on the arguments. To test them, we gather a set of input-output pairs, feed the inputs, and compare results with the corresponding outputs.

Gathering data presents the biggest challenge: it needs to be both truthful and varied, dimensions-wise and content-wise. Accumulating it by hand is error-prone and time-consuming; so is writing generators (essentially, host-based implementations of target operations).

Instead, we choose to extract inputs and outputs from a TensorFlow _computational graph_. Its primitive operations are well-tested, and as such can be used to verify new implementations. Moreover, data can be dumped for a subgraph instead of a single node, allowing fused operations (e.g. batch normalization and ReLU activation combined) to be tested as well.

# Importing a Graph

We envision that users would want to work with existing models, such as the [officially supported ones](https://github.com/tensorflow/models). This is achieved by loading a computational graph from a saved [checkpoint](https://www.tensorflow.org/guide/checkpoints):

In [1]:
import tensorflow as tf

def load_graph_from_checkpoint(sess, chkpt_dir):
  latest_chkpt = tf.train.latest_checkpoint(chkpt_dir)
  
  saver = tf.train.import_meta_graph(f'{latest_chkpt}.meta')
  saver.restore(sess, latest_chkpt)

# Dumping Inputs and Outputs

In [2]:
def run_model_and_extract(sess, inputs, output_node_names):
  outputs = [tf.get_default_graph().get_tensor_by_name(n) for n in output_node_names]
  return sess.run(outputs, inputs)

Note that `inputs` is a dictionary of tensor names to tensors, which provides input values for the _whole model_. For the purposes of testing, the tensors can be randomly generated as long as we know their names, and those can be extracted using `tf.report_uninitialized_variables()`.

In [3]:
def random_inputs(sess):
  input_tensor_names = [f'{str(v, "utf8")}:0' for v in sess.run(tf.report_uninitialized_variables())]
  return {n: sess.run(tf.random_uniform(tf.get_default_graph().get_tensor_by_name(n).shape))
          for n in input_tensor_names}

# Example

The checkpoint used as an example was created using [TensorFlow benchmarking scripts](https://github.com/tensorflow/benchmarks/tree/master/scripts/tf_cnn_benchmarks#tf_cnn_benchmarks-high-performance-benchmarks):

```
CHKPT_DIR='../../Documents/resnet50v1_traindir'

python3 tf_cnn_benchmarks.py --model=resnet50 --data_format=NHWC --batch_size=8 --num_batches=1 \
  --train_dir=${CHKPT_DIR} --trace_file=${CHKPT_DIR}/trace --tfprof_file=${CHKPT_DIR}/profile \
  --summary_verbosity=3 --save_summaries_steps=30 \
  --device=cpu --local_parameter_device=cpu --all_reduce_spec=pscpu
```

In [4]:
sess = tf.Session()
load_graph_from_checkpoint(sess, '../../Documents/resnet50v1_traindir')

INFO:tensorflow:Restoring parameters from ../../Documents/resnet50v1_traindir/model.ckpt-11


In [5]:
target_output_names = ['v/tower_0/cg/resnet_v10/add:0', 'v/tower_0/cg/resnet_v10/Relu:0']

inputs = random_inputs(sess)
add_output, relu_output = run_model_and_extract(sess, inputs, target_output_names)

In [6]:
print(add_output.shape, relu_output.shape)

(8, 56, 56, 256) (8, 56, 56, 256)
