This notebook provides a set of utilities to output and inspect
tensors as they pass through the model.



This notebook details the process of identifying and tracking the values of tensors in a given network with an example using Mask RCNN.

In order to run this notebook on EC2, ssh into your instance with the command

ssh -i /your/ec2/keypair -L localhost:8888:localhost:8888 -L localhost:6006:localhost:6006 ec2-user@ip
nohup jupyter notebook --no-browser --ip=0.0.0.0 > notebook.log &
tensorboard -logdir ~/logs

This notebook is broken into _ sections. First, we generate a small dataset consisting of a single image from the coco data. We then look at how to track the tensors within Mask RCNN using that image. Finally, we track the gradients that backpropogate through the network.


In [1]:
import sys
import os
#os.environ['TF_CUDNN_DETERMINISTIC'] = 'true'
os.environ['TENSORPACK_FP16'] = 'true'
import tensorflow as tf
import tqdm
import numpy as np
import tensorpack.utils.viz as tpviz
from tensorpack import *
from tensorpack.tfutils.common import get_tf_version_tuple
sys.path.append('/mask-rcnn-tensorflow/MaskRCNN')
from model.generalized_rcnn import ResNetFPNModel
from config import finalize_configs, config as cfg
from eval import DetectionResult, predict_image, multithread_predict_dataflow, EvalCallback
from performance import ThroughputTracker, humanize_float
from data import get_eval_dataflow, get_train_dataflow, get_batch_train_dataflow

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])


The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.












In [2]:
MODEL = ResNetFPNModel(True)

In [3]:
#cfg.DATA.BASEDIR = '/home/ec2-user/data'
cfg.DATA.BASEDIR = '/data/coco/small_sample/'
#cfg.DATA.BASEDIR = '/home/ec2-user/small_data'
cfg.BACKBONE.WEIGHTS = '/data/coco/pretrained-models/ImageNet-R50-AlignPadding.npz'
#cfg.MODE_FPN=True
#cfg.FPN.NORM = 'GN'
#cfg.TRAIN.BATCH_SIZE_PER_GPU = 4
tf.set_random_seed(cfg.TRAIN.SEED)
fix_rng_seed(cfg.TRAIN.SEED)
np.random.seed(cfg.TRAIN.SEED)

In [4]:
train_dataflow = get_batch_train_dataflow(cfg.TRAIN.BATCH_SIZE_PER_GPU)
finalize_configs(is_training=True)

In train dataflow
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
[32m[1106 10:35:30 @dataset.py:50][0m Instances loaded from /data/coco/small_sample/annotations/instances_train2017.json.


100%|██████████| 25/25 [00:00<00:00, 4934.48it/s]

[32m[1106 10:35:30 @timer.py:50][0m Load Load annotations for train2017 finished, time:0.0068sec.
Done loading roidbs
[32m[1106 10:35:30 @data.py:509][0m Filtered 0 images which contain no non-crowd groudtruth boxes. Total #images for training: 25
Batching roidbs
Done batching roidbs
[32m[1106 10:35:30 @config.py:285][0m Config: ------------------------------------------
{'BACKBONE': {'FREEZE_AFFINE': False,
              'FREEZE_AT': 2,
              'NORM': 'FreezeBN',
              'RESNET_NUM_BLOCKS': [3, 4, 6, 3],
              'STRIDE_1X1': False,
              'TF_PAD_MODE': False,
              'WEIGHTS': '/data/coco/pretrained-models/ImageNet-R50-AlignPadding.npz'},
 'DATA': {'BASEDIR': '/data/coco/small_sample/',
          'CLASS_NAMES': ['BG', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus',
                          'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign',
                          'parking meter', 'bench', 'bird', 'cat', 'do




In [5]:
session_init = get_model_loader(cfg.BACKBONE.WEIGHTS)




In [6]:
traincfg = TrainConfig(
            model=MODEL,
            data=QueueInput(train_dataflow),
            steps_per_epoch=20,
            max_epoch=1,
            session_init=session_init,
            session_config=None,
            starting_epoch=cfg.TRAIN.STARTING_EPOCH
        )






At this point, we have our data and the model is setup. In order to export a gradient from the graph, we need to add a print statement to the tensor we want to track. For example, say we want to export the second to last layer of the backbone network (c4). Add this import to the top of the backbone.py file:

```from performance import print_runtime_tensor, print_runtime_tensor_loose_branch```

The, just after the c4 tensor is created in the network, add this line:

```c4 = print_runtime_tensor(\"tensor_c4_forward\", c4)```

Similarly, say we want to output a list of tensors. Perhaps the full output of the backbone (p23456). We can use something like:

```p23456 = [print_runtime_tensor(\"tensor_p23456_{}_forward\".format(i), j) for i,j in enumerate(p23456)]```

On the other hand, we might want to see a tensor that isn't actually used later in the graph, which means it wouldn't normally execute such that we can output it. This can be dome using the

```print_runtime_tensor_loose_branch```

For this, you need a downstream trigger tesnor to force the print of the tensor of interest. Say we have a tensor `t5` that isn't used in the graph, but `t1` is. We can print `t5` with:

```t1 = print_runtime_tensor_loose_branch(\"tensor_t5_forward\", t5, trigger_tensor=t1)```

Finally, say we want to print the gradients of the backwards pass. This is a little more complicated. Add this gradient printer class to the generalized_rcnn.py file:

```
class GradientPrinter(tf.train.Optimizer):
    def __init__(self, opt):
        self.opt = opt
    def compute_gradients(self, *args, **kwargs):
        return self.opt.compute_gradients(*args, **kwargs)
    def apply_gradients(self, gradvars, global_step=None, name=None):
        old_grads, v = zip(*gradvars)
        old_grads = [print_runtime_tensor("tensor_{}_backward".format(i.name), j) for i,j in zip(v, old_grads)]
        for i in v:
            print("gradient_name: {}".format(i.name))
        gradvars = list(zip(old_grads, v))
        return self.opt.apply_gradients(gradvars, global_step, name)
```

Inside the detection model class, modify the optimizer to pass through the gradient printer.

```
opt = tf.train.MomentumOptimizer(lr, 0.9)
opt = GradientPrinter(opt)
```

Once the print function has been added, run the paragraph below with the capture magic function to catch the printed output.

In [7]:
trainer = SimpleTrainer()

In [8]:
%%capture cap_out --no-stderr
launch_train_with_config(traincfg, trainer)



[32m[1106 10:35:31 @input_source.py:222][0m Setting up the queue 'QueueInput/input_queue' for CPU prefetching ...

[32m[1106 10:35:31 @trainers.py:49][0m Building graph for a single training tower ...
[32m[1106 10:35:31 @registry.py:127][0m conv0 input: [None, 3, None, None]

[32m[1106 10:35:32 @batch_norm.py:174][0m [5m[31mWRN[0m [BatchNorm] Using moving_mean/moving_variance in training.

Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
[32m[1106 10:35:32 @registry.py:135][0m conv0 output: [None, 64, None, None]
[32m[1106 10:35:32 @registry.py:127][0m pool0 input: [None, 64, None, None]

[32m[1106 10:35:32 @registry.py:135][0m pool0 output: [None, 64, None, None]
[32m[1106 10:35:32 @registry.py:127][0m group0/block0/conv1 input: [None, 64, None, None]
[32m[1106 10:35:32 @batch_norm.py:174][0m [5m[31mWRN[0m [BatchNorm] Using moving_mean/moving_variance in training.
[32m[1106 10:35:32 @regist

[32m[1106 10:35:33 @registry.py:127][0m group1/block3/conv1 input: [None, 512, None, None]
[32m[1106 10:35:33 @batch_norm.py:174][0m [5m[31mWRN[0m [BatchNorm] Using moving_mean/moving_variance in training.
[32m[1106 10:35:33 @registry.py:135][0m group1/block3/conv1 output: [None, 128, None, None]
[32m[1106 10:35:33 @registry.py:127][0m group1/block3/conv2 input: [None, 128, None, None]
[32m[1106 10:35:33 @batch_norm.py:174][0m [5m[31mWRN[0m [BatchNorm] Using moving_mean/moving_variance in training.
[32m[1106 10:35:33 @registry.py:135][0m group1/block3/conv2 output: [None, 128, None, None]
[32m[1106 10:35:33 @registry.py:127][0m group1/block3/conv3 input: [None, 128, None, None]
[32m[1106 10:35:33 @batch_norm.py:174][0m [5m[31mWRN[0m [BatchNorm] Using moving_mean/moving_variance in training.
[32m[1106 10:35:33 @registry.py:135][0m group1/block3/conv3 output: [None, 512, None, None]
[32m[1106 10:35:33 @registry.py:127][0m group2/block0/conv1 input: [None, 512

[32m[1106 10:35:35 @registry.py:135][0m group3/block1/conv1 output: [None, 512, None, None]
[32m[1106 10:35:35 @registry.py:127][0m group3/block1/conv2 input: [None, 512, None, None]
[32m[1106 10:35:35 @batch_norm.py:174][0m [5m[31mWRN[0m [BatchNorm] Using moving_mean/moving_variance in training.
[32m[1106 10:35:35 @registry.py:135][0m group3/block1/conv2 output: [None, 512, None, None]
[32m[1106 10:35:35 @registry.py:127][0m group3/block1/conv3 input: [None, 512, None, None]
[32m[1106 10:35:35 @batch_norm.py:174][0m [5m[31mWRN[0m [BatchNorm] Using moving_mean/moving_variance in training.
[32m[1106 10:35:35 @registry.py:135][0m group3/block1/conv3 output: [None, 2048, None, None]
[32m[1106 10:35:35 @registry.py:127][0m group3/block2/conv1 input: [None, 2048, None, None]
[32m[1106 10:35:35 @batch_norm.py:174][0m [5m[31mWRN[0m [BatchNorm] Using moving_mean/moving_variance in training.
[32m[1106 10:35:35 @registry.py:135][0m group3/block2/conv1 output: [None, 

INFO:tensorflow:Summary name mask_truth|pred is illegal; using mask_truth_pred instead.
[32m[1106 10:35:39 @regularize.py:97][0m regularize_cost() found 63 variables to regularize.
[32m[1106 10:35:39 @regularize.py:22][0m The following tensors will be regularized: group1/block0/conv1/W:0, group1/block0/conv2/W:0, group1/block0/conv3/W:0, group1/block0/convshortcut/W:0, group1/block1/conv1/W:0, group1/block1/conv2/W:0, group1/block1/conv3/W:0, group1/block2/conv1/W:0, group1/block2/conv2/W:0, group1/block2/conv3/W:0, group1/block3/conv1/W:0, group1/block3/conv2/W:0, group1/block3/conv3/W:0, group2/block0/conv1/W:0, group2/block0/conv2/W:0, group2/block0/conv3/W:0, group2/block0/convshortcut/W:0, group2/block1/conv1/W:0, group2/block1/conv2/W:0, group2/block1/conv3/W:0, group2/block2/conv1/W:0, group2/block2/conv2/W:0, group2/block2/conv3/W:0, group2/block3/conv1/W:0, group2/block3/conv2/W:0, group2/block3/conv3/W:0, group2/block4/conv1/W:0, group2/block4/conv2/W:0, group2/block4/con

  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


[32m[1106 10:35:44 @monitor.py:259][0m [5m[31mWRN[0m logger directory was not set. Ignore TFEventWriter.
[32m[1106 10:35:44 @monitor.py:300][0m [5m[31mWRN[0m logger directory was not set. Ignore JSONWriter.
[32m[1106 10:35:44 @model_utils.py:66][0m [36mTrainable Variables: 
[0mname                                   shape                    dim
-------------------------------------  ------------------  --------
group1/block0/conv1/W:0                [1, 1, 256, 128]       32768
group1/block0/conv1/bn/gamma:0         [128]                    128
group1/block0/conv1/bn/beta:0          [128]                    128
group1/block0/conv2/W:0                [3, 3, 128, 128]      147456
group1/block0/conv2/bn/gamma:0         [128]                    128
group1/block0/conv2/bn/beta:0          [128]                    128
group1/block0/conv3/W:0                [1, 1, 128, 512]       65536
group1/block0/conv3/bn/gamma:0         [512]                    512
group1/block0/conv3/bn/beta:

[32m[1106 10:35:44 @base.py:210][0m Setup callbacks graph ...

[32m[1106 10:35:44 @argtools.py:148][0m [5m[31mWRN[0m "import prctl" failed! Install python-prctl so that processes can be cleaned with guarantee.
[32m[1106 10:35:44 @summary.py:48][0m [MovingAverageSummary] 27 operations in collection 'MOVING_SUMMARY_OPS' will be run with session hooks.
[32m[1106 10:35:44 @summary.py:95][0m Summarizing collection 'summaries' of size 30.

[32m[1106 10:35:44 @base.py:231][0m Creating the session ...
[32m[1106 10:35:53 @base.py:237][0m Initializing the session ...
[32m[1106 10:35:53 @sessinit.py:206][0m Variables to restore from dict: group2/block5/conv1/W:0, group1/block2/conv1/bn/variance/EMA:0, group3/block2/conv3/bn/mean/EMA:0, group2/block3/conv2/bn/gamma:0, group3/block1/conv3/bn/mean/EMA:0, group1/block2/conv3/bn/beta:0, group3/block2/conv2/bn/beta:0, group0/block1/conv3/bn/mean/EMA:0, group0/block1/conv1/W:0, group1/block3/conv1/bn/variance/EMA:0, group2/block1/conv1/

[32m[1106 10:35:53 @sessinit.py:89][0m [5m[31mWRN[0m The following variables are in the graph, but not found in the dict: fastrcnn/fc6/W, fastrcnn/fc6/b, fastrcnn/fc7/W, fastrcnn/fc7/b, fastrcnn/outputs/box/W, fastrcnn/outputs/box/b, fastrcnn/outputs/class/W, fastrcnn/outputs/class/b, fpn/lateral_1x1_c2/W, fpn/lateral_1x1_c2/b, fpn/lateral_1x1_c3/W, fpn/lateral_1x1_c3/b, fpn/lateral_1x1_c4/W, fpn/lateral_1x1_c4/b, fpn/lateral_1x1_c5/W, fpn/lateral_1x1_c5/b, fpn/posthoc_3x3_p2/W, fpn/posthoc_3x3_p2/b, fpn/posthoc_3x3_p3/W, fpn/posthoc_3x3_p3/b, fpn/posthoc_3x3_p4/W, fpn/posthoc_3x3_p4/b, fpn/posthoc_3x3_p5/W, fpn/posthoc_3x3_p5/b, global_step, learning_rate, maskrcnn/conv/W, maskrcnn/conv/b, maskrcnn/deconv/W, maskrcnn/deconv/b, maskrcnn/fcn0/W, maskrcnn/fcn0/b, maskrcnn/fcn1/W, maskrcnn/fcn1/b, maskrcnn/fcn2/W, maskrcnn/fcn2/b, maskrcnn/fcn3/W, maskrcnn/fcn3/b, rpn/box/W, rpn/box/b, rpn/class/W, rpn/class/b, rpn/conv0/W, rpn/conv0/b
[32m[1106 10:35:53 @sessinit.py:89][0m [5m[3

100%|##########|20/20[00:26<00:00, 0.75it/s]

[32m[1106 10:36:46 @base.py:286][0m Epoch 1 (global_step 20) finished, time:26.7 seconds.
[32m[1106 10:36:46 @monitor.py:469][0m QueueInput/queue_size: 50
[32m[1106 10:36:46 @monitor.py:469][0m boxclass_losses/box_loss: 0.11954
[32m[1106 10:36:46 @monitor.py:469][0m boxclass_losses/label_loss: 0.842
[32m[1106 10:36:46 @monitor.py:469][0m boxclass_losses/label_metrics/accuracy: 0.91671
[32m[1106 10:36:46 @monitor.py:469][0m boxclass_losses/label_metrics/false_negative: 0.97059
[32m[1106 10:36:46 @monitor.py:469][0m boxclass_losses/label_metrics/fg_accuracy: 0
[32m[1106 10:36:46 @monitor.py:469][0m boxclass_losses/num_fg_label: 27.88
[32m[1106 10:36:46 @monitor.py:469][0m learning_rate: 0.003
[32m[1106 10:36:46 @monitor.py:469][0m maskrcnn_loss/accuracy: 0.52806
[32m[1106 10:36:46 @monitor.py:469][0m maskrcnn_loss/fg_pixel_ratio: 0.55169
[32m[1106 10:36:46 @monitor.py:469][0m maskrcnn_loss/maskrcnn_loss: 0.82215
[32m[1106 10:36:46 @monitor.py:469][0m maskrcnn_lo




In [9]:
cap_out.stdout

"Use channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse channels_first data format\nUse cha

[32m[1106 10:36:46 @input_source.py:178][0m EnqueueThread QueueInput/input_queue Exited.
