## Define a simple graph first to get a feel how tf.profiler works
[Tensorflow official guide]( https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/profiler/g3doc/profile_model_architecture.md)

[SO REF: Also list APIs for KERAS](https://stackoverflow.com/questions/45085938/tensorflow-is-there-a-way-to-measure-flops-for-a-model?utm_medium=organic&utm_source=google_rich_qa&utm_campaign=google_rich_qa)

In [8]:
import tensorflow as tf

tf.reset_default_graph()

# case1 normal save and restore
# define simple graphs
a = tf.Variable([3.], dtype=tf.float32, name='a')
b = tf.placeholder(tf.float32, shape=(), name='input')
# In this grpah we have three FLOPs
c = tf.multiply(a, b, name='wawa')
d = tf.multiply(c, c, name='tata')
e = tf.multiply(d, d, name='haha')
init = tf.global_variables_initializer()

# session will bind to the global default graph
saver = tf.train.Saver()
with tf.Session() as sess:
    sess.run(init)
    print(sess.run(d, feed_dict={b:2}))
    saver.save(sess, './tmp/model.ckpt')
# then under the directory ./tmp you will find the two files
# model.ckpt.meta : The definition of graph
# model.ckpt.data-00000-of-00001 : The data (the value for the nodes)

[36.]


In [9]:
# here we want to analyze the floating point opearation numbers
tf.reset_default_graph()


tf.train.import_meta_graph('./tmp/model.ckpt.meta')
saver = tf.train.Saver()
with tf.Session() as sess:
    # The session is binding to the default global graph
    tf.profiler.profile(
        sess.graph,
        options=tf.profiler.ProfileOptionBuilder.float_operation())
    
    

## The output of the terminal will be like the following:
Indeed there are three parameters as we have defined in our simple graph

```
Profile:
node name | # float_ops
_TFProfRoot (--/3 flops)
  haha (1/1 flops)
  tata (1/1 flops)
  wawa (1/1 flops)
```

## We can also counts the number of parameters in the graph

Before we run the following code, we can first take a look the graph we have defined.

There should only be one parameter, 'a'

In [11]:
# here we want to analyze the floating point opearation numbers
tf.reset_default_graph()


tf.train.import_meta_graph('./tmp/model.ckpt.meta')
saver = tf.train.Saver()
with tf.Session() as sess:
    # The session is binding to the default global graph
    parameters = tf.profiler.profile(
        sess.graph,
        options=tf.profiler.ProfileOptionBuilder
        .trainable_variables_parameter())
    print ('total parameters: {}'.format(parameters.total_parameters))
    

total parameters: 1


## Now we can define more complicated network and see if tf.profile really does it job!

[In this paper: Convolutional Neural Networks for Small-footprint Keyword Spotting](https://static.googleusercontent.com/media/research.google.com/zh-TW//pubs/archive/43969.pdf)

The author lists number of parameters and multiplications in a small model, and here we can get a try
to see if tf.profiler counts the same number as in the paper

The meanings of the parameters are : 

m : Height of Kernel (Time)

r : Width of Kernel  (Frequency)

n : Number of Output Channels

p :  Pool in height

q :  Pool in width

```
type m  r n  p  q     Par.      Mul.
conv 20 8 64 1  3   10.2K       4.4M
conv 10 4 64 1  1   164.8K      5.2M
lin - - 32 - -      65.5K       65.5K
dnn - - 128 - -     4.1K        4.1K
softmax - - 4 - -   0.5K        0.5K
Total - - - - -     244.2K      9.7M
```

In [21]:
tf.reset_default_graph()

# for simplicity we consider the first conv layer

# You can think X as 1 example, 32 timestamps, spectral components for 40 mel-bands, and one input channel
# And typically TF call this as NHWC format
X = tf.placeholder(tf.float32, [1, 32, 40, 1])
# H:20, W:8, Input Channel: 1, Output Channel 64
W = tf.Variable(tf.random_normal([20, 8, 1, 64]))
b = tf.Variable(tf.random_normal([64]))
conv1 = tf.nn.conv2d(X, W, strides=[1,1,1,1], padding='VALID')
conv1 = tf.nn.bias_add(conv1, b)
conv1 = tf.nn.max_pool(conv1, ksize=[1, 1, 3, 1], strides=[1,1,1,1], padding='VALID')

# now we have defined our graph, we can calculate the FLOPs and number of
# parameters

with tf.Session() as sess:
    with tf.Session() as sess:
        # The session is binding to the default global graph
        tf.profiler.profile(
            sess.graph,
            options=tf.profiler.ProfileOptionBuilder.float_operation())
        parameters = tf.profiler.profile(sess.graph,
                                         options=tf.profiler.ProfileOptionBuilder
                                         .trainable_variables_parameter())
        print ('total parameters: {}'.format(parameters.total_parameters))
    

# observe the output of this cell: the counts of parameter is indeed 10.2K!

total parameters: 10304


## The outputs on the terminal are:

For FLOPS: Here the number is 8.91M
since it both counts add and multiplications.

```
Profile:
node name | # float_ops
_TFProfRoot (--/8.91m flops)
  Conv2D (8.79m/8.79m flops)
  MaxPool (77.38k/77.38k flops)
  BiasAdd (27.46k/27.46k flops)
  random_normal (10.24k/20.48k flops)
    random_normal/mul (10.24k/10.24k flops)
  random_normal_1 (64/128 flops)
    random_normal_1/mul (64/64 flops)
```

For model parameters:

```
==================Model Analysis Report======================

Doc:
scope: The nodes in the model graph are organized by their names, which is hierarchical like filesystem.
param: Number of parameters (in the Variable).

Profile:
node name | # parameters
_TFProfRoot (--/10.30k params)
  Variable (20x8x1x64, 10.24k/10.24k params)
  Variable_1 (64, 64/64 params)

```
