# Summary of models from Deep Learning Benchmarking Suite

Full list of models is [here](https://hewlettpackard.github.io/dlcookbook-dlbs/#/models/models?id=models). From high level point of view:
1. FLOPS are multiply-add operations in dense and convolutional layers.
2. Algorithm for estimating memory requirements is very naive and will be updated.
3. Memory for training is approximately twice of inference memory.
3. FLOPS:
   ```
   gFLOPS(backward) = 2 * gFLOPs(forward)
   gFLOPS(training) = 3 * gFLOPs(forward)
   ```
4. FLOPs and memory are provided for one instance (== batch size is 1).

Click for [summary](#Summary). Or see below for details.

**Models**
1.  [English acoustic model](#English-acoustic-model)
2.  [AlexNet](#AlexNet)
3.  [AlexNet OWT](#AlexNetOWT)
4.  [Deep MNIST](#DeepMNIST)
5.  [VGG-11](#VGG11)
6.  [VGG-13](#VGG13)
7.  [VGG-16](#VGG16)
8.  [VGG-19](#VGG19)
9.  [Overfeat](#Overfeat)
10. [ResNet-18](#ResNet18)
11. [ResNet-34](#ResNet34)
12. [ResNet-50](#ResNet50)
13. [ResNet-101](#ResNet101)
14. [ResNet-152](#ResNet152)
15. [ResNet-200](#ResNet200)
16. [ResNet-269](#ResNet269)


In [2]:
from nns.nns import (estimate, printable_dataframe)
from nns.models import dlbs as models

In [3]:
# Inference and training model summaries - name, shapes, parameters, gFLOPs, activations. Each element is a
# dictionary. I will later convert that into Pandas data frame and will print that.
inference = []
training = []

### [English acoustic model](http://ethereon.github.io/netscope/#/gist/10f5dee56b6f7bbb5da26749bd37ae16)

In [4]:
estimate(models.EnglishAcousticModel(), inference, training)

Unnamed: 0,name,out_shape,gFLOPs,num_params,num_activations,params_mem (MB),activations_mem (MB)
0,input,"(540,)",0.0,0,540,0.0,0.00216
1,dense1,"(2048,)",0.001106,1107968,4096,4.431872,0.016384
2,dense2,"(2048,)",0.004194,4196352,4096,16.785408,0.016384
3,dense3,"(2048,)",0.004194,4196352,4096,16.785408,0.016384
4,dense4,"(2048,)",0.004194,4196352,4096,16.785408,0.016384
5,dense5,"(2048,)",0.004194,4196352,4096,16.785408,0.016384
6,dense6,"(8192,)",0.016777,16785408,16384,67.141632,0.065536
7,TOTAL,"(8192,)",0.03466,34678784,37404,138.715136,0.149616


### [AlexNet](http://ethereon.github.io/netscope/#/gist/5c94a074f4e4ac4b81ee28a796e04b5d)

In [5]:
estimate(models.AlexNet(), inference, training)

Unnamed: 0,name,out_shape,gFLOPs,num_params,num_activations,params_mem (MB),activations_mem (MB)
0,input,"(227, 227, 3)",0.0,0,154587,0.0,0.618348
1,conv1,"(55, 55, 96)",0.105415,34944,580800,0.139776,2.3232
2,max_pooling2d,"(27, 27, 96)",0.0,0,69984,0.0,0.279936
3,conv2,"(27, 27, 256)",0.447898,614656,373248,2.458624,1.492992
4,max_pooling2d_1,"(13, 13, 256)",0.0,0,43264,0.0,0.173056
5,conv3,"(13, 13, 384)",0.14952,885120,129792,3.54048,0.519168
6,conv4,"(13, 13, 384)",0.224281,1327488,129792,5.309952,0.519168
7,conv5,"(13, 13, 256)",0.14952,884992,86528,3.539968,0.346112
8,max_pooling2d_2,"(6, 6, 256)",0.0,0,9216,0.0,0.036864
9,flatten,"(9216,)",0.0,0,9216,0.0,0.036864


### [AlexNetOWT](http://ethereon.github.io/netscope/#/gist/dc85cc15d59d720c8a18c4776abc9fd5)

In [6]:
estimate(models.AlexNet(version='owt'), inference, training)

Unnamed: 0,name,out_shape,gFLOPs,num_params,num_activations,params_mem (MB),activations_mem (MB)
0,input,"(227, 227, 3)",0.0,0,154587,0.0,0.618348
1,conv1,"(55, 55, 64)",0.070277,23296,387200,0.093184,1.5488
2,max_pooling2d,"(27, 27, 64)",0.0,0,46656,0.0,0.186624
3,conv2,"(27, 27, 192)",0.223949,307392,279936,1.229568,1.119744
4,max_pooling2d_1,"(13, 13, 192)",0.0,0,32448,0.0,0.129792
5,conv3,"(13, 13, 384)",0.11214,663936,129792,2.655744,0.519168
6,conv4,"(13, 13, 256)",0.14952,884992,86528,3.539968,0.346112
7,conv5,"(13, 13, 256)",0.09968,590080,86528,2.36032,0.346112
8,max_pooling2d_2,"(6, 6, 256)",0.0,0,9216,0.0,0.036864
9,flatten,"(9216,)",0.0,0,9216,0.0,0.036864


### [DeepMNIST](http://ethereon.github.io/netscope/#/gist/9c75cd95891207082bd42264eb7a2706)

In [7]:
estimate(models.DeepMNIST(), inference, training)

Unnamed: 0,name,out_shape,gFLOPs,num_params,num_activations,params_mem (MB),activations_mem (MB)
0,input,"(28, 28, 1)",0.0,0,784,0.0,0.003136
1,flatten,"(784,)",0.0,0,784,0.0,0.003136
2,dense1,"(2500,)",0.00196,1962500,5000,7.85,0.02
3,dense2,"(2000,)",0.005,5002000,4000,20.008,0.016
4,dense3,"(1500,)",0.003,3001500,3000,12.006,0.012
5,dense4,"(1000,)",0.0015,1501000,2000,6.004,0.008
6,dense5,"(500,)",0.0005,500500,1000,2.002,0.004
7,dense6,"(10,)",5e-06,5010,20,0.02004,8e-05
8,TOTAL,"(10,)",0.011965,11972510,16588,47.89004,0.066352


### [VGG11](http://ethereon.github.io/netscope/#/gist/5550b93fb51ab63d520af5be555d691f)

In [8]:
estimate(models.VGG(version='vgg11'), inference, training)

Unnamed: 0,name,out_shape,gFLOPs,num_params,num_activations,params_mem (MB),activations_mem (MB)
0,input,"(224, 224, 3)",0.0,0,150528,0.0,0.602112
1,conv1_1,"(224, 224, 64)",0.086704,1792,6422528,0.007168,25.690112
2,pool1,"(112, 112, 64)",0.0,0,802816,0.0,3.211264
3,conv2_1,"(112, 112, 128)",0.924844,73856,3211264,0.295424,12.845056
4,pool2,"(56, 56, 128)",0.0,0,401408,0.0,1.605632
5,conv3_1,"(56, 56, 256)",0.924844,295168,1605632,1.180672,6.422528
6,conv3_2,"(56, 56, 256)",1.849688,590080,1605632,2.36032,6.422528
7,pool3,"(28, 28, 256)",0.0,0,200704,0.0,0.802816
8,conv4_1,"(28, 28, 512)",0.924844,1180160,802816,4.72064,3.211264
9,conv4_2,"(28, 28, 512)",1.849688,2359808,802816,9.439232,3.211264


### [VGG13](http://ethereon.github.io/netscope/#/gist/a96ba317064a61b22a1742bd05c54816)

In [9]:
estimate(models.VGG(version='vgg13'), inference, training)

Unnamed: 0,name,out_shape,gFLOPs,num_params,num_activations,params_mem (MB),activations_mem (MB)
0,input,"(224, 224, 3)",0.0,0,150528,0.0,0.602112
1,conv1_1,"(224, 224, 64)",0.086704,1792,6422528,0.007168,25.690112
2,conv1_2,"(224, 224, 64)",1.849688,36928,6422528,0.147712,25.690112
3,pool1,"(112, 112, 64)",0.0,0,802816,0.0,3.211264
4,conv2_1,"(112, 112, 128)",0.924844,73856,3211264,0.295424,12.845056
5,conv2_2,"(112, 112, 128)",1.849688,147584,3211264,0.590336,12.845056
6,pool2,"(56, 56, 128)",0.0,0,401408,0.0,1.605632
7,conv3_1,"(56, 56, 256)",0.924844,295168,1605632,1.180672,6.422528
8,conv3_2,"(56, 56, 256)",1.849688,590080,1605632,2.36032,6.422528
9,pool3,"(28, 28, 256)",0.0,0,200704,0.0,0.802816


### [VGG16](http://ethereon.github.io/netscope/#/gist/050efcbb3f041bfc2a392381d0aac671)

In [10]:
estimate(models.VGG(version='vgg16'), inference, training)

Unnamed: 0,name,out_shape,gFLOPs,num_params,num_activations,params_mem (MB),activations_mem (MB)
0,input,"(224, 224, 3)",0.0,0,150528,0.0,0.602112
1,conv1_1,"(224, 224, 64)",0.086704,1792,6422528,0.007168,25.690112
2,conv1_2,"(224, 224, 64)",1.849688,36928,6422528,0.147712,25.690112
3,pool1,"(112, 112, 64)",0.0,0,802816,0.0,3.211264
4,conv2_1,"(112, 112, 128)",0.924844,73856,3211264,0.295424,12.845056
5,conv2_2,"(112, 112, 128)",1.849688,147584,3211264,0.590336,12.845056
6,pool2,"(56, 56, 128)",0.0,0,401408,0.0,1.605632
7,conv3_1,"(56, 56, 256)",0.924844,295168,1605632,1.180672,6.422528
8,conv3_2,"(56, 56, 256)",1.849688,590080,1605632,2.36032,6.422528
9,conv3_3,"(56, 56, 256)",1.849688,590080,1605632,2.36032,6.422528


### [VGG19](http://ethereon.github.io/netscope/#/gist/f9e55d5947ac0043973b32b7ff51b778)

In [11]:
estimate(models.VGG(version='vgg19'), inference, training)

Unnamed: 0,name,out_shape,gFLOPs,num_params,num_activations,params_mem (MB),activations_mem (MB)
0,input,"(224, 224, 3)",0.0,0,150528,0.0,0.602112
1,conv1_1,"(224, 224, 64)",0.086704,1792,6422528,0.007168,25.690112
2,conv1_2,"(224, 224, 64)",1.849688,36928,6422528,0.147712,25.690112
3,pool1,"(112, 112, 64)",0.0,0,802816,0.0,3.211264
4,conv2_1,"(112, 112, 128)",0.924844,73856,3211264,0.295424,12.845056
5,conv2_2,"(112, 112, 128)",1.849688,147584,3211264,0.590336,12.845056
6,pool2,"(56, 56, 128)",0.0,0,401408,0.0,1.605632
7,conv3_1,"(56, 56, 256)",0.924844,295168,1605632,1.180672,6.422528
8,conv3_2,"(56, 56, 256)",1.849688,590080,1605632,2.36032,6.422528
9,conv3_3,"(56, 56, 256)",1.849688,590080,1605632,2.36032,6.422528


### [Overfeat](http://ethereon.github.io/netscope/#/gist/ebfeff824393bcd66a9ceb851d8e5bde)

In [12]:
estimate(models.Overfeat(), inference, training)

Unnamed: 0,name,out_shape,gFLOPs,num_params,num_activations,params_mem (MB),activations_mem (MB)
0,input,"(231, 231, 3)",0.0,0,160083,0.0,0.640332
1,conv1,"(56, 56, 96)",0.109283,34944,602112,0.139776,2.408448
2,max_pooling2d,"(28, 28, 96)",0.0,0,75264,0.0,0.301056
3,conv2,"(24, 24, 256)",0.353894,614656,294912,2.458624,1.179648
4,max_pooling2d_1,"(12, 12, 256)",0.0,0,36864,0.0,0.147456
5,conv3,"(12, 12, 512)",0.169869,1180160,147456,4.72064,0.589824
6,conv4,"(12, 12, 1024)",0.679477,4719616,294912,18.878464,1.179648
7,conv5,"(12, 12, 1024)",1.358954,9438208,294912,37.752832,1.179648
8,max_pooling2d_2,"(6, 6, 1024)",0.0,0,36864,0.0,0.147456
9,flatten,"(36864,)",0.0,0,36864,0.0,0.147456


### [ResNet18](http://ethereon.github.io/netscope/#/gist/649e0fb6c96c60c9f0abaa339da3cd27)

In [13]:
estimate(models.ResNet(version='resnet18'), inference, training)

Layer not recognized (type=<class 'tensorflow.python.keras.engine.input_layer.InputLayer'>, name=input)


Unnamed: 0,name,out_shape,gFLOPs,num_params,num_activations,params_mem (MB),activations_mem (MB)
0,input,"(224, 224, 3)",0.000000,0,150528,0.000000,0.602112
1,conv1/conv,"(112, 112, 64)",0.118014,9408,802816,0.037632,3.211264
2,conv1/bn,"(112, 112, 64)",0.000000,128,802816,0.000512,3.211264
3,conv1/relu,"(112, 112, 64)",0.000000,0,802816,0.000000,3.211264
4,conv1/mpool,"(56, 56, 64)",0.000000,0,200704,0.000000,0.802816
...,...,...,...,...,...,...,...
68,res4b1/relu,"(7, 7, 512)",0.000000,0,25088,0.000000,0.100352
69,apool,"(1, 1, 512)",0.000000,0,512,0.000000,0.002048
70,flatten,"(512,)",0.000000,0,512,0.000000,0.002048
71,output,"(1000,)",0.000512,513000,2000,2.052000,0.008000


### [ResNet34](http://ethereon.github.io/netscope/#/gist/277a9604370076d8eed03e9e44e23d53)

In [14]:
estimate(models.ResNet(version='resnet34'), inference, training)

Layer not recognized (type=<class 'tensorflow.python.keras.engine.input_layer.InputLayer'>, name=input)


Unnamed: 0,name,out_shape,gFLOPs,num_params,num_activations,params_mem (MB),activations_mem (MB)
0,input,"(224, 224, 3)",0.000000,0,150528,0.000000,0.602112
1,conv1/conv,"(112, 112, 64)",0.118014,9408,802816,0.037632,3.211264
2,conv1/bn,"(112, 112, 64)",0.000000,128,802816,0.000512,3.211264
3,conv1/relu,"(112, 112, 64)",0.000000,0,802816,0.000000,3.211264
4,conv1/mpool,"(56, 56, 64)",0.000000,0,200704,0.000000,0.802816
...,...,...,...,...,...,...,...
124,res4b2/relu,"(7, 7, 512)",0.000000,0,25088,0.000000,0.100352
125,apool,"(1, 1, 512)",0.000000,0,512,0.000000,0.002048
126,flatten,"(512,)",0.000000,0,512,0.000000,0.002048
127,output,"(1000,)",0.000512,513000,2000,2.052000,0.008000


### [ResNet50](http://ethereon.github.io/netscope/#/gist/db945b393d40bfa26006)

In [15]:
estimate(models.ResNet(version='resnet50'), inference, training)

Layer not recognized (type=<class 'tensorflow.python.keras.engine.input_layer.InputLayer'>, name=input)


Unnamed: 0,name,out_shape,gFLOPs,num_params,num_activations,params_mem (MB),activations_mem (MB)
0,input,"(224, 224, 3)",0.000000,0,150528,0.000000,0.602112
1,conv1/conv,"(112, 112, 64)",0.118014,9408,802816,0.037632,3.211264
2,conv1/bn,"(112, 112, 64)",0.000000,128,802816,0.000512,3.211264
3,conv1/relu,"(112, 112, 64)",0.000000,0,802816,0.000000,3.211264
4,conv1/mpool,"(56, 56, 64)",0.000000,0,200704,0.000000,0.802816
...,...,...,...,...,...,...,...
172,res4b2/relu,"(7, 7, 2048)",0.000000,0,100352,0.000000,0.401408
173,apool,"(1, 1, 2048)",0.000000,0,2048,0.000000,0.008192
174,flatten,"(2048,)",0.000000,0,2048,0.000000,0.008192
175,output,"(1000,)",0.002048,2049000,2000,8.196000,0.008000


### [ResNet101](http://ethereon.github.io/netscope/#/gist/b21e2aae116dc1ac7b50)

In [16]:
estimate(models.ResNet(version='resnet101'), inference, training)

Layer not recognized (type=<class 'tensorflow.python.keras.engine.input_layer.InputLayer'>, name=input)


Unnamed: 0,name,out_shape,gFLOPs,num_params,num_activations,params_mem (MB),activations_mem (MB)
0,input,"(224, 224, 3)",0.000000,0,150528,0.000000,0.602112
1,conv1/conv,"(112, 112, 64)",0.118014,9408,802816,0.037632,3.211264
2,conv1/bn,"(112, 112, 64)",0.000000,128,802816,0.000512,3.211264
3,conv1/relu,"(112, 112, 64)",0.000000,0,802816,0.000000,3.211264
4,conv1/mpool,"(56, 56, 64)",0.000000,0,200704,0.000000,0.802816
...,...,...,...,...,...,...,...
342,res4b2/relu,"(7, 7, 2048)",0.000000,0,100352,0.000000,0.401408
343,apool,"(1, 1, 2048)",0.000000,0,2048,0.000000,0.008192
344,flatten,"(2048,)",0.000000,0,2048,0.000000,0.008192
345,output,"(1000,)",0.002048,2049000,2000,8.196000,0.008000


### [ResNet152](http://ethereon.github.io/netscope/#/gist/d38f3e6091952b45198b)

In [17]:
estimate(models.ResNet(version='resnet152'), inference, training)

Layer not recognized (type=<class 'tensorflow.python.keras.engine.input_layer.InputLayer'>, name=input)


Unnamed: 0,name,out_shape,gFLOPs,num_params,num_activations,params_mem (MB),activations_mem (MB)
0,input,"(224, 224, 3)",0.000000,0,150528,0.000000,0.602112
1,conv1/conv,"(112, 112, 64)",0.118014,9408,802816,0.037632,3.211264
2,conv1/bn,"(112, 112, 64)",0.000000,128,802816,0.000512,3.211264
3,conv1/relu,"(112, 112, 64)",0.000000,0,802816,0.000000,3.211264
4,conv1/mpool,"(56, 56, 64)",0.000000,0,200704,0.000000,0.802816
...,...,...,...,...,...,...,...
512,res4b2/relu,"(7, 7, 2048)",0.000000,0,100352,0.000000,0.401408
513,apool,"(1, 1, 2048)",0.000000,0,2048,0.000000,0.008192
514,flatten,"(2048,)",0.000000,0,2048,0.000000,0.008192
515,output,"(1000,)",0.002048,2049000,2000,8.196000,0.008000


### [ResNet200](http://ethereon.github.io/netscope/#/gist/38a20d8dd1a4725d12659c8e313ab2c7)

In [18]:
estimate(models.ResNet(version='resnet200'), inference, training)

Layer not recognized (type=<class 'tensorflow.python.keras.engine.input_layer.InputLayer'>, name=input)


Unnamed: 0,name,out_shape,gFLOPs,num_params,num_activations,params_mem (MB),activations_mem (MB)
0,input,"(224, 224, 3)",0.000000,0,150528,0.000000,0.602112
1,conv1/conv,"(112, 112, 64)",0.118014,9408,802816,0.037632,3.211264
2,conv1/bn,"(112, 112, 64)",0.000000,128,802816,0.000512,3.211264
3,conv1/relu,"(112, 112, 64)",0.000000,0,802816,0.000000,3.211264
4,conv1/mpool,"(56, 56, 64)",0.000000,0,200704,0.000000,0.802816
...,...,...,...,...,...,...,...
672,res4b2/relu,"(7, 7, 2048)",0.000000,0,100352,0.000000,0.401408
673,apool,"(1, 1, 2048)",0.000000,0,2048,0.000000,0.008192
674,flatten,"(2048,)",0.000000,0,2048,0.000000,0.008192
675,output,"(1000,)",0.002048,2049000,2000,8.196000,0.008000


### [ResNet269](http://ethereon.github.io/netscope/#/gist/fbf7c67565523a9ac2c349aa89c5e78d)

In [19]:
estimate(models.ResNet(version='resnet269'), inference, training)

Layer not recognized (type=<class 'tensorflow.python.keras.engine.input_layer.InputLayer'>, name=input)


Unnamed: 0,name,out_shape,gFLOPs,num_params,num_activations,params_mem (MB),activations_mem (MB)
0,input,"(224, 224, 3)",0.000000,0,150528,0.000000,0.602112
1,conv1/conv,"(112, 112, 64)",0.118014,9408,802816,0.037632,3.211264
2,conv1/bn,"(112, 112, 64)",0.000000,128,802816,0.000512,3.211264
3,conv1/relu,"(112, 112, 64)",0.000000,0,802816,0.000000,3.211264
4,conv1/mpool,"(56, 56, 64)",0.000000,0,200704,0.000000,0.802816
...,...,...,...,...,...,...,...
902,res4b7/relu,"(7, 7, 2048)",0.000000,0,100352,0.000000,0.401408
903,apool,"(1, 1, 2048)",0.000000,0,2048,0.000000,0.008192
904,flatten,"(2048,)",0.000000,0,2048,0.000000,0.008192
905,output,"(1000,)",0.002048,2049000,2000,8.196000,0.008000


# Summary

* Input shape column does not include batch dimension which is always the first dimensions.
* `GFLOPs` are multiply-add operations for batch size 1 for one inference or one training pass. These values should be used to compute times, instead, use them for comparing models.
* Activation size is the memory requried to store activations (batch size 1). The algorithm that's now used to estimate these numbers is very naive and, again, use these numbers to compare models.

## Inference

In [20]:
printable_dataframe(inference)

Unnamed: 0,Model,Input shape,#Parameters,Model size (MB) FP32,GFLOPs (multiply-add),Activation size (MB) FP32
0,EnglishAcousticModel,"(540,)",34678784,138.715136,0.03466,0.149616
1,AlexNet,"(227, 227, 3)",62378344,249.513376,1.135256,6.452012
2,AlexNetOWT,"(227, 227, 3)",61100840,244.40336,0.714188,4.994732
3,DeepMNIST,"(28, 28, 1)",11972510,47.89004,0.011965,0.066352
4,VGG11,"(224, 224, 3)",132863336,531.453344,7.60909,66.338624
5,VGG13,"(224, 224, 3)",133047848,532.191392,11.308466,104.873792
6,VGG16,"(224, 224, 3)",138357544,553.430176,15.470264,115.3104
7,VGG19,"(224, 224, 3)",143667240,574.66896,19.632062,125.747008
8,Overfeat,"(231, 231, 3)",145920872,583.683488,2.801404,8.014988
9,ResNet18,"(224, 224, 3)",11703464,46.813856,1.826918,35.135296


## Training

In [21]:
printable_dataframe(training)

Unnamed: 0,Model,Input shape,#Parameters,Model size (MB) FP32,GFLOPs (multiply-add),Activation size (MB) FP32
0,EnglishAcousticModel,"(540,)",34678784,138.715136,0.103981,0.299232
1,AlexNet,"(227, 227, 3)",62378344,249.513376,3.405768,12.904024
2,AlexNetOWT,"(227, 227, 3)",61100840,244.40336,2.142565,9.989464
3,DeepMNIST,"(28, 28, 1)",11972510,47.89004,0.035895,0.132704
4,VGG11,"(224, 224, 3)",132863336,531.453344,22.82727,132.677248
5,VGG13,"(224, 224, 3)",133047848,532.191392,33.925399,209.747584
6,VGG16,"(224, 224, 3)",138357544,553.430176,46.410793,230.6208
7,VGG19,"(224, 224, 3)",143667240,574.66896,58.896187,251.494016
8,Overfeat,"(231, 231, 3)",145920872,583.683488,8.404212,16.029976
9,ResNet18,"(224, 224, 3)",11703464,46.813856,5.480755,70.270592
