# Memory Equivalent Capacity (MEC)

This notebook is used for calculation and documentation of the model MEC. 

The calculation of MEC in our model only looks at the classifier section. This means that the feature extraction from images performed by the CNN will be included. This means that the generalization of the model will be calculated through a comparison between the information content of the latent embedding

In [None]:
import tensorflow as tf
import tensorflow.keras as keras

In [None]:
model = tf.keras.applications.MobileNetV2(
    alpha=1.0,
)

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_224.h5


In [None]:
model.layers[-1].input_shape, model.layers[-1].output_shape


((None, 1280), (None, 1000))

We see that the output features of the model is $1280$ which will be the number of input features for the linear model

The MEC of the model is defined as in section 7.2 in the book

In summary, derived four engineering rules to determine the Memory-equivalent
Capacity of a neural network:
1. The output of a single neuron yields maximally one bit of information.
2. The capacity of a single neuron is the number of its parameters (weights and threshold) in bits.
3. The total capacity $C_{tot}$ of $M$ neurons in parallel is $C_{tot} = \sum^M_{C_i}$ where $C_i$ is the capacity of each neuron.

4. For perceptrons in series (e.g., in subsequent layers), the capacity of a subsequent
layer cannot be larger than the output of the previous layer.

In [None]:
# Only works for dense layer
def mec_of_linear_layer(layer):
    assert isinstance(layer, keras.layers.Dense), "Only works for dense layers"
    # Zero is weights, 1 is bias
    in_shape, out_shape = layer.get_weights()[0].shape
    cap_neuron=out_shape
    if type(layer.bias)!=type(None):
        in_shape+=1
    return cap_neuron*in_shape

**Calculate MEC**

In [None]:
MEC = mec_of_linear_layer(model.layers[-1])
print(f"Model has {MEC} bits of capacity or {MEC*0.000122}kB or {MEC*(1.192e-7)} mB")

Model has 1281000 bits of capacity or 156.282kB or 0.1526952 mB


### Width multiplier


<i><u>The original paper except</u></i>


Although the base MobileNet architecture is already small and low latency, many times a specific use case or application may require the model to be smaller and faster.
In order to construct these smaller and less computationally expensive models we introduce a very simple parameter $\alpha$ called width multiplier. The role of the width multiplier $\alpha$ is to thin a network uniformly at each layer. For a given layer and width multiplier $\alpha$, the number of input channels $M$ becomes $\alpha M$ and the number of output channels $N$ becomes $\alpha N$.

The computational cost of a depthwise separable convolution with width multiplier $\alpha$ is:

$D_K\cdot D_K\cdot \alpha M \cdot D_F\cdot D_F\cdot + \alpha M \cdot \alpha N \cdot D_F \cdot D_F$

where $\alpha \in (0, 1]$ with typical settings of 1, 0.75, 0.5 and
0.25. $\alpha = 1$ is the baseline MobileNet and $\alpha < 1$ are
reduced MobileNets. Width multiplier has the effect of reducing computational cost and the number of parameters quadratically by roughly $\alpha^2$. Width multiplier can be applied to any model structure to define a new smaller model with a reasonable accuracy, latency and size trade off. It is used to define a new reduced structure that needs to be trained from scratch.




In [None]:
import plotly.express as px
import pandas as pd

In [None]:
data =[[alpha, mec_of_linear_layer(tf.keras.applications.MobileNetV2(alpha=alpha).layers[-1])] for alpha in [0.35,0.5,0.75,1.0,1.3,1.4]]

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_0.35_224.h5
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_0.5_224.h5
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_0.75_224.h5
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.3_224.h5
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.4_224.h5


In [None]:
df = pd.DataFrame(data, columns=["alpha","mec"])

In [None]:
px.line(df, x="alpha",y="mec", title="MEC as function of Alpha")

## Small Imagenet Models MEC

This part of the notebook will look at comparing the MEC of various small backbone architectures against MobileNetV2. 

This will result in a table containing

Model MEC, Model Name, Top 1 Accuracy

### nets

In [None]:
from tensorflow.keras.applications import ResNet50V2, ResNet101V2, DenseNet121, DenseNet169, NASNetMobile, EfficientNetB2, NASNetLarge

In [None]:
data = []

**ResNet50V2**

In [None]:
model=ResNet50V2()

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50v2_weights_tf_dim_ordering_tf_kernels.h5


In [None]:
model.layers[-5:]

[<keras.layers.merging.add.Add at 0x7f9d7bca2760>,
 <keras.layers.normalization.batch_normalization.BatchNormalization at 0x7f9d7bca22e0>,
 <keras.layers.core.activation.Activation at 0x7f9d7bc869a0>,
 <keras.layers.pooling.global_average_pooling2d.GlobalAveragePooling2D at 0x7f9d7bc35100>,
 <keras.layers.core.dense.Dense at 0x7f9d7bca2460>]

Only a dense layer at the end. We use the convolutional trick

In [None]:
MEC = mec_of_linear_layer(model.layers[-1])
print(f"Model has {MEC} bits of capacity or {MEC*0.000122}kB or {MEC*(1.192e-7)} mB")

Model has 2049000 bits of capacity or 249.978kB or 0.24424079999999998 mB


In [None]:
# acc from keras website
data.append(["ResNet50V2", MEC, 76.0])

**ResNet101V2**

In [None]:
model=ResNet101V2()

In [None]:
model.layers[-5:]

[<keras.layers.merging.add.Add at 0x7f51ec4169d0>,
 <keras.layers.normalization.batch_normalization.BatchNormalization at 0x7f51ec45ed90>,
 <keras.layers.core.activation.Activation at 0x7f51ec438c40>,
 <keras.layers.pooling.global_average_pooling2d.GlobalAveragePooling2D at 0x7f51ec3a6e50>,
 <keras.layers.core.dense.Dense at 0x7f51ec3a6eb0>]

In [None]:
MEC = mec_of_linear_layer(model.layers[-1])
print(f"Model has {MEC} bits of capacity or {MEC*0.000122}kB or {MEC*(1.192e-7)} mB")

Model has 2049000 bits of capacity or 249.978kB or 0.24424079999999998 mB


In [None]:
# acc from keras website
data.append(["ResNet101V2", MEC, 77.2])

**DenseNet121**

In [None]:
model=DenseNet121()

In [None]:
model.layers[-8:]

[<keras.layers.normalization.batch_normalization.BatchNormalization at 0x7f51ebf9b9a0>,
 <keras.layers.core.activation.Activation at 0x7f51ebfeea90>,
 <keras.layers.convolutional.conv2d.Conv2D at 0x7f51ebfe0910>,
 <keras.layers.merging.concatenate.Concatenate at 0x7f51ec0425e0>,
 <keras.layers.normalization.batch_normalization.BatchNormalization at 0x7f51ec130be0>,
 <keras.layers.core.activation.Activation at 0x7f51ec023880>,
 <keras.layers.pooling.global_average_pooling2d.GlobalAveragePooling2D at 0x7f51ec02b700>,
 <keras.layers.core.dense.Dense at 0x7f51ec0e0640>]

In [None]:
MEC = mec_of_linear_layer(model.layers[-1])
print(f"Model has {MEC} bits of capacity or {MEC*0.000122}kB or {MEC*(1.192e-7)} mB")

Model has 1025000 bits of capacity or 125.05kB or 0.12218 mB


In [None]:
# acc from keras website
data.append(["DenseNet121", MEC, 75.0])

**DenseNet169**

In [None]:
model=DenseNet169()

In [None]:
model.layers[-8:]

[<keras.layers.normalization.batch_normalization.BatchNormalization at 0x7f51eba3e910>,
 <keras.layers.core.activation.Activation at 0x7f51eb988850>,
 <keras.layers.convolutional.conv2d.Conv2D at 0x7f51eb922fd0>,
 <keras.layers.merging.concatenate.Concatenate at 0x7f51eb991100>,
 <keras.layers.normalization.batch_normalization.BatchNormalization at 0x7f51ebdd5ac0>,
 <keras.layers.core.activation.Activation at 0x7f51eb98ea90>,
 <keras.layers.pooling.global_average_pooling2d.GlobalAveragePooling2D at 0x7f51eb934c70>,
 <keras.layers.core.dense.Dense at 0x7f51eb9229a0>]

In [None]:
MEC = mec_of_linear_layer(model.layers[-1])
print(f"Model has {MEC} bits of capacity or {MEC*0.000122}kB or {MEC*(1.192e-7)} mB")

Model has 1665000 bits of capacity or 203.13kB or 0.198468 mB


In [None]:
# acc from keras website
data.append(["DenseNet169", MEC, 76.2])

**NASNetMobile**

In [None]:
model=NASNetMobile()

In [None]:
model.layers[-20:]

[<keras.layers.convolutional.separable_conv2d.SeparableConv2D at 0x7f51eb1bd610>,
 <keras.layers.convolutional.separable_conv2d.SeparableConv2D at 0x7f51eb0b3e80>,
 <keras.layers.convolutional.separable_conv2d.SeparableConv2D at 0x7f51eb0618e0>,
 <keras.layers.normalization.batch_normalization.BatchNormalization at 0x7f51eb114400>,
 <keras.layers.normalization.batch_normalization.BatchNormalization at 0x7f51eb114220>,
 <keras.layers.normalization.batch_normalization.BatchNormalization at 0x7f51eb1a5f40>,
 <keras.layers.normalization.batch_normalization.BatchNormalization at 0x7f51eb0b3490>,
 <keras.layers.pooling.average_pooling2d.AveragePooling2D at 0x7f51eb0c4af0>,
 <keras.layers.pooling.average_pooling2d.AveragePooling2D at 0x7f51eb0b3fd0>,
 <keras.layers.pooling.average_pooling2d.AveragePooling2D at 0x7f51eb0bea90>,
 <keras.layers.normalization.batch_normalization.BatchNormalization at 0x7f51eb061d60>,
 <keras.layers.merging.add.Add at 0x7f51eb0a6970>,
 <keras.layers.merging.add.Ad

In [None]:
MEC = mec_of_linear_layer(model.layers[-1])
print(f"Model has {MEC} bits of capacity or {MEC*0.000122}kB or {MEC*(1.192e-7)} mB")

Model has 1057000 bits of capacity or 128.954kB or 0.1259944 mB


In [None]:
# acc from keras website
data.append(["NASNetMobile", MEC, 74.4])

**NASNetLarge**

In [None]:
model=NASNetLarge()

In [None]:
model.layers[-20:]

[<keras.layers.convolutional.separable_conv2d.SeparableConv2D at 0x7f51ea5ffb20>,
 <keras.layers.convolutional.separable_conv2d.SeparableConv2D at 0x7f51ea62fb80>,
 <keras.layers.convolutional.separable_conv2d.SeparableConv2D at 0x7f51ea59aac0>,
 <keras.layers.normalization.batch_normalization.BatchNormalization at 0x7f51ea61b9a0>,
 <keras.layers.normalization.batch_normalization.BatchNormalization at 0x7f51ea5e63a0>,
 <keras.layers.normalization.batch_normalization.BatchNormalization at 0x7f51ea5e61f0>,
 <keras.layers.normalization.batch_normalization.BatchNormalization at 0x7f51ea5df580>,
 <keras.layers.pooling.average_pooling2d.AveragePooling2D at 0x7f51ea66a5e0>,
 <keras.layers.pooling.average_pooling2d.AveragePooling2D at 0x7f51ea6a1520>,
 <keras.layers.pooling.average_pooling2d.AveragePooling2D at 0x7f51ea912c40>,
 <keras.layers.normalization.batch_normalization.BatchNormalization at 0x7f51ea62f3d0>,
 <keras.layers.merging.add.Add at 0x7f51ea61fd60>,
 <keras.layers.merging.add.Ad

In [None]:
MEC = mec_of_linear_layer(model.layers[-1])
print(f"Model has {MEC} bits of capacity or {MEC*0.000122}kB or {MEC*(1.192e-7)} mB")

Model has 4033000 bits of capacity or 492.026kB or 0.4807336 mB


In [None]:
# acc from keras website
data.append(["NASNetLarge", MEC, 82.5])

**EfficientNetB2**

In [None]:
model=EfficientNetB2()

In [None]:
model.layers[-10:]

[<keras.layers.convolutional.conv2d.Conv2D at 0x7f51ea059220>,
 <keras.layers.normalization.batch_normalization.BatchNormalization at 0x7f51ea058070>,
 <keras.layers.regularization.dropout.Dropout at 0x7f51ea06dfa0>,
 <keras.layers.merging.add.Add at 0x7f51ea0eb250>,
 <keras.layers.convolutional.conv2d.Conv2D at 0x7f51ea62f580>,
 <keras.layers.normalization.batch_normalization.BatchNormalization at 0x7f51ea1086d0>,
 <keras.layers.core.activation.Activation at 0x7f51ea0ebfd0>,
 <keras.layers.pooling.global_average_pooling2d.GlobalAveragePooling2D at 0x7f51ea0e6ac0>,
 <keras.layers.regularization.dropout.Dropout at 0x7f51ea0c9f70>,
 <keras.layers.core.dense.Dense at 0x7f51ea07b310>]

In [None]:
MEC = mec_of_linear_layer(model.layers[-1])
print(f"Model has {MEC} bits of capacity or {MEC*0.000122}kB or {MEC*(1.192e-7)} mB")

Model has 1409000 bits of capacity or 171.898kB or 0.16795279999999999 mB


In [None]:
# acc from keras website
data.append(["EfficientNetB2", MEC, 80.1])

In [None]:
from math import prod
import numpy as np
import plotly.express as px
import pandas as pd

df=pd.DataFrame(data, columns=["name", "MEC", "acc"])
fig = px.scatter(df, x="acc", y="MEC", color="name")
fig.show()

# How many params in model with without dense layer and MEC of model
Report visualization


In [None]:
from tensorflow.keras.applications import ResNet50V2, ResNet101V2, DenseNet121, DenseNet169, NASNetMobile, EfficientNetB2, NASNetLarge, MobileNetV2

In [None]:
data=[]
models=[ResNet50V2, ResNet101V2, DenseNet121, DenseNet169, NASNetMobile, EfficientNetB2, NASNetLarge, MobileNetV2]

In [None]:
data=[]
for model in models:
  # Example
  temp_data={"name":"name", 
             "mec":0, 
             "compression":100, 
             "w/ classification_param_count":2,
             "w/o classification_param_count":2, 
             "compression_output_shape":2,
             "compression_input_shape":2,
             "compression":1}

  model_instanec = model(include_top=False)
  temp_data["name"] = model_instanec.name
  temp_data["w/o classification_param_count"]=model_instanec.count_params()
  model_instanec = model(include_top=True)
  temp_data["w/ classification_param_count"]=model_instanec.count_params()
  MEC = mec_of_linear_layer(model_instanec.layers[-1])
  temp_data["mec"]=MEC
  input_shape=prod(model_instanec.layers[0].input_shape[0][1:])
  output_shape=prod(model_instanec.layers[-1].input_shape[1:])
  temp_data['compression_output_shape']=output_shape
  temp_data['compression_input_shape']=input_shape
  temp_data['compression']=input_shape/output_shape
  data.append(temp_data.copy())




In [None]:
from math import prod
import numpy as np
import pandas as pd

df=pd.DataFrame(data)
df

Unnamed: 0,name,mec,compression,w/ classification_param_count,w/o classification_param_count,compression_output_shape,compression_input_shape
0,resnet50v2,2049000,73.5,25613800,23564800,2048,150528
1,resnet101v2,2049000,73.5,44675560,42626560,2048,150528
2,densenet121,1025000,147.0,8062504,7037504,1024,150528
3,densenet169,1665000,90.461538,14307880,12642880,1664,150528
4,NASNet,1057000,142.545455,5326716,4269716,1056,150528
5,efficientnetb2,1409000,144.034091,9177569,7768569,1408,202800
6,NASNet,4033000,81.518601,88949818,84916818,4032,328683
7,mobilenetv2_1.00_224,1281000,117.6,3538984,2257984,1280,150528
