## How to calculate parameters in a Deep Learning model?

How to calcualte the parameters of a Deep learning model (with Keras):

In [0]:
from keras import models
from keras import layers
model = models.Sequential()

model.add(layers.Conv2D(32, (5, 5), activation='relu',
                        input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(32, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))

  **name      size**

--  --------  --------
  0  input     1x28x28

  1  conv2d1   32x24x24

  2  maxpool1  32x12x12

  3  conv2d2   32x10x10

  4  maxpool2  32x5x5

  5  dense     256

  6  output    10

**name         -                  size       -          parameters**
---  --------  -------------------------    ------------------------
  0  input     -                  1x28x28   -                        0

  1  conv2d1  - (28-(5-1))=24 -> 32x24x24    - (5* 5* 1+1)* 32   =     832 ((shape of width of the filter * shape of height of the filter+1) * number of filters)

  2  maxpool1 -                  32x12x12    -                       0

  3  conv2d2  - (12-(3-1))=10 -> 32x10x10    - (3* 3 * 32+1)* 32  =   9'248 ((shape of width of filter * shape of height filte * previous output r+1) * number of filters)

  4  maxpool2  -                   32x5x5    -                       0

  5  dense     -                      256    - (32* 5* 5+1)* 256 = 205'056  ((current layer n*previous layer n)+1)

  6  output    -                       10    - (256+1)*10     =   2'570

So in your network, you have a total of 832 + 9'248 + 205'056 + 2'570 = 217'706 learnable parameters.

In [0]:
model.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_5 (Conv2D)            (None, 24, 24, 32)        832       
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 12, 12, 32)        0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 10, 10, 32)        9248      
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 5, 5, 32)          0         
_________________________________________________________________
flatten_3 (Flatten)          (None, 800)               0         
_________________________________________________________________
dense_5 (Dense)              (None, 256)               205056    
_________________________________________________________________
dense_6 (Dense)              (None, 10)               

✕

Let's first look at how the number of learnable parameters is calculated for each individual type of layer you have, and then calculate the number of parameters in your example.

**Input layer**: All the input layer does is read the input image, so there are no parameters you could learn here.

**Convolutional layers**: Consider a convolutional layer which takes l feature maps at the input, and has k feature maps as output. The filter size is n x m. For example, this will look like this:

![texto alternativo](https://i.stack.imgur.com/2r4XG.png)


Here, the input has l=32 feature maps as input, k=64 feature maps as output, and the filter size is n=3 x m=3. It is important to understand, that we don't simply have a 3x3 filter, but actually a 3x3x32 filter, as our input has 32 dimensions. And we learn 64 different 3x3x32 filters. Thus, the total number of weights is n*m*k*l. Then, there is also a bias term for each feature map, so we have a total number of parameters of (n*m*l+1)*k.

((shape of width of the filter*shape of height of the filter+1)*number of filters) **texto en negrita**

**Pooling layers:** The pooling layers e.g. do the following: "replace a 2x2 neighborhood by its maximum value". So there is no parameter you could learn in a pooling layer.

**Fully-connected layers**: In a fully-connected layer, all input units have a separate weight to each output unit. For n inputs and m outputs, the number of weights is n*m. Additionally, you have a bias for each output node, so you are at (n+1)*m parameters.

**Output layer**: The output layer is a normal fully-connected layer, so (n+1)*m parameters, where n is the number of inputs and m is the number of outputs.
The final difficulty is the first fully-connected layer: we do not know the dimensionality of the input to that layer, as it is a convolutional layer. To calculate it, we have to start with the size of the input image, and calculate the size of each convolutional layer. In your case, Lasagne already calculates this for you and reports the sizes - which makes it easy for us. If you have to calculate the size of each layer yourself, it's a bit more complicated:




## **YOUR TURN:**

From the previous model:

**Input:**

1) Change the input shape of the input image from 28 * 28 pixels (img_rows and img_cols) to 8 * 8.

2) Image the image is RGB with 3 channels 

**Convolution 1**

1) Modify the filter size to 3 * 3

2) Modify the output feature maps to 64


**Convolution 2**

1) Add 1 * 1 filter 

2) Modify output feature maps to 128

**Fully-connected layers**

1) Add 516 output feature maps

**Output layers (Dense classifier)**

1) Predict 5 labels 


In [0]:
model.summary()

Model: "sequential_11"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_21 (Conv2D)           (None, 6, 6, 64)          1792      
_________________________________________________________________
max_pooling2d_17 (MaxPooling (None, 3, 3, 64)          0         
_________________________________________________________________
conv2d_22 (Conv2D)           (None, 3, 3, 128)         8320      
_________________________________________________________________
max_pooling2d_18 (MaxPooling (None, 1, 1, 128)         0         
_________________________________________________________________
flatten_7 (Flatten)          (None, 128)               0         
_________________________________________________________________
dense_13 (Dense)             (None, 516)               66564     
_________________________________________________________________
dense_14 (Dense)             (None, 5)               