<a href="https://colab.research.google.com/github/AtrCheema/Miscellaneous_DL_Tutorials/blob/master/understanding_dense_layer_tf.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Intro
This notebook describes dense layer or fully connected layer using tensorflow.

In [None]:
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Dense
import numpy as np

def reset_seed(seed=313):
    tf.keras.backend.clear_session()
    tf.random.set_seed(seed)
    np.random.seed(seed)

np.set_printoptions(linewidth=100, suppress=True)

tf.__version__, np.__version__

('2.3.0', '1.18.5')

In [None]:
# set some global parameters
input_features = 2
batch_size = 10
dense_units = 5

define input to model

In [None]:
in_np = np.random.randint(0, 100, size=(batch_size,input_features))
in_np

array([[ 8, 42],
       [87, 31],
       [39, 73],
       [25, 72],
       [ 0, 50],
       [82, 26],
       [ 0, 58],
       [69, 56],
       [13, 73],
       [23, 46]])

build a model consisting of single dense layer

In [None]:
reset_seed()


ins = Input(input_features, name='my_input')
out = Dense(dense_units, use_bias=False, name='my_output')(ins)
model = Model(inputs=ins, outputs=out)

In [None]:
out_np = model.predict(in_np)
out_np



array([[-13.74732  ,   8.15492  , -10.744486 ,  25.119091 ,  -7.663808 ],
       [ -5.950535 ,  68.49644  ,  -9.491724 , -38.411003 ,  24.45086  ],
       [-22.595592 ,  33.50788  , -19.158081 ,  26.03558  ,  -4.0035515],
       [-22.982853 ,  22.674595 , -18.636395 ,  35.13561  ,  -8.948013 ],
       [-16.85867  ,   2.3709118, -12.607699 ,  36.59203  , -12.659398 ],
       [ -4.5233936,  64.40725  ,  -8.134692 , -38.558826 ,  23.860497 ],
       [-19.556057 ,   2.7502577, -14.624931 ,  42.446754 , -14.684901 ],
       [-15.311285 ,  55.81435  , -15.449032 ,  -7.4740105,  11.438467 ],
       [-23.94097  ,  13.476982 , -18.657522 ,  44.29477  , -13.656332 ],
       [-14.319835 ,  19.90088  , -12.041887 ,  17.512308 ,  -3.107648 ]], dtype=float32)

In [None]:
out_np.shape

(10, 5)

We can get all layers of model as list

In [None]:
model.layers

[<tensorflow.python.keras.engine.input_layer.InputLayer at 0x7f59ce874b00>,
 <tensorflow.python.keras.layers.core.Dense at 0x7f59ce858828>]

or a specific layer by its name

In [None]:
dense_layer = model.get_layer('my_output')

input to dense layer must be of the shape

In [None]:
dense_layer.input_shape

(None, 2)

output from dense layer will be of the shape

In [None]:
dense_layer.output_shape

(None, 5)

dense layer ususally has two variables i.e. weight/kernel and bias. As we did not use bias thus no bias is shown

In [None]:
dense_layer.weights

[<tf.Variable 'my_output/kernel:0' shape=(2, 5) dtype=float32, numpy=
 array([[ 0.0517453 ,  0.77041924, -0.0192523 , -0.7022766 ,  0.37126076],
        [-0.3371734 ,  0.04741824, -0.252154  ,  0.7318406 , -0.25318795]], dtype=float32)>]

The shape of the dense weights is of the form `(input_size, units)`

`dense_layer.weights` returns a list, the first variable of which kernel/weights. We can convert a numpy version of weights

In [None]:
dense_w = dense_layer.weights[0].numpy()
dense_w.shape

(2, 5)

In [None]:
dense_w

array([[ 0.0517453 ,  0.77041924, -0.0192523 , -0.7022766 ,  0.37126076],
       [-0.3371734 ,  0.04741824, -0.252154  ,  0.7318406 , -0.25318795]], dtype=float32)

The output from our model consisting of a single dense layer is simply the matrix multiplication between input and weight matrix as can be verified from below.

In [None]:
np.matmul(in_np, dense_w)

array([[-13.74732053,   8.15491986, -10.74448609,  25.11909294,  -7.66380799],
       [ -5.95053476,  68.49643922,  -9.4917239 , -38.41100419,  24.45085973],
       [-22.59559184,  33.50788164, -19.15808117,  26.03557765,  -4.00355095],
       [-22.98285258,  22.67459404, -18.63639498,  35.13560927,  -8.94801366],
       [-16.85867012,   2.37091184, -12.60769963,  36.59203053, -12.65939772],
       [ -4.52339423,  64.40725183,  -8.13469243, -38.5588243 ,  23.86049569],
       [-19.55605733,   2.75025773, -14.62493157,  42.44675541, -14.68490136],
       [-15.31128514,  55.81434882, -15.44903231,  -7.47401035,  11.43846714],
       [-23.94096953,  13.4769814 , -18.65752137,  44.29476893, -13.65633076],
       [-14.31983471,  19.90088141, -12.04188657,  17.51230657,  -3.10764837]])

compare above output from the model's output which was obtained earlier.

## Using Bias

By default the `Dense` layer in tensorflow uses bias as well. 

In [None]:
reset_seed()
tf.keras.backend.clear_session()

ins = Input(input_features, name='my_input')
out = Dense(5, use_bias=True,  name='my_output')(ins)
model = Model(inputs=ins, outputs=out)

In [None]:
out_np = model.predict(in_np)
out_np.shape, out_np



((10, 5),
 array([[-13.74732  ,   8.15492  , -10.744486 ,  25.119091 ,  -7.663808 ],
        [ -5.950535 ,  68.49644  ,  -9.491724 , -38.411003 ,  24.45086  ],
        [-22.595592 ,  33.50788  , -19.158081 ,  26.03558  ,  -4.0035515],
        [-22.982853 ,  22.674595 , -18.636395 ,  35.13561  ,  -8.948013 ],
        [-16.85867  ,   2.3709118, -12.607699 ,  36.59203  , -12.659398 ],
        [ -4.5233936,  64.40725  ,  -8.134692 , -38.558826 ,  23.860497 ],
        [-19.556057 ,   2.7502577, -14.624931 ,  42.446754 , -14.684901 ],
        [-15.311285 ,  55.81435  , -15.449032 ,  -7.4740105,  11.438467 ],
        [-23.94097  ,  13.476982 , -18.657522 ,  44.29477  , -13.656332 ],
        [-14.319835 ,  19.90088  , -12.041887 ,  17.512308 ,  -3.107648 ]], dtype=float32))

In [None]:
dense_layer = model.get_layer('my_output')
dense_layer.weights

[<tf.Variable 'my_output/kernel:0' shape=(2, 5) dtype=float32, numpy=
 array([[ 0.0517453 ,  0.77041924, -0.0192523 , -0.7022766 ,  0.37126076],
        [-0.3371734 ,  0.04741824, -0.252154  ,  0.7318406 , -0.25318795]], dtype=float32)>,
 <tf.Variable 'my_output/bias:0' shape=(5,) dtype=float32, numpy=array([0., 0., 0., 0., 0.], dtype=float32)>]

The bias vector above was all zeros thus had no effect on model output as the equation for dense layer becomes

$$ y = Ax + b$$

We can initialize bias vector with ones and see the output

In [None]:
reset_seed()

ins = Input(input_features, name='my_input')
out = Dense(dense_units, use_bias=True, bias_initializer='ones', name='my_output')(ins)
model = Model(inputs=ins, outputs=out)

In [None]:
out_np = model.predict(in_np)
out_np.shape, out_np



((10, 5),
 array([[-12.74732  ,   9.15492  ,  -9.744486 ,  26.119091 ,  -6.663808 ],
        [ -4.950535 ,  69.49644  ,  -8.491724 , -37.411003 ,  25.45086  ],
        [-21.595592 ,  34.50788  , -18.158081 ,  27.03558  ,  -3.0035515],
        [-21.982853 ,  23.674595 , -17.636395 ,  36.13561  ,  -7.9480133],
        [-15.858669 ,   3.3709118, -11.607699 ,  37.59203  , -11.659398 ],
        [ -3.5233936,  65.40725  ,  -7.134692 , -37.558826 ,  24.860497 ],
        [-18.556057 ,   3.7502577, -13.624931 ,  43.446754 , -13.684901 ],
        [-14.311285 ,  56.81435  , -14.449032 ,  -6.4740105,  12.438467 ],
        [-22.94097  ,  14.476982 , -17.657522 ,  45.29477  , -12.656332 ],
        [-13.319835 ,  20.90088  , -11.041887 ,  18.512308 ,  -2.107648 ]], dtype=float32))

In [None]:
dense_layer = model.get_layer('my_output')
dense_layer.weights

[<tf.Variable 'my_output/kernel:0' shape=(2, 5) dtype=float32, numpy=
 array([[ 0.0517453 ,  0.77041924, -0.0192523 , -0.7022766 ,  0.37126076],
        [-0.3371734 ,  0.04741824, -0.252154  ,  0.7318406 , -0.25318795]], dtype=float32)>,
 <tf.Variable 'my_output/bias:0' shape=(5,) dtype=float32, numpy=array([1., 1., 1., 1., 1.], dtype=float32)>]

We can verify that the model's output is obtained following the equation we wrote above.

In [None]:
dense_layer = model.get_layer('my_output')
dense_w = dense_layer.weights[0].numpy()
np.matmul(in_np, dense_w) + np.ones(dense_units)

array([[-12.74732053,   9.15491986,  -9.74448609,  26.11909294,  -6.66380799],
       [ -4.95053476,  69.49643922,  -8.4917239 , -37.41100419,  25.45085973],
       [-21.59559184,  34.50788164, -18.15808117,  27.03557765,  -3.00355095],
       [-21.98285258,  23.67459404, -17.63639498,  36.13560927,  -7.94801366],
       [-15.85867012,   3.37091184, -11.60769963,  37.59203053, -11.65939772],
       [ -3.52339423,  65.40725183,  -7.13469243, -37.5588243 ,  24.86049569],
       [-18.55605733,   3.75025773, -13.62493157,  43.44675541, -13.68490136],
       [-14.31128514,  56.81434882, -14.44903231,  -6.47401035,  12.43846714],
       [-22.94096953,  14.4769814 , -17.65752137,  45.29476893, -12.65633076],
       [-13.31983471,  20.90088141, -11.04188657,  18.51230657,  -2.10764837]])

## using `activation` function

We can add non-linearity to the output of dense layer by making use of `activation` keyword argument. A common `activation` function is `relu` which makes all the values below 0 as zero.

In this case the equation of dense layer will become

$$ y = \alpha (Ax + b) $$

Where $\alpha$ is the non-linearity applied.

In [None]:
reset_seed()

ins = Input(input_features, name='my_input')
out = Dense(dense_units, use_bias=True, bias_initializer='ones',
            activation='relu', name='my_output')(ins)
model = Model(inputs=ins, outputs=out)

out_np = model.predict(in_np)
out_np.shape, out_np



((10, 5), array([[ 0.       ,  9.15492  ,  0.       , 26.119091 ,  0.       ],
        [ 0.       , 69.49644  ,  0.       ,  0.       , 25.45086  ],
        [ 0.       , 34.50788  ,  0.       , 27.03558  ,  0.       ],
        [ 0.       , 23.674595 ,  0.       , 36.13561  ,  0.       ],
        [ 0.       ,  3.3709118,  0.       , 37.59203  ,  0.       ],
        [ 0.       , 65.40725  ,  0.       ,  0.       , 24.860497 ],
        [ 0.       ,  3.7502577,  0.       , 43.446754 ,  0.       ],
        [ 0.       , 56.81435  ,  0.       ,  0.       , 12.438467 ],
        [ 0.       , 14.476982 ,  0.       , 45.29477  ,  0.       ],
        [ 0.       , 20.90088  ,  0.       , 18.512308 ,  0.       ]], dtype=float32))

We can again verify that the above output from dense layer follows the equation that we wrote above.

In [None]:
def relu(X):
   return np.maximum(0,X)


dense_layer = model.get_layer('my_output')
dense_w = dense_layer.weights[0].numpy()
relu(np.matmul(in_np, dense_w) + np.ones(dense_units))

array([[ 0.        ,  9.15491986,  0.        , 26.11909294,  0.        ],
       [ 0.        , 69.49643922,  0.        ,  0.        , 25.45085973],
       [ 0.        , 34.50788164,  0.        , 27.03557765,  0.        ],
       [ 0.        , 23.67459404,  0.        , 36.13560927,  0.        ],
       [ 0.        ,  3.37091184,  0.        , 37.59203053,  0.        ],
       [ 0.        , 65.40725183,  0.        ,  0.        , 24.86049569],
       [ 0.        ,  3.75025773,  0.        , 43.44675541,  0.        ],
       [ 0.        , 56.81434882,  0.        ,  0.        , 12.43846714],
       [ 0.        , 14.4769814 ,  0.        , 45.29476893,  0.        ],
       [ 0.        , 20.90088141,  0.        , 18.51230657,  0.        ]])

## customizing weights

we can set the weights and bias of dense layer to values of our choice. This is useful for example when we want to initialize the weights/bias with the values that we already have.

In [None]:
custom_dense_weights = np.array([[1, 2, 3 , 4,  5],
                                 [6, 7, 8 , 9 , 10]], dtype=np.float32)
custom_bias = np.array([0., 0., 0., 0., 0.])

reset_seed() 

ins = Input(input_features, name='my_input')

dense_lyr = Dense(dense_units, use_bias=True, bias_initializer='ones', name='my_output')
out = dense_lyr(ins)

model = Model(inputs=ins, outputs=out)

dense_lyr.set_weights([custom_dense_weights, custom_bias])

The method `set_weights` must be called after initializing `Model` class. 

The input to `set_weights` is a list containing both weight matrix and bias vector respectively.

In [None]:
out_np = model.predict(in_np)
out_np.shape, out_np



((10, 5), array([[260., 310., 360., 410., 460.],
        [273., 391., 509., 627., 745.],
        [477., 589., 701., 813., 925.],
        [457., 554., 651., 748., 845.],
        [300., 350., 400., 450., 500.],
        [238., 346., 454., 562., 670.],
        [348., 406., 464., 522., 580.],
        [405., 530., 655., 780., 905.],
        [451., 537., 623., 709., 795.],
        [299., 368., 437., 506., 575.]], dtype=float32))

In [None]:
dense_layer = model.get_layer('my_output')
dense_w = dense_layer.weights[0].numpy()
dense_w

array([[ 1.,  2.,  3.,  4.,  5.],
       [ 6.,  7.,  8.,  9., 10.]], dtype=float32)

Verify that the output from dense is just matrix multiplication.

In [None]:
np.matmul(in_np, custom_dense_weights) + np.zeros(dense_units)

array([[260., 310., 360., 410., 460.],
       [273., 391., 509., 627., 745.],
       [477., 589., 701., 813., 925.],
       [457., 554., 651., 748., 845.],
       [300., 350., 400., 450., 500.],
       [238., 346., 454., 562., 670.],
       [348., 406., 464., 522., 580.],
       [405., 530., 655., 780., 905.],
       [451., 537., 623., 709., 795.],
       [299., 368., 437., 506., 575.]])

## Reducing Dimensions

Dense layer can be used to reduce last dimension of incoming input.

In following the size is reduced from `(10, 20, 30)` ==> `(10, 20, 1)`

In [None]:
input_shape = 20, 30
in_np = np.random.randint(0, 100, size=(batch_size,*input_shape))

reset_seed()


ins = Input(input_shape, name='my_input')
out = Dense(1, use_bias=False, name='my_output')(ins)
model = Model(inputs=ins, outputs=out)
out_np = model.predict(in_np)
print('input shape: {}\n output shape: {}'.format(in_np.shape, out_np.shape))

input shape: (10, 20, 30)
 output shape: (10, 20, 1)
