The parameter of the TCN #239

yanghui-wng · 2022-09-24T13:20:25Z

I am studying the code about TCN on github (https://github.com/philipperemy/keras-tcn). The number of parameters of the TCN I calculated is different from the answer of the function "model.summary()". The parameter of TCN layer that calculated by the function "model.summary()" is 153500, but I am not clear about how to calculate this value and I am trying to calculate the value, but the result is 153000.

Code

# design network
batch_size = None
model = Sequential()
input_layer = Input(batch_shape=(batch_size,1,7))
model.add(input_layer)
model.add(TCN(nb_filters=100, #Integer. The number of filters to use in the convolutional layers. Would be similar to units for LSTM. Can be a list.
        kernel_size=3, #Integer. The size of the kernel to use in each convolutional layer.
        nb_stacks=1,   #The number of stacks of residual blocks to use.
        dilations=(1,2,4), #List/Tuple. A dilation list. Example is: [1, 2, 4, 8, 16, 32, 64].
        padding='causal',
        use_skip_connections=False, 
        dropout_rate=0.1,
        return_sequences=False,
        activation='relu', 
        kernel_initializer='he_normal', 
        use_batch_norm=False, 
        use_layer_norm=False, 
        ))
model.add(Dense(64))
model.add(LeakyReLU(alpha=0.3))
model.add(Dense(32))
model.add(LeakyReLU(alpha=0.3))
model.add(Dense(16))
model.add(LeakyReLU(alpha=0.3))
model.add(Dense(1))
model.add(LeakyReLU(alpha=0.3))
model.compile(loss='mse', optimizer='adam')
model.summary()

Output

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
tcn_2 (TCN)                  (None, 100)               153500    
_________________________________________________________________
dense_8 (Dense)              (None, 64)                6464      
_________________________________________________________________
leaky_re_lu_8 (LeakyReLU)    (None, 64)                0         
_________________________________________________________________
dense_9 (Dense)              (None, 32)                2080      
_________________________________________________________________
leaky_re_lu_9 (LeakyReLU)    (None, 32)                0         
_________________________________________________________________
dense_10 (Dense)             (None, 16)                528       
_________________________________________________________________
leaky_re_lu_10 (LeakyReLU)   (None, 16)                0         
_________________________________________________________________
dense_11 (Dense)             (None, 1)                 17        
_________________________________________________________________
leaky_re_lu_11 (LeakyReLU)   (None, 1)                 0         
=================================================================
Total params: 162,589
Trainable params: 162,589
Non-trainable params: 0
_________________________________________________________________

The text was updated successfully, but these errors were encountered:

philipperemy · 2022-09-25T14:56:31Z

@yanghui-wng here is a detailed version of the weights contained in the TCN model:

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
matching_conv1D (Conv1D)     multiple                  800
_________________________________________________________________
Act_Res_Block (Activation)   multiple                  0
_________________________________________________________________
conv1D_0 (Conv1D)            multiple                  2200
_________________________________________________________________
Act_Conv1D_0 (Activation)    multiple                  0
_________________________________________________________________
SDropout_0 (SpatialDropout1D multiple                  0
_________________________________________________________________
conv1D_1 (Conv1D)            multiple                  30100
_________________________________________________________________
Act_Conv1D_1 (Activation)    multiple                  0
_________________________________________________________________
SDropout_1 (SpatialDropout1D multiple                  0
_________________________________________________________________
Act_Conv_Blocks (Activation) multiple                  0
_________________________________________________________________
matching_identity (Lambda)   (None, 1, 100)            0
_________________________________________________________________
Act_Res_Block (Activation)   multiple                  0
_________________________________________________________________
conv1D_0 (Conv1D)            multiple                  30100
_________________________________________________________________
Act_Conv1D_0 (Activation)    multiple                  0
_________________________________________________________________
SDropout_0 (SpatialDropout1D multiple                  0
_________________________________________________________________
conv1D_1 (Conv1D)            multiple                  30100
_________________________________________________________________
Act_Conv1D_1 (Activation)    multiple                  0
_________________________________________________________________
SDropout_1 (SpatialDropout1D multiple                  0
_________________________________________________________________
Act_Conv_Blocks (Activation) multiple                  0
_________________________________________________________________
matching_identity (Lambda)   (None, 1, 100)            0
_________________________________________________________________
Act_Res_Block (Activation)   multiple                  0
_________________________________________________________________
conv1D_0 (Conv1D)            multiple                  30100
_________________________________________________________________
Act_Conv1D_0 (Activation)    multiple                  0
_________________________________________________________________
SDropout_0 (SpatialDropout1D multiple                  0
_________________________________________________________________
conv1D_1 (Conv1D)            multiple                  30100
_________________________________________________________________
Act_Conv1D_1 (Activation)    multiple                  0
_________________________________________________________________
SDropout_1 (SpatialDropout1D multiple                  0
_________________________________________________________________
Act_Conv_Blocks (Activation) multiple                  0
_________________________________________________________________
Slice_Output (Lambda)        multiple                  0
_________________________________________________________________
dense (Dense)                (None, 64)                6464
_________________________________________________________________
leaky_re_lu (LeakyReLU)      (None, 64)                0
_________________________________________________________________
dense_1 (Dense)              (None, 32)                2080
_________________________________________________________________
leaky_re_lu_1 (LeakyReLU)    (None, 32)                0
_________________________________________________________________
dense_2 (Dense)              (None, 16)                528
_________________________________________________________________
leaky_re_lu_2 (LeakyReLU)    (None, 16)                0
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 17
_________________________________________________________________
leaky_re_lu_3 (LeakyReLU)    (None, 1)                 0
=================================================================
Total params: 162,589
Trainable params: 162,589
Non-trainable params: 0
_________________________________________________________________

philipperemy · 2022-09-25T14:57:27Z

And here are the TCN blocks (the breakdown by block):

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
residual_block_0 (ResidualBl multiple                  33100
_________________________________________________________________
residual_block_1 (ResidualBl multiple                  60200
_________________________________________________________________
residual_block_2 (ResidualBl multiple                  60200
_________________________________________________________________

philipperemy · 2022-09-25T15:03:21Z

Here is the graph of your model. I used tensorboard for this. You can generate it yourself and explore each node of your model.

philipperemy · 2022-09-25T15:04:55Z

To reproduce it you can run this script:

import numpy as np
from tensorflow.keras import Input
from tensorflow.keras import Sequential
from tensorflow.keras.callbacks import TensorBoard
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import LeakyReLU

from tcn import TCN

input_dim = 7
timesteps = 1

print('Loading data...')
x_train = np.zeros(shape=(100, timesteps, input_dim))
y_train = np.zeros(shape=(100, 1))

batch_size = None
model = Sequential()
input_layer = Input(batch_shape=(batch_size, timesteps, input_dim))
model.add(input_layer)
model.add(TCN(nb_filters=100,
              # Integer. The number of filters to use in the convolutional layers. Would be similar to units for LSTM. Can be a list.
              kernel_size=3,  # Integer. The size of the kernel to use in each convolutional layer.
              nb_stacks=1,  # The number of stacks of residual blocks to use.
              dilations=(1, 2, 4),  # List/Tuple. A dilation list. Example is: [1, 2, 4, 8, 16, 32, 64].
              padding='causal',
              use_skip_connections=False,
              dropout_rate=0.1,
              return_sequences=False,
              activation='relu',
              kernel_initializer='he_normal',
              use_batch_norm=False,
              use_layer_norm=False,
              ))
model.add(Dense(64))
model.add(LeakyReLU(alpha=0.3))
model.add(Dense(32))
model.add(LeakyReLU(alpha=0.3))
model.add(Dense(16))
model.add(LeakyReLU(alpha=0.3))
model.add(Dense(1))
model.add(LeakyReLU(alpha=0.3))
model.compile(loss='mse', optimizer='adam')

# tensorboard --logdir logs_tcn
# Browse to http://localhost:6006/#graphs&run=train.
# and double click on TCN to expand the inner layers.
# It takes time to write the graph to tensorboard. Wait until the first epoch is completed.
tensorboard = TensorBoard(
    log_dir='logs_tcn',
    histogram_freq=1,
    write_images=True
)

print('Train...')
model.fit(
    x_train, y_train,
    batch_size=batch_size,
    callbacks=[tensorboard],
    epochs=10
)

Run it and a folder called logs_tcn should be generated. In the same directory run:

tensorboard --logdir logs_tcn

And go to http://localhost:6006/.

Select GRAPH and you will see it:

philipperemy · 2022-09-25T15:05:37Z

I guess with all those tools, you should be able to have an answer.

yanghui-wng · 2022-09-26T06:12:19Z

Thank you for your help! Through your explanation, I have known how to calculate the parameters of TCN.

philipperemy · 2022-09-26T06:53:54Z

Good to hear!

philipperemy added the question Further information is requested label Sep 25, 2022

philipperemy closed this as completed Sep 26, 2022

philipperemy mentioned this issue May 25, 2023

Visualization of internal structure of TCN block #247

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The parameter of the TCN #239

The parameter of the TCN #239

yanghui-wng commented Sep 24, 2022

philipperemy commented Sep 25, 2022

philipperemy commented Sep 25, 2022

philipperemy commented Sep 25, 2022

philipperemy commented Sep 25, 2022

philipperemy commented Sep 25, 2022

yanghui-wng commented Sep 26, 2022

philipperemy commented Sep 26, 2022

The parameter of the TCN #239

The parameter of the TCN #239

Comments

yanghui-wng commented Sep 24, 2022

Code

Output

philipperemy commented Sep 25, 2022

philipperemy commented Sep 25, 2022

philipperemy commented Sep 25, 2022

philipperemy commented Sep 25, 2022

philipperemy commented Sep 25, 2022

yanghui-wng commented Sep 26, 2022

philipperemy commented Sep 26, 2022