<a href="https://colab.research.google.com/github/yoheikikuta/TensorFlow2-check/blob/master/colab/check_parameter_sharing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Check Parameter sharing



In [1]:
from __future__ import absolute_import, division, print_function, unicode_literals

try:
    # %tensorflow_version only exists in Colab.
    %tensorflow_version 2.x
except Exception:
    pass

import tensorflow as tf

TensorFlow 2.x selected.


In [2]:
tf.__version__

'2.0.0'

In [0]:
import numpy as np
from tensorflow.keras import layers
from tensorflow.keras import Model

## NOT shared case

In [0]:
class Model1(Model):
    def __init__(self):
        super(Model1, self).__init__()
        self.input_layer = layers.InputLayer((28, 28, 1))
        self.conv1 = layers.Conv2D(32, 3, activation='relu')
        self.conv2 = layers.Conv2D(32, 3, activation='relu')
        self.conv3 = layers.Conv2D(32, 3, activation='relu')
        self.flatten = layers.Flatten()
        self.d1 = layers.Dense(128, activation='relu')
        self.d2 = layers.Dense(10, activation='softmax')

    def call(self, x):
        x = self.input_layer(x)
        x = self.conv1(x)
        x = self.conv2(x)
        x = self.conv3(x)
        x = self.flatten(x)
        x = self.d1(x)
        return self.d2(x)

In [0]:
model1 = Model1()

In [6]:
model1.trainable_variables, model1.non_trainable_variables

([], [])

We need to run at least once to define the graph.

In [0]:
mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# Add a channels dimension
x_train = x_train[..., tf.newaxis]
x_test = x_test[..., tf.newaxis]

In [8]:
model1(tf.constant(x_train[0:1]))



To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.



<tf.Tensor: id=144, shape=(1, 10), dtype=float32, numpy=
array([[0.09472661, 0.09300236, 0.0958236 , 0.09831807, 0.11362754,
        0.11690934, 0.09909359, 0.09278698, 0.09523912, 0.10047279]],
      dtype=float32)>

In [9]:
len(model1.trainable_variables), len(model1.non_trainable_variables)

(10, 0)

In [10]:
model1.trainable_variables[0].shape, model1.trainable_variables[1].shape, model1.trainable_variables[2].shape

(TensorShape([3, 3, 1, 32]), TensorShape([32]), TensorShape([3, 3, 32, 32]))

In [0]:
def compute_trainable_var_num(trainable: list):
    total_num = 0
    for idx, component in enumerate(trainable):
        print(idx, component.name, np.prod(component.shape))
        total_num += np.prod(component.shape)

    return total_num

In [12]:
total_num = compute_trainable_var_num(model1.trainable_variables)

0 model1/conv2d/kernel:0 288
1 model1/conv2d/bias:0 32
2 model1/conv2d_1/kernel:0 9216
3 model1/conv2d_1/bias:0 32
4 model1/conv2d_2/kernel:0 9216
5 model1/conv2d_2/bias:0 32
6 model1/dense/kernel:0 1982464
7 model1/dense/bias:0 128
8 model1/dense_1/kernel:0 1280
9 model1/dense_1/bias:0 10


In [13]:
total_num

2002698

## Shared case

In [0]:
class Model2(Model):
    def __init__(self):
        super(Model2, self).__init__()
        self.input_layer = layers.InputLayer((28, 28, 1))
        self.conv_1 = layers.Conv2D(32, 3, activation='relu')
        self.conv_2_3 = layers.Conv2D(32, 3, activation='relu')
        self.flatten = layers.Flatten()
        self.d1 = layers.Dense(128, activation='relu')
        self.d2 = layers.Dense(10, activation='softmax')

    def call(self, x):
        x = self.input_layer(x)
        x = self.conv_1(x)
        x = self.conv_2_3(x)
        x = self.conv_2_3(x)
        x = self.flatten(x)
        x = self.d1(x)
        return self.d2(x)

In [0]:
model2 = Model2()

In [16]:
model2.trainable_variables, model2.non_trainable_variables

([], [])

In [17]:
model2(tf.constant(x_train[0:1]))



To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.



<tf.Tensor: id=264, shape=(1, 10), dtype=float32, numpy=
array([[0.10101324, 0.10935233, 0.09916582, 0.09875683, 0.09361419,
        0.0980546 , 0.09641179, 0.10259576, 0.10012846, 0.10090692]],
      dtype=float32)>

In [18]:
total_num = compute_trainable_var_num(model2.trainable_variables)

0 model2/conv2d_3/kernel:0 288
1 model2/conv2d_3/bias:0 32
2 model2/conv2d_4/kernel:0 9216
3 model2/conv2d_4/bias:0 32
4 model2/dense_2/kernel:0 1982464
5 model2/dense_2/bias:0 128
6 model2/dense_3/kernel:0 1280
7 model2/dense_3/bias:0 10


In [19]:
total_num

1993450

In [20]:
2002698 - 1993450 == 9216 + 32

True

It's easy to share model parameters; we just need to reuse the same layer objects.