# Modules, layers and Models

## Models
A model is, abstractly:
- A function that computes something on tensors (a forward pass)
- Some variables that can be updated in response to training

## Modules
- Most models are made of layers. 
- Layers are functions with a known mathematical structure that can be reused and have trainable variables. 
- In TensorFlow, most high-level implementations of layers and models, such as Keras or Sonnet, are built on the same foundational class: **tf.Module**.
- Modules in TensorFlow provide a means to encapsulate related functionality, such as layers, blocks, or custom operations, into a reusable unit.

In summary,
- "**models**" in TensorFlow refer to higher-level abstractions that encompass the architecture and functionality of machine learning models, often defined by subclassing **tf.keras.Model**.
- "**Modules**", on the other hand, refer to reusable blocks or components that can be used within models, often defined by subclassing **tf.Module**. Modules help with code organization, modularity, and reusability, allowing you to encapsulate related functionality into separate units.

In [1]:
import tensorflow as tf
from datetime import datetime

# Load the TensorBoard notebook extension
%load_ext tensorboard

2023-07-13 21:01:34.434771: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-07-13 21:01:34.482282: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-07-13 21:01:34.484097: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [2]:
class SimpleModule(tf.Module):
    def __init__(self, name=None):
        super().__init__(name=name)
        # You can set the trainability of variables on and off for any reason, 
        # including freezing layers and variables during fine-tuning.
        self.a_variable = tf.Variable(5.0, trainable=True, name="train")
        self.non_trainable_var = tf.Variable(5.0, trainable=False, name="non trainable")
    
    def __call__(self, inputs):
        return self.a_variable * inputs + self.non_trainable_var

In [3]:
simple_module = SimpleModule("simple")
simple_module.name

'simple'

In [4]:
simple_module(tf.Variable(2.0))

<tf.Tensor: shape=(), dtype=float32, numpy=15.0>

**Note:**
- ***tf.Module*** is the base class for both **tf.keras.layers.Layer** and **tf.keras.Model**, so everything you come across here also applies in Keras.
-  For historical compatibility reasons Keras layers do not collect variables from modules, so your models should use only modules or only Keras layers.

In [5]:
# list of all the trainable variables
simple_module.trainable_variables

(<tf.Variable 'train:0' shape=() dtype=float32, numpy=5.0>,)

In [6]:
# list of all the variables assigned to the simple_module
simple_module.variables

(<tf.Variable 'train:0' shape=() dtype=float32, numpy=5.0>,
 <tf.Variable 'non trainable:0' shape=() dtype=float32, numpy=5.0>)

### Building two layer model using Module

In [7]:
class Dense(tf.Module):
    def __init__(self,in_features, out_features, name=None):
        super().__init__(name=name)
        self.weights = tf.Variable(tf.random.normal([in_features, out_features], name="W"))
        self.bias = tf.Variable(tf.zeros(out_features), name="b")
    
    def __call__(self, inputs):
        z = inputs @ self.weights + self.bias
        return tf.nn.relu(z)

In [8]:
class SequentialModel(tf.Module):
    def __init__(self, name=None):
        super().__init__(name=name)
        
        self.dense1 = Dense(in_features=4, out_features=3, name="dense1")
        self.dense2 = Dense(in_features=3, out_features=2, name="dense2")
    
    def __call__(self, inputs):
        out1 = self.dense1(inputs)
        out2 = self.dense2(out1)
        return out2

In [9]:
# create a model
model = SequentialModel()

In [10]:
x = tf.Variable([[1., 1., 0., 1.]])

print("Model output: ", model(x))

Model output:  tf.Tensor([[0. 0.]], shape=(1, 2), dtype=float32)


In [11]:
model.variables

(<tf.Variable 'b:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)>,
 <tf.Variable 'Variable:0' shape=(4, 3) dtype=float32, numpy=
 array([[ 0.27624676,  0.8140204 , -1.423646  ],
        [-0.01590264, -1.0349654 , -0.7415145 ],
        [ 0.6664191 , -0.3525899 , -0.816536  ],
        [ 0.07416085, -0.1692806 , -0.16368097]], dtype=float32)>,
 <tf.Variable 'b:0' shape=(2,) dtype=float32, numpy=array([0., 0.], dtype=float32)>,
 <tf.Variable 'Variable:0' shape=(3, 2) dtype=float32, numpy=
 array([[-1.1496557 , -1.6198384 ],
        [ 0.13141622,  0.80947703],
        [-0.63101864,  0.09868614]], dtype=float32)>)

tf.Module instances will automatically collect, recursively, any tf.Variable or tf.Module instances assigned to it. This allows you to manage collections of tf.Modules with a single model instance, and save and load whole models.

In [12]:
print("List of all the submodules: ", model.submodules)
print("List of the name of the submodules: ", [sub.name for sub in model.submodules])

List of all the submodules:  (<__main__.Dense object at 0x7fa3d7195360>, <__main__.Dense object at 0x7fa3d7197730>)
List of the name of the submodules:  ['dense1', 'dense2']


### Waiting to create variable 
- No need to specify both input and output shape to the layer
- flexible size

In [13]:
class FlexibleDense(tf.Module):
    def __init__(self, out_features, name=None):
        super().__init__(name=name)
        self.is_built = False
        self.out_features = out_features
    
    def __call__(self, x):
        if not self.is_built:
            self.weights = tf.Variable(
                tf.random.normal([x.shape[-1], self.out_features]),
                name="w"
            )
            self.bias = tf.Variable(tf.zeros(self.out_features), name="b")
            self.is_built = True
        z = x @ self.weights + self.bias
        return tf.nn.relu(z)

In [14]:
class SequentialModel(tf.Module):
    def __init__(self, name=None):
        super().__init__(name=name)
        
        self.dense1 = FlexibleDense(out_features=3, name="dense1")
        self.dense2 = FlexibleDense(out_features=2, name="dense2")
    
    def __call__(self, inputs):
        out1 = self.dense1(inputs)
        out2 = self.dense2(out1)
        return out2

In [15]:
my_model = SequentialModel(name="flexible_model")
print("Model results (input_shape = 3):", my_model(tf.constant([[2.0, 2.0, 1.0]])))

Model results (input_shape = 3): tf.Tensor([[0. 0.]], shape=(1, 2), dtype=float32)


In [16]:
print(my_model([[[2.0, 2.0, 2.0], [2.0, 2.0, 2.0]]]))

tf.Tensor(
[[[0. 0.]
  [0. 0.]]], shape=(1, 2, 2), dtype=float32)


## saving weights
- You can save a **tf.Module** as both a **checkpoint** and a **SavedModel**.

#### Checkpoint
- Checkpoints are just the weights

In [17]:
chkp_path = "my_checkpoint"
checkpoint = tf.train.Checkpoint(model=my_model)
checkpoint.write(chkp_path)

'my_checkpoint'

In [18]:
checkpoint.model.variables

(<tf.Variable 'b:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)>,
 <tf.Variable 'w:0' shape=(3, 3) dtype=float32, numpy=
 array([[-1.6191789, -1.2352841,  0.1626661],
        [ 0.7143987, -0.2580629, -1.160111 ],
        [ 0.9796424,  1.2809355,  0.6541501]], dtype=float32)>,
 <tf.Variable 'b:0' shape=(2,) dtype=float32, numpy=array([0., 0.], dtype=float32)>,
 <tf.Variable 'w:0' shape=(3, 2) dtype=float32, numpy=
 array([[-1.4439256 , -0.00888472],
        [-1.0444151 ,  1.3851212 ],
        [-1.2846667 , -0.0374243 ]], dtype=float32)>)

In [19]:
# You can look inside a checkpoint to be sure the whole collection of variables is saved,
# sorted by the Python object that contains them
tf.train.list_variables(chkp_path)

[('_CHECKPOINTABLE_OBJECT_GRAPH', []),
 ('model/dense1/bias/.ATTRIBUTES/VARIABLE_VALUE', [3]),
 ('model/dense1/weights/.ATTRIBUTES/VARIABLE_VALUE', [3, 3]),
 ('model/dense2/bias/.ATTRIBUTES/VARIABLE_VALUE', [2]),
 ('model/dense2/weights/.ATTRIBUTES/VARIABLE_VALUE', [3, 2])]

During distributed (multi-machine) training they can be sharded, which is why they are numbered (e.g., '00000-of-00001'). In this case, though, there is only one shard.

In [20]:
new_model = SequentialModel()
new_checkpoint = tf.train.Checkpoint(model=new_model)
new_checkpoint.restore("my_checkpoint")

# Should be the same result as above
new_model(tf.constant([[2.0, 2.0, 2.0]]))

<tf.Tensor: shape=(1, 2), dtype=float32, numpy=array([[0., 0.]], dtype=float32)>

## Saving functions
- TensorFlow can run models without the original Python objects.
- TensorFlow needs to know how to do the computations described in Python, but without the original code. To do this, you can make a graph.
- You can define a graph in the model above by adding the **@tf.function** decorator to indicate that this code should run as a graph.

In [21]:
class SequentialModel(tf.Module):
    def __init__(self, name=None):
        super().__init__(name=name)
        
        self.dense1 = FlexibleDense(out_features=3, name="dense1")
        self.dense2 = FlexibleDense(out_features=2, name="dense2")
    
    @tf.function
    def __call__(self, inputs):
        out1 = self.dense1(inputs)
        out2 = self.dense2(out1)
        return out2

In [22]:
my_model = SequentialModel(name="flexible_model")
print(isinstance(my_model, SequentialModel))
print("Model results (input_shape = 3):", my_model(tf.constant([[2.0, 2.0, 1.0]])))

True
Model results (input_shape = 3): tf.Tensor([[0.        1.7730099]], shape=(1, 2), dtype=float32)


In [23]:
# print(my_model([[[2.0, 2.0, 2.0], [2.0, 2.0, 2.0]]]))

The module you have made works exactly the same as before. Each unique signature passed into the function creates a separate graph. Check the Introduction to graphs and functions guide for details.

#### You can visualize the graph by tracing it within a TensorBoard summary.

In [24]:
# # Set up logging.
# stamp = datetime.now().strftime("%Y%m%d-%H%M%S")
# logdir = "logs/func/%s" % stamp
# writer = tf.summary.create_file_writer(logdir)

# # Create a new model to get a fresh trace
# # Otherwise the summary will not see the graph.
# new_model = SequentialModel()

# # Bracket the function call with
# # tf.summary.trace_on() and tf.summary.trace_export().
# tf.summary.trace_on(graph=True)
# tf.profiler.experimental.start(logdir)
# # Call only one tf.function when tracing.
# z = print(new_model(tf.constant([[2.0, 2.0, 2.0]])))
# with writer.as_default():
#     tf.summary.trace_export(
#       name="my_func_trace",
#       step=0,
#       profiler_outdir=logdir)

In [25]:
#docs_infra: no_execute
# %tensorboard --logdir logs/func

### Saving model
-  SavedModel contains both a collection of functions and a collection of weights. 

In [26]:
tf.saved_model.save(my_model, "the_saved_model_2")

INFO:tensorflow:Assets written to: the_saved_model_2/assets


In [27]:
!ls

'Gradient and Automatic Differentiation.ipynb'	'Tensorflow Components.ipynb'
 Modules_layers_models.ipynb			'Tensorflow operations.ipynb'
 my_checkpoint.data-00000-of-00001		 the_saved_model_2
 my_checkpoint.index


### Loading the saved model

In [28]:
new_model = tf.saved_model.load("the_saved_model_2")
new_model

<tensorflow.python.saved_model.load.Loader._recreate_base_user_object.<locals>._UserObject at 0x7fa3cc255ea0>

In [29]:
isinstance(new_model, SequentialModel)

False

new_model, created from loading a saved model, is an internal TensorFlow user object without any of the class knowledge. It is not of type SequentialModule.

In [30]:
isinstance(my_model, SequentialModel)

True

In [31]:
print(my_model([[2, 12, 23]]))
print(my_model([[[12.0,12.0, 12.0], [2.0, 12.0, 12.0], [3.0, 4.0, 5.0]]]))

tf.Tensor([[ 0.       27.373873]], shape=(1, 2), dtype=float32)
tf.Tensor(
[[[ 0.       18.471558]
  [ 0.       14.634044]
  [ 0.        7.12486 ]]], shape=(1, 3, 2), dtype=float32)


This new model works on the already-defined input signatures. You can't add more signatures to a model restored like this.

In [36]:
print(new_model([[20, 12, 1]]))
try:
    print(new_model([[[12.0,12.0, 12.0], [2.0, 12.0, 12.0], [3.0, 4.0, 5.0]]]))
except Exception as e:
    print(f"{type(e).__name__}")

tf.Tensor([[0.        6.4120345]], shape=(1, 2), dtype=float32)
ValueError


**Explaination**:
- When we run the newly created model, it will trace the graph if @tf.function has been used to decorate (new graph for new signature), if @tf.function is not used we won't be able to run the loaded model.
- And since we haven't run the my_model with the signature (1, 3, 3), so no graph is created with that signature, upon running the loaded model, thus, we got the value error.
- But you may notice that my_model (not a loaded model) can run without any error.