<a href="https://colab.research.google.com/github/Shashankwer/Tensorflow_Testing/blob/master/TensorflowModuleslayersandmodels.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This guide is a guide for machine learning below the Keras level APIs

A model is abstractly 
* A function that computes something on tensors(a forward pass)
* Some variables that are updated in response to training



In [55]:
import tensorflow as tf
from datetime import datetime

# Defining models and layers: 

Most of the models are made of layrs are functions  with a known mathematical structure that can be reused and have trainable variables. In tensorflow most high level implementations of layers and models, such as keras or Sonnet are build using the same functional class tf.Module


In [56]:
class SimpleModule(tf.Module):
  def __init__(self,name=None):
    super().__init__(name=name)
    self.a_variable = tf.Variable(5.0,name="train_me")
    self.non_trainable_variable = tf.Variable(5.0,trainable=False,name="do_not_train_me")
  @tf.function
  def __call__(self,x):
    return self.a_variable*x+self.non_trainable_variable

In [57]:
simple_module = SimpleModule(name="simple")

In [58]:
simple_module(tf.constant(5.0))

<tf.Tensor: shape=(), dtype=float32, numpy=30.0>

Modules and by extension layers are deep learning terminologies for objects. They have internal state and methods that use that state. 

There is nothing special about `__call__` except to act like Python callavle one can invoke in the models wtih whatever function one whish for

Trainability can be set on or off for any reason, including freezing layers and variables during fine-tuning

*tf.Module* is the base class of both **tf.keras.layers.Layer** and **tf.keras.Model**. For hsitorical compatibility reasons Keras layers do not collect variables from module so models should use only modules or only Keras layers.

By subclassing *tf.Module*, any *tf.Variable* or *tf.Module* instances assigned to this object's properties are automatically collected. This allows to save and load variables and alos create collections of *tf.Module*s

In [59]:
print("trainable variables:",simple_module.trainable_variables)
#Every variable
print("all variables:",simple_module.variables)

trainable variables: (<tf.Variable 'train_me:0' shape=() dtype=float32, numpy=5.0>,)
all variables: (<tf.Variable 'train_me:0' shape=() dtype=float32, numpy=5.0>, <tf.Variable 'do_not_train_me:0' shape=() dtype=float32, numpy=5.0>)


In [60]:
class Dense(tf.Module):
  def __init__(self,in_features,out_features,name=None):
    super().__init__(name=name)
    self.w = tf.Variable(tf.random.normal([in_features,out_features]),name='w')
    self.b = tf.Variable(tf.zeros([out_features]),name='b')
  @tf.function 
  def __call__(self,x):
    y = tf.matmul(x,self.w)+self.b
    return tf.nn.relu(y)


In [61]:
class SequetialModel(tf.Module):
  def __init__(self,name=None):
    super().__init__(name=name)
    self.dense_1 = Dense(in_features=3,out_features=3)
    self.dense_2 = Dense(in_features=3,out_features=2)
  
  def __call__(self,x):
    x = self.dense_1(x)
    return self.dense_2(x)


In [62]:
my_model = SequetialModel(name="the_model")
print("Model results:",my_model(tf.constant([[2.0,2.0,2.0]])))

Model results: tf.Tensor([[0.        5.6875362]], shape=(1, 2), dtype=float32)


`tf.Module` instances will automatically, collect, recursively any `tf.Variable` or `tf.Module` instances assigned to it. This allows one to manage collections of `tf.Module`s with a single model instance and save and load the whole models. 

In [14]:
print("Submodels", my_model.submodules)

Submodels (<__main__.Dense object at 0x7f97ea3799e8>, <__main__.Dense object at 0x7f97ea3794a8>)


In [15]:
for var in my_model.variables:
  print(var,"\n")

<tf.Variable 'b:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)> 

<tf.Variable 'w:0' shape=(3, 3) dtype=float32, numpy=
array([[-0.6983206 , -0.3311129 ,  1.7663767 ],
       [ 1.0761409 , -0.37199038, -0.1732689 ],
       [-0.31482482, -1.005461  ,  0.03379207]], dtype=float32)> 

<tf.Variable 'b:0' shape=(2,) dtype=float32, numpy=array([0., 0.], dtype=float32)> 

<tf.Variable 'w:0' shape=(3, 2) dtype=float32, numpy=
array([[ 0.8653841 , -0.839348  ],
       [-0.66861176,  0.49682468],
       [ 1.3994701 , -0.70983684]], dtype=float32)> 



# Waiting to create variables

By deferring te variable creation to the firs time the model is called with a specific shape, one do not need to specify the input size up front

In [24]:
class FlexibleDenseModule(tf.Module):
  #Note: No need for `in_features`
  def __init__(self,out_features,name=None):
    super().__init__(name=name)
    self.is_built=False
    self.out_features=out_features
  @tf.function
  def __call__(self,x):
    if not self.is_built:
      self.w = tf.Variable(tf.random.normal([x.shape[-1],self.out_features]),name='w')
      self.b = tf.Variable(tf.zeros([self.out_features]),name='b')
      self.is_built=True
    y =tf.matmul(x,self.w)+self.b
    return tf.nn.relu(y)  

In [25]:
class MySequentialModel(tf.Module):
  def __init__(self,name=None):
    super().__init__(name=name)
    self.dense1 = FlexibleDenseModule(out_features=3)
    self.dense2 = FlexibleDenseModule(out_features=2)
  
  @tf.function
  def __call__(self,x):
    x = self.dense1(x)
    return self.dense2(x)


In [26]:
my_model=MySequentialModel(name="the_model")
print("Model results:",my_model(tf.constant([[2.0,2.0,2.0]])))

Model results: tf.Tensor([[1.8044676 0.       ]], shape=(1, 2), dtype=float32)


# Saving weights: 

One can save a tf.Module as both a checkpoint and a SavedModel. 

Checkpoints are just the weights(that is, values of the set of variables inside the module and its submodules)

In [27]:
chkp_path = "my_checkpoint"
checkpoint = tf.train.Checkpoint(model=my_model)
checkpoint.write(chkp_path)
checkpoint.write(chkp_path)

'my_checkpoint'

Checkpoints consist of 2 kinds of files, the data itself and then an index file for metadata. The index file track of what is actually saved and the numbering of checkpoint data contains the variable values and their attribute lookup paths

In [28]:
tf.train.list_variables(chkp_path)

[('_CHECKPOINTABLE_OBJECT_GRAPH', []),
 ('model/dense1/b/.ATTRIBUTES/VARIABLE_VALUE', [3]),
 ('model/dense1/w/.ATTRIBUTES/VARIABLE_VALUE', [3, 3]),
 ('model/dense2/b/.ATTRIBUTES/VARIABLE_VALUE', [2]),
 ('model/dense2/w/.ATTRIBUTES/VARIABLE_VALUE', [3, 2])]

In [29]:
new_model = MySequentialModel()
new_checkpoint = tf.train.Checkpoint(model=new_model)
new_checkpoint.restore("my_checkpoint")

<tensorflow.python.training.tracking.util.CheckpointLoadStatus at 0x7f97ea275b70>

In [31]:
new_model(tf.constant([[2.0,2.0,2.0]]))

<tf.Tensor: shape=(1, 2), dtype=float32, numpy=array([[1.8044676, 0.       ]], dtype=float32)>

In [32]:
print(my_model([[[2.0,2.0,2.0],[2.0,2.0,2.0]]]))

tf.Tensor(
[[[1.8044676 0.       ]
  [1.8044676 0.       ]]], shape=(1, 2, 2), dtype=float32)


In [34]:
stamp = datetime.now().strftime("%Y%m%d-%H%M%S")
logdir = "logs/func/%s" % stamp
writer = tf.summary.create_file_writer(logdir)

new_model = MySequentialModel()
tf.summary.trace_on(graph=True,profiler=True)
z = print(new_model(tf.constant([[2.0,2.0,2.0]])))
with writer.as_default():
  tf.summary.trace_export(
      name="my_func_trace",
      step=0,
      profiler_outdir=logdir
  )

Instructions for updating:
use `tf.profiler.experimental.start` instead.
tf.Tensor([[2.0565543 1.4202245]], shape=(1, 2), dtype=float32)
Instructions for updating:
use `tf.profiler.experimental.stop` instead.
Instructions for updating:
`tf.python.eager.profiler` has deprecated, use `tf.profiler` instead.
Instructions for updating:
`tf.python.eager.profiler` has deprecated, use `tf.profiler` instead.


In [None]:
%load_ext tensorboard
%tensorboard --logdir logs/func


# Creating a SavedModel

The recommended way of sharing completely trained models is to use SavedModel. SavedModel contains both a collection of functions and a collection of weights

In [37]:
tf.saved_model.save(my_model,"the_saved_model")

INFO:tensorflow:Assets written to: the_saved_model/assets


In [38]:
!ls -l the_saved_model

total 28
drwxr-xr-x 2 root root  4096 Aug 15 10:53 assets
-rw-r--r-- 1 root root 19094 Aug 15 10:53 saved_model.pb
drwxr-xr-x 2 root root  4096 Aug 15 10:53 variables


In [39]:
!ls -l the_saved_model/variables

total 8
-rw-r--r-- 1 root root 402 Aug 15 10:53 variables.data-00000-of-00001
-rw-r--r-- 1 root root 355 Aug 15 10:53 variables.index


In [40]:
new_model = tf.saved_model.load("the_saved_model")

In [41]:
new_model([[2.0,2.0,2.0]])

<tf.Tensor: shape=(1, 2), dtype=float32, numpy=array([[1.8044676, 0.       ]], dtype=float32)>

#Keras models and layers

Note that up unitl this point there is no mention of Keras. One can build your own high level API on top of tf.Module, and people have 

In [46]:
class MyDense(tf.keras.layers.Layer):
  def __init__(self,in_features,out_features,**kwargs):
    super().__init__(**kwargs)
    self.w = tf.Variable(tf.random.normal([in_features,out_features]),name='W')
    self.b = tf.Variable(tf.zeros([out_features]),name='b')
  @tf.function
  def call(self,x):
    y = tf.matmul(x,self.w)+self.b
    return tf.nn.relu(y)

In [47]:
simple_layer = MyDense(name="simple",in_features=3,out_features=3)

In [48]:
simple_layer([[2.0,2.0,2.0]])

<tf.Tensor: shape=(1, 3), dtype=float32, numpy=array([[0.        , 0.        , 0.16166806]], dtype=float32)>

### The build step: 

As Noted its convineinet in many cases to wait to create variables unitl one is sure of the input shape

Keras layers come with an extra lifecycle step that allows one more flexibility in how one defines the layers. This is defined in the `build()` function. 

`build` is called exactly once, and it is called with shape of the input. It's usually used to create variables(weights)

In [49]:
class FlexibleDense(tf.keras.layers.Layer):
  def __init__(self,out_features,**kwargs):
    super().__init__(**kwargs)
    self.out_features=out_features
  
  def build(self,input_shape):# create the state of the layer(weights)
    self.w = tf.Variable(tf.random.normal([input_shape[-1],self.out_features]),name='w')
    self.b = tf.Variable(tf.zeros([self.out_features]),name='b')
  
  def call(self,inputs):
    return tf.matmul(inputs,self.w)+self.b
  
#Creating an instance of the layer
flexible_dense = FlexibleDense(out_features=3)

In [50]:
flexible_dense.variables

[]

In [51]:
print("Model results: ",flexible_dense(tf.constant([[2.0,2.0,2.0],[3.0,3.0,3.0]])))

Model results:  tf.Tensor(
[[ 5.0256906 -2.3145933  3.1164691]
 [ 7.538536  -3.47189    4.6747036]], shape=(2, 3), dtype=float32)


In [52]:
flexible_dense.variables

[<tf.Variable 'flexible_dense/w:0' shape=(3, 3) dtype=float32, numpy=
 array([[ 1.3510668 ,  0.5209874 ,  0.50965416],
        [-0.5100422 ,  0.0262166 , -0.2408849 ],
        [ 1.6718206 , -1.7045007 ,  1.2894653 ]], dtype=float32)>,
 <tf.Variable 'flexible_dense/b:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)>]

Build is only called once any subsequent calls are simply rejected.

In [54]:
try: 
  print("Model results:",flexible_dense(tf.constant([[2.0,2.0,2.0,2.0]])))
except tf.errors.InvalidArgumentError as e:
  print("Failed:", e)

Failed: Matrix size-incompatible: In[0]: [1,4], In[1]: [3,3] [Op:MatMul]


Keras Layers have a lot more extra features including
* Optional Loss
* Support for metrics
* Built in support for optional training argument to differentiate between training and inference use.
* `get_config` and `from_config` methods that allows one to accurately store configurations to allow model cloning in Python

# Keras models: 

One can define model as nested Keras layers. 
However, keras also provides a full featured model class called `tf.keras.Model`. It inherits from `tf.keras.layers.Layer`, so a Keras models come with extra functionality that makes them easy to train, evaluate, load, save, and even train on multiple machines

One can define the `SequentialModule` from above with nearly identical code, again converting `__call__` to `call()` and changing the parent. 

In [63]:
class MySequentialModel(tf.keras.Model):
  def __init__(self,name=None,**kwargs):
    super().__init__(**kwargs)
    self.dense1=FlexibleDense(out_features=3)
    self.dense2=FlexibleDense(out_features=2)
  def call(self,x):
    x = self.dense1(x)
    return self.dense2(x)

In [64]:
my_sequential_model = MySequentialModel(name="the_model")

In [65]:
print("Model results:",my_sequential_model(tf.constant([[2.0,2.0,2.0]])))

Model results: tf.Tensor([[-9.818333  6.250565]], shape=(1, 2), dtype=float32)


All the same features are available, including tracking variables and submodules. 

To emphasize the note above, a raw `tf.Module` nested inside a Keras layer or model will not get its variables collected for training or saving. Instead, nest keras layers inside of Keras layers.

In [66]:
my_sequential_model.variables

[<tf.Variable 'my_sequential_model/flexible_dense_1/w:0' shape=(3, 3) dtype=float32, numpy=
 array([[-1.399645  ,  0.06991847, -0.0432677 ],
        [-0.6534347 , -0.92082614,  0.4032547 ],
        [-0.17581803, -0.33068183, -0.86391443]], dtype=float32)>,
 <tf.Variable 'my_sequential_model/flexible_dense_1/b:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)>,
 <tf.Variable 'my_sequential_model/flexible_dense_2/w:0' shape=(3, 2) dtype=float32, numpy=
 array([[ 0.9092294 , -0.5006875 ],
        [ 1.7846612 , -1.5829638 ],
        [ 1.5356383 , -0.27561128]], dtype=float32)>,
 <tf.Variable 'my_sequential_model/flexible_dense_2/b:0' shape=(2,) dtype=float32, numpy=array([0., 0.], dtype=float32)>]

In [67]:
my_sequential_model.submodules

(<__main__.FlexibleDense at 0x7f97e7b23080>,
 <__main__.FlexibleDense at 0x7f97e7b237b8>)

Overriding `tf.keras.Model` is a very Pythonic approach to building TensorFlow models. If you are migrating from other frameworks, this can be straightforward. 

If one is constructing models that are simple assemblages of existing layers and inputs, one can save time ans space by using the functional API, which comes with additional features around model reconstruction and architechture. 

In [69]:
inputs = tf.keras.Input(shape=[3,])
x = FlexibleDense(3)(inputs)
x = FlexibleDense(2)(x)
my_functional_model = tf.keras.Model(inputs=inputs,outputs=x)
my_functional_model.summary()

Model: "functional_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         [(None, 3)]               0         
_________________________________________________________________
flexible_dense_5 (FlexibleDe (None, 3)                 12        
_________________________________________________________________
flexible_dense_6 (FlexibleDe (None, 2)                 8         
Total params: 20
Trainable params: 20
Non-trainable params: 0
_________________________________________________________________


In [70]:
my_functional_model(tf.constant([[2.0,2.0,2.0]]))

<tf.Tensor: shape=(1, 2), dtype=float32, numpy=array([[ 2.8367457, 20.757769 ]], dtype=float32)>

The major difference here is that input shape is specified up front as the part of the functional construction process. The `input_shape` argument in this case does not have to be completely specified; you can leave some dimensions as `None`

#Saving Keras Models: 

Keras models can be checkpointed and that will look the same as `tf.Module`

Kers models can also be saved with `tf.saved_models.save()` as the are modules. However, Keras models have convenience methods, and other functionality

In [74]:
my_functional_model.save("exname_of_file")

INFO:tensorflow:Assets written to: exname_of_file/assets


In [75]:
reconstructed_model = tf.keras.models.load_model("exname_of_file")



In [76]:
reconstructed_model(tf.constant([[2.0,2.0,2.0]]))

<tf.Tensor: shape=(1, 2), dtype=float32, numpy=array([[ 2.8367457, 20.757769 ]], dtype=float32)>