<a href="https://colab.research.google.com/github/nigoda/machine_learning/blob/main/32(a)_TensorFlow_customization.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **tf.keras**
We have extensively used tf.Keras API throughout the course. Let's review main concepts behind tf.Keras before learning how to customize it.

tf.keras is TensorFlow's implementation of the keras API specification. This ia a high-level model build and train models that include first-class support for TensorFlow functionality, such as eager execution, tf.data.pipelines,and Estimators. tf.keras makes TensorFlow easier to use without sacrificing flexibility and performance.

In [1]:
#  Note: Select 'GPU' hardware accelerator

from __future__ import absolute_import, division, print_function, unicode_literals

try:
  # %tensorflow_version only exists in Colab.
  %tensorflow_version 2.x
except Exception:
  pass
import tensorflow as tf

# We need to import keras from tensorflow package.
from tensorflow import keras

`tf.keras` can run any Keras-compatible code, but keep in mind:
*  The `tf.keras` version in the latest TensorFlow release might not be the same as the latest keras version from PyPI. Check `tf.keras.__version__.`
*  When saving a model's weight, tf.keras defaults to the checkpoint format. Pass save_format='h5' to use HDF5(or pass a filename that ends in .h5)

## **Build a simple model**
### **Sequential model**
In Keras, we assemble *layers* to build *models*. A model is (usually) a *graph of layers.* The most common type of model is a stack of layers: The tf.keras.Sequential model.

To build a simple, fully-connected network (i.e. multi-layer perceptron) as we seen in numerous occassions in this course:


In [2]:
from tensorflow.keras import layers

model = tf.keras.Sequential()
# Add a densely-connected layer with 64 units to the model:
model.add(layers.Dense(64, activation='relu'))
# Add another:
model.add(layers.Dense(64, activation = 'relu'))
# Add a softmax layer with 10 output units:
model.add(layers.Dense(10, activation = 'softmax'))
  

We will learn how to learn how to write advance model by-
*  Implementing layers and models from scratch with subclassing.
*  Functional APIs

## **Configure the layers**
There are many `tf.keras.layers` available. Most of them share common constructor arguments:
*  activation: Set the activation function for the layer. This parameter is specified by the name of a build-in function or as a collable object. By default, no activation is applied.
*  kernel_initialize and bias_initializer: The initialization schemes that create the layer's weights (kernal and bias). This parameter is a name or a callable object. This defaults to the "Glorot uniform" initializer.
*  kernel_regularizer and bias_regularizer: The regularization schemes that apply the layer's weigths (kernel and bias), such as L1 OR L2 regularization. By defualt, no regularization is applied.

The following instantiates tf.keras.layers.Dense layer using constructor arguments:

In [3]:
# Create a sigmoid layer:
layers.Dense(64, activation='sigmoid')
# Or:
layers.Dense(64, activation=tf.keras.activations.sigmoid)

# A linear layer with L1 regularization of factor 0.01 applied to the kernel matrix
layers.Dense(64, kernel_regularizer=tf.keras.regularizers.l1(0.01))

# A linear layer with L2 regularization of factor 0.01 applied to the bias vector:
layers.Dense(64, bias_regularizer=tf.keras.regularizers.l2(0.01))

# A linear layer with kernel initialized to a random orthogonal matrix:
layers.Dense(64, kernel_initializer='orthogonal')

# A linear layer with a bias vector initialized to 2.0s:
layers.Dense(64, bias_initializer=tf.keras.initializers.Constant(2.0))


<tensorflow.python.keras.layers.core.Dense at 0x7fec8d727790>

## **Train and evaluate**

### **Set up training**
After the model is constructed, configure its learning process by calling the compile method:

In [4]:
model = tf.keras.Sequential([
  # Add a densely-connected layer with 64 units to the model:
  layers.Dense(64, activation='relu', input_shape=(32,)),
  # Add another:
  layers.Dense(64, activation='relu'),
  # Add a softmax layer with 10 output units:
  layers.Dense(10, activation='softmax')])

model.compile(optimizer=tf.keras.optimizers.Adam(0.01),
              loss = 'categorical_crossentropy',
              metrics=['accuracy'])

`tf.keras.Model.compile` takes three important arguments:
*  optimizer : This object specifies the training procedure. Pass it optimizer instances from the `tf.keras.optimizer` module, such as `tf.keras.optimizer.Adam` or `tf.keras.optimizer.SGD`. If you just want to use the default parameters, you can also specify optimizers via string, such as 'adam' or 'sgd'.
*  loss : The function to minimize during optimization. Common choices include mean square error(mse), categorical_crossentropy, and binary_crossentropy. Loss functions are specified by name or by passing a callable object from the tf.keras.losses module.
metrics : Used to monitor training. These are string names or callables from the tf.keras.metrics module.
*  Additionally, to make sure the module trains and evaluates eagerly, you can make sure to pass run_eagerly=True as a paramater to compile.

The following shows a few examples of configuring a model for training:

In [5]:
 # Configure a model for mean-squared error regression.
 model.compile(optimizer=tf.keras.optimizers.Adam(0.01),
               loss='mse',
               metrics=['mae'])
 # Configure a model for categorical classification.
 model.compile(optimizer=tf.keras.optimizers.RMSprop(0.01),
               loss=tf.keras.losses.CategoricalCrossentropy(),
               metrics=[tf.keras.metrics.CategoricalAccuracy()])

## **Train from Numpy data**
For small datasets, use in-memory NumPy arrays to train and evaluate a model. The model is "fit" to the training data using the fit method:


In [6]:
import numpy as np

data = np.random.random((1000, 32))
print("Data:")
print(data.shape)
print(data)
print()

labels = np.random.random((1000, 10))
print("Labels:")
print(labels.shape)
print(labels)
print()

model.fit(data, labels, epochs=10, batch_size=32)


Data:
(1000, 32)
[[0.66493292 0.03427028 0.41507935 ... 0.52445452 0.34191234 0.15536374]
 [0.26773882 0.09192836 0.74742292 ... 0.81986545 0.87898716 0.8013343 ]
 [0.13219433 0.89601486 0.84929763 ... 0.80392135 0.85404215 0.84809613]
 ...
 [0.08073657 0.83437597 0.0373483  ... 0.24063384 0.10079778 0.76505798]
 [0.86642249 0.20750078 0.93285412 ... 0.00735963 0.70505304 0.81890177]
 [0.03504074 0.71309329 0.46734353 ... 0.94089298 0.73508259 0.60327789]]

Labels:
(1000, 10)
[[0.52678488 0.22353622 0.99433032 ... 0.60630028 0.25149095 0.88994783]
 [0.70849782 0.8903322  0.96949003 ... 0.02127129 0.69818863 0.44752314]
 [0.26490401 0.35204058 0.9720163  ... 0.9730748  0.18404089 0.27940687]
 ...
 [0.73279546 0.30233695 0.66213487 ... 0.37244311 0.42938965 0.3189341 ]
 [0.08103672 0.21838265 0.40160957 ... 0.14582414 0.80448217 0.28360193]
 [0.8852957  0.40168594 0.63173203 ... 0.11574588 0.24507505 0.37099878]]

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/

<tensorflow.python.keras.callbacks.History at 0x7fec70367510>

tf.keras.Model.fit takes three important arguments:
* epochs : Training is structured into *epochs*. An epoch is one iteration over the entire input data (this is done in smaller batches).
*  batch_size : When passed NumPy data, the model slices the data into smaller batches and iterates over these batches during training. This integer specifies the size of each batch. Be aware that the batch may be smaller if the total number of samples is not divisible by the batch size.
* validation_data :When prototyping a model, you want to easily monitor its performance on validation data. Passing this argument-a tuple of input and labels-allows the model to display the loss and metrics in inference mode for the passed data, at the end of each epoch.

Here's an example using validation_data:


In [7]:
import numpy as np

data = np.random.random((1000, 32))
lables = np.random.random((1000, 10))

val_data = np.random.random((100, 32))
val_labels = np.random.random((100, 10))

model.fit(data, labels, epochs=10, batch_size=32,
          validation_data=(val_data, val_labels))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7fec70190790>

## **Train from tf.data.datasets**
Use the Datasets API to scale to large datasets or multi-device training. Pass a tf.data.Dataset instance to the fit method:
 

In [8]:
# Instantiates a toy dataset instances:
dataset = tf.data.Dataset.from_tensor_slices((data, labels))
dataset = dataset.batch(32)

model.fit(dataset, epochs=10)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7fec87965fd0>

Since the Dataset yields batches of data, This snippet does not require a batch_size.

Datasets can also be used for validation:

In [9]:
dataset = tf.data.Dataset.from_tensor_slices((data, labels))
dataset = dataset.batch(32)

val_dataset = tf.data.Dataset.from_tensor_slices((val_data, val_labels))
val_dataset = val_dataset.batch(32)

model.fit(dataset, epochs=10,
          validation_data=val_dataset)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7fec2021e9d0>

## **Evaluate and predict**
The tf.keras.Model.Evaluate and tf.keras.Model.predict methods can use NumPy data and a tf.data.Dataset.

Here's how to *evaluate* the inference-mode loss and metrics for the data provided: 

In [10]:
# With NumPy array
data = np.random.random((1000, 32))
labels = np.random.random((1000, 10))

model.evaluate(data, labels, batch_size=32)

# With a Dataset
dataset = tf.data.Dataset.from_tensor_slices((data, labels))
dataset = dataset.batch(32)

model.evaluate(dataset)



[201694.546875, 0.09200000017881393]

And here's how to *predict* the output of the last layer in inference for the data provided, as a NumPy array:


In [11]:
result = model.predict(data, batch_size=32)
print(result.shape)

(1000, 10)


## **Callback**
A callback is a object passed to a model to customize and extend its behavior during training. You can write you own custom callback, or use the built-in tf.keras.callback that include:
*  tf.keras.callbacks.ModelCheckpoint : Save checkpoints of your model at regular intervals.
*  tf.keras.callbacks.LearningRateSchedular : Dynamically change the learning rate.
*  tf.keras.callback.EarlyStopping : Interrupt training when validation performance has stopped improving.
*  tf.keras.callbacks.TensorBoard : Monitor the model's behavior using TensorBoard

To use tf.keras.callbacks.Callback, pass it to the model;s fit method:I

In [13]:
callbacks = [
  # Interrupt training if 'val_loss' stops improving for over 2 epochs
  tf.keras.callbacks.EarlyStopping(patience=2, monitor='val_loss'),
  # Write TensorBoard logs to './logs' directory
  tf.keras.callbacks.TensorBoard(log_dir='./logs')
]

model.fit(data, labels, batch_size=32, epochs=5, callbacks=callbacks,
          validation_data=(val_data, val_labels))

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7fec109b5a50>

In [14]:
!ls

logs  sample_data


In [15]:
!ls logs

train  validation


In [16]:
!ls logs/train

events.out.tfevents.1627111019.f89b2b0136f2.522.5126.v2    plugins
events.out.tfevents.1627111019.f89b2b0136f2.profile-empty


## **Save and restore**
Save and load the weigths of a model using tf.keras.Model.save_weigths:

In [18]:
model = tf.keras.Sequential([
  layers.Dense(64, activation='relu', input_shape=(32,)),
  layers.Dense(10, activation='softmax')])

model.compile(optimizer=tf.keras.optimizers.Adam(0.001),
              loss='categorical_crossentropy',
              metrics=['accuracy'])

In [20]:
# Save weigth to a TensorFlow Checkpoint file
model.save_weights('./weigths/my_model')

# Restore the model's state,
# this requires a model with the same architecture.
model.load_weights('./weigths/my_model')

<tensorflow.python.training.tracking.util.CheckpointLoadStatus at 0x7fec703d67d0>

In [21]:
!ls

logs  sample_data  weigths


In [23]:
!ls weigths

checkpoint  my_model.data-00000-of-00001  my_model.index


By default, this saves the model's weights in the TensorFlow checkpoint file formate. Weights can also be saved to the keras HDF5 format (the default for the multi-backend implementation of keras)

In [24]:
# Save weights to a HDF5 file
model.save_weights('my_model.h5', save_format='h5')

# Restore the model's state
model.load_weights('my_model.h5')

In [26]:
!ls

logs  my_model.h5  sample_data	weigths


## **Save just the model configuration**
A model's configuration can be saved-this serializes the model architecture without any weights. A saved configuration can recreate and initialize the same model, even without the code that defined the original model. Keras spports JSON and YAML serialization formates:


In [27]:
# Serialize a model to JSON format
json_string = model.to_json()
json_string 

'{"class_name": "Sequential", "config": {"name": "sequential_3", "layers": [{"class_name": "InputLayer", "config": {"batch_input_shape": [null, 32], "dtype": "float32", "sparse": false, "ragged": false, "name": "dense_14_input"}}, {"class_name": "Dense", "config": {"name": "dense_14", "trainable": true, "batch_input_shape": [null, 32], "dtype": "float32", "units": 64, "activation": "relu", "use_bias": true, "kernel_initializer": {"class_name": "GlorotUniform", "config": {"seed": null}}, "bias_initializer": {"class_name": "Zeros", "config": {}}, "kernel_regularizer": null, "bias_regularizer": null, "activity_regularizer": null, "kernel_constraint": null, "bias_constraint": null}}, {"class_name": "Dense", "config": {"name": "dense_15", "trainable": true, "dtype": "float32", "units": 10, "activation": "softmax", "use_bias": true, "kernel_initializer": {"class_name": "GlorotUniform", "config": {"seed": null}}, "bias_initializer": {"class_name": "Zeros", "config": {}}, "kernel_regularizer":

In [29]:
import json
import pprint
pprint.pprint(json.loads(json_string))

{'backend': 'tensorflow',
 'class_name': 'Sequential',
 'config': {'layers': [{'class_name': 'InputLayer',
                        'config': {'batch_input_shape': [None, 32],
                                   'dtype': 'float32',
                                   'name': 'dense_14_input',
                                   'ragged': False,
                                   'sparse': False}},
                       {'class_name': 'Dense',
                        'config': {'activation': 'relu',
                                   'activity_regularizer': None,
                                   'batch_input_shape': [None, 32],
                                   'bias_constraint': None,
                                   'bias_initializer': {'class_name': 'Zeros',
                                                        'config': {}},
                                   'bias_regularizer': None,
                                   'dtype': 'float32',
                                   'kern

Recreate the model (newly initialized) from the JSON:

In [30]:
fresh_model = tf.keras.models.model_from_json(json_string)

Serializing a model to YAML format requires that you install pyyaml you import TensorFlow:

In [31]:
yaml_string = model.to_yaml()
print(yaml_string)

backend: tensorflow
class_name: Sequential
config:
  layers:
  - class_name: InputLayer
    config:
      batch_input_shape: !!python/tuple [null, 32]
      dtype: float32
      name: dense_14_input
      ragged: false
      sparse: false
  - class_name: Dense
    config:
      activation: relu
      activity_regularizer: null
      batch_input_shape: !!python/tuple [null, 32]
      bias_constraint: null
      bias_initializer:
        class_name: Zeros
        config: {}
      bias_regularizer: null
      dtype: float32
      kernel_constraint: null
      kernel_initializer:
        class_name: GlorotUniform
        config: {seed: null}
      kernel_regularizer: null
      name: dense_14
      trainable: true
      units: 64
      use_bias: true
  - class_name: Dense
    config:
      activation: softmax
      activity_regularizer: null
      bias_constraint: null
      bias_initializer:
        class_name: Zeros
        config: {}
      bias_regularizer: null
      dtype: float32
   

Recreate the model from the YAML:

In [32]:
fresh_model = tf.keras.models.model_from_yaml(yaml_string)

Note: Subclassed models are not serializable because their architecture is defined by the Python code in the body of the call method.

# Save the entire model in one file
The entire model can be saved to a file that contains the weight values, the model's cofiguration, and even the optimizer's configuration. This allows you to checkpoint a model and resume training later-from the exact same state-without access to the original code.

In [33]:
 # Create a simple model
 model = tf.keras.Sequential([
    layers.Dense(10, activation='softmax', input_shape=(32,)),
    layers.Dense(10, activation='softmax')
])
 
model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])
model.fit(data, labels, batch_size=32, epochs=5)

# Save entire model to a HDF5 file
model.save('my_model.h5')

# Recreate the extract same model, including weights and optimizer.
model = tf.keras.models.load_model('my_model.h5')

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


## **Multiple GPUs**
tf.keras models can run multiple GPUs using `tf.distribute.Strategy` This API provides distributed training on multiple GPUs with almost no changes to existing code.

Currently, `tf.distribute.MirroredStrategy` is the only supported strategy. `MirroredStrategy` does in-graph replication with synchronous training using all-reduce on a single machine. To use `distribute.Strategy`s, nest the optimizer instantiation and model construction and compilation in a `Strategy's.scope()`, then train the model.

The following example distributes a tf.keras.Model across multiple GPUs on a single machine.

First, define a model inside the distributed strategy scope:

In [34]:
strategy = tf.distribute.MirroredStrategy()

with strategy.scope():
  model = tf.keras.Sequential()
  model.add(layers.Dense(16, activation='relu', input_shape=(10,)))
  model.add(layers.Dense(1, activation='sigmoid'))

  optimizer = tf.keras.optimizers.SGD(0.2)

  model.compile(loss='binary_crossentropy', optimizer=optimizer)

model.summary()

INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',)
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
Model: "sequential_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_18 (Dense)             (None, 16)                176       
_________________________________________________________________
dense_19 (Dense)             (None, 1)              

Next, train the model on data as usual:

In [36]:
x = np.random.random((1024, 10))
y = np.random.randint(2, size=(1024, 1))

x = tf.cast(x, tf.float32)
dataset = tf.data.Dataset.from_tensor_slices((x, y))
dataset = dataset.shuffle(buffer_size=1024).batch(32)

model.fit(dataset, epochs=1)

INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).


<tensorflow.python.keras.callbacks.History at 0x7fec0fe8fd50>