# Custom Models and Training with TensorFlow

In [1]:
import tensorflow as tf
from tensorflow import keras

import numpy as np


TensorFlow’s API revolves around tensors, which flow from operation to operation—hence the name TensorFlow. A tensor is usually a multidimensional array (exactly like a NumPy ndarray), but it can also hold a scalar (a simple value, such as 42).

## Using TensorFlow like NumPy

### Tensors and Operations

In [2]:
tf.constant([[1., 2., 3.], [4., 5., 6.]]) # matrix

<tf.Tensor: shape=(2, 3), dtype=float32, numpy=
array([[1., 2., 3.],
       [4., 5., 6.]], dtype=float32)>

In [3]:
tf.constant(42) # scalar

<tf.Tensor: shape=(), dtype=int32, numpy=42>

In [4]:
t = tf.constant([[1., 2., 3.], [4., 5., 6.]])
t.shape, t.dtype

(TensorShape([2, 3]), tf.float32)

In [5]:
t[:, 2]

<tf.Tensor: shape=(2,), dtype=float32, numpy=array([3., 6.], dtype=float32)>

In [6]:
t[1:2, 2]

<tf.Tensor: shape=(1,), dtype=float32, numpy=array([6.], dtype=float32)>

In [7]:
t + 16

<tf.Tensor: shape=(2, 3), dtype=float32, numpy=
array([[17., 18., 19.],
       [20., 21., 22.]], dtype=float32)>

In [8]:
tf.add(t, 16)

<tf.Tensor: shape=(2, 3), dtype=float32, numpy=
array([[17., 18., 19.],
       [20., 21., 22.]], dtype=float32)>

In [9]:
tf.square(t)

<tf.Tensor: shape=(2, 3), dtype=float32, numpy=
array([[ 1.,  4.,  9.],
       [16., 25., 36.]], dtype=float32)>

In [10]:
tf.transpose(t)

<tf.Tensor: shape=(3, 2), dtype=float32, numpy=
array([[1., 4.],
       [2., 5.],
       [3., 6.]], dtype=float32)>

In [11]:
t

<tf.Tensor: shape=(2, 3), dtype=float32, numpy=
array([[1., 2., 3.],
       [4., 5., 6.]], dtype=float32)>

In [12]:
t @ tf.transpose(t) # dot multipication

<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[14., 32.],
       [32., 77.]], dtype=float32)>

In [13]:
tf.matmul(t, tf.transpose(t)) # dot multipication

<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[14., 32.],
       [32., 77.]], dtype=float32)>

Another operations available: 

        tf.add(), tf.multiply(), tf.square(), tf.exp(), tf.sqrt()  ...
        
        tf.reshape(), tf.squeeze(), tf.tile())  ...
        
        tf.reduce_mean(), tf.reduce_sum(), tf.reduce_max(), tf.math.log() - equivalents of np.mean(), np.sum(), np.max() and np.log(). 

### Tensors and NumPy

Tensors play nice with NumPy: you can create a tensor from a NumPy array, and vice versa. You can even apply TensorFlow operations to NumPy arrays and NumPy operations to tensors:

In [14]:
a = np.array([2., 4., 5.])
tf.constant(a)

<tf.Tensor: shape=(3,), dtype=float64, numpy=array([2., 4., 5.])>

In [15]:
t.numpy()

array([[1., 2., 3.],
       [4., 5., 6.]], dtype=float32)

In [16]:
tf.square(a)

<tf.Tensor: shape=(3,), dtype=float64, numpy=array([ 4., 16., 25.])>

In [17]:
np.square(t)

array([[ 1.,  4.,  9.],
       [16., 25., 36.]], dtype=float32)

### Converting Types

In [18]:
t2 = tf.constant(40, dtype=tf.float64)
t2

<tf.Tensor: shape=(), dtype=float64, numpy=40.0>

In [19]:
tf.constant(2.0)

<tf.Tensor: shape=(), dtype=float32, numpy=2.0>

In [20]:
tf.constant(2.0) + tf.cast(t2, tf.float32)

<tf.Tensor: shape=(), dtype=float32, numpy=42.0>

### Variables

The tf.Tensor values are immutable: you cannot modify them. But tf.Variable values can be!

In [21]:
 tf.Variable([[1., 2., 3.], [4., 5., 6.]])

<tf.Variable 'Variable:0' shape=(2, 3) dtype=float32, numpy=
array([[1., 2., 3.],
       [4., 5., 6.]], dtype=float32)>

In [22]:
v =  tf.Variable([[1., 2., 3.], [4., 5., 6.]])
v.assign(2 * v)

<tf.Variable 'UnreadVariable' shape=(2, 3) dtype=float32, numpy=
array([[ 2.,  4.,  6.],
       [ 8., 10., 12.]], dtype=float32)>

In [23]:
v[0, 1].assign(42)

<tf.Variable 'UnreadVariable' shape=(2, 3) dtype=float32, numpy=
array([[ 2., 42.,  6.],
       [ 8., 10., 12.]], dtype=float32)>

In [24]:
v.scatter_nd_update(indices=[[0, 0], [1, 2]], updates=[100., 200.])

<tf.Variable 'UnreadVariable' shape=(2, 3) dtype=float32, numpy=
array([[100.,  42.,   6.],
       [  8.,  10., 200.]], dtype=float32)>

### Strings

In [25]:
tf.constant("cafe")

<tf.Tensor: shape=(), dtype=string, numpy=b'cafe'>

In [26]:
tf.constant(b"notebook")

<tf.Tensor: shape=(), dtype=string, numpy=b'notebook'>

In [27]:
tf.constant("café")

<tf.Tensor: shape=(), dtype=string, numpy=b'caf\xc3\xa9'>

## Customizing Models and Training Algorithms

### Custom Loss Functions

#### The Huber loss

The Huber loss is not currently part of the official Keras API, but it is available in tf.keras (just use an instance of the keras.losses.Huber class). 

But let’s pretend it’s not there. Just create a function that takes the labels and predictions as arguments, and use TensorFlow operations to compute every instance’s loss:

In [28]:
def huber_fn(y_true, y_pred):
    error = y_true - y_pred
    is_small_error = tf.abs(error) < 1
    squared_loss = tf.square(error) / 2
    linear_loss = tf.abs(error) - 0.5
    return tf.where(is_small_error, squared_loss, linear_loss)

model.compile(loss=huber_fn, optimizer="nadam")
<br>
model.fit(X_train, y_train, ....)

For each batch during training, Keras will call the huber_fn() function to compute the loss and use it to perform a Gradient Descent step. Moreover, it will keep track of the total loss since the beginning of the epoch, and it will display the mean loss.

### Custom Activation Functions, Initializers, Regularizers, and Constraints

Here are examples of a custom activation function (equivalent to keras.activations.softplus() or tf.nn.softplus()), a custom Glorot initializer (equivalent to keras.initializers.glorot_normal()), a custom l1 regularizer (equivalent to keras.regularizers.l1(0.01)), and a custom constraint that ensures weights are all positive (equivalent to keras.constraints.nonneg() or tf.nn.relu()):

In [29]:
def my_softplus(z): # return value is just tf.nn.softplus(z) 
    return tf.math.log(tf.exp(z) + 1.0)


def my_glorot_initializer(shape, dtype=tf.float32):
    stddev = tf.sqrt(2. / (shape[0] + shape[1]))
    return tf.random.normal(shape, stddev=stddev, dtype=dtype)


def my_l1_regularizer(weights):
    return tf.reduce_sum(tf.abs(0.01 * weights))


def my_positive_weights(weights): # return value is just tf.nn.relu(weights) 
    return tf.where(weights < 0., tf.zeros_like(weights), weights)

These custom functions can then be used normally; for example:

In [30]:
layer = keras.layers.Dense(30, activation=my_softplus, 
                           kernel_initializer=my_glorot_initializer, 
                           kernel_regularizer=my_l1_regularizer, 
                           kernel_constraint=my_positive_weights)

### Custom Layers

First, some layers have no weights, such as keras.layers.Flatten or keras.layers.ReLU. If you want to create a custom layer without any weights, the simplest option is to write a function and wrap it in a keras.layers.Lambda layer. For example, the following layer will apply the exponential function to its inputs:

In [31]:
exponential_layer = keras.layers.Lambda(lambda x: tf.exp(x))

To  build a custom stateful layer (i.e., a layer with weights), you need to create a subclass of the keras.layers.Layer class.

For example, the following class implements a simplified version of the Dense layer:

In [32]:
class MyDense(keras.layers.Layer):
    def __init__(self, units, activation=None, **kwargs):
        super().__init__(**kwargs)
        self.units = units
        self.activation = keras.activations.get(activation)
        
        
    def build(self, batch_input_shape):
        self.kernel = self.add_weight(
            name="kernel", shape=[batch_input_shape[-1], self.units], initializer="glorot_normal")
        
        self.bias = self.add_weight(
            name="bias", shape=[self.units], initializer="zeros")
        
        super().build(batch_input_shape) # must be at the end 
        
        
    def call(self, X):
        return self.activation(X @ self.kernel + self.bias) 
    
    
    def compute_output_shape(self, batch_input_shape):
        return tf.TensorShape(batch_input_shape.as_list()[:-1] + [self.units])
    
    
    def get_config(self):
        base_config = super().get_config()
        
        return {**base_config, "units": self.units, "activation": keras.activations.serialize(self.activation)}


## TensorFlow Functions and Graphs

Subclass keras.Model class, create layers and variables in the constructor, and implement the call() method to do whatever you want the model to do. 

In [33]:
def cube(x):
    return x ** 3

In [34]:
cube(2)

8

In [36]:
cube(tf.constant(2))

<tf.Tensor: shape=(), dtype=int32, numpy=8>

In [37]:
cube(tf.constant(2.0))

<tf.Tensor: shape=(), dtype=float32, numpy=8.0>

Now, let’s use tf.function() to convert this Python function to a TensorFlow Function:

In [38]:
tf_cube = tf.function(cube)
tf_cube

<tensorflow.python.eager.def_function.Function at 0x7fd4c3796090>

This TF Function can then be used exactly like the original Python function, and it will return the same result (but as tensors):

In [39]:
tf_cube(2)

<tf.Tensor: shape=(), dtype=int32, numpy=8>

In [40]:
tf_cube(tf.constant(2.0))

<tf.Tensor: shape=(), dtype=float32, numpy=8.0>

#### Alternatively, we could have used tf.function as a decorator; this is actually more common:

In [41]:
@tf.function
def tf_square(x): 
    return x ** 2