# 12. Custom Models and Training with TensorFlow

After introducing Keras high-level API, which will be good for most of our everyday use cases. In this chapter we will dive deeper in the lower level API.

This Chapter uses TF2. 

### Quick tour of TensorFlow

Here is a summary of what TensorFlow offers:

* Core similar to NumPy but with GPU support
* Distributed computing support
* Just-in-time (JIT) compiler that allows it to optimize computations for speed and memory usage. It works by: 
    1. Extracting **computation graph** from Py function
    2. Optimizing it (e.g., by pruning unused nodes)
    3. Running it efficiently (e.g., by automatically running independent operations in parallel)
* Exportable computation graph, potentially allowing to train in an env and run in another
* Implements autodiff and provides optimizers

### Using TensorFlow like NumPy

TensorFlow’s API revolves around **tensors**, which **flow** from operation to operation—hence the name _TensorFlow_. A tensor is usually a multidimensional array (exactly like a NumPy `ndarray`), but it can also hold a scalar. 

#### Tensors and Operations

We can create a tensor with `tf.constant()`:

In [2]:
import tensorflow as tf

# floats matrix 2x3 
tf.constant([[1., 2., 3.], [4., 5., 6.]])

<tf.Tensor 'Const:0' shape=(2, 3) dtype=float32>

In [3]:
tf.constant(42)

<tf.Tensor 'Const_1:0' shape=() dtype=int32>

In [4]:
t = tf.constant([[1., 2., 3.], [4., 5., 6.]])

In [5]:
t.shape

TensorShape([Dimension(2), Dimension(3)])

In [6]:
t.dtype

tf.float32

Indexing similar to NumPy:

In [8]:
t[:, 1:]

<tf.Tensor 'strided_slice:0' shape=(2, 2) dtype=float32>

In [9]:
t[..., 1, tf.newaxis]

<tf.Tensor 'strided_slice_1:0' shape=(2, 1) dtype=float32>

Tensor operations as we would expect them:  

In [10]:
t + 10

<tf.Tensor 'add:0' shape=(2, 3) dtype=float32>

In [11]:
tf.square(t)

<tf.Tensor 'Square:0' shape=(2, 3) dtype=float32>

In [12]:
# matrix multiplication
t @ tf.transpose(t)

<tf.Tensor 'matmul:0' shape=(2, 2) dtype=float32>

Generally, NumPy and TensorFlow are compatible in terms of operations.

**Note**: NumPy uses 64-bit precision by default, so don't forget to set it to `dtype=tf.float32` (more than enough for NNs).

#### Type Conversions

Tf doesn't allow operations between different types, or even different bit precisions. 

#### Variables

For things that need to change (e.g. weights) we would need to use `tf.Variable`:

In [13]:
v = tf.Variable([[1., 2., 3.], [4., 5., 6.]])

In [14]:
v

<tf.Variable 'Variable:0' shape=(2, 3) dtype=float32_ref>

A `tf.Variable` acts much like a `tf.Tensor` but it can also be modified using the `assign()` method.

In [15]:
v.assign(2 * v)

<tf.Tensor 'Assign:0' shape=(2, 3) dtype=float32_ref>

In [16]:
v[0, 1].assign(42)

<tf.Tensor 'strided_slice_2/_assign:0' shape=(2, 3) dtype=float32_ref>

In [17]:
v[:, 2].assign([0., 1.])

<tf.Tensor 'strided_slice_3/_assign:0' shape=(2, 3) dtype=float32_ref>

In [18]:
v.scatter_nd_update(indices=[[0, 0], [1, 2]], updates=[100., 200.])

<tf.Tensor 'ScatterNdUpdate:0' shape=(2, 3) dtype=float32_ref>

### Customizing Models and Training Algorithms

#### Custom Loss Functions

Let's assume we are working with a very noisy dataset. Let's also pretend that we want to implement Huber loss and it's not include in Keras (it actually is):

In [19]:
def huber_fn(y_true, y_pred):
    error = y_true - y_pred
    is_small_error = tf.abs(error) < 1
    squared_loss = tf.square(error) / 2
    linear_loss = tf.abs(error) - 0.5
    return tf.where(is_small_error, squared_loss, linear_loss)

Now we can use this loss when you compile the Keras model, then train your model:

In [20]:
model.compile(loss=huber_fn, optimizer="nadam")
model.fit(X_train, y_train, [...])

NameError: name 'model' is not defined

#### Saving and Loading Models That Contain Custom Components

In [21]:
model = keras.models.load_model("my_model_with_a_custom_loss.h5",
                                custom_objects={"huber_fn": huber_fn})

NameError: name 'keras' is not defined

What if we want to change the loss function threshold for _small errors_ (1 in the example above)? Then we will have to create a function that creates a configured loss function:

In [22]:
def create_huber(threshold=1.0):
    def huber_fn(y_true, y_pred):
        error = y_true - y_pred
        is_small_error = tf.abs(error) < threshold
        squared_loss = tf.square(error) / 2
        linear_loss = threshold * tf.abs(error) - threshold**2 / 2
        return tf.where(is_small_error, squared_loss, linear_loss)
    return huber_fn

model.compile(loss=create_huber(2.0), optimizer="nadam")

NameError: name 'model' is not defined

Unfortunately this threshold will not be saved, we have to specify it again: 

In [25]:
# using name of function, not name of function creating the function

model = keras.models.load_model("my_model_with_a_custom_loss_threshold_2.h5",
                                custom_objects={"huber_fn":
create_huber(2.0)})

NameError: name 'keras' is not defined

#### Custom Activation Functions, Initializers, Regularizers, and Constraints

Most Keras functionalities can be customized in pretty much the same way:

In [27]:
def my_softplus(z): # = tf.nn.softplus(z)
    return tf.math.log(tf.exp(z) + 1.0)

In [28]:
def my_glorot_initializer(shape, dtype=tf.float32):
    stddev = tf.sqrt(2. / (shape[0] + shape[1]))
    return tf.random.normal(shape, stddev=stddev, dtype=dtype)

In [29]:
def my_l1_regularizer(weights):
    return tf.reduce_sum(tf.abs(0.01 * weights))

#### Custom Metrics

In most cases, designing a custom metric function is very similar to creating a custom loss function:

In [30]:
# using huber loss as metric
model.compile(loss="mse", optimizer="nadam", metrics=
[create_huber(2.0)])

NameError: name 'model' is not defined

#### Custom Layer

In [31]:
# layer without weights
exponential_layer = keras.layers.Lambda(lambda x: tf.exp(x))

NameError: name 'keras' is not defined

To build a custom stateful layer (i.e., a layer with weights), you need to create a subclass of the `keras.layers.Layer` class. 

In [32]:
# dense layer
class MyDense(keras.layers.Layer):
    # hyperparams units and activation
    def __init__(self, units, activation=None, **kwargs):
        super().__init__(**kwargs)
        self.units = units
        self.activation = keras.activations.get(activation)
    # create layer variables by calling add_weight() for each weight
    def build(self, batch_input_shape):
        self.kernel = self.add_weight(
            name="kernel", shape=[batch_input_shape[-1], self.units],
            initializer="glorot_normal")
        self.bias = self.add_weight(
            name="bias", shape=[self.units], initializer="zeros")
        super().build(batch_input_shape) # must be at the end
    # perform desidered operations
    def call(self, X):
        # matrix multiply (input X and kernel) + bias 
        # then apply activation function
        return self.activation(X @ self.kernel + self.bias)
    
    # returns shape of layer’s outputs
    def compute_output_shape(self, batch_input_shape):
        return tf.TensorShape(batch_input_shape.as_list()[:-1] +
                                [self.units])
    def get_config(self):
        base_config = super().get_config()
        return {**base_config, "units": self.units,
                "activation":
keras.activations.serialize(self.activation)}

NameError: name 'keras' is not defined