# Module 11: Using TensorFlow Interactively and Using Custom Functions

# Introduction

In this module, we will cover how to use TensorFlow like NumPy and introduce how to create tailored custom models for applications where using the Keras high-level API will be awkward or insufficient.

We will also look at how to save and restore models.

# Learning Outcomes

In this module, you will develop the knowledge and skills to use TensorFlow (TF) Python API. This will include:

- Using TensorFlow interatively
- Customizing models
- Saving and restoring models

# Readings and Resources

We invite you to further supplement this notebook with the following recommended texts/resources.

- Géron, A. (2019). Chapter 12: Custom Models and Training with TensorFlow in *Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow* (2nd ed.). O’Reilly Media. https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/


- TensorFlow documentation and tutorials: https://www.tensorflow.org/tutorials

<h1>Table of Contents<span class="tocSkip"></span></h1>
<br>
<div class="toc">
<ul class="toc-item">
<li><span><a href="#Module-11:-Using-TensorFlow-Interactively-and-Using-Custom-Functions" data-toc-modified-id="Module-11:-Using-TensorFlow-Interactively-and-Using-Custom-Functions">Module 11: Using TensorFlow Interactively and Using Custom Functions</a></span>
</li>
<li><span><a href="#Introduction" data-toc-modified-id="Introduction">Introduction</a></span>
</li>
<li><span><a href="#Learning-Outcomes" data-toc-modified-id="Learning-Outcomes">Learning Outcomes</a></span>
</li>
<li><span><a href="#Readings-and-Resources" data-toc-modified-id="Readings-and-Resources">Readings and Resources</a></span>
</li>
<li><span><a href="#Table-of-Contents" data-toc-modified-id="Table-of-Contents">Table of Contents</a></span>
</li>
<li><span><a href="#An-overview-of-TensorFlow" data-toc-modified-id="An-overview-of-TensorFlow">An overview of TensorFlow</a></span>
<ul class="toc-item">
<li><span><a href="#How-TensorFlow-works" data-toc-modified-id="How-TensorFlow-works">How TensorFlow works</a></span>
</li>
<li><span><a href="#TensorFlow-benefits" data-toc-modified-id="TensorFlow-benefits">TensorFlow benefits</a></span>
</li>
<li><span><a href="#TensorFlow-alternatives" data-toc-modified-id="TensorFlow-alternatives">TensorFlow alternatives</a></span>
</li>
<li><span><a href="#Computation-graphs" data-toc-modified-id="Computation-graphs">Computation graphs</a></span>
</li>
</ul>
</li>
<li><span><a href="#Using-TensorFlow-interactively" data-toc-modified-id="Using-TensorFlow-interactively">Using TensorFlow interactively</a></span>
<ul class="toc-item">
<li><span><a href="#Basic-mathematical-operations" data-toc-modified-id="Basic-mathematical-operations">Basic mathematical operations</a></span>
</li>
<li><span><a href="#Differences-from-NumPy" data-toc-modified-id="Differences-from-NumPy">Differences from NumPy</a></span>
</li>
<li><span><a href="#Variables" data-toc-modified-id="Variables">Variables</a></span>
</li>
<li><span><a href="#Other-TF-data-structures" data-toc-modified-id="Other-TF-data-structures">Other TF data structures</a></span>
</li>
</ul>
</li>
<li><span><a href="#Custom-functions" data-toc-modified-id="Custom-functions">Custom functions</a></span>
<ul class="toc-item">
<li><span><a href="#Custom-loss" data-toc-modified-id="Custom-loss">Custom loss</a></span>
</li>
<li><span><a href="#Saving-and-reloading-models" data-toc-modified-id="Saving-and-reloading-models">Saving and reloading models</a></span>
</li>
<li><span><a href="#Custom-activation-functions,-initializers,-regularizers,-and-constraints" data-toc-modified-id="Custom-activation-functions,-initializers,-regularizers,-and-constraints">Custom activation functions, initializers, regularizers, and constraints</a></span>
</li>
</ul>
</li>
<li><span><a href="#References" data-toc-modified-id="References">References</a></span>
</li>
</ul>
</div>

# An overview of TensorFlow

## How TensorFlow works

TensorFlow uses **dataflow graphs**, which are essentially structures that describe how data moves through a graph, or a series of processing nodes. Each node in the graph represents a mathematical operation, and each connection or edge between nodes is a multidimensional data array, or **tensor** (this will be discussed in more detail in the next section). 

In TensorFlow 1 a user would explicitly construct and run the graph.  TensorFlow 2 allows us to work interactively more like the way we use NumPy, with each operation executed immediately ("eagerly") when we are experimenting and provides higher level tools such as [tf.function](https://www.tensorflow.org/guide/basics#graphs_and_tffunction) and Keras to construct the graphs for us when we actually need them.

It is important to note that the actual **math operations are not performed in Python**. The libraries of transformations that TensorFlow provides are written as high-performance C++ binaries so TF runs much faster than an implementation completely in Python would be capable of. **Python just directs traffic between the pieces, and provides simple and high-level programming abstractions to connect all of the pieces together.**

TensorFlow applications can be run on pretty much any platform or machine: your local computer, iOS and Android devices, a cluster in the cloud, CPUs or GPUs, even in a browser. If you use Google’s cloud platform, you can run TensorFlow on Google’s custom-designed TensorFlow Processing Units (TPUs) for even greater processing speed. That said, **regardless of what machine you use to train your model, the resulting models created can be deployed on almost any device with a proper TensorFlow installation.**

## TensorFlow benefits

The single greatest benefit TensorFlow provides for Neural Network model development is abstraction. The relatively simple API allows the developer to focus on the overall logic of their application, rather than getting bogged down by the nitty-gritty details of implementing complex neural network architectures.

TensorFlow makes parallel processing across multiple CPUs or GPUs fairly simple (at least from the developer's standpoint). TensorFlow also supports distributed computing, so you can train colossal neural networks on humongous training sets in a reasonable amount of time by splitting the computations across hundreds of servers if required. In fact, **TensorFlow can train a network with millions of parameters on a data set composed of billions of instances with millions of features each.**

In addition, TensorFlow offers additional conveniences for developers who need to debug and gain introspection into TensorFlow apps. The TensorBoard visualization suite (which will be discussed in the next section of this module) lets you inspect and profile the way your graphs run by way of a simple interactive, web-based dashboard.

## TensorFlow alternatives

TensorFlow is not the only such library.  Another popular alternative is PyTorch, developed primarily at Facebook (now Meta) which was easier to use than TensorFlow 1 but TensorFlow 2 has more-or-less caught up.

Google, where TensorFlow originates from, has more recently also introduced a high-performance numerical library named JAX, which some organizations are using to accelerate their deep learning models.  JAX is not, however, a full-fledged deep learning library like TensorFlow, but rather is used primarily at this time for research into new deep model architectures.

## Computation graphs

It is useful for more advanced applications of neural nets to have an understanding of the underlying computational model although you rarely need to deal with it directly (like was necessary with TensorFlow 1).

TensorFlow differs from the typical programming library in the sense that its entire architecture is based on a type of computational abstraction known as a **computation graph**. This abstraction (also known as a **dataflow graph**) is used to represent user-defined computations through a network of individual TensorFlow operations (also known as *ops*). This differs from standard programming conventions where one writes procedural instructions to perform calculations. In a TF graph, we use the graph to describe the structure (or **what**) of a computation rather than specifying the step-by-step process (**how**) to perform the calculation.

These graphs can be saved in a portable format that allows them to run on a different device than the one they were created on.

# Using TensorFlow interactively

TensorFlow 2 defaults to "eager" mode which means we can run individual TF operations and immediately see their results without having to create a computation graph first. We can work with TF much like how we do with NumPy. Let's start with TF variables of type `tf.constant` which we can use to hold static (immutable) values that we want to be a part of our computation.

Tensors are simply multi-dimensional arrays similar to NumPy's ndarray type, but have capabilities for distributed computation and/or are specific to neural networks training and use that you won't find in NumPy.

In [2]:
# We would first import TensorFlow like any other Python library
import tensorflow as tf

# Create a tensor holding the scalar value 2
c = tf.constant(2)
print(c)

# Create a tensor containing a matrix
m = tf.constant([[1., 2.], [3., 4.]])
print(m)

tf.Tensor(2, shape=(), dtype=int32)Metal device set to: Apple M1 Max

tf.Tensor(
[[1. 2.]
 [3. 4.]], shape=(2, 2), dtype=float32)


2022-08-30 15:38:10.934703: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2022-08-30 15:38:10.935051: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)


Note how similar this is to NumPy. We can easily convert the tensors back and forth between NumPy ndarrays:

In [3]:
import numpy as np

npa = np.array([[1., 2.], [3., 4.]])
tft = tf.convert_to_tensor(npa)
tft

<tf.Tensor: shape=(2, 2), dtype=float64, numpy=
array([[1., 2.],
       [3., 4.]])>

Note that tensors also have a shape similar to that of ndarrays.

Let's convert tensor back to an ndarray.

In [4]:
npa_again = tf.make_ndarray(tf.make_tensor_proto(tft))
print(npa_again)
print(type(npa_again))

[[1. 2.]
 [3. 4.]]
<class 'numpy.ndarray'>


We can also slice tensors using the familiar Python/NumPy slicing syntax:

In [5]:
t = tf.constant([[1, 2, 3], [4, 5, 6], [7, 8 , 9]])

print(t[0: 1])

print(t[:, 1:])

tf.Tensor([[1 2 3]], shape=(1, 3), dtype=int32)
tf.Tensor(
[[2 3]
 [5 6]
 [8 9]], shape=(3, 2), dtype=int32)


## Basic mathematical operations

Tensors provide a wide variety of built-in operations:

In [6]:
print(t)

print(t + 1)

print(t * 2)

print(t * t)

print(tf.transpose(t)) # note that this does transpose in-place like NumPy; it creates a new tensor

print(t @ t) # the @ sign means matrix multiply

tf.Tensor(
[[1 2 3]
 [4 5 6]
 [7 8 9]], shape=(3, 3), dtype=int32)
tf.Tensor(
[[ 2  3  4]
 [ 5  6  7]
 [ 8  9 10]], shape=(3, 3), dtype=int32)
tf.Tensor(
[[ 2  4  6]
 [ 8 10 12]
 [14 16 18]], shape=(3, 3), dtype=int32)
tf.Tensor(
[[ 1  4  9]
 [16 25 36]
 [49 64 81]], shape=(3, 3), dtype=int32)
tf.Tensor(
[[1 4 7]
 [2 5 8]
 [3 6 9]], shape=(3, 3), dtype=int32)
tf.Tensor(
[[ 30  36  42]
 [ 66  81  96]
 [102 126 150]], shape=(3, 3), dtype=int32)


## Differences from NumPy

There are several places where our intuition from using NumPy might mislead us though.

Since TensorFlow is typically used for large computations where efficiency is critical, sometimes conveniences that we are used to in Python and NumPy are not available.  This is often because of their performance characteristics or overhead which might go unnoticed but negatively impact model training or prediction speed.

For example, TF floating point tensors use 32-bit precision rather than NumPy's 64-bit.  32-bit precision works fine for most neural net applications and are much faster on GPUs, especially on consumer-grade graphics cards.

Another example is type conversions (e.g. integer to float, etc.). These must be explicitly handled so it is clear where in a computation they are occuring.  Operations between incompatible types will cause an exception, terminating the computation.  Types can be converted where necessary using tf.cast().

In [7]:
tf.cast(tf.constant(1), tf.float32) + tf.constant(2.0) # this would throw an exception if the cast were missing

<tf.Tensor: shape=(), dtype=float32, numpy=3.0>

## Variables

To create a variable tensor, we use tf.Variable() in the same way:

In [8]:
v = tf.Variable([[1, 2, 3], [4, 5, 6], [7, 8 , 9]])
v

<tf.Variable 'Variable:0' shape=(3, 3) dtype=int32, numpy=
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]], dtype=int32)>

To change the value of an entry in a variable tensor, use assign().

In [9]:
v[0, 0].assign(22)
v

<tf.Variable 'Variable:0' shape=(3, 3) dtype=int32, numpy=
array([[22,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9]], dtype=int32)>

## Other TF data structures

TF supports a variety of other special-purpose data structures beyond "vanilla" tensors:

- tf.SparseTensor: For tensors that are mostly zero so would otherwise waste a lot of space
- tf.TensorArray: For working with a list/array of tensors
- tf.RaggedTensor: For working with tensors that are unevenly shaped rather than rectangular
- tf.string: For storing byte strings (note that they do not handle Unicode)
- Sets: TF provides set operations on tensors
- Queues: TF provides various types of standard queues

# Custom functions

## Custom loss

As we've seen, the usual "go to" loss functions for regression and classification are Mean Squared Error (MSE) and Crossentropy Loss respectively.  These are not necessarily the best choices, however, especially if we are building a custom model of some kind.  Researchers have found a large number of alternative loss functions for specific cases.  

You can see a list of the Keras loss functions at https://www.tensorflow.org/api_docs/python/tf/keras/losses.  These cover the vast majority of specialized functions you are likely to require.

If you do indeed need to create a new loss function that is not covered by the standard set, you can code your own in Python, but for good performance you should use TensorFlow operations as we discussed above, not plain Python or NumPy code.


Here is an example of how to code a custom loss function using tensor operations.  This is the popular *Huber Loss* which is useful when training a regression model on noisy data.  It uses squared error for absolute error values less than some threshold (set to its typical value of 1 in this example) and absolute error otherwise. (Huber Loss is actually an existing option in TensorFlow but makes a nice, simple example for our purposes).

In [10]:
def huber_fn(y_true, y_pred):
    error = y_true - y_pred
    is_small_error = tf.abs(error) < 1
    squared_loss = tf.square(error) / 2
    linear_loss = tf.abs(error) - 0.5
    return tf.where(is_small_error, squared_loss, linear_loss)

## Saving and reloading models

Keras has the built-in capability to save and reload models.  If the model hasn't been customized, it's as simple as model.save("myModel.h5") to save your model to the file "myModel.h5" and model = kera.models.load_model("myModel.h5") to restore it.

If you have developed a model that uses custom functions such as the huber_fn above, it is a little more involved but still relatively easy to do.  The save method will store the name of the custom function, but not the function itself.  When you load the model again, you will need to provide a Python dictionary to map the function name to the actual function, which you will need to provide the code for in the notebook you're restoring to:

model = keras.models.load_model("myModel.h5"), custom_objects={"huber_fn": huber_fn})

If you have used subclassing to build a custom model rather than just adding custom functions you will need to implement a get_config() method for the subclass to allow saving and restoring it.  Again, consult the documentation for the version of TensorFlow that you're using for details if you need to do this.

## Custom activation functions, initializers, regularizers, and constraints

Keras's built-in functions cover the majority of general use cases. In most cases where a special loss function, regularizer, initializer, etc. are required, they can be implemented as TF function such as w've seen with the Huber Loss.  Here are some examples that reimplement standard functions already present in Keras or TF (again, it would be unusual to need something that isn't already provided):

In [11]:
# Works the same as tf.nn.softplus(z)
def my_softplus(z): 
    return tf.math.log(tf.exp(z) + 1.0)

# Works the same as keras.initializers.glorot_normal()
def my_glorot_initializer(shape, dtype=tf.float32):
    stddev = tf.sqrt(.0 / (shape[0] + shape[1]))
    return tf.random.normal(shape, stddev=stddev, dtype=dtype)

# Works the same as keras.regularizers.l1(0.01)
def my_l1_regularizer(weights):
    return tf.reduce_sum(tf.abs(0.01 * weights))

# Works the same as keras.constraints.nonneg()
def my_positive_weights(weights):
    return tf.where(weights < 0.0, tf.zeros_like(weights, weights))

Note that again we needed to use TF operations (not NumPy ones) because we are performing operations on tensors, not on Numpy objects.  Once we've defined custom functions we can use them in the model:

In [12]:
from tensorflow import keras

layer = keras.layers.Dense(30, activation=my_softplus,
                          kernel_initializer=my_glorot_initializer,
                          kernel_regularizer=my_l1_regularizer,
                          kernel_constraint=my_positive_weights)

Like with the loss, if your function has hyperparameters that you want saved along with the model, you will need to do a little more work by subclassing the appropriate Keras class.

**End of Module**

You have reached the end of this module.

If you have any questions, please reach out to your peers using the discussion boards. If you
and your peers are unable to come to a suitable conclusion, do not hesitate to reach out to
your instructor on the designated discussion board.

# References

- Géron, A. (2017). Chapter 12: Custom Models and Training with TensorFlow in *Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow* (2nd ed.). O’Reilly Media. https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/


- TensorFlow documentation and tutorials: https://www.tensorflow.org/tutorials