In [90]:
import tensorflow as tf
import numpy as np

# A Quick Tour of Tensorflow

Here's a summary of what TensorFlow has to offer:

1. Its core is very similar to NumPy, but with GPU support
2. It supports distributed computing (across multiple devices and servers)
3. It includes a kind of just-in-time (JIT) compiler that allows it to optimize computations for speed and memory usage. It works by extracting the *computation graph* from a Python function, then optimizing it (e.g. by pruning unused nodes), and finally running it efficiently (e.g. by automatically running independent operations in parallel.)
4. Computation graphs can be exported to a portable format so you can train a TensorFlow model in on environment (e.g. using Python on Linux) and run it in another (e.g. using Java on an Android device)
5. It implements autodiff (see Chapter 10 and Appendix D) and provides some excellent optimizers, such as RMSProp and Nadam (see Chapter 11), so you can easily minimize all sorts of loss functions.

TensorFlow offers many more features built on top of these core features: the most important is of course tf.keras, but it also has data loading and preprocessing ops, image processing ops, signal processing ops, and more. 

As you may know, GPUs can dramatically speed up computations by splitting them into many smaller chunks and running them in parallel across many GPU threads. TPUs are even faster: they are custom ASIC chips built specifically for Deep Learning operations.

There's even more to the TensorFlow library:
1. TensorBoard - for visualization
2. TensorFlow Extended (TFX) - a set of libraries built by Google to productionize TensorFlow projects. It includes tools for data validation, preprocessing, model analysis, and serving.
3. TensorFlow Hub - provides a way to easily download and reuse pretrained neural networks. You can also get many neural network architectures, some of them pretrained, in TensorFlows *model garden*
4. TensorFlow Resources - contains TensorFlow-based projects. You will find hundreds of TensorFlow projects on GitHub, so it is often easy to find existing coded for whatever you are trying to do.

More and more ML papers are released along with their implementations, and sometimes even with pretrained models. Check out https://paperswithcode.com/ to easily find them

## Using Tensorflow like NumPy

TensorFlow's API revolves around tensors. A tensor is usually a multidimensional array, b ut it can also hold a scaler. Let's see how to create and manipulate them

### Tensors and Operations

In [77]:
tf.constant([[1., 2., 3.], [4., 5., 6]]), tf.constant(42)

(<tf.Tensor: shape=(2, 3), dtype=float32, numpy=
 array([[1., 2., 3.],
        [4., 5., 6.]], dtype=float32)>,
 <tf.Tensor: shape=(), dtype=int32, numpy=42>)

In [79]:
# Just like an ndarray, a tf.Tensor has a shape and a data type
t = tf.constant([[1., 2., 3.], [4., 5., 6]])
t.shape, t.dtype

(TensorShape([2, 3]), tf.float32)

In [81]:
# Indexing works much like in Numpy
t[:, 1:]

<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[2., 3.],
       [5., 6.]], dtype=float32)>

In [83]:
t[..., 1, tf.newaxis]

<tf.Tensor: shape=(2, 1), dtype=float32, numpy=
array([[2.],
       [5.]], dtype=float32)>

In [85]:
# More importantly, all sorts of tensor operations are available
t + 10, tf.square(t), t @ tf.transpose(t)

(<tf.Tensor: shape=(2, 3), dtype=float32, numpy=
 array([[11., 12., 13.],
        [14., 15., 16.]], dtype=float32)>,
 <tf.Tensor: shape=(2, 3), dtype=float32, numpy=
 array([[ 1.,  4.,  9.],
        [16., 25., 36.]], dtype=float32)>,
 <tf.Tensor: shape=(2, 2), dtype=float32, numpy=
 array([[14., 32.],
        [32., 77.]], dtype=float32)>)

You will find all the basic math operations you need (tf.add(), tf.multiply(), tf.square(), tf.exp(), tf.sqrt(), etc.) and most operations that you can find in Numpy (e.g. tf.reshape(), tf.squeeze(), tf.tile()). Some functions have a different name than Numpy; for instance, tf.reduce_mean(), tf.reduce_sum(), tf.reduce_max(), and tf.math.log() are the equivalent of np.mean(), np.sum(), np.max() and np.log().

When the name differs, there is often a good reason for it. For example, in TensorFlow you must write tf.transpose(t); you cannot use write t.T like in NumPy. The reason is that the tf.transpose() function does not do exactly the same thing as Numpy's T attribute: in TensorFlow, a new tensor is created with its own copy of the transposed data, whlie in Numpy, t.T is just a transposed view of the same data. 

Similarly, the tf.reduce_sum() operation is named this way because its GPU kernel (i.e. GPU implementation) uses a reduce algorithm that does not guarantee the order in which the elements are added: because 32-bit floats have limited precision, the result may change ever so slightly every time you call this operation.

### Tensors and NumPy

In [94]:
# Tensors play nice with Numpy
a = np.array([2., 4., 5.])
tf.constant(a), t.numpy()

(<tf.Tensor: shape=(3,), dtype=float64, numpy=array([2., 4., 5.])>,
 array([[1., 2., 3.],
        [4., 5., 6.]], dtype=float32))

In [96]:
tf.square(a), np.square(t)

(<tf.Tensor: shape=(3,), dtype=float64, numpy=array([ 4., 16., 25.])>,
 array([[ 1.,  4.,  9.],
        [16., 25., 36.]], dtype=float32))

Notice that NumPy uses 64-bit precision by default, while TensorFlow uses 32-bit. This is because 32-bit precision is generally more than enough for neural networks, plus it runs faster and uses less RAM. So when you create a tensor from a NumPy array, make sure to set dtype=tf.float32

### Type Conversions

Type conversions can significantly hurt performance, and they can easily go unnoticed when they are done automatically. To avoid this, TensorFlow does not perform any type conversions automatically: it just raises an exception if you try to execute an operation on tensors with incompatible types. This may be a bit annoying at first, but remember that it's for a good cause! And of course you can use tf.cast() when you really need to convert types.

In [103]:
# Example of Exception
# tf.constant(2.) + tf.constant(40)

' InvalidArgumentError: cannot compute AddV2 as input #1(zero-based) was expected to be a float tensor but is a int32 tensor [Op:AddV2]'

' InvalidArgumentError: cannot compute AddV2 as input #1(zero-based) was expected to be a float tensor but is a int32 tensor [Op:AddV2]'

In [101]:
# Example casting variables as the correct type
t2 = tf.constant(40., dtype=tf.float64)
tf.constant(2.0) + tf.cast(t2, tf.float32)

<tf.Tensor: shape=(), dtype=float32, numpy=42.0>

## Variables

The tf.Tensor values we've seen so far are immutable: you cannot modify them. For mutable tf.Tensor values we need tf.Variable. A tf.Variable acts much like a tf.Tensor: you can perform the same operations with it, it plays nicely with NumPy as well, and it is just as picky with types. But it can also be modified in place using the assign() method (or assign_add() or assign_sub(), which increment or decrement the variable by the given value).

In practice you will rarely have to create variables manually, since Keras provides an add_weight() method that will take care of it for you, as we will see. Moreover, model parameters will generally be updated directly by the optimizers, so you will rarely need to update variables manually.

In [106]:
# Variable examples
v = tf.Variable([[1., 2., 3.], [4., 5., 6.]])
v

<tf.Variable 'Variable:0' shape=(2, 3) dtype=float32, numpy=
array([[1., 2., 3.],
       [4., 5., 6.]], dtype=float32)>

In [110]:
v.assign(2 * v)

<tf.Variable 'UnreadVariable' shape=(2, 3) dtype=float32, numpy=
array([[ 2.,  4.,  6.],
       [ 8., 10., 12.]], dtype=float32)>

In [112]:
v[0, 1].assign(42)

<tf.Variable 'UnreadVariable' shape=(2, 3) dtype=float32, numpy=
array([[ 2., 42.,  6.],
       [ 8., 10., 12.]], dtype=float32)>

In [114]:
v[:, 2].assign([0., 1.])

<tf.Variable 'UnreadVariable' shape=(2, 3) dtype=float32, numpy=
array([[ 2., 42.,  0.],
       [ 8., 10.,  1.]], dtype=float32)>

In [116]:
v.scatter_nd_update(
    indices=[[0, 0], [1, 2]],
    updates=[100., 200.]
)

<tf.Variable 'UnreadVariable' shape=(2, 3) dtype=float32, numpy=
array([[100.,  42.,   0.],
       [  8.,  10., 200.]], dtype=float32)>

## Other Data Structures

# Customizing Models and Training Algorithms

## Custom Loss Functions

## Saving and Loading Models That Contain Custom Components

## Custom Activation Functions, Initializers, Regularizers, and Contstraints

## Custom Metrics

## Custom Layers

## Custom Models

# Losses and Metrics Based on Model Internals

## Computing Gradients Using Autodiff

## Custom Training Loops

# TensorFlow Functions and Graphs

## AutoGraph and Tracing

## TF Function Rules

# Exercises

1. **How would you describe TensorFlow in a short sentence? What are its main features? Can you name other popular Deep Learning libraries?**

2. **Is TensorFlow a drop-in replacement for NumPy? What are the main differences between the two?**

3. **Do you get the same result with tf.range(10) and tf.constant(np.arange(10))?**

4. **Can you name six other data structures available in TensorFlow, beyond regular tensors?**

5. **A custom loss function can b e defined by writing a function or by subclassing the keras.losses.Loss class. When would you use each option?**

6. **Similarly, a custom metric can be defined in a function or a subclass of keras.metrics.Metric. When would you use each option?**

7. **When should you create a custom layer versus a custom model?**

8. **What are some use cases that require writing your own custom training loop?**

9. **Can custom Keras components contain arbitrary Python code, or must they be convertible to TF Functions?**

10. **What are the main rules to respect if you want a function to be convertible to a TF Function?**

11. **When would you need to create a dynamic Keras model? How do you do that? Why not make all your models dynamic?**

12. **Implement a custom layer that performs *Layer Normalization* (we will use this type of layer in Chapter 15):**

    a. The build() method should define two trainable weights $\alpha$ and $\beta$, both of shape input_shape[-1:] and data type tf.float32. $\alpha$ should be initialized with 1s and $\beta$ with 0s

    b. The call() method should compute the mean $\mu$ and standard deviation $\sigma$ of each instance's features. For this, you can use tf.nn.moments(inputs, axes=-1, keepdims=True), which returns the mean $\mu$ and the variance $\sigma^{2}$ of all instances (compute the square root of the variance to get the standard deviation). Then the function should computer and return $\alpha\bigotimes\frac{(X - \mu)}{\sigma + \epsilon} + \beta$, where \bigotimes represents itemwise multiplication and \epsilon is a smoothing term (small constant to avoid division by zero, e.g. 0.001)

    c. Ensure that your custom layer produces the same (or very nearly the same) outoput as the keras.layers.LayerNormalization layer.

13. **Train a model using a custom training loop to tackle the Fashion MNIST dataset (see Chapter 10).**

    a. Display the epoch, iteration, mean training loss, and mean accuracy over each epoch (updated at each iteration), as well as the validation loss and accuracy at the end of each epoch.
    
    b. Try using a different optimizer with a different learning rate for the upper layers and the lower layers.