<table align="left">
  <td>
    <a href="https://colab.research.google.com/github/marco-canas/didactica_ciencia_datos/blob/main/referentes/geron/part_2/c_12/c_12.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
  </td>
</table>

# Chapter 12. Custom Models and Training with TensorFlow

Hasta ahora, solo hemos usado la API de alto nivel de TensorFlow, `tf.keras`, pero ya hemos llegado bastante lejos: construimos varias arquitecturas de redes neuronales, incluidas redes de regresión y clasificación, redes anchas y profundas y redes de autonormalización utilizando todo tipo de técnicas, como la normalización por lotes, el abandono y los programas de tasa de aprendizaje. 

De hecho, el 95 % de los casos de uso que encontrará no requerirán nada más que `tf.keras` (y `tf.data`; consulte el Capítulo 13).

Pero ahora es el momento de profundizar en TensorFlow y echar un vistazo a su API de Python de nivel inferior.

Esto será útil cuando necesite control adicional para escribir funciones de pérdida personalizadas, métricas personalizadas, capas, modelos, inicializadores, regularizadores, restricciones de peso y más.

Incluso es posible que necesite controlar completamente el ciclo de entrenamiento en sí mismo, por ejemplo, para aplicar transformaciones o restricciones especiales a los gradientes (más allá de simplemente recortarlos) o usar múltiples optimizadores para diferentes partes de la red.

We will cover all these cases in this chapter, and we will also look at how you can boost your custom models and training algorithms using TensorFlow’s automatic graph generation feature. 

But first, let’s take a quick tour of TensorFlow.

### NOTE

TensorFlow 2.0 (beta) was released in June 2019, making TensorFlow much easier to use. 

The first edition of this book used TF 1, while this edition uses TF 2.

# A Quick Tour of TensorFlow

As you know, TensorFlow is a powerful library for numerical computation, particularly well suited and fine-tuned for large-scale Machine Learning (but you could use it for anything else that requires heavy computations). 

It was developed by the Google Brain team and it powers many of Google’s large-scale services, such as Google Cloud Speech, Google Photos, and Google Search. 

It was open sourced in November 2015, and it is now the most popular Deep Learning library (in terms of citations in papers,
adoption in companies, stars on GitHub, etc.). 

Countless projects use TensorFlow for all sorts of Machine Learning tasks, such as image classification, natural language processing, recommender systems, and time series forecasting.

So what does TensorFlow offer? Here’s a summary:

* Its core is very similar to NumPy, but with GPU support. It supports distributed computing (across multiple devices and
servers).  

* It includes a kind of just-in-time (JIT) compiler that allows it to optimize computations for speed and memory usage. It works by extracting the computation graph from a Python function, then optimizing it (e.g., by pruning unused nodes), and finally running it efficiently (e.g., by automatically running independent operations in parallel).

* Computation graphs can be exported to a portable format, so you can train a TensorFlow model in one environment (e.g., using
Python on Linux) and run it in another (e.g., using Java on an Android device).  

* It implements autodiff (see Chapter 10 and Appendix D) and provides some excellent optimizers, such as RMSProp and Nadam
(see Chapter 11), so you can easily minimize all sorts of loss functions.

TensorFlow offers many more features built on top of these core features: the most important is of course tf.keras, but it also has data loading and preprocessing ops (tf.data, tf.io, etc.), image processing ops (tf.image), signal processing ops (tf.signal), and more (see Figure 12-1 for an overview of TensorFlow’s Python API).

## TIP

We will cover many of the packages and functions of the TensorFlow API, but it’s impossible to cover them all, so you should really take some time to browse through the API; you will find that it is quite rich and well documented.

<img src = 'https://github.com/marco-canas/didactica_ciencia_datos/blob/main/referentes/geron/part_2/c_12/figure_12_1.jpg?raw=true'>

At the lowest level, each TensorFlow operation (op for short) is implemented using highly efficient C++ code. 

Many operations have multiple implementations called kernels: each kernel is dedicated to a specific device type, such as CPUs, GPUs, or even TPUs (tensor processing units). 

As you may know, GPUs can dramatically speed up computations by splitting them into many smaller chunks and running them in parallel across many GPU threads. 

TPUs are even faster: they are custom ASIC chips built specifically for Deep Learning operations (we will discuss how to use TensorFlow with GPUs or TPUs in Chapter 19).

TensorFlow’s architecture is shown in Figure 12-2. 

<img src = 'https://github.com/marco-canas/didactica_ciencia_datos/blob/main/referentes/geron/part_2/c_12/figure_12_2.jpg?raw=true'>

Most of the time your code will use the high-level APIs (especially tf.keras and tf.data); but when you need more flexibility, you will use the lower-level Python API, handling tensors directly. 

Note that APIs for other languages are also available. 

In any case, TensorFlow’s execution engine will take care of running the operations efficiently, even across multiple devices and machines if you tell it to.

TensorFlow runs not only on Windows, Linux, and macOS, but also on mobile devices (using TensorFlow Lite), including both iOS and Android (see Chapter 19). 

If you do not want to use the Python API, there are C++, Java, Go, and Swift APIs. 

There is even a JavaScript implementation called TensorFlow.js that makes it possible to run your models directly in your browser.

There’s more to TensorFlow than the library. 

TensorFlow is at the center of an extensive ecosystem of libraries. First, there’s TensorBoard for visualization (see Chapter 10). 

Next, there’s TensorFlow Extended (TFX), which is a set of libraries built by Google to productionize TensorFlow projects: it includes tools for data validation, preprocessing, model analysis, and serving (with TF Serving; see Chapter 19). 

Google’s TensorFlow Hub provides a way to easily download and reuse pretrained neural networks.

You can also get many neural network architectures, some of them pretrained, in TensorFlow’s model garden. 

Check out the TensorFlow Resources and https://github.com/jtoy/awesome-tensorflow for more TensorFlow-based projects. You will find hundreds of TensorFlow projects on GitHub, so it is often easy to find existing code for whatever you are trying to do. 

## TIP

More and more ML papers are released along with their implementations, and sometimes even with pretrained models. Check out https://paperswithcode.com/ to easily find them.

Last but not least, TensorFlow has a dedicated team of passionate and helpful developers, as well as a large community contributing to improving it. 

To ask technical questions, you should use http://stackoverflow.com/ and tag your question with tensorflow and python. 

You can file bugs and feature requests through GitHub. 

For general discussions, join the Google group.

OK, it’s time to start coding!

# Using TensorFlow like NumPy

TensorFlow’s API revolves around tensors, which flow from operation to operation—hence the name TensorFlow. A tensor is usually a multidimensional array (exactly like a NumPy ndarray), but it can also hold a scalar (a simple value, such as 42). 

These tensors will be important when we create custom cost functions, custom metrics, custom layers, and more, so let’s see how to create and manipulate them.

## Tensors and Operations

You can create a tensor with `tf.constant()`. 

For example, here is a tensor representing a matrix with two rows and three columns of floats:

In [1]:
import tensorflow as tf 

In [2]:
tf.constant([[1., 2., 3.], [4., 5., 6.]]) # matrix

<tf.Tensor: shape=(2, 3), dtype=float32, numpy=
array([[1., 2., 3.],
       [4., 5., 6.]], dtype=float32)>

In [3]:
tf.constant(42) # scalar


<tf.Tensor: shape=(), dtype=int32, numpy=42>