# Introduction to Tensorflow

Here we define import some useful libraries to understand the concepts behind TensorFlow.

In [2]:
# Importing basic libraries

import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf

In [3]:
tf.__version__

'2.8.0'

## What is Tensorflow?
Tensorflow is a free to use end-to-end platform for machine learning (ML). It is made up of a comprehensive and flexible ecosystem of tools, libraries, and community resources that enable researchers and developers to build and deploy ML-based applications.

Some of the main advantages of tensorflow are:

|Easy model building|Robust ML production|Experimentation for research|
|---|---|---|
|<img src="https://www.tensorflow.org/site-assets/images/marketing/home/model.svg" width="80%"> | <img src="https://www.tensorflow.org/site-assets/images/marketing/home/robust.svg" width="80%" />|<img src="https://www.tensorflow.org/site-assets/images/marketing/home/research.svg" width="80%" />|
| It makes it easy to deploy and train <br/> ML models using high-level APIs <br/> like *Keras* with *eager execution*, which, <br/> allows for easy building and debugging.| It enables easy <br/>training and deployment of models in the cloud, on mobile devices, or <br/>in a browser regardless of the language used.| It has a simple and flexible architecture to <br/> take new ideas from concept to <br/>code, allows you to use the latest <br/>state-of-the-art models to publish faster.
|

# 1. General concepts in Tensorflow

## Computational graph

Tensorflow is very useful in *Deep learning* fundamentally because it allows automatic differentiation and parallelizes mathematical operations. Tensorflow achieves this by internally building a computational graph:

<img src="https://github.com/tensorflow/docs/blob/master/site/en/guide/images/intro_to_graphs/two-layer-network.png?raw=1" width="70%" />

This graph defines a data flow based on mathematical expressions. More specifically, Tensorflow uses a directed graph where each node represents an operation or variable.

One of the main advantages of using a computational graph is that operations are defined as relationships or dependencies, which allows computations to be easily simplified and parallelized. This is much more practical compared to a conventional program where the operations are executed sequentially.

## Tensors

The main data structure used in Tensorflow are tensors. These are multidimensional arrays that allow information to be stored. They can be viewed as a generalization of scalars (0D-tensor), vectors (1D-tensor), and matrices (2D-tensor). Let's see some examples of tensors of different orders:

In [4]:
# we define a constant 1D-tensor (vector) from a constant list
t = tf.constant([2, 3, 4, 5], dtype=tf.int32)
print(t)

tf.Tensor([2 3 4 5], shape=(4,), dtype=int32)


A tensor has two basic properties: 
* shape (`shape`)
* type (`dtype`) 

On the one hand, the `shape`, as in `numpy`, indicates the order, number of dimensions, and the size of each dimension. In the previous example we have a tensor of order 1, that is, a single dimension, of size 4.
On the other hand, as in any programming language, tensors have an internal representation type: `tf.int32`, `tf.float32`, `tf.string`, among others. A correct selection of the data type can make the codes more efficient. In the example above, the type of the tensor is 32-bit integer.

The following example corresponds to a tensor of **order** 2, a matrix, whose **type** is a 32-bit float.

In [5]:
# we define a 2D-tensor (matrix) variable from a list
t = tf.constant([[9, 5], [1, 0]], dtype=tf.float32)
print(t)

tf.Tensor(
[[9. 5.]
 [1. 0.]], shape=(2, 2), dtype=float32)


In Tensorflow there are two main types of tensors:

* ```tf.constant```: these are immutable multidimensional arrays, that is, they are tensors that will not change during execution.
* ```tf.Variable```: these are tensors whose values can change during execution (for example, the parameters of a model are defined as variables, since these values are updated iteratively).

Let's see an example of variables in tensorflow:

In [6]:
# we define a 2D-tensor (matrix) variable from a list
t = tf.Variable([[1, 2], [3, 4]], dtype=tf.float32)
print(t)

<tf.Variable 'Variable:0' shape=(2, 2) dtype=float32, numpy=
array([[1., 2.],
       [3., 4.]], dtype=float32)>


In [7]:
# we can assign it a new value
t.assign([[-2, -1], [-3, -7]])
print(t)

<tf.Variable 'Variable:0' shape=(2, 2) dtype=float32, numpy=
array([[-2., -1.],
       [-3., -7.]], dtype=float32)>


In [8]:
# or we can add or substract a value
t.assign_add([[1, 1], [1, 1]])
print(t)
t.assign_sub([[2, 2], [2, 2]])
print(t)

<tf.Variable 'Variable:0' shape=(2, 2) dtype=float32, numpy=
array([[-1.,  0.],
       [-2., -6.]], dtype=float32)>
<tf.Variable 'Variable:0' shape=(2, 2) dtype=float32, numpy=
array([[-3., -2.],
       [-4., -8.]], dtype=float32)>


We can perform various operations and define functions on tensors, likewise, tensorflow provides a similar *slicing* to that of numpy arrays. Let's look at an example:

In [9]:
# we define a 2D-tensor A
A=tf.constant([[1, 2], [3, 4]], dtype=tf.float32)
# we define a 2D-tensor B
B=tf.constant([[-1, -2], [-3, -4]], dtype=tf.float32)

In [10]:
# sum
A + B

<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[0., 0.],
       [0., 0.]], dtype=float32)>

In [11]:
# substraction
A - B

<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[2., 4.],
       [6., 8.]], dtype=float32)>

In [12]:
# scalar multiplication (in Python)
3 * A

<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[ 3.,  6.],
       [ 9., 12.]], dtype=float32)>

In [13]:
# Element-wise multiplication
A * B

<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[ -1.,  -4.],
       [ -9., -16.]], dtype=float32)>

In [14]:
# Matrix multiplication
print(tf.matmul(A, B))

tf.Tensor(
[[ -7. -10.]
 [-15. -22.]], shape=(2, 2), dtype=float32)


In [15]:
# Slicing examples
print('Original tensor:\n {}'.format(A))
print('First row:\n {}'.format(A[0]))
print('First element of the first row: \n {}'.format(A[0, 0]))
print('Second column:\n {}'.format(A[:, 1]))
print('Inverted rows:\n {}'.format(A[::-1]))

Original tensor:
 [[1. 2.]
 [3. 4.]]
First row:
 [1. 2.]
First element of the first row: 
 1.0
Second column:
 [2. 4.]
Inverted rows:
 [[3. 4.]
 [1. 2.]]


We can also apply different mathematical functions to all elements of a tensor:

In [16]:
# logarithm 
tf.math.log(A)

<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[0.       , 0.6931472],
       [1.0986123, 1.3862944]], dtype=float32)>

Other types of arithmetic operations, mathematical functions, and linear algebra operations can be found in the ```tf.math``` package and for linear algebra in the ```tf.linalg``` package.

## Eager execution

*Tensorflow* provides an imperative programming environment (*Eager execution*) to evaluate operations immediately without the need for the user to explicitly specify a graph. That is, the result of the operations are concrete values ​​instead of symbolic variables within the computational graph. In addition, it also allows the graph to be built automatically in cases where it is required. This makes it easier to get started programming in *Tensorflow* and debugging models. Additionally, *Eager execution* supports most of the features of *Tensorflow* and also allows GPU acceleration.

*Eager execution* is a flexible platform for research and experimentation that provides:

* **Intuitive interface**: Allows you to develop code naturally and use Python data structures. It also allows rapid development of applications in cases with small models and little data.

* **Simple debugging**: executing the operations directly allows you to review the models in detail during execution and evaluate changes. In addition, it uses native Python debugging tools to report bugs immediately.

* **Natural Control**: Controlling variables from Python instead of control via a graph simplifies the specification of more dynamic models.

*Tensorflow* 2.0 comes with *Eager execution* by default.

In [17]:
# We check tf version
tf.__version__

'2.8.0'

In [18]:
# We check if eager execution is active
tf.executing_eagerly()

True

By default, *Eager execution* executes operations sequentially, that is, it does not build a computational graph unless it is necessary for some operation or specified. For tensorflow to build the graph we must use the ```tf.function``` decorator as shown below:

In [19]:
# We define a decorated function (it internally builds the computational graph)
@tf.function
def poly(x):
    y1 = 2 * x + 3 * x ** 4 + 5 * x ** 2    
    y2 = 5 * x + - 2 * x ** 4 + - 3 * x ** 2
    y3 = x + 2
    return  y1 + y2 + y3

In [20]:
# We define a normal function in Python (it executes the operations sequentially)
def poly2(x):
    y1 = 2 * x + 3 * x ** 4 + 5 * x ** 2
    y2 = 5 * x + - 2 * x ** 4 + - 3 * x ** 2
    y3 = x + 2
    return  y1 + y2 + y3

Now, let's compare the average time between these two functions:

In [21]:
%%timeit -n 1000
poly(tf.constant([1, 2, 3, 4], dtype=tf.float32))

343 µs ± 106 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


In [22]:
%%timeit -n 1000
poly2(tf.constant([1, 2, 3, 4], dtype=tf.float32))

816 µs ± 103 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


## Numpy integration

One of the main advantages of Tensorflow 2.0 is its support for arrays and numpy operations. The latter is the most used linear algebra library in python.

Let's see some examples with numpy and Tensorflow:

In [23]:
# We define an array in numpy, the linspace function creates a sequence of 'num' numbers
# equally spaced between two limits 'start' and 'stop'
x = np.linspace(start=0, stop=1, num=10)
x

array([0.        , 0.11111111, 0.22222222, 0.33333333, 0.44444444,
       0.55555556, 0.66666667, 0.77777778, 0.88888889, 1.        ])

In [24]:
# We perform some operations in Tensorflow
acum = tf.reduce_sum(x)
acum

<tf.Tensor: shape=(), dtype=float64, numpy=5.0>

In [25]:
# We define a tensor in TensorFlow
x = tf.linspace(0.0, 1.0, 10)
x

<tf.Tensor: shape=(10,), dtype=float32, numpy=
array([0.        , 0.11111111, 0.22222222, 0.33333334, 0.44444445,
       0.5555556 , 0.6666667 , 0.7777778 , 0.8888889 , 1.        ],
      dtype=float32)>

In [26]:
# Now, we perform some operations in numpy
acum = np.sum(x)
acum

5.0

In [27]:
# We can also convert a numpy array to a tensor
x = np.linspace(0, 1, 10)
t = tf.constant(x)
t

<tf.Tensor: shape=(10,), dtype=float64, numpy=
array([0.        , 0.11111111, 0.22222222, 0.33333333, 0.44444444,
       0.55555556, 0.66666667, 0.77777778, 0.88888889, 1.        ])>

In [28]:
# Also, a tensor into a numpy array
t = tf.linspace(0.0,1.0,10)
x = t.numpy()
x

array([0.        , 0.11111111, 0.22222222, 0.33333334, 0.44444445,
       0.5555556 , 0.6666667 , 0.7777778 , 0.8888889 , 1.        ],
      dtype=float32)

# 2. Keras

Originally, Keras was a high-level *framework* written in Python that used different *backends* of *deep learning* such as: Tensorflow, CNTK or Theano. Currently, it is a package within Tensorflow 2.0 that allows us to simplify both the training and the design of *machine learning* models and neural networks. It also includes built-in and custom layers, models, optimizers, loss functions and metrics.

```tf.keras``` is used for fast model creation and has three advantages:

* **User friendly**: keras has a simple and consistent interface that has been optimized for use in typical cases.
* **Modular**: Model building is based on connecting customizable blocks with few restrictions.
* **Easy extension**: allows you to easily implement new modules using all Tensorflow features, which makes it easy to build new models or state-of-the-art models.



## Layers

In keras, `Layers` are the basic building blocks of neural networks. A layer can be understood as a simple input-output transformation. From the point of view of TensorFlow, a layer also consists of a tensor-in tensor-out computation function  and some state, held in TensorFlow variables (**the layer's weights**).

In [29]:
# For instance here's a linear projection layer that maps its inputs to a 16-dimensional feature space
dense = tf.keras.layers.Dense(units=16)

In [30]:
# We can also declare a layer to receive a RGB image from any size
inputs = tf.keras.Input(shape=(None, None, 3))

Almost all layers have:

* Weights: They create a linear combination of the outputs that come from the previous layer.
* Non-linear activation: It introduces non-linearities during the training process.
* A bias node: An equivalent to one incoming variable that is always set to 1.

## Most used Keras layers

Here you can see some of the most used layers from Keras:

|               | Data types               | Weights from last layer                    |
|---------------|--------------------------|--------------------------------------------|
| `InputLayer`    | All                      | None                                       |
| `Embedding`     | Categorical, text        | Categorical input to vector                |
| `Dense`         | All                      | Get fed to each neuron                     |
| `Dropout`       | Most                     | Get fed to each neuron, with a probability |
| `Convolutional` | Text, time series, image | Adjacent weights get combined              |
| `Max Pooling`   | Text, time series, image | Take max of adjacent weights               |
| `RNN`           | Text, time series        | Each "timestep" gets fed in order          |
| `LSTM`          | Text, time series        | Each "timestep" gets fed in order          |
| `Bidirectional` | Text, time series        | Get passed on both forwards and backwards  |

[Table adapted from](https://www.hergertarian.com/keras-layers-intro)

In our next notebook, we will use some of this layers for building a small neural network