## TensorFlow:
TensorFlow is the most popular and adopted free and open-source deep learning library. It was first developed and maintained by Google. It can be used for both research and production.

## **TensorFlow benefits:**
- Highly efficient
- Cross-platform (works on IOS, Android, Unix, Windows, in the cloud, in the browser etc etc)
- Calculates gradients automatically (this is truly useful for Neural Networks, where the analytical solution of gradients would be VERY tedious to derive).
* Deep integration with the Keras library (Functional approach, as well as high-level wrapper)

# General notebook setup

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt


# Hide warnings
import warnings
warnings.filterwarnings('ignore')

# Install TensorFlow 2.0

TensorFlow 2.x is a major change from TensorFlow 1.x (not backwards compatible, however you can use a tool to convert your TensorFlow 1.x code to 2.x).

The new version is designed to be more pythonic. It's  easier to debug models, extract values during training (because of the need of sessions and graphs in TensorFlow 1.x). 

TensorFlow 2.x supports eager execution by default, so you don't need a session and to evaluate operations / tensors in order to extract values.

In [None]:
#!pip install tensorflow
# or for GPU version:
# !pip install tensorflow-gpu

# Import TensorFlow

In [1]:
# Canonical way of importing TensorFlow
import tensorflow as tf

# If this doesn't work TensorFlow is not installed correctly

# TensorFlow 2.0
At the time of the update of this notebook we are still in the early days of TensorFlow, and currently (Oct 22) the version 2.0.0 has just been released.

In [None]:
# Check tf version, oftentimes tensorflow is not backwards compatible
tf.__version__

# should be tensorflow 2

# Intro to TensorFlow
### Core components:

#### 1. Tensor
A Tensor in TensorFlow is an N-dimensional array (just like Numpys array object). Tensors are multilinear maps from vector spaces to real numbers. Scalars, vectors and matrices are all tensors. The Tensor represents units of data in TensorFlow.

Numpy arrays or Pandas DataFrames sent to Tensorflow functions are automatically converted into TensorFlow tensors.

#### 2. Operations / Ops
TensorFlow operations or ops are units / edges / nodes of computation (e.g. matrix multiplication, addition, etc.)

#### 3. Computation Graph
The computational graph is is an optimized, compiled representation of the dataflow and the order of computations that are sent to an execution environment (for example during model training).

TensforFlow 2.x supports eager execution, but when we build a model and then train it TensorFlow can compile the model and optimize the executions as a computational graph object. This is done by decorating a function with `@tf.function`.

This computational graph is then  sent to another instance / runtime environment (e.g. on a CPU or GPU) for execution. The results are sent back to us. This makes TensorFlow computations highly distributable and it also allows us to automatically evaluate all gradients in the computation nodes.

![](imgs/tf_graph.png)

TensorFlow 2.x supports eager execution by default.

In [3]:
tf.executing_eagerly() 

True

# 1. TensorFlow tensors

## 1.1 tf.constant

Constants are initialized directly and eager execution let's us see the values without creating a session and running the tensor.

In [43]:
a = tf.constant(2, dtype=np.int64)
b = tf.constant(5, dtype=np.int64)

In [48]:
a = tf.constant(20, dtype=np.int64)

In [49]:
a # note the numpy value

<tf.Tensor: id=20, shape=(), dtype=int64, numpy=20>

In [50]:
type(a.numpy())

numpy.int64

The .numpy() method will return the result as a numpy array.

In [51]:
# Eager evaluation of tensors
a.numpy()

20

### We can also perform operations on tensors

In [53]:
a * b

<tf.Tensor: id=22, shape=(), dtype=int64, numpy=100>

#### or the same with universal functions

In [None]:
tf.multiply(a, b)

In [None]:
t = tf.constant(np.arange(25).reshape(5, 5), name="mymat")

In [54]:
a_matrix = tf.constant([[1,2], [3,4]])
b_matrix = tf.constant([[5,6], [7,8]])
b_matrix

<tf.Tensor: id=24, shape=(2, 2), dtype=int32, numpy=
array([[5, 6],
       [7, 8]], dtype=int32)>

In [55]:
a_matrix

<tf.Tensor: id=23, shape=(2, 2), dtype=int32, numpy=
array([[1, 2],
       [3, 4]], dtype=int32)>

In [56]:
b_matrix

<tf.Tensor: id=24, shape=(2, 2), dtype=int32, numpy=
array([[5, 6],
       [7, 8]], dtype=int32)>

In [57]:
tf.matmul(a_matrix, b_matrix)

<tf.Tensor: id=25, shape=(2, 2), dtype=int32, numpy=
array([[19, 22],
       [43, 50]], dtype=int32)>

##### Note, we cannot reassign values of constants (like we can with Variables).

In [58]:
a.assign(8)

AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute 'assign'

## 1.2 tf.Variable

Variables are mutable and can be updated and reassigned new values. Variables are usually weights and biases of a model that are optimized during training, they also indicate the degrees of freedom of the model (what model parameters that can change, thus making the model flexible).

In [62]:
var = tf.Variable(3, dtype=np.int8)
var

<tf.Variable 'Variable:0' shape=() dtype=int8, numpy=3>

In [63]:
# Reassign the value of a Variable
var.assign(40000)
var

<tf.Variable 'Variable:0' shape=() dtype=int8, numpy=64>

In [68]:
v = np.array([127], dtype=np.int8)

In [70]:
v + 1

array([-128], dtype=int8)

In [None]:
var.numpy()

In [71]:
# we can also create multi dim Variables.
d = tf.Variable(np.random.randn(3, 1)) #reshape
# automatically assings data type
d #

<tf.Variable 'Variable:0' shape=(3, 1) dtype=float64, numpy=
array([[0.99174327],
       [0.42845568],
       [0.01088375]])>

In [77]:
k = tf.Variable(5)

In [78]:
j = k

In [79]:
j is k

True

In [81]:
k.assign(k+1)

<tf.Variable 'UnreadVariable' shape=() dtype=int32, numpy=6>

In [82]:
j is k

True

In [84]:
k

<tf.Variable 'Variable:0' shape=() dtype=int32, numpy=6>

In [92]:
var = tf.Variable(10.0)

In [93]:
# inplace increase / decrease Variable values

print('original value:', var.numpy())
print('add 1:', var.assign_add(1).numpy())
print('subtract 5:', var.assign_sub(5).numpy())

original value: 10.0
add 1: 11.0
subtract 5: 6.0


In [94]:
var.numpy()

6.0

### Variables also have a lot of attributes associated with them:

In [104]:
v = tf.Variable([[3.,3.2], [1.2,2.2]], dtype=tf.float32, name='my_variable')

print('name  : ', v.name)
print('type  : ', v.dtype)
print('shape : ', v.shape)
print('device: ', v.device)

name  :  my_variable:0
type  :  <dtype: 'float32'>
shape :  (2, 2)
device:  /job:localhost/replica:0/task:0/device:CPU:0


In [98]:
mat = tf.Variable(np.random.randint(0, 100, (10, 10)))

In [103]:
tf.reduce_sum(mat, axis=1)

<tf.Tensor: id=151, shape=(10,), dtype=int64, numpy=array([429, 518, 443, 462, 307, 475, 487, 561, 584, 629])>

<div class='alert alert-info'><b>Note</b>: Tensorflow is really similar to NumPy, and you can think of the tensors as an ndimensional array.</div>


![tf_to_np](imgs/tf_to_np.png)
Source: CS227d, NLP, Stanford

# 2. Operations / Ops
Operations can be carried out directly or assigned to variables.

<tf.Tensor: id=18, shape=(), dtype=int64, numpy=5>

In [108]:
op1 = tf.add(a,b)
op1

<tf.Tensor: id=160, shape=(), dtype=int64, numpy=25>

In [109]:
a+b # same as tf.add

<tf.Tensor: id=161, shape=(), dtype=int64, numpy=25>

In [110]:
v = a+b
u = v+2
w = v*u
z = w*3
z

<tf.Tensor: id=167, shape=(), dtype=int64, numpy=2025>

In [111]:
np.add(3, 5)

8

## Look at the computational graph with @tf.function

`@tf.function` is a very useful module that can be used to convert simple python functions into a highly optimized computational graph that can be run on any runtime environment. When we build a model and then train it TensorFlow we can compile the model and optimize the executions.

In [114]:
@tf.function
def func(a,b):
        z = tf.multiply(a, b, name='z')
        y1 = tf.constant(3, name='3')
        y2 = tf.constant(4)
        w1 = tf.add(z, y1, name='w1')
        w2 = tf.add(z, y2, name='w2')
        
        return(w1+w2)

In [116]:
p = tf.Variable(10)
q = tf.Variable(20)
func(p, q)

<tf.Tensor: id=202, shape=(), dtype=int32, numpy=407>

In [None]:
@tf.function
def func(a,b):
    with tf.name_scope('first'):
        z = tf.multiply(a, b, name='z')
    with tf.name_scope('second'):
        y1 = tf.constant(3, name='3')
        y2 = tf.constant(4)
        w1 = tf.add(z, y1, name='w1')
        w2 = tf.add(z, y2, name='w2')
        
    return(w1+w2)

In [None]:
a = tf.constant(3)
b = tf.constant(4)
func(5, 6)

# Calculate gradients

Gradient evaluation is very importnat machine learning because it is based on function optimization. You can use `tf.GradientTape()` method to record the gradient of an arbitrary function

In [None]:
def op(w):
    k = tf.constant(5, dtype=tf.float32)
    square_w = (w * w) + k
    another = square_w * square_w
    return another

# Gradient scope for the function w^2
for i in range(10):
    with tf.GradientTape() as tape:
        w = tf.Variable(i, dtype=tf.float32)
        another = op(w)
        grad = tape.gradient(another, w)
        print(grad.numpy())

### Gradient of the Sigmoid function
In this example we evaluate the gradient of the sigmoid function 

$$\sigma(x) = \frac{1}{1+e^{-x}}$$

Note that 

$$\sigma'(x) = \frac{e^{-x}}{(1+e^{-x})^2} = \sigma(x)(1-\sigma(x)) $$

For instance 

$$\sigma'(0) = \sigma(0)(1-\sigma(0)) = \frac{1}{2}\left(1-\frac{1}{2} \right) = \frac{1}{4}$$

In [None]:
def sigmoid(x):
    return 1/(1 + tf.exp(-x))

In [None]:
#define a varaible
x = tf.Variable(0.)

#record the gradient
with tf.GradientTape() as tape:
    sig = sigmoid(x)
    
res = tape.gradient(sig, x).numpy()
print('The gradient of the sigmoid function at 0.0 is ', res)