<a href="https://colab.research.google.com/github/nicoloceneda/Python-edu/blob/master/TensorFlow_Graphs_%2B_Variables_%2B_Gradients.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# TensorFlow - Computation Graph and Gradients
---



In [0]:
import tensorflow as tf

## Creating a computation graph
TensorFlow relies on building computation graphs to perform computations and derive the relationship between tensors, from the input all the way to the output. A computation graph is a network of nodes, where each node represents a tensor or an operation, which applies a function to its input tensor(s) and returns zero or more output tensors. TensorFlow uses this computation graph to compute gradients.




### TensorFlow v1.x
TensorFlow v1.x's low-level API uses **static computation graphs**, which must be explicitly defined, although this is not trivial for large and complex models. The steps for building, compiling and evaluating a graph are:

1. Instantiate an empty graph using `tf.Graph()`
2. Add nodes to the computation graph using `g.as_default()`
3. Evaluate the graph: \
     3.1 Start a new session using `tf.compat.v1.Session()` \
     3.2 Initialize the variables using `tf.compat.v1.global_variables_initializer()` \
     3.3 Execute the operations using `sess.run()`

In [0]:
# Example with only constants
g = tf.Graph()

with g.as_default():
  a = tf.constant(1, name='a')
  b = tf.constant(2, name='b')
  c = tf.constant(3, name='c')
  z = 2 * (a - b) + c

with tf.compat.v1.Session(graph=g) as sess:
  print('z =', sess.run(z))

z = 1


In [0]:
# Example with only placeholders
g = tf.Graph()

with g.as_default():
  a = tf.compat.v1.placeholder(shape=None, dtype=tf.int32, name='tf_a')
  b = tf.compat.v1.placeholder(shape=None, dtype=tf.int32, name='tf_b')
  c = tf.compat.v1.placeholder(shape=None, dtype=tf.int32, name='tf_c')
  z = 2 * (a - b) + c

with tf.compat.v1.Session(graph=g) as sess:
  print('z =', sess.run(z, feed_dict = {a: 1, b: 2, c: 3}))

z = 1


In [0]:
# Example with placeholders and variables
g = tf.Graph()

with g.as_default():
  x = tf.compat.v1.placeholder(shape=None, dtype=tf.float32, name='x')
  w = tf.Variable(2.0, name='weight')
  b = tf.Variable(0.7, name='bias')
  z = w * x + b

  init = tf.compat.v1.global_variables_initializer()

with tf.compat.v1.Session(graph=g) as sess:
  sess.run(init)
  for t in [1.0, 0.6, -1.8]:
    print('x = {:>5.2f} --> z = {:>5.2f}'.format(t, sess.run(z, feed_dict={x: t})))


Instructions for updating:
If using Keras pass *_constraint arguments to layers.
x =  1.00 --> z =  2.70
x =  0.60 --> z =  1.90
x = -1.80 --> z = -2.90


### TensorFlow v2
TensorFlow v2 uses **dynamic computation graphs** (also called eager execution), which allow to evaluate operations on the fly, without the need to explicitly create a graph, constants and a session.

In [0]:
a = tf.constant(1, name='a')
b = tf.constant(2, name='b')
c = tf.constant(3, name='c')
z = 2 * (a - b) + c

print('z =', z.numpy())

z = 1


To **load input data** into a model, we can directly feed data in the form of Python variables or NumPy arrays, without the need to create a graph, placeholders and a session. 

In [0]:
def compute_z(a, b, c):
  z = tf.add(tf.multiply(2, tf.subtract(a, b)), c)
  return z

z = compute_z(1, 2, 3)
print('z =', z.numpy())

z = 1


Since dynamic graphs are not as computationally efficient as static ones, TensorFlow v2 automatically compiles Python **code into a static graph** using a tool called AutoGraph. Moreover, we can compile a **function into a static graph** using the `@tf.function` decorator.

In [0]:
@tf.function
def compute_z(a, b, c):
  z = tf.add(tf.multiply(2, tf.subtract(a, b)), c)
  return z

z = compute_z(1, 2, 3)
print('z =', z.numpy())

z = 1


**Limit the ways to call** a function using `tf.TensorSpec`.

In [0]:
@tf.function(input_signature=(tf.TensorSpec(shape=[None], dtype=tf.int32), tf.TensorSpec(shape=[None], dtype=tf.int32), tf.TensorSpec(shape=[None], dtype=tf.int32),))
def compute_z(a, b, c):
  z = tf.add(tf.multiply(2, tf.subtract(a, b)), c)
  return z

# Calling the function using tensors with rank 1 or lists that can be converted to rank 1 tensors:
tf.print('Rank 1 inputs:', compute_z([1], [2], [3]))
tf.print('Rank 1 inputs:', compute_z([1, 2], [2, 4], [3, 6]))

# Calling the function using tensors with ranks other than 1 will result in errors:
# tf.print('Rank 0 inputs:', compute_z(1, 2, 3)) 
# tf.print('Rank 2 Inputs:', compute_z([[1], [2]], [2], [4]], [[3], [6]]))

Rank 1 inputs: [1]
Rank 1 inputs: [1 2]


## Variable objects for storing and updating model parameters
A variable is a special tensor that allows to store and update the parameters of our model during training.

**Create a variable** using `tf.Variable`

In [0]:
a = tf.Variable(initial_value=3, name='var_a')
print(a)

b = tf.Variable(initial_value=[1, 2, 3], name='var_b')
print(b)

c = tf.Variable(initial_value=[True, False], dtype=tf.bool)
print(c)

d = tf.Variable(initial_value=['abc'], dtype=tf.string)
print(d)

<tf.Variable 'var_a:0' shape=() dtype=int32, numpy=3>
<tf.Variable 'var_b:0' shape=(3,) dtype=int32, numpy=array([1, 2, 3], dtype=int32)>
<tf.Variable 'Variable:0' shape=(2,) dtype=bool, numpy=array([ True, False])>
<tf.Variable 'Variable:0' shape=(1,) dtype=string, numpy=array([b'abc'], dtype=object)>


To use a **variable inside a decorated function**, define the variable outside of the decorated function.

In [0]:
w = tf.Variable(tf.random.uniform((2, 3)))

@tf.function
def compute_z(x):
  return tf.matmul(w, x)

x = tf.constant([[1], [2], [3]], dtype=tf.float32)
print(compute_z(x).numpy())

[[3.2747576]
 [2.3064604]]


**Specify trainable** and non-trainable variables using `trainable`.

In [0]:
a = tf.Variable(initial_value=3, trainable=False)
print('a trainable:', a.trainable)

b = tf.Variable(initial_value=3, trainable=True)
print('b.trainable:', b.trainable)

a trainable: False
b.trainable: True


**Modify the values** of a variable using `assign()`, `assign_add()`, and other methods.

In [0]:
a = tf.Variable(initial_value=1)
print('a before:', a.numpy())
a.assign(value=3, read_value=True)
print('a after:', a.numpy())

b = tf.Variable(initial_value=[1, 2, 3])
print('b before:', b.numpy())
b.assign_add(delta=[1, 1, 1], read_value=True)
print('b after:', b.numpy())

a before: 1
a after: 3
b before: [1 2 3]
b after: [2 3 4]


Initialize a variable with **random values** using `tf.keras.initializers`.

In [0]:
tf.random.set_seed(seed=1)
init = tf.keras.initializers.GlorotNormal()

print(init(shape=(3, )).numpy())

[-0.7227959   1.0145682   0.25180823]


## Computing gradients via automatic differentiation
Optimizing neural networks via algorithms such as stochastic gradient descent requires computing the gradients of the cost with respect to the weights. TensorFlow supports automatic differentiation, which represents a set of computational techniques for computing derivatives or gradients of arithmetic operations. During this process, gradients of nested functions (expressed as a series of operations) are obtained by accumulating the gradients through repeated applications of the chain rule. 

Compute the **gradient wrt a trainable variable** using `tf.GradientTape`.



In [0]:
w = tf.Variable(1.0, trainable=True)
b = tf.Variable(0.5, trainable=True)

x = tf.constant([1.4])
y = tf.constant([2.1])

with tf.GradientTape() as tape:
  z = tf.add(tf.multiply(w, x), b)
  loss = tf.reduce_sum(tf.square(tf.subtract(y, z)))

dloss_dw = tape.gradient(loss, w)
print('dL/dw =', dloss_dw.numpy())

dL/dw = -0.55999976


Compute the **gradient wrt a non-trainable variable** using `tf.GradientTape` and `tape.watch`.

In [0]:
w = tf.Variable(1.0, trainable=True)
b = tf.Variable(0.5, trainable=True)

x = tf.constant([1.4])
y = tf.constant([2.1])

with tf.GradientTape() as tape:
  tape.watch(x)
  z = tf.add(tf.multiply(w, x), b)
  loss = tf.reduce_sum(tf.square(tf.subtract(y, z)))

dloss_dx = tape.gradient(loss, x)
print('dL/dx =', dloss_dx.numpy())

dL/dx = [-0.39999986]


When we monitor computations with `tf.GradientTape`, by default the tape keeps the resources only for a single gradient computation: after calling `tape.gradient` once, the tape is cleared. 

To compute **more than one gradient**, we need to make the tape persistent. 



In [0]:
w = tf.Variable(1.0, trainable=True)
b = tf.Variable(0.5, trainable=True)

x = tf.constant([1.4])
y = tf.constant([2.1])

with tf.GradientTape(persistent=True) as tape:
  z = tf.add(tf.multiply(w, x), b)
  loss = tf.reduce_sum(tf.square(tf.subtract(y, z)))

dloss_dw = tape.gradient(loss, w)
print('dL/dw =', dloss_dw.numpy())

dloss_db = tape.gradient(loss, b)
print('dL/db =', dloss_db.numpy())

dL/dw = -0.55999976
dL/db = -0.39999986


Define an optimizer and apply the gradients to optimize the model parameters using `tf.keras.optimizers`.

In [0]:
w = tf.Variable(1.0, trainable=True)
b = tf.Variable(0.5, trainable=True)

x = tf.constant([1.4])
y = tf.constant([2.1])

with tf.GradientTape(persistent=True) as tape:
  z = tf.add(tf.multiply(w, x), b)
  loss = tf.reduce_sum(tf.square(tf.subtract(y, z)))

dloss_dw = tape.gradient(loss, w)
dloss_db = tape.gradient(loss, b)

optimizer = tf.keras.optimizers.SGD()
optimizer.apply_gradients(zip([dloss_dw, dloss_db], [w, b]))
print('Updated w:', w.numpy())
print('Updated b:', b.numpy())

Updated w: 1.0056
Updated b: 0.504
