<a href="https://colab.research.google.com/github/AmirJlr/Deep-Learning-FUM/blob/master/00-tensorflow-first-steps/AutoGrad.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

 ### TensorFlow is not limited to build a neural network. Behind the scenes, TensorFlow is a tensor library with **automatic differentiation** capability. Hence you can easily use it to solve a numerical optimization problem with gradient descent

### Overview
- Autograd in TensorFlow

- Using Autograd to Solve a Math Puzzle

## Autograd in TensorFlow

In [1]:
### create a constant matrix ###

import tensorflow as tf

x = tf.constant([1, 2, 3])
print(x)
print(x.shape)
print(x.dtype)

tf.Tensor([1 2 3], shape=(3,), dtype=int32)
(3,)
<dtype: 'int32'>


In [2]:
### Creating variables in TensorFlow ###
import tensorflow as tf

x = tf.Variable([1, 2, 3])
print(x)
print(x.shape)
print(x.dtype)

<tf.Variable 'Variable:0' shape=(3,) dtype=int32, numpy=array([1, 2, 3], dtype=int32)>
(3,)
<dtype: 'int32'>


The only **difference between variables and constants** is the former allows the value to change while the latter is immutable. This distinction is important when you run a **gradient tape** as follow:

In [3]:
x = tf.Variable(3.6)

with tf.GradientTape() as tape:
    y = x*x

dy = tape.gradient(y, x)
print(dy)

tf.Tensor(7.2, shape=(), dtype=float32)


### Using autograd to Solve a Math Puzzle


 You may use gradient descent to solve some math puzzles as well. For example, the following problem:

```
 [ A ]  +  [ B ]  =  9
   +         -
 [ C ]  -  [ D ]  =  1
   =         =
   8         2

```

In other words, to find the values of
A, B, C, D such that:

- A + B = 9
- C - D = 1
- A + C = 8
- B - D = 2

#### This can also be solved using autograd, as follows:

In [6]:
import tensorflow as tf
import random

A = tf.Variable(random.random())
B = tf.Variable(random.random())
C = tf.Variable(random.random())
D = tf.Variable(random.random())

# Gradient descent loop
EPOCHS = 1000

optimizer = tf.keras.optimizers.Nadam(learning_rate=0.1)

for _ in range(EPOCHS):
    with tf.GradientTape() as tape:
        y1 = A + B - 9
        y2 = C - D - 1
        y3 = A + C - 8
        y4 = B - D - 2
        sqerr = y1*y1 + y2*y2 + y3*y3 + y4*y4
    gradA, gradB, gradC, gradD = tape.gradient(sqerr, [A, B, C, D])

    optimizer.apply_gradients([(gradA, A), (gradB, B), (gradC, C), (gradD, D)])

print(A)
print(B)
print(C)
print(D)



<tf.Variable 'Variable:0' shape=() dtype=float32, numpy=4.7298384>
<tf.Variable 'Variable:0' shape=() dtype=float32, numpy=4.2701626>
<tf.Variable 'Variable:0' shape=() dtype=float32, numpy=3.2701623>
<tf.Variable 'Variable:0' shape=() dtype=float32, numpy=2.2701626>


- The above code defines the four unknowns as variables with a random initial value.

- Then you compute the result of the four equations and compare it to the expected answer.

- You then sum up the squared error and ask TensorFlow to minimize it.

- The minimum possible square error is zero, attained when our solution exactly fits the problem.

- Note the way the gradient tape is asked to produce the gradient: You ask the gradient of sqerr respective to A, B, C, and D. Hence four gradients are found. You then apply each gradient to the respective variables in each iteration. Rather than looking for the gradient in four different calls to tape.gradient(), this is required in TensorFlow because the gradient sqerr can only be recalled once by default.

