# __Importing tensorflow__

In [2]:
import tensorflow as tf

# __Introducing Tensorflow Variables__

A TensorFlow variable is the recommended way to represent shared, persistent state your program manipulates.

Variables are created and tracked via the `tf.Variable` class. A `tf.Variable` represents a tensor whose value can be changed by running ops on it. Specific ops allow you to read and modify the values of this tensor. Higher level libraries like `tf.keras` use `tf.Variable` to store model parameters.

# __Why we need variables?__

In a optimization problem, like linear regression, we have parameters that we need to tune, in order to optimize our objective funtion.

`tf.Variable` marks the tensor as a **tunable parameter**, so that tensorflow knows that this is the tensor that we need to tune/update/change... in order to optimize our objective.

So a variable holds a special importance, as the gradient of the function w.r.t this variable would be computed, and later updated using gradient descent.


## __Creating a variable__

Any tensor can be converted to a variable by just wrapping it in `tf.Variable`

In [None]:
a_constant_tensor = tf.random.normal([4, 4])
a_variable = tf.Variable(a_constant_tensor)

print(a_constant_tensor)
print('\n', a_variable)

tf.Tensor(
[[ 0.1685639   0.43453616  0.7295714   1.4290098 ]
 [ 0.43067005 -0.77764016 -1.041164    0.05287381]
 [-0.6257793   0.13404195 -0.13665919 -1.2711952 ]
 [ 0.9694465   0.2251089  -0.33621752 -0.60028934]], shape=(4, 4), dtype=float32)

 <tf.Variable 'Variable:0' shape=(4, 4) dtype=float32, numpy=
array([[ 0.1685639 ,  0.43453616,  0.7295714 ,  1.4290098 ],
       [ 0.43067005, -0.77764016, -1.041164  ,  0.05287381],
       [-0.6257793 ,  0.13404195, -0.13665919, -1.2711952 ],
       [ 0.9694465 ,  0.2251089 , -0.33621752, -0.60028934]],
      dtype=float32)>


In [None]:
print("Name: ", a_variable.name) # each variable created in tensorflow has its unique name.
print("Shape: ", a_variable.shape)
print("DType: ", a_variable.dtype)
print("As NumPy: ", a_variable.numpy())

Name:  Variable:0
Shape:  (4, 4)
DType:  <dtype: 'float32'>
As NumPy:  [[ 0.1685639   0.43453616  0.7295714   1.4290098 ]
 [ 0.43067005 -0.77764016 -1.041164    0.05287381]
 [-0.6257793   0.13404195 -0.13665919 -1.2711952 ]
 [ 0.9694465   0.2251089  -0.33621752 -0.60028934]]


Since a variable is **tunable** or in other words **trainable**, all the variables are created with the `trainable` attribute as `True` 

In [None]:
print('Varible is trainable:', a_variable.trainable)

Varible is trainable: True


Although variables are important for differentiation, some variables will not need to be differentiated. You can turn off gradients for a variable by setting trainable to false at creation.

In [None]:
step_counter = tf.Variable(1, trainable = False)
print(step_counter.trainable)

False


# __Gradient Computation in tensorflow__

Automatic differentiation is useful for implementing machine learning algorithms such as backpropagation for training neural networks.

Now here you'll see the real use of a trainable variable we talked about above.

In [None]:
def objective_function(x):
    return 2 * x

x = tf.Variable(3.0)

with tf.GradientTape() as tape:
    y = objective_function(x)

# `tape.gradient` computes the gradient, below we are computing the gradient of y wrt x
dy_dx = tape.gradient(y, x) # which would be 2 ofc since y is simply 2x ...
print(dy_dx)

tf.Tensor(2.0, shape=(), dtype=float32)


Lets do a bit more complicated gradient computation

In [None]:
w = tf.Variable(tf.ones((3, 2)), name = 'w') # you can also pass the names during variable initialization
b = tf.Variable(tf.ones(2), name = 'b')
x = [[1., 2., 3.]]

with tf.GradientTape(persistent=True) as tape:
    y = tf.matmul(x, w) + b # shape of y would be [1, 2]

    loss = tf.reduce_mean(y**2)

dloss_dw_and_b = tape.gradient(loss, [w, b]) # will return a list of gradient of loss wrt both w and b

print('Gradient of Loss wrt w:', dloss_dw_and_b[0])
print('\nGradient of Loss wrt b:', dloss_dw_and_b[1])

Gradient of Loss wrt w: tf.Tensor(
[[ 7.0000005  7.0000005]
 [14.000001  14.000001 ]
 [21.000002  21.000002 ]], shape=(3, 2), dtype=float32)

Gradient of Loss wrt b: tf.Tensor([7.0000005 7.0000005], shape=(2,), dtype=float32)


You can also get a list of all the variables that are the part of the gradient computation as follows:

In [None]:
tape.watched_variables() # returns tuple of trainable variables

(<tf.Variable 'w:0' shape=(3, 2) dtype=float32, numpy=
 array([[1., 1.],
        [1., 1.],
        [1., 1.]], dtype=float32)>,
 <tf.Variable 'b:0' shape=(2,) dtype=float32, numpy=array([1., 1.], dtype=float32)>)

Above is the basic introduction to the automatic differentiation in tensorflow, which is enough for basic Neural Network implementations.

`tf.GradientTape` is actually extremely powerful, to know other usage and its functionality chekout [this link](https://www.tensorflow.org/guide/autodiff)

# __Your Assignment__

Below are some questions that you need to ans by completing the code. The desired output is written on top of each cell. You need to come up with a code that results to that output.

## __Question 1__
You need to create a variable of shape [5, 4],  that is filled with 0.152, and having name as `I love machine learning... its super sexy baby`

### __[ Expected Outputs ]__
```
<tf.Variable 'I love machine learning... its super sexy baby:0' shape=(4, 5) dtype=float32, numpy=
array([[0.152, 0.152, 0.152, 0.152, 0.152],
       [0.152, 0.152, 0.152, 0.152, 0.152],
       [0.152, 0.152, 0.152, 0.152, 0.152],
       [0.152, 0.152, 0.152, 0.152, 0.152]], dtype=float32)>
```

In [4]:
# [========== Your answer below ==========]

a = tf.Variable(tf.fill([5,4],0.152), name = "I love machine learning... its super sexy baby")
a


<tf.Variable 'I love machine learning... its super sexy baby:0' shape=(5, 4) dtype=float32, numpy=
array([[0.152, 0.152, 0.152, 0.152],
       [0.152, 0.152, 0.152, 0.152],
       [0.152, 0.152, 0.152, 0.152],
       [0.152, 0.152, 0.152, 0.152],
       [0.152, 0.152, 0.152, 0.152]], dtype=float32)>

## __Question 2__
You need to create a functions that computes the mean of all the elements in a tensor multiplied by 5.258

Then compute and print the gradients of that function wrt a variable of shape [4, 4] filled with ones.

### __[ Expected Outputs ]__
```
tf.Tensor(
[[0.2629 0.2629 0.2629 0.2629 0.2629]
 [0.2629 0.2629 0.2629 0.2629 0.2629]
 [0.2629 0.2629 0.2629 0.2629 0.2629]
 [0.2629 0.2629 0.2629 0.2629 0.2629]], shape=(4, 5), dtype=float32)
 ```

In [21]:
# [========== Your answer below ==========]

def function(tensor):
    return tf.reduce_mean(tensor*5.258)
variable = tf.Variable(tf.ones([4,5]))

with tf.GradientTape() as tape:
    y = function(variable)
# print gradients of the function wrt variable
dy_dx = tape.gradient(y,variable)
dy_dx

<tf.Tensor: shape=(4, 5), dtype=float32, numpy=
array([[0.2629, 0.2629, 0.2629, 0.2629, 0.2629],
       [0.2629, 0.2629, 0.2629, 0.2629, 0.2629],
       [0.2629, 0.2629, 0.2629, 0.2629, 0.2629],
       [0.2629, 0.2629, 0.2629, 0.2629, 0.2629]], dtype=float32)>

## __Question 3__
You need to create a functions that take two tensors as arguments, and returns the their muliplication. 

Then compute and print the gradients of that function wrt both the variables of shape [2, 2], one filled with 1 and another with 2.5. Both of dtype float32.

Also 😛, print out all the variables that are used to compute the value of the function using `tape`.

**HINT: use `tape.watc....` :)**

### __[ Expected Outputs ]__
```
Gradients
 [<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[2.5, 2.5],
       [2.5, 2.5]], dtype=float32)>, <tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[1., 1.],
       [1., 1.]], dtype=float32)>]

Variables used to compute this function
 (<tf.Variable 'Variable:0' shape=(2, 2) dtype=float32, numpy=
array([[1., 1.],
       [1., 1.]], dtype=float32)>, <tf.Variable 'Variable:0' shape=(2, 2) dtype=float32, numpy=
array([[2.5, 2.5],
       [2.5, 2.5]], dtype=float32)>)
 ```

In [30]:
# [========== Your answer below ==========]

def function(a, b):
    return a*b

a = tf.Variable(tf.ones([2,2]),dtype=tf.float32)
b = tf.Variable(tf.fill([2,2],2.5),dtype=tf.float32)

with tf.GradientTape() as tape:
    y = function(a,b)

dy_da_and_b = tape.gradient(y,[a,b])

print('Gradients\n', dy_da_and_b)
print('\nVariables used to compute this function')
# here print variables used to compute this function
tape.watched_variables()

Gradients
 [<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[2.5, 2.5],
       [2.5, 2.5]], dtype=float32)>, <tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[1., 1.],
       [1., 1.]], dtype=float32)>]

Variables used to compute this function


(<tf.Variable 'Variable:0' shape=(2, 2) dtype=float32, numpy=
 array([[1., 1.],
        [1., 1.]], dtype=float32)>,
 <tf.Variable 'Variable:0' shape=(2, 2) dtype=float32, numpy=
 array([[2.5, 2.5],
        [2.5, 2.5]], dtype=float32)>)