## Exercise: "Introduction" to Tensorflow

*TensorFlow is a free and open-source software library for machine learning and artificial intelligence. It can be used across a range of tasks but has a particular focus on training and inference of deep neural networks. It was developed by the Google Brain team for Google's internal use in research and production. The initial version was released under the Apache License 2.0 in 2015. Google released an updated version, TensorFlow 2.0, in September 2019.* [Wikipedia](https://en.wikipedia.org/wiki/TensorFlow).

This file is only intended to give an intuition how the low level programming in TensorFlow works. We reproduce the sample given in the script `5.0.intro-pytorch.ipynb` for comparison.

### Import Tensorflow and numpy

In [None]:
import tensorflow as tf
import numpy as np

### Create a simple graph
The decorator `tf.function` compiles a function into a callable TensorFlow graph. Here the same cost function (as in `6.0.intro-pytorch.ipynb`) based on W and b is determined and its gradients calculated.

In [None]:
@tf.function
def simple_graph(x, W, b):
    t = tf.linalg.matmul(W,x)
    a = t + b
    a_square = tf.math.square(a)
    cost = tf.reduce_sum(a_square)
    grad_W, grad_b = tf.gradients(ys=cost, xs=[W,b])
    return cost, grad_W, grad_b


In [None]:
x = np.array([[1.,2.,3.]]).T
#declare W and b as tensors with gradient determination
W = np.random.randn(2,3) 
b = np.random.randn(2,1) 

#calculate a function called 'cost'
cost, grad_W, grad_b = simple_graph(x, W, b)
print(cost)
print(grad_W)
print(grad_b)

#### Manual verification of gradient determination

For comparison we determine the gradient manually. Recall that 
$$
res = \mathbf{a}^T \cdot \mathbf{a} = (\mathbf{W} \cdot \mathbf{x} + \mathbf{b})^T \cdot (\mathbf{W} \cdot \mathbf{x} + \mathbf{b}) 
$$
Thus it is straight forward to show that:
$$
\frac{\partial}{\partial \mathbf{W}} res = 2 \cdot (\mathbf{W} \cdot \mathbf{x} + \mathbf{b}) \cdot \mathbf{x}^T = 2\cdot \mathbf{a} \cdot \mathbf{x}^T
$$
$$
\frac{\partial}{\partial \mathbf{b}} res = 2 \cdot (\mathbf{W} \cdot \mathbf{x} + \mathbf{b})= 2\cdot \mathbf{a}
$$

In [None]:
a = W@x + b
W_grad = 2*a@x.T
b_grad = 2*a
print(W_grad)
print(b_grad)