# Computing Gradients in Pytorch and Tensorflow, their automatic differentiation 

Ref:

https://d2l.ai/chapter_preliminaries/autograd.html#a-simple-function

All modern deep learning frameworks take this work off of our plates by offering automatic differentiation (often shortened to autograd). As we pass data through each successive function, the framework builds a computational graph that tracks how each value depends on others. To calculate derivatives, automatic differentiation packages then work backwards through this graph applying the chain rule. The computational algorithm for applying the chain rule this fashion is called backpropagation.

In [3]:
import torch
import tensorflow as tf

In [12]:
x_torch = torch.arange(4.0, requires_grad=True)
x_torch

tensor([0., 1., 2., 3.], requires_grad=True)

In [13]:
y_torch = 2 * torch.dot(x_torch, x_torch)
y_torch

tensor(28., grad_fn=<MulBackward0>)

In [14]:
y_torch.backward()
x_torch.grad

tensor([ 0.,  4.,  8., 12.])

In [5]:
x_tf = tf.range(4, dtype=tf.float32)
x_tf

<tf.Tensor: shape=(4,), dtype=float32, numpy=array([0., 1., 2., 3.], dtype=float32)>

In [10]:
# The API through which you can leverage TensorFlow’s powerful automatic differentiation 
# capabilities is the GradientTape

x_tfvar = tf.Variable(x_tf)
#x = tf.Variable(tf.random.uniform((2, 2)))

# Record all computations onto a tape
with tf.GradientTape() as tape:
    y_tf = 2 * tf.tensordot(x_tfvar, x_tfvar, axes=1)
y_tf

<tf.Tensor: shape=(), dtype=float32, numpy=28.0>

In [11]:
grad_of_y_tf_wrt_x_tf = tape.gradient(y_tf, x_tfvar)
grad_of_y_tf_wrt_x_tf

<tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 0.,  4.,  8., 12.], dtype=float32)>