<table>
<tr>
<td width=15%><img src="./img/UGA.png"></img></td>
<td><center><h1>Introduction to Python for Data Sciences</h1></center></td>
<td width=15%><a href="https://tung-qle.github.io/" style="font-size: 16px; font-weight: bold">Quoc-Tung Le</a> </td>
</tr>
</table>


# 1 - Pytorch: a Numpy library that can differentiate

Pytorch is a Python library that are arguably the most popular for deep learning. It contains many similar functions that are implemented in Numpy (see the [second notebook](2_Numpy_and_co.ipynb)). Attention: many functions of these two libraries might have the same names, but their functionalities can be (entirely) different!

As we will see in this tutorial, in comparison to Numpy, Pytorch provides an additional important feature: Automatic Differentiation (AD, sometimes shortened to autodiff). That means if we implement a function using Pytorch library, we can compute its gradient with respect to its parameters efficiently. The gradient will be then used in the optimization algorithm (see Optimization course for more details).

The following code demonstrates the autodiff feature of Pytorch: in the following, we want to compute the gradient of the function:

$$f(x) = \frac{1}{2}x^\top \mathbf{A} x,$$

which are given by: $\nabla f(x) = \frac{1}{2}(\mathbf{A} + \mathbf{A}^\top)x$.

In [42]:

import torch

dim = 20
# Create a parameter x and a matrix A
x = torch.randn(dim, requires_grad=True)
A = torch.randn((dim, dim))

# Compute the function f(x) and assign to the variable y
y = 0.5 * torch.dot(x, torch.matmul(A, x))

# Differentiating the function f by calling y.backward()
y.backward()

# Accessing the gradient of f with respect to x
print(x.grad)
print(A.grad)

# Checking the calculation with the closed form gradient formula
try:
    torch.testing.assert_close(0.5 * torch.matmul(A + A.T,x), x.grad)
    print("Two vectors are equal. All is good")
except:
    print("Wrong calculation")


tensor([-1.4024, -0.1146, -0.2406, -0.0727,  2.5713, -6.1343, -3.0981, -5.3953,
        -3.2368,  0.7224, -1.7826,  2.3095,  4.7901,  7.4169,  4.8418, -0.2430,
        -0.7804, -1.5638, -1.0425,  3.7074])
None
Two vectors are equal. All is good


## 1.1 - How to create Pytorch Tensors

There are many different methods to create a Pytorch tensor, either using Python list, numpy array or even randomization. They are shown in the following:

In [37]:
# Initialize a tensor using Python list
x = torch.Tensor([[1, 2, 3], [-3, -2, -1]])
print(x)

tensor([[ 1.,  2.,  3.],
        [-3., -2., -1.]])


Unlike a Numpy array, a Pytorch tensor has many metadata field. The following code shows how to access the information

In [None]:
print("Tensor info:")
print(f"  shape        : {tuple(x.shape)}")
print(f"  size         : {x.size()}")
print(f"  dtype        : {x.dtype}")
print(f"  device       : {x.device}")         # The output is either CPU or GPU, depending on the your implementation and hardware
print(f"  requires_grad: {x.requires_grad}")  # If requires_grad = False, it is impossible to differentiate a function w.r.t. x. See the example with the quadratic function
print(f"  grad_fn      : {x.grad_fn}")        # grad_fn
print(f"  is_leaf      : {x.is_leaf}")        # is_leaf = True if and only if requires_grad = False (convention) or it is created by the user and not a result of some operations
print(f"  data         :\n{x}")

Tensor info:
  shape        : (2, 3)
  size         : torch.Size([2, 3])
  dtype        : torch.float32
  device       : cpu
  requires_grad: False
  is_leaf      : True
  grad_fn      : None
  data         :
tensor([[ 1.,  2.,  3.],
        [-3., -2., -1.]])


## 1.2 - Operation on Pytorch tensors

## 1.3 - Linear algebra with Pytorch

## 1.4 - Building functions using Pytorch functions

## 1.5 - Compute the gradient of a function

## 1.6 - Exercise