## PyTorch basics


In [15]:
import torch
import numpy as np

In [2]:
a = torch.tensor(3)
print(a)  # tensor(3)

tensor(3)


In [4]:
a.requires_grad

False

In [5]:
d = torch.rand([2, 2, 2])

In [8]:
d.size()

torch.Size([2, 2, 2])

In [9]:
d.requires_grad

False

Tensors can be used to perform algebraic operations efficiently. One of the most commonly used operations in machine learning applications is matrix multiplication. Say you want to multiply two random matrices of size 3x5 and 5x4, this can be done with the matrix multiplication (@) operation:

In [11]:
x = torch.randn([3, 5])
y = torch.randn([5, 4])
z = x @ y

print(z)

tensor([[ 1.6983, -1.9336,  0.8055, -0.1356],
        [ 1.1973,  2.8537, -1.9427, -1.8584],
        [-4.7054,  2.5842,  0.2806,  0.2686]])


In [13]:
type(z.numpy())

numpy.ndarray

In [16]:
x = torch.tensor(np.random.normal([3, 5]))
type(x)

torch.Tensor

### Automatic differentiation

The most important advantage of PyTorch over NumPy is its automatic differentiation functionality which is very useful in optimization applications such as optimizing parameters of a neural network. Let's try to understand it with an example.

Say you have a composite function which is a chain of two functions: g(u(x)). To compute the derivative of g with respect to x we can use the chain rule which states that: dg/dx = dg/du * du/dx. PyTorch can analytically compute the derivatives for us.

To compute the derivatives in PyTorch first we create a tensor and set its requires_grad to true. We can use tensor operations to define our functions. We assume u is a quadratic function and g is a simple linear function:

In [19]:
x = torch.tensor(1.0, requires_grad=True)

def u(x):
    return x * x

def g(u):
    return -u

In [20]:
# Computes and returns the sum of gradients of outputs w.r.t. the inputs.
dgdx = torch.autograd.grad(g(u(x)),x)

In [21]:
dgdx

(tensor(-2.),)

## Curve fitting