In [10]:
# loading and setting up dependencies
import fastbook

fastbook.setup_book()

from fastbook import *
from torch.nn import functional as F

# `PyTorch` tensors

PyTorch tensors are multidimensional arrays that can be used for numerical computations.

`NumPy` arrays are multidimensional tables of data, with all items of the same type. `PyTorch` tensors are different in that they have to use a single basic numeric type for all components of the array. Also `PyTorch` tensors cannot be _jagged_, e.g. containing arrays of different sizes in it.

One of the main features of `PyTorch` tensors, compared to `NumPy` arrays, is that they can be used on a GPU to accelerate computing. They are also able to compute gradients by keeping track of successive operations using the calculus chain rule.

## Basic manipulations

In [11]:
# To create an array or tensor, pass a list (or list of lists, or list of lists of lists, etc.) to `array()` or `tensor()`:
data = [[1,2,3],[4,5,6]]
arr = array(data) # `NumPy`
tns = tensor(data) # `PyTorch`

tns

tensor([[1, 2, 3],
        [4, 5, 6]])

In [12]:
# to select a row of a tensor
tns[0]

tensor([1, 2, 3])

In [13]:
# to select a column of a tensor,
# `:` here says "take all the values of axis 0"
tns[:, 1] 

tensor([2, 5])

In [14]:
# combination of the above using `slice` syntax
tns[1,1:3]

# this says "take the second row, then take columns 1 to 3 (not including 3)"


tensor([5, 6])

In [15]:
# `PyTorch` makes it very easy to perform arithmetic operations at scale
tns+1

tensor([[2, 3, 4],
        [5, 6, 7]])

In [16]:
# all `PyTorch` tensors have a type
tns.type()

'torch.LongTensor'

In [17]:
# this type can change automatically based on the operations you perform
tns*1.5

tensor([[1.5000, 3.0000, 4.5000],
        [6.0000, 7.5000, 9.0000]])

## Compute derivatives with `PyTorch`

Consider the statement below:

```python
xt = tensor(3.).requires_grad_()
```

This statement does 2 main things:

- creates a tensor `xt` with the value `3.`
- it enables the gradient computation with `requires_grad_()`: this has the effect of modfying the tensor so that `PyTorch` keeps track of the operations performed directly on it using a computation graph, this feature being later used to calculate the gradients (_automatic differentiation_)

Let's see this in action with a simple example.

In [18]:
# creating our tensor and enabling gradients computation
xt = torch.tensor(3.).requires_grad_()

# perform operations on the tensor using the example of a simple quadratic function
yt = xt**2

## let's compute the derivative of `yt` with respect to `xt` (backward propagation)
yt.backward()

# let's see the values of `xt`, its derivative (`gradient` in deep learning jargon), and `yt`;
# here, `xt.grad` is the derivative of `yt` with respect to `xt`
xt, xt.grad, yt

(tensor(3., requires_grad=True),
 tensor(6.),
 tensor(9., grad_fn=<PowBackward0>))

Using the `backward` method above refers to _backpropagation_, which is the process of calculating the derivative of each layer.

Let's do the same thing with more values.

In [19]:
xt = tensor([3.,4.,10.]).requires_grad_()
xt

tensor([ 3.,  4., 10.], requires_grad=True)

In [20]:
# taking in a rank-1 tensor and returning a tensor with a scalar value (rank-0 tensor)
def f(x): return (x**2).sum()

yt = f(xt)
yt

tensor(125., grad_fn=<SumBackward0>)

In [21]:
# let's get the gradients
yt.backward()
xt.grad

tensor([ 6.,  8., 20.])