# PyTorch Tutorial

**William Yue**

In this Jupyter notebook, my goal is to gain familiarity with PyTorch by following the [online tutorials](https://pytorch.org/tutorials/). Hopefully I will know how it works at the end.

## Introduction to PyTorch

### Tensors

Let's start by getting `torch` and `numpy` in here.

In [1]:
import torch
import numpy as np

Tensors appear to just be `torch`'s version of a matrix or multi-dimensional array, similar to `numpy`'s ndarrays. The difference is that tensors can run on GPUs or other fast hardware. They are also optimized for automatic differentiation.

#### Initializing a Tensor

There are several ways to make a tensor:

In [2]:
data = [[1,2,3],[4,5,6]]
x_data = torch.tensor(data)
print(x_data)

tensor([[1, 2, 3],
        [4, 5, 6]])


In [3]:
np_data = np.arange(6).reshape(2,3)
x_np = torch.tensor(np_data)
print(x_np)

tensor([[0, 1, 2],
        [3, 4, 5]], dtype=torch.int32)


In [4]:
x_ones = torch.ones_like(x_data)
print(x_ones)

x_rand = torch.rand_like(x_data, dtype=torch.float)
print(x_rand)

tensor([[1, 1, 1],
        [1, 1, 1]])
tensor([[0.7469, 0.3134, 0.7703],
        [0.1933, 0.7184, 0.0972]])


Note that we have a problem if we don't convert the `dtype` in the `torch.rand_like` function:

In [5]:
x_rand_test = torch.rand_like(x_data)

RuntimeError: "check_uniform_bounds" not implemented for 'Long'

This appears to be because the initial tenesor `x_data` has the datatype `Long` (64-bit integer, according to [documentation](https://pytorch.org/docs/stable/tensor_attributes.html)), and there is no way to sample a random number in the interval `[0,1)` for this datatype.

We can also directly specify the shape for `torch.rand`, `torch.ones`, and `torch.zeros`:

In [10]:
shape=(2,3)
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)

print(rand_tensor,ones_tensor,zeros_tensor,sep='\n')

another_ones_tensor = torch.ones(4,5)
print(another_ones_tensor)

tensor([[0.7427, 0.3731, 0.1832],
        [0.7499, 0.2047, 0.2797]])
tensor([[1., 1., 1.],
        [1., 1., 1.]])
tensor([[0., 0., 0.],
        [0., 0., 0.]])
tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]])


There are several attributes of tensors that we can check:

In [12]:
tensor = torch.rand(3,4)

print(tensor.shape)
print(tensor.dtype)
print(tensor.device)

torch.Size([3, 4])
torch.float32
cpu


#### Operations on Tensors

If you're observant, you'll notice that the device above that the tensor is stored on is a CPU! It turns out that by default, all tensors are initialized with `cpu` as their device. I'm on a Makerspace computer that comes with a NVIDIA GPU that can use CUDA, so we'll want to convert the device to a GPU possible. We can first check if CUDA is available before switching the tensor to that using the `.to` command.

In [13]:
if torch.cuda.is_available():
    tensor = tensor.to('cuda')

print(tensor)

tensor([[0.2022, 0.7543, 0.1385, 0.5809],
        [0.0607, 0.5798, 0.9910, 0.2934],
        [0.2830, 0.0540, 0.9153, 0.2379]], device='cuda:0')


Looks better now!

Tensors can be operated on similar to `numpy` arrays, with standard indexing and slicing.

In [24]:
tensor = torch.rand(4,4)
print(tensor)

print(tensor[0])
print(tensor[0:1])
print(tensor[:,0])
print(tensor[:,-1])

tensor([[0.3103, 0.9059, 0.9750, 0.2465],
        [0.1227, 0.3771, 0.5851, 0.5742],
        [0.5704, 0.9578, 0.3762, 0.6578],
        [0.5233, 0.2576, 0.3134, 0.2994]])
tensor([0.3103, 0.9059, 0.9750, 0.2465])
tensor([[0.3103, 0.9059, 0.9750, 0.2465]])
tensor([0.3103, 0.1227, 0.5704, 0.5233])
tensor([0.2465, 0.5742, 0.6578, 0.2994])


Note that similar to `numpy` arrays, `tensor[0]` and `tensor[0:1]` have different dimensionalities. We can also do standard concatenation along a given `dim` (not `axis`) using `torch.cat`. 

In [26]:
t0 = torch.cat([tensor]*3, dim=0)
t1 = torch.cat([tensor]*3, dim=1)

print(t0,t1,sep='\n ----- \n')

tensor([[0.3103, 0.9059, 0.9750, 0.2465],
        [0.1227, 0.3771, 0.5851, 0.5742],
        [0.5704, 0.9578, 0.3762, 0.6578],
        [0.5233, 0.2576, 0.3134, 0.2994],
        [0.3103, 0.9059, 0.9750, 0.2465],
        [0.1227, 0.3771, 0.5851, 0.5742],
        [0.5704, 0.9578, 0.3762, 0.6578],
        [0.5233, 0.2576, 0.3134, 0.2994],
        [0.3103, 0.9059, 0.9750, 0.2465],
        [0.1227, 0.3771, 0.5851, 0.5742],
        [0.5704, 0.9578, 0.3762, 0.6578],
        [0.5233, 0.2576, 0.3134, 0.2994]])
 ----- 
tensor([[0.3103, 0.9059, 0.9750, 0.2465, 0.3103, 0.9059, 0.9750, 0.2465, 0.3103,
         0.9059, 0.9750, 0.2465],
        [0.1227, 0.3771, 0.5851, 0.5742, 0.1227, 0.3771, 0.5851, 0.5742, 0.1227,
         0.3771, 0.5851, 0.5742],
        [0.5704, 0.9578, 0.3762, 0.6578, 0.5704, 0.9578, 0.3762, 0.6578, 0.5704,
         0.9578, 0.3762, 0.6578],
        [0.5233, 0.2576, 0.3134, 0.2994, 0.5233, 0.2576, 0.3134, 0.2994, 0.5233,
         0.2576, 0.3134, 0.2994]])


We also have standard arithmetic operations. Three ways to do matrix multiplication are shown below, using `@` and `matmul`. `y1`, `y2`, and `y3` should have the same value. Note that `.T` transposes the matrix.

In [40]:
y1 = tensor @ tensor.T
y2 = tensor.matmul(tensor.T)
y3 = torch.rand(tensor.shape)
torch.matmul(tensor, tensor.T, out=y3)

tensor([[1.9284, 1.0916, 1.5737, 0.7751],
        [1.0916, 0.8293, 1.0290, 0.5166],
        [1.5737, 1.0290, 1.8171, 0.8601],
        [0.7751, 0.5166, 0.8601, 0.5281]])

If you want to do element-wise multiplication instead, you can use `*` or `mul` instead.

In [41]:
z1 = tensor * tensor
z2 = tensor.mul(tensor)
z3 = torch.rand(tensor.shape) # note that using torch.rand_like also works
torch.mul(tensor, tensor, out=z3)

tensor([[0.0963, 0.8207, 0.9507, 0.0607],
        [0.0151, 0.1422, 0.3423, 0.3297],
        [0.3254, 0.9174, 0.1416, 0.4327],
        [0.2739, 0.0664, 0.0982, 0.0896]])