# Intro to PyTorch
## 1. Tensors
PyTorch operates through manipulation of a data structure known as the "tensor".  Tensors are mathematically known as a high-dimensional data structure (i.e., 3 or more dimensional matrices), but in PyTorch, any dimensionality of array can be represented as a tensor:

In [10]:
import torch

#All are welcome in the world of PyTorch tensors!
print(torch.ones(1)) #A scalar?
print(torch.ones(10)) #A vector?
print(torch.ones(3,3)) #A matrix?
print(torch.ones(3,2,1)) #A 3D tensor?

tensor([1.])
tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])
tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
tensor([[[1.],
         [1.]],

        [[1.],
         [1.]],

        [[1.],
         [1.]]])


Tensors must contain only entries of the same data type, and the dimension of each axis must be consistent across the layers of the tensor (i.e. a tensor is an *array* of *arrays* not a *list* of *lists* like Python likes to do).  If you're familiar with numpy arrays, Tensors can be easily constructed from python lists and numpy arrays:

In [2]:
import numpy as np

print(torch.tensor([[i for i in range(10)] for j in range(10)]))
print(torch.tensor(np.ones((10,10))))

tensor([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
tensor([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]], dtype=torch.float64)


You can easily create tensors of a given size using the following functions:

In [12]:
print(torch.zeros(3,3)) #Make a tensor of zeros
print(torch.ones(3,3)) #Make a tensor of ones
print(torch.rand(3,3)) #Make a tensor of random numbers between 0 and 1
print(torch.empty(3,3)) #Make a tensor of uninitialized values (might be zeros, might be weird numbers)

tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])
tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
tensor([[0.9906, 0.3294, 0.7882],
        [0.9233, 0.6241, 0.4668],
        [0.3117, 0.3975, 0.9637]])
tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])


A few attributes of a tensor:

In [13]:
a = torch.ones(5,5)

print(a.size()) #Get the size of each
print(a.dtype) #Get the data type of the 
print(a.device) #Get which device the tensor is on

torch.Size([5, 5])
torch.float32
cpu


But hold on... what's that last one again?  What does it mean to have a tensor on a device?  The answer is that the main difference between numpy arrays and torch tensors is that tensors can be put on a GPU, thus allowing all mathematical operations involving it to be GPU accelerated!  For that to be able to happen, the following has to return true:

In [15]:
torch.cuda.is_available()

False

If this is true, congratulations!  You're ready to accelerate using the GPU!  If it's false, you either don't have a GPU, or don't have CUDA or PyTorch set up in a way that allows for GPU acceleration.  Getting this set up is incredibly system specific and beyond the scope of this tutorial.  However, you can take advantage of your GPU by moving your tensor to the GPU memory:

In [18]:
if torch.cuda.is_available():
    a.to("cuda")

Note that all created tensors will be stored on the CPU by default unless specified otherwise on initialization (i.e. specifying `device="cuda"` when the tensor is made).  Also, operations between tensors require that the tensors be on the same device; you'll get an error otherwise.  So what are the operations can you do to tensors?

In [35]:
a = torch.rand(5,5)
b = torch.rand(5,5)
print(a)
print(b)

print(a+b) #Elementwise addition
print(a*b) #Elementwise multiplication
print(a[2:4,]) #Slice out rows
print(b[:,3:5]) #Slice out columns
print(torch.matmul(a,b)) #Matrix multiplication
print(torch.cat((a,b),axis=1)) #Tensor concatenation

#In place addition, notice the underscore and that the tensor has been permanently changed
a.add_(5) 
print(a) 

tensor([[0.2558, 0.4919, 0.3648, 0.8303, 0.1401],
        [0.2633, 0.1573, 0.1584, 0.9556, 0.5556],
        [0.1617, 0.5476, 0.4154, 0.6731, 0.1571],
        [0.8064, 0.8664, 0.0642, 0.0574, 0.3945],
        [0.5852, 0.8945, 0.9340, 0.9404, 0.4619]])
tensor([[0.4583, 0.1443, 0.2191, 0.3032, 0.6162],
        [0.2973, 0.8497, 0.0771, 0.0942, 0.3936],
        [0.6465, 0.1874, 0.3082, 0.3536, 0.2133],
        [0.7136, 0.7007, 0.9473, 0.9918, 0.1927],
        [0.6428, 0.7178, 0.4280, 0.6711, 0.3129]])
tensor([[0.7141, 0.6362, 0.5839, 1.1335, 0.7563],
        [0.5607, 1.0071, 0.2354, 1.0498, 0.9492],
        [0.8082, 0.7350, 0.7237, 1.0267, 0.3704],
        [1.5200, 1.5671, 1.0115, 1.0493, 0.5872],
        [1.2280, 1.6123, 1.3620, 1.6115, 0.7748]])
tensor([[0.1172, 0.0710, 0.0799, 0.2517, 0.0863],
        [0.0783, 0.1337, 0.0122, 0.0900, 0.2187],
        [0.1045, 0.1026, 0.1281, 0.2380, 0.0335],
        [0.5754, 0.6071, 0.0608, 0.0570, 0.0760],
        [0.3762, 0.6420, 0.3997, 0.6311, 0.1445

## 2. Datasets and DataLoaders
Now, let's begin actually training a model.  For this tutorial, we will be training a neural network to predict the secondary structure of a protein given its evolutionary 