# Introduction to PyTorch

Before reading this introduction you should know a bit of:
1. Python - look at [official tutorial](https://docs.python.org/3/tutorial/)
2. Linear Algebra and Matrices - look at [Coursera tutorial](https://www.coursera.org/learn/linear-algebra-machine-learning) and/or book [Introduction to Applied Linear Algebra](http://vmls-book.stanford.edu/vmls.pdf)



<hr>


From official NumPy page we could read that


PyTorch is a Python-based scientific computing package targeted at two sets of audiences:

A replacement for NumPy to use the power of GPUs
a deep learning research platform that provides maximum flexibility and speed

<hr>


Contrary to NumPy, PyTorch was designed mostly to work on **GPU**. PyTorch represents n-dimensional array object as  `Tensor`. To install PyTorch library, go to [link](https://pytorch.org/get-started/locally/). There are also very good tutorials:
* [Official PyTorch tutorials](https://pytorch.org/tutorials/)
* [Deep Learning for Natural Language Processing with Pytorch](https://github.com/rguthrie3/DeepLearningForNLPInPytorch/blob/master/Deep%20Learning%20for%20Natural%20Language%20Processing%20with%20Pytorch.ipynb) 

Here we want give you a quick crash course of using PyTorch library, especially Tensor object. 

## 1. Basics: creating a PyTorch tensor

Important notes:
* all items in PyTorch array (a.k.a. `Tensor`) cantain only one data type e.g. `int8`, `float32`, ... ([all datatypes](https://pytorch.org/docs/stable/tensors.html))

In [1]:
import torch

In [66]:
# creating tensor from python list

print("1d Tensor from Python list (with 'int32' type)")
list1d = [0, 1, 2, 3, 4, 5, 6, 7]
tensor1d = torch.tensor(list1d, dtype=torch.int32)
print(tensor1d)
print(tensor1d.size())
print()


print("2d Tensor from Python list of lists  (with `float32` type)")
list2d = [[0, 1], [2, 3], [4, 5], [6, 7], [8, 9], [10, 11], [12, 13], [14, 15]]
tensor2d = torch.tensor(list2d, dtype=torch.float32)
print(tensor2d)
print(tensor2d.size())

1d Tensor from Python list (with 'int32' type)
tensor([0, 1, 2, 3, 4, 5, 6, 7], dtype=torch.int32)
torch.Size([8])

2d Tensor from Python list of lists  (with `float32` type)
tensor([[ 0.,  1.],
        [ 2.,  3.],
        [ 4.,  5.],
        [ 6.,  7.],
        [ 8.,  9.],
        [10., 11.],
        [12., 13.],
        [14., 15.]])
torch.Size([8, 2])


In [36]:
# creating tensor using distribution

print("2d random tensor (with `float32` type)")
tensor2d_random = torch.rand(8, 2, dtype=torch.float32)
print(tensor2d_random)
print(tensor2d_random.size())
print()


print("2d tensor with uniform distribution")
tensor2d_uniform = torch.FloatTensor(8, 1).uniform_(-10, 10)
print(tensor2d_uniform)
print(tensor2d_uniform.size())

2d random tensor (with `float32` type)
tensor([[0.6430, 0.3873],
        [0.1354, 0.0599],
        [0.7824, 0.7412],
        [0.8572, 0.6942],
        [0.3523, 0.6777],
        [0.3822, 0.2264],
        [0.4619, 0.0919],
        [0.4787, 0.5517]])
torch.Size([8, 2])

2d tensor with uniform distribution
tensor([[ 6.0738],
        [-9.7094],
        [ 5.7095],
        [ 2.4143],
        [-3.3079],
        [-2.7146],
        [ 6.3315],
        [ 3.7597]])
torch.Size([8, 1])


In [34]:
# creating tensor using linspace and arange

print("1d tensor based on linearly spaced vector")
tensor1d_linspace = torch.linspace(0, 7, steps=8, dtype=torch.float32)
print(tensor1d_linspace)
print(tensor1d_linspace.size())
print()


print("1d tensor based on `arange` mechanism")
tensor1d_arange = torch.arange(0, 10, 3)
print(tensor1d_arange)
print(tensor1d_arange.size())

1d tensor based on linearly spaced vector
tensor([0., 1., 2., 3., 4., 5., 6., 7.])
torch.Size([8])

1d tensor based on `arange` mechanism
tensor([0, 3, 6, 9])
torch.Size([4])


In [33]:
#Creating 0 and 1 tensor

print("2d zeros tensor")
torch2d_zeros = torch.zeros([2, 4])
print(torch2d_zeros)
print(torch2d_zeros.size())
print()

print("2d ones tensor")
torch2d_ones = torch.ones([2, 4])
print(torch2d_ones)
print(torch2d_ones.size())


2d zeros tensor
tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.]])
torch.Size([2, 4])

2d ones tensor
tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.]])
torch.Size([2, 4])


## 1.2 Basics: extracting specific values from tensors

Important notes:
* tensor can be indexed using the standard Python x[obj] syntax, wherea x is the array and obj the selection

In [30]:
print(tensor1d)
print(tensor2d)

tensor([0, 1, 2, 3, 4, 5, 6, 7], dtype=torch.int32)
tensor([[ 0.,  1.],
        [ 2.,  3.],
        [ 4.,  5.],
        [ 6.,  7.],
        [ 8.,  9.],
        [10., 11.],
        [12., 13.],
        [14., 15.]])


In [38]:
# single element

print("Get specific element")
print(tensor2d[1,1])

Get specific element
tensor(3.)


In [39]:
# slicing using i:j:k where i is starting element, j is the stopping element, k is the step

print("The basic slice syntax is i:j:k")
print(tensor1d[0:6:2])

The basic slice syntax is i:j:k
tensor([0, 2, 4], dtype=torch.int32)


In [41]:
# slicing first row from a 2x2

print("Extract only one dimension from multidimensional ndarray") 
print(tensor2d[:, 0])
print(tensor2d[:, 0].shape)

Extract only one dimension from multidimensional ndarray
tensor([ 0.,  2.,  4.,  6.,  8., 10., 12., 14.])
torch.Size([8])


In [44]:
# slicing using bolean

print("Boolean array indexing") 
print(tensor1d[([True, False, True, False, True, False, True, False])])

Boolean array indexing
tensor([0, 2, 4, 6], dtype=torch.int32)


In [46]:
# slicing with a condition

print("Using condition statement for indexing array") 
print(tensor1d[(tensor1d % 2 == 0)])

Using condition statement for indexing array
tensor([0, 2, 4, 6], dtype=torch.int32)


## 1.2 Nan and infinite representation

In [48]:
# represent nan in tensor

print("represent `not a number value`")
print(torch.tensor(float('nan')))

represent `not a number value`
tensor(nan)


In [49]:
# represent inf in tensor

print("represent `infinite`")
print(torch.tensor(float('Inf')))

represent `infinite`
tensor(inf)


## 1.3 Basics: sum, min, max, mean, reshape

In [50]:
print(tensor2d)

tensor([[ 0.,  1.],
        [ 2.,  3.],
        [ 4.,  5.],
        [ 6.,  7.],
        [ 8.,  9.],
        [10., 11.],
        [12., 13.],
        [14., 15.]])


In [52]:
print("calculate mean, max and min in tensor")
print("sum ", tensor2d.sum())
print("max ", tensor2d.max())
print("min ", tensor2d.min())
print("mean ", tensor2d.mean())

calculate mean, max and min in tensor
sum  tensor(120.)
max  tensor(15.)
min  tensor(0.)
mean  tensor(7.5000)


In [55]:
print("calculate max on different axis")
print("column max: ", tensor2d.max(dim=0))
print()
print("row max: ", tensor2d.max(dim=1)[0])

calculate max on different axis
column max:  torch.return_types.max(
values=tensor([14., 15.]),
indices=tensor([7, 7]))

row max:  tensor([ 1.,  3.,  5.,  7.,  9., 11., 13., 15.])


## 1.4 Reshaping and resizing

Resizing or reshaping a tensor is an incredibly important tensor operation that is used all the time. The interesting thing is that there seems to be many ways of achieving the same behavior.

view, reshape, resize

In [56]:
tensor2d

tensor([[ 0.,  1.],
        [ 2.,  3.],
        [ 4.,  5.],
        [ 6.,  7.],
        [ 8.,  9.],
        [10., 11.],
        [12., 13.],
        [14., 15.]])

In [63]:
# view

print("reshape 2d tensor using view")
newtensor2d = tensor2d.view(4, 4)
print(newtensor2d)
print(newtensor2d.size())
print()

print("reshape 2d tensor using view 2")
newtensor2d = tensor2d.view(4, -1)  # the second dimention is adjusted to size of the matrix
print(newtensor2d)
print(newtensor2d.size())

reshape 2d tensor using view
tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.],
        [12., 13., 14., 15.]])
torch.Size([4, 4])

reshape 2d tensor using view 2
tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.],
        [12., 13., 14., 15.]])
torch.Size([4, 4])


In [61]:
# reshape

print("reshape 2d tensor using reshape")
newtensor2d = tensor2d.reshape(4, 4)
print(newtensor2d)
print(newtensor2d.size())

reshape 2d tensor using reshape
tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.],
        [12., 13., 14., 15.]])
torch.Size([4, 4])


In [65]:
# resize
# this method permantly affect the size of the input tensor

print("reshape 2d tensor using resize")
newtensor2d = tensor2d.resize_(4, 4)
print(newtensor2d)
print(newtensor2d.size())

print(tensor2d)

reshape 2d tensor using resize
tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.],
        [12., 13., 14., 15.]])
torch.Size([4, 4])
tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.],
        [12., 13., 14., 15.]])


## 1.5 Basic math operation (plus, minus,times,divide)

In [73]:
print("Initialize tensor, x1")
x1 = torch.ones([3, 3], dtype=torch.float32)
x1[:, 0] = torch.tensor([4., 5., 6.])
print(x1)

Initialize tensor, x1
tensor([[4., 1., 1.],
        [5., 1., 1.],
        [6., 1., 1.]])


In [74]:
print("Transpose x1 tensor")
print(x1.transpose(0, 1))

Transpose x1 tensor
tensor([[4., 5., 6.],
        [1., 1., 1.],
        [1., 1., 1.]])


In [75]:
print("Multiply by scalar, x2=x1*3")
x2 = x1*3.
print(x2)

Multiply by scalar, x2=x1*3
tensor([[12.,  3.,  3.],
        [15.,  3.,  3.],
        [18.,  3.,  3.]])


In [76]:
print("Element-wise sum, x1+x2")
print(x1+x2)

Element-wise sum, x1+x2
tensor([[16.,  4.,  4.],
        [20.,  4.,  4.],
        [24.,  4.,  4.]])


In [77]:
print("Element-wise subtract, x1-x2")
print(x1-x2)

Element-wise subtract, x1-x2
tensor([[ -8.,  -2.,  -2.],
        [-10.,  -2.,  -2.],
        [-12.,  -2.,  -2.]])


In [78]:
print("Element-wise product, x1*x2")
print(x1*x2)

Element-wise product, x1*x2
tensor([[ 48.,   3.,   3.],
        [ 75.,   3.,   3.],
        [108.,   3.,   3.]])


In [79]:
print("Element-wise divide, x1/x2")
print(x1/x2)

Element-wise divide, x1/x2
tensor([[0.3333, 0.3333, 0.3333],
        [0.3333, 0.3333, 0.3333],
        [0.3333, 0.3333, 0.3333]])


In [80]:
print("Element-wise power, x2^2")
print(torch.pow(x2, 2))

Element-wise power, x2^2
tensor([[144.,   9.,   9.],
        [225.,   9.,   9.],
        [324.,   9.,   9.]])


In [81]:
print("Element-wise square root, sqrt(x2)")
print(torch.sqrt(x2))

Element-wise square root, sqrt(x2)
tensor([[3.4641, 1.7321, 1.7321],
        [3.8730, 1.7321, 1.7321],
        [4.2426, 1.7321, 1.7321]])


In [82]:
print("Matrix multiplication, x1*x2")
print(x1.mm(x2))

Matrix multiplication, x1*x2
tensor([[ 81.,  18.,  18.],
        [ 93.,  21.,  21.],
        [105.,  24.,  24.]])


In [83]:
print("Vector multiplication, x1*x2[0]")
print(x1.mm(x2[0].view([-1, 1])).view(-1))

Vector multiplication, x1*x2[0]
tensor([54., 66., 78.])


## 1.6 Broadcasting

In short, if a PyTorch operation supports broadcast, then its Tensor arguments can be automatically expanded to be of equal sizes (see more at [this link](https://pytorch.org/docs/stable/notes/broadcasting.html))

In [90]:
print("Add vector x2 to each row of matrix x1 using broadcasting mechanism")
x1 = torch.tensor([[0, 1], [2, 3], [4, 5], [6, 7], [8, 9], [10, 11], [12, 13], [14, 15]])
x2 = torch.tensor([1, 3])
print(x1+x2)

Add vector x2 to each row of matrix x1 using broadcasting mechanism
tensor([[ 1,  4],
        [ 3,  6],
        [ 5,  8],
        [ 7, 10],
        [ 9, 12],
        [11, 14],
        [13, 16],
        [15, 18]])


## 1.7 Dimension

One of the main takeaways from that experience is that an intuition on dimensionality and tensor operations in general is a huge plus. This gets especially important for things like batching.

(n,) v. (1, n) does have different dimension

In [94]:
a = torch.randn(3)
print(a)
print(a.shape)
print(a.ndim)

tensor([ 0.1012, -0.9094,  0.2257])
torch.Size([3])
1


In [95]:
b = torch.randn(1,3)
print(b)
print(b.shape)
print(b.ndim)

tensor([[ 0.9801,  0.8352, -0.6674]])
torch.Size([1, 3])
2


## 1.8 Adding or reducing dimension

As mentioned earlier, batch dimension is something that becomes very important later on. Some PyTorch layers, most notably RNNs, even have an argument batch_first, which accepts a boolean value. 

A common operation that is used when dealing with inputs is .squeeze(), or its inverse, .unsqueeze().

In [96]:
print(a)

tensor([ 0.1012, -0.9094,  0.2257])


In [100]:
# adding a dimension in the 0 shape
new_a_0 = a.unsqueeze(0)
print(new_a_0)
print()

# adding a dimension in the 1 shape
new_a_1 = a.unsqueeze(1)
print(new_a_1)

tensor([[ 0.1012, -0.9094,  0.2257]])

tensor([[ 0.1012],
        [-0.9094],
        [ 0.2257]])


In [109]:
# remove dimension in 0 shape
print(new_a_0.squeeze(0))

# remove dimension in 1 shape
print(new_a_1.squeeze(1))

tensor([ 0.1012, -0.9094,  0.2257])
tensor([ 0.1012, -0.9094,  0.2257])


## 1.9 Concat

Concatenation and stacking are very commonly used in deep learning.

In [111]:
m1 = (torch.rand(2, 3, 4) * 10).int()
m2 = (torch.rand(2, 3, 4) * 10).int()
print(m1)
print()
print(m2)

tensor([[[1, 9, 4, 7],
         [8, 8, 8, 6],
         [5, 0, 9, 8]],

        [[8, 1, 4, 1],
         [3, 9, 3, 8],
         [5, 8, 4, 4]]], dtype=torch.int32)

tensor([[[4, 6, 1, 5],
         [1, 0, 1, 8],
         [1, 6, 8, 3]],

        [[2, 7, 5, 8],
         [3, 3, 8, 8],
         [2, 4, 4, 5]]], dtype=torch.int32)


In [113]:
# perform the first concatenation along the 0-th dimension

cat0 = torch.cat((m1, m2), 0)
print(cat0)
print(cat0.shape)

tensor([[[1, 9, 4, 7],
         [8, 8, 8, 6],
         [5, 0, 9, 8]],

        [[8, 1, 4, 1],
         [3, 9, 3, 8],
         [5, 8, 4, 4]],

        [[4, 6, 1, 5],
         [1, 0, 1, 8],
         [1, 6, 8, 3]],

        [[2, 7, 5, 8],
         [3, 3, 8, 8],
         [2, 4, 4, 5]]], dtype=torch.int32)
torch.Size([4, 3, 4])


In [114]:
# perform the first concatenation along the 1-th dimension

cat1 = torch.cat((m1, m2), 1)
print(cat1)
print(cat1.shape)

tensor([[[1, 9, 4, 7],
         [8, 8, 8, 6],
         [5, 0, 9, 8],
         [4, 6, 1, 5],
         [1, 0, 1, 8],
         [1, 6, 8, 3]],

        [[8, 1, 4, 1],
         [3, 9, 3, 8],
         [5, 8, 4, 4],
         [2, 7, 5, 8],
         [3, 3, 8, 8],
         [2, 4, 4, 5]]], dtype=torch.int32)
torch.Size([2, 6, 4])


In [115]:
# perform the first concatenation along the 2-th dimension

cat2 = torch.cat((m1, m2), 2)
print(cat2)
print(cat2.shape)

tensor([[[1, 9, 4, 7, 4, 6, 1, 5],
         [8, 8, 8, 6, 1, 0, 1, 8],
         [5, 0, 9, 8, 1, 6, 8, 3]],

        [[8, 1, 4, 1, 2, 7, 5, 8],
         [3, 9, 3, 8, 3, 3, 8, 8],
         [5, 8, 4, 4, 2, 4, 4, 5]]], dtype=torch.int32)
torch.Size([2, 3, 8])


### Thank for reading & good bye!