## Crash Course in PyTorch

### Part 0 - Introduction

#### Instructor: Tudor Cebere


![](https://www.ambient-it.net/wp-content/uploads/2018/07/Logo-PyTorch-200x175.jpg)

PyTorch is a Python library used for high performance tensor operations.

PyTorch internals are written in C++ for memory-friendly operations and it exposes a set of primitive routines to a high level API in Python, due to it's ease of use and beginner friendly syntax. PyTorch exposes it's API to C++, rust <3 , Haskell, swift and it can be compiled to other frameworks as well.

It's being used mainly in Python due to it's ease of prototyping and clean syntax, the computational heave operations being forwarded to the C++ core.

## What is a Tensor?

We can look at a Tensor as a multidimensional matrix or a generalization of a vector. Let's try to define  what the tensor behavior should be from a computer scientist. Let's imagine that you have a tensor `v`. What will you get if you try to get the first element of a tensor:
* If `v` is a 4D array, the requested element will be a 3D tensor.
* If `v` is a 3D array, the requested element will be a 2D tensor (or a matrix).
* If `v` is a 2D array, the requested element will be a 1D tensor (or a vector).
* If `v` is a 1D array, the requested element will be a scalar.
* If `v` is a scalar, the request element will raise an error.

In other words, let's imagine the dimensions of a tensor:

![](https://miro.medium.com/max/2088/1*TPauIPgMOuwowxd53zNKVA.png)

In [None]:
# Note: the notebook cells are stateful, you need to run this once per session.
import torch as th
import sys

### Creating a Tensor

The starting point:
[Tensor Creation API](https://pytorch.org/docs/stable/tensors.html)

![](https://i.imgflip.com/4mrg2n.jpg)

In [None]:
# Note: note the fact that the int types are converted to float. Why?
one_dimension_tensor = th.Tensor([1, 2, 3])

In [None]:
# Note: Lists in python have no restriction on their internal types and almost all numerical types in python
# can be converted to float.

another_one_dimensional_int_tensor = th.Tensor([False, True, 5])

In [None]:
# Note: When creating a tensor, you can specify the dtype on which to cast the data. 

a = th.ones((2, 3), dtype=th.int)
print(a)

mask = th.zeros((2, 3, 4), dtype=th.bool)
print(mask)

In [None]:
# Note: The underlying data type can be specified by running custom Tensors as well, for
# example: LongTensor

v = th.LongTensor([1,2,3])   # A Tensor of type Long
print(f"LongTensor: {v}")

In [None]:
# Initialize each element with a random number sampled from the uniform distribution.
uniform = th.rand(2, 3)
print(f"From the uniform distribution:\n{uniform}\n")

# Initialize each element with a random number sampled from the normal distribution.
normal = th.randn(2, 3)
print(f"From the normal distribution:\n{normal}\n")

# Initialize each element with a permutation from a range.
perm = th.randperm(4)
print(f"From permutations:\n{perm}\n")

In [None]:
v = th.linspace(1, 10, steps=10) # Create a Tensor with 10 linear points for (1, 10) inclusively
v = th.logspace(start=-10, end=10, steps=5) # Size 5: 1.0e-10 1.0e-05 1.0e+00, 1.0e+05, 1.0e+10

### Tensor indexing

![](https://miro.medium.com/max/1276/1*WArDf9h6Dtbo-4H5P4lguQ.png)

In [None]:
# Tip: If you are familiar with NumPy, the indexing is similar!
x = th.randn(2, 3)
print(x)

In [None]:
# let's get the first array
print(x[0])

In [None]:
# let's get the second element from the first array
print(x[0][1])

# similar to:
print(x[0, 1])

In [None]:
# .size() gives you good hints about how you can do the indexing!
print(x.size())

In [None]:
# .numel() gives you information about the numer of elements in the tensor (this can become huge)
print(x.numel())

In [None]:
# your Swiss knife when it comes to tensor shaping is view:
x = th.randn(2, 3)

# lets ssee our tensor unrolled
y = x.view(6)

# what a -1 on a tensor dimension mean when viewing the tensor?
z = x.view(-1, 2)

print(y)
print(z)

## Tensor manipulation

In [None]:
v = th.arange(9)
print(v)
v = v.view(3, 3)
print(v)
print(v.shape)

In [None]:
# Concatenation
x3 = th.cat((x, x, x), 0) # Concatenate in the 0 dimension
print(x3)

# Stack
r = th.stack((v, v))
print(r)

In [None]:
# Index select
# 0 2
# 3 5
# 6 8

indices = th.LongTensor([0, 2])
r = th.index_select(v, 1, indices) # Select element 0 and 2 for each dimension 1.

In [None]:
# Masked select
# 0  0  0
# 1  1  1
# 1  1  1
mask = v.ge(3)

# Size 6: 3 4 5 6 7 8
r = th.masked_select(v, mask)

In [None]:
t = th.ones(2,1,2,1) # Size 2x1x2x1
print(t)
r = th.squeeze(t)     # Size 2x2
print(r)
r = th.squeeze(t, 1)  # Squeeze dimension 1: Size 2x2x1
print(r)

In [None]:
# Un-squeeze a dimension
x = th.Tensor([1, 2, 3])
print(x)
r = th.unsqueeze(x, 0)       # Size: 1x3
print(r)
r = th.unsqueeze(x, 1)       # Size: 3x1
print(r)

In [None]:
# Transpose dim 0 and 1
print(v)

transposed_1 = th.transpose(v, 0, 1)
print(transposed_1)

transposed_2 = v.T
print(transposed_2)

## Point-wise operations

In [None]:
neg_data = th.tensor([-1, 2, -3])
x = th.abs(neg_data) # 1 2 3
y = th.tensor([-1, -1, -1])

# Add x, y and scalar 10 to all elements
r = th.add(x, 10) # 11 12 13
print(r)

# What the 3 arguments does?
r = th.add(x, 10, y) # 10, 11, 12
print(r)

In [None]:
# Clamp the value of a Tensor
r = th.clamp(v, min=-0.5, max=0.5)

# Element-wise divide
r = th.div(v, v+0.03)

# Element-wise multiple
r = th.mul(v, v)

## Comparison equation

In [None]:
### Comparison
# Size 3x3: Element-wise comparison
r = th.eq(v, v)

# Max element with corresponding index
r = th.max(v, 1)

In [None]:
# What will the following snippet of code return?
result = th.tensor([1, 2, 3]) == th.tensor([1, 4, 6])
print(result)

print(result.all())

print(result.any())

## Matrix multiplication


In [None]:
# mm is matrix multiplication between two tensors
mat1 = th.randn(2, 3)
mat2 = th.randn(3, 4)
r = th.mm(mat1, mat2)
print(r.shape)

In [None]:
mat1 = th.tensor([2, 3])
mat2 = th.tensor([2, 1])
r = th.dot(mat1, mat2)
print(r)

## Backward
![](https://colah.github.io/posts/2015-08-Backprop/img/tree-eval-derivs.png)

In [None]:
a = th.tensor([1.], requires_grad=True)
b = th.tensor([2.], requires_grad=True)

c = a * b
c.backward()

print(a.grad)
print(b.grad)

In [None]:
a = th.tensor([1.], requires_grad=True)
b = th.tensor([2.], requires_grad=False)

c = a*b
c.backward()

print(a.grad)
print(b.grad)

In [None]:
# we don't care at all about the number of leaves in our computational graph. They can be as many as possible
a_1 = th.tensor([1., 2., 3.], requires_grad=True)
b_1 = th.tensor([4., 5., 6.], requires_grad=True)

c_1 = (a_1 + b_1).sum()

# be careful to compute the gradients only once per graph
c_1.backward()
print(a_1.grad)
print(b_1.grad)

a_2 = th.tensor([1., 2., 3.], requires_grad=True)
b_2 = th.tensor([4., 5., 6.], requires_grad=True)
c_2 = c_1 + (a_2 + b_2).sum()

c_2.backward()

print(a_1.grad)
print(b_1.grad)

In [None]:
a_1 = th.tensor([1., 2., 3.], requires_grad=True)
b_1 = th.tensor([4., 5., 6.], requires_grad=True)

print(a_1.requires_grad)
print(b_1.requires_grad)

with th.no_grad():
    c_1 = a_1 + b_1
    print(c_1.requires_grad)

Useful Links:
* [PyTorch](https://pytorch.org/docs/stable/) docs are amazing!
* [The blog of Chris Olah@OpenAI](https://colah.github.io/posts/2015-08-Backprop/) made me understand backpropagation and LSTM.