# **Deep Learning with PyTorch** -  Introductory Lab

## **Part 1:** PyTorch Basics

* Here we will look at [**PyTorch**](https://pytorch.org/)
* It is the most popular deep learning library, together with Google's TensorFlow
* You can easily search in the [full documentation](https://pytorch.org/docs/stable/index.html)
* There is also a good [starting tutorial](https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html)
* We will also cover the basics here.

### **Important:**
* Go through, **run** and **understand** the code in each cell
* It is important that you also **modify**, **explore**, and **play around** with the code :)
* There are also some **exercises** marked as:
### 💡 **Exercise**
* Now, lets start ...

In [None]:
# Import the pytorch library
import torch

In [None]:
# Create a 2D tensor with random values

x = torch.rand(3,4)    # The arguments indicate the shape

print(x)

In [None]:
# You can check the shape of the tensor
print(x.shape)

### Creating Tensors

In [None]:
# Can be any number of dimensions
x1 = torch.rand(4,2,6,3,8)

# Tensor with standard Normal distributed values
x2 = torch.randn(2,3)
print(x2)

# Tensor with zeros
x3 = torch.zeros(2,3)
print(x3)

# Tensor with ones
x4 = torch.ones(2,3)
print(x4)

In [None]:
# Tensor with increasing values
x5 = torch.arange(6)
print(x5)

# Create a tensor with your favourite values
my_tensor = torch.Tensor([1.41, 5.1, 3.1415])
print(my_tensor)

### Types of Tensors
* Tensors can have different data types
* The type can be checked through the `dtype` attribute

In [None]:
x = torch.rand(3)
print(x.dtype)   # Print the type

* This was a `float32` tensor. That is the standard type.
* There are other useful types ...

In [None]:
# Create a double precision tensor
x = torch.rand(3, dtype=torch.float64)
print(x.dtype)
print(x)

* `float64` tensors are useful if you need extra numerical precision
* But this is seldom needed for deep learning
* `float32` consumes half the memory and are therefore preferred

In [None]:
# Create a tensor with integers
x = torch.LongTensor([3, 6, -11])
print(x.dtype)
print(x)

In [None]:
# Try dividing the elements with 2
x / 2

In [None]:
# That only works for float tensors. Use integer division instead
x // 2

In [None]:
# Images often comes as unsigned 8-bit tensors (or Byte tensors)
x = torch.ByteTensor([3, 0, 100, 255])
print(x)
print(x + 1)  # Note that 255 + 1 = 0  (i.e. modulo 256)

In [None]:
# We can convert from one tensor to another
x_float = 10 * torch.rand(3)
x_long = x_float.long()
x_float2 = x_long.float()

print('x_float =', x_float)
print('x_long =', x_long)
print('x_float2 =', x_float2)

### Operating on Tensors

#### Point-wise operations

In [None]:
x = torch.rand(3,4)
y = torch.rand(3,4)

# You can play around with what you can do.
# These operations are pointwise
# Add print statements if you want to see the outputs
z = x + y
z = x / y
z = x**y
z = torch.exp(x)
z = torch.sin(x)
z = x.exp()
z = x.round()
z = (x - 0.5).abs()

print(z)

#### Reducing operations

In [None]:
x = torch.rand(3,4)

# Sum all elements
print(torch.sum(x))
print(x.sum())

print()

# Sum along a dimension
print(x.sum(dim=0))
print(x.sum(dim=1))

print()

# Keep dimensions after summing
print(x.shape)
print(x.sum(dim=0).shape)
print(x.sum(dim=0, keepdim=True).shape)

In [None]:
x = torch.rand(3,4)

# You can calso do a mean
print('Mean: ', x.mean())

# Max and min works in a similar way
print('Min: ', x.min())
print('Max: ', x.max())

In [None]:
# You can use .item() to convert a scalar tensor to just a number
s = x.sum()
print(s)
print(s.item())

In [None]:
# With max/min along a dimension you also get the argmax/argmin
max_val, arg_max = x.max(dim=0)
print(max_val)
print(arg_max)

### 💡 **Exercise**
* Using the functions above, write a function `argmin2d` in the cell below that returns the 2d-coordinate of the minimum value of any input 2d tensor.

In [None]:
def argmin2d(x):
    pass   # Write your code here

x = torch.rand(5,6)

print(x)
print(argmin2d(x))

#### Inplace operations
* These operations do not create a new tensor
* Instead they modify the values inside the tensor itself

In [None]:
x = torch.rand(3)
print(x)

x += torch.ones(3)
print(x)

x.exp_()    # PyTorch functions ending with _ are inplace
print(x)

#### Broadcasting
* Broadcasting is an important concept when performing an operation that involves two tensors.
* If the tensors have different shapes, then some dimensions are automatically expanded (replicated) without copying the underlying data.
* This is both very convenient and useful.
* It is easies to learn from some examples ...

In [None]:
# Here, the tensors have an equal number of dimensions
# The dimension of size 1 is always expanded to match that of the other tensor

x = torch.rand(3,4)
y = torch.rand(1,4)
z = torch.rand(3,1)

print(x + y)   # First dimension in y is broadcasted
print(x * z)   # Second dimension in z is broadcasted
print(y / z)   # Both the first dimension in y and the second dimension in z are broadcasted

In [None]:
# This still gives an error since dimension 0 has non-matching sizes
# So, each dimension in y must either be 1 or the same size as in x

x = torch.rand(3,4)
y = torch.rand(2,4)

x + y

In [None]:
# Broadcasting also works if the number of dimensions are different in the two tensors.
# But be careful when using it. The behavior is not that intuitive unless you are used to it!
# Lets check some examples ...

x = torch.rand(3,6,4)
y1 = torch.rand(4)
y2 = torch.rand(6,4)
y3 = torch.rand(6,1)

# All these examples work!
# Note that the dimensions are aligned by starting from the last one.
# Therefore, the first dimension in x is always broadcasted in this example

z = x + y1
z = x + y2
z = x + y3

print('No error!')

In [None]:
# We cannot align the tensors starting from the first dimension

x = torch.rand(3,6,4)
y = torch.rand(3,6)

z = x + y

#### Reshaping and permuting dimensions

In [None]:
x1 = torch.rand(3,4)

# Tensors can be reshaped with the view command
# The function takes the new shape as arguments

x2 = x1.view(6,2)
x3 = x1.view(1,-1)    # The size of a dimension with -1 is computed
x4 = x1.view(3,2,2)   # You can also add new dimensions

print(x1)
print()
print(x2)
print()
print(x3)
print()
print(x4)

* The ordering of the data is not changed.
* The tensor is "filled" starting with the last dimension and ending with the first.
* The data is not copied.
* Hence x1, x2, x3, and x4 refers to the same underlying data.
* So, if you change one of the tensors inplace, the other ones will also change.

In [None]:
# Next we will try to swap (transpose) dimensions

x1 = torch.rand(3,4)

# The transpose function swaps the indicated dimensions
# In this case, this corresponds to standard matrix transpose

x2 = x1.transpose(0,1)

# Lets compare with the effect of reshaping
x3 = x1.view(4,3)

print(x1)
print()
print(x2)
print()
print(x3)

* **Note:** x2 and x3 are not the same! Why? 

In [None]:
# The permute function is more general and can swap multiple dimensions at the same time.

x1 = torch.rand(3,4,5,6)

x2 = x1.permute(2,3,0,1)   # Input the new order of dimensions

print(x1.shape)
print(x2.shape)

### 💡 **Exercise**
* Use broadcasting and the `.view(...)` function to generate a 2D multiplication table `mul_tab` of size 16x16.
* It should contain the numbers `mul_tab[i,j] = i*j`
* **Tip:** Check the `torch.arange` function above.

In [None]:
# Implement your solution here!

mul_tab = None


print(mul_tab)

### Indexing
* Tensors can be indexed in different ways. Lets explore...

#### Standard indexing

In [None]:
x = torch.rand(3,4,5)

# Index a single value
print(x[2,0,1])

# Slice a dimension. : means 'everything'
print(x[2,0,:])

# or multiple
print(x[:,0,:])

# Slice with a range
print(x[2, 1:, 2:-1])

# Tripple dot ... means 'all remaining dimensions'
print(x[1, ...])

#### Index with a list of coordinates

In [None]:
x = torch.rand(3,4)

# We can use LongTensor to index out a list of specific values/rows/columns

ind_col = torch.LongTensor([1,1,0,3,-1,-2])   # Index these columns pls

print(x[:, ind_col])


# If we also want to index specific rows, we need to match the shapes

ind_row = torch.LongTensor([2,0])   # Index these rows

print(x[ind_row.view(-1,1), ind_col.view(1,-1)])

#### Logical indexing

In [None]:
x = torch.rand(3,4)

# Finally, we do some logical indexing

logical_ind_col = torch.BoolTensor([True, False, False, True])   # Boolean tensor indicating wich columns to keep

print(x[:, logical_ind_col])

# The length of logical indices must match the size of the corresponding dimension (4 in this case)

# Logical indexing is very useful in many cases
# For example, say we just want to keep columns whose average is larger than 0.5:

x2 = x[:, x.mean(dim=0) > 0.5]

print(x2)

print(x.mean(dim=0) > 0.5)  # Lets also check what the index looks like

#### Is data copied?

In [None]:
# Note that the indexed values maps to the same underlying data

x = torch.rand(3,4)
y = x[0,:]

print('x =', x)
print('y =', y)
print()

# Now, lets modify y inplace

y *= 0   # This should set all values to 0

print('x =', x)
print('y =', y)

* **Note:** `x` was also modified in the process!
* Both x and y refers to the same underlying data.
* So data is **not** copied by indexing.

#### Indexing the left-hand side

In [None]:
x = torch.rand(3,4)

# Set a specific coordinate
x[1,1] = 999
print(x)

# Set a column to a value
x[:,-1] = 555
print(x)

# Set a row to another tensor
x[0,:] = -22 * torch.rand(4)
print(x)

### Combining tensors
* You can concatenate and stack tensors together.

In [None]:
x = torch.rand(3,4)
y = 5*torch.ones(2,4)

# Concatenate x and y along the first dimension (dim=0)
z1 = torch.cat([x, y], dim=0)  # The shapes, except in dim=0, must match
print(z1)

# Stacking, on the other hand, creates a new dimension
# Here, all dimensions must match
z2 = torch.stack([x, x], dim=1)
print(z2.shape)

# Nicely combined with list comprehension
z3 = torch.cat([n*torch.ones(3,n) for n in range(5)], dim=-1)
print(z3)

### Efficiency
* The key to efficiency when using PyTorch is parallelism.
* When performing an operation, such as `z = x + y`, the computation is parallelized over all elements on the CPU or GPU.
* It is therefore **crucial that you avoid for-loops at all costs!**
* Complex for-loop implementations are also not compatible (or become very inefficient) with gradient computation, which we need for deep learning.

### 💡 **Exercise**
* Here you have an example of a **very bad** implementation, which uses for-loops over the elements to compute the new tensor `z1`.
* Implement your own solution which completely avoids loops.
* Make sure that your result `z2` is equal to `z1`.

In [None]:
x = torch.rand(3,4)
y = torch.rand(4,7)
u = torch.rand(3)
v = torch.rand(7)

# Example of a very bad implementation

z1 = torch.zeros(x.shape[0], x.shape[1], y.shape[1])

for i in range(z1.shape[0]):
    for j in range(z1.shape[1]):
        for k in range(z1.shape[2]):
            diff = x[i,j] - y[j,k]
            z1[i,j,k] = u[i] * torch.exp(diff) + v[k]
        

# Write your implementation without loops here

z2 = None


# Check that the difference is zero
print(z1 - z2)

### Great! In the next part we will use PyTorch to operate on images ...