<a href="https://colab.research.google.com/github/tonyzamyatin/learning-pytorch/blob/master/fcc-course/00_fundamentals.ipynb" target="_parent"><img alt="Open In Colab" src="https://colab.research.google.com/assets/colab-badge.svg"/></a>

# 00. PyTorchFundamentals

Reference to online course book: https://www.learnpytorch.io/00_pytorch_fundamentals/

In [101]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)

2.3.1


## Introductino to Tensors
### Creating Tensors

In [102]:
# Scalar
s = torch.tensor(7)
s

tensor(7)

In [103]:
# Vector
v = torch.tensor([7, 7])
v

tensor([7, 7])

In [104]:
v.ndim

1

In [105]:
v.shape

torch.Size([2])

In [106]:
# Matrix
M = torch.tensor([
    [1, 2],
    [3, 4]
])
M

tensor([[1, 2],
        [3, 4]])

In [107]:
M.ndim

2

In [108]:
M.shape

torch.Size([2, 2])

In [109]:
# Second row of matrix (a tensor yet again)
M[1]

tensor([3, 4])

In [110]:
# Tensor, with dim 3*4*2, randomly initialized
T = torch.rand(3, 4, 2)
T

tensor([[[0.1145, 0.9157],
         [0.8773, 0.6945],
         [0.2558, 0.6495],
         [0.5830, 0.4821]],

        [[0.8109, 0.5073],
         [0.4354, 0.9809],
         [0.9477, 0.3601],
         [0.7568, 0.5272]],

        [[0.2953, 0.7052],
         [0.9806, 0.6236],
         [0.4984, 0.5948],
         [0.3828, 0.4015]]])

In [111]:
T.ndim

3

In [112]:
T.shape

torch.Size([3, 4, 2])

The first number of the .shape attribute denotes the number of tensors within the first pair of `[]`, the second number denotes the number of 
tensors inside the second pair of `[]` (in the three-dimensional case it denotes the number of rows), 

### Naming conventions
Scalars and vectors have lower case names, matrices and higher dimensional tensors have upper case names.

## Random Tensors
### Why random tensors?
Often we want to randomly initialize the parameters of neural networks as an unbiased starting point for training. For this we can use random tensors.

In [113]:
random_img_tensor = torch.rand(224, 224, 3)
random_img_tensor.shape, random_img_tensor.ndim

(torch.Size([224, 224, 3]), 3)

### Zeros and Ones

Zero tensors are useful for masking rows, columns or sub-tensors of the data tensors of any shape.

In [114]:
zeros = torch.zeros(3, 3)
zeros

tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])

In [115]:
ones = torch.ones(3, 3)
ones

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])

In [116]:
zeros.dtype

torch.float32

### Creating a range of tensors and tensor-like

In [117]:
range_tensor = torch.arange(start=0, end=691, step=69)
range_tensor

tensor([  0,  69, 138, 207, 276, 345, 414, 483, 552, 621, 690])

In [118]:
# Creating tensors like
eleven_zeros = torch.zeros_like(range_tensor)
eleven_zeros, len(eleven_zeros)

(tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]), 11)

## Tensor datatypes
*Note:* Tensor datatypes is one of the 3 big errors you will run into with PyTorch and deep learning:
1. Tensors not right datatype
2. Tensor not right shape
3. Tensor not on the right device

In [119]:
float_32_tensor = torch.tensor([3., 6., 9.],
                               dtype=None,          # datatype of the tensor (e.g. float32, float16)
                               device=None,         # the device the tensor is located on   
                               requires_grad=False) # Tells PyTorch whether to track the gradient with this tensors operation
float_32_tensor, float_32_tensor.dtype, float_32_tensor.device, float_32_tensor.requires_grad

(tensor([3., 6., 9.]), torch.float32, device(type='cpu'), False)

### Most common datatypes
- 32 bit ... single precision
- 16 bit ... half precision
- 64 bit ... double precision

In [120]:
float_16_tensor = float_32_tensor.type(torch.float16)   # Change datatype to float16
float_16_tensor, float_16_tensor.dtype

(tensor([3., 6., 9.], dtype=torch.float16), torch.float16)

In [121]:
float_64_tensor = float_32_tensor.type(torch.float64)
float_64_tensor, float_64_tensor.dtype

(tensor([3., 6., 9.], dtype=torch.float64), torch.float64)

In [122]:
res_16_32 = float_16_tensor * float_32_tensor
res_16_32, res_16_32.dtype

(tensor([ 9., 36., 81.]), torch.float32)

In [123]:
res_32_64 = float_32_tensor * float_64_tensor
res_32_64, res_32_64.dtype

(tensor([ 9., 36., 81.], dtype=torch.float64), torch.float64)

Some tensor operations work with different tensor datatypes (e.g. by adopting the larger datatype). Other operations (many of the operations 
during neural network training) will throw on error, however.

In [124]:
int_32_tensor = torch.tensor([3, 6, 9], dtype=torch.int32)

In [125]:
res_int_float = int_32_tensor * float_32_tensor
res_int_float, res_int_float.dtype

(tensor([ 9., 36., 81.]), torch.float32)

## Getting information from tensors
**Tensor attributes:**
1. Tensors not right datatype - get information using `tensor.dtype`
2. Tensor not right shape - get information using `tensor.shape`
3. Tensor not on the right device - get information using `tensor.device`

In [126]:
some_tensor = torch.rand(3, 4)
some_tensor

tensor([[0.4127, 0.2543, 0.5568, 0.9360],
        [0.8333, 0.4436, 0.1995, 0.4165],
        [0.1707, 0.8016, 0.5629, 0.0955]])

In [127]:
# Size and shape do the same
some_tensor.size(), some_tensor.shape

(torch.Size([3, 4]), torch.Size([3, 4]))

In [128]:
some_tensor.dtype, some_tensor.shape, some_tensor.device

(torch.float32, torch.Size([3, 4]), device(type='cpu'))

In [131]:
# Change datatype and device of tensor
some_tensor.to(dtype=torch.float16, device="cuda")  # Throws error if no NVIDIA GPU with CUDA is available
some_tensor.dtype, some_tensor.device

AssertionError: Torch not compiled with CUDA enabled

## Manipulating tensors
Tensor operations include:
* Addition
* Subtraction
* Multiplication (element-wise)
* Division
* Matrix multiplication (dot product)

In [132]:
# Add scalar to tensor (creates a tensor-like filled with the scalar to add)
tensor = torch.tensor([1, 2, 3])
tensor + 10

tensor([11, 12, 13])

In [133]:
# Multiply tensor by scalar
tensor * 10

tensor([10, 20, 30])

In [134]:
# Subtract scalar from tensor
tensor - 10

tensor([-9, -8, -7])

In [135]:
# PyTorch in-built functions
torch.add(tensor, 10), torch.mul(tensor, 10), torch.subtract(tensor, 10)

(tensor([11, 12, 13]), tensor([10, 20, 30]), tensor([-9, -8, -7]))

### Rules for matrix mulitplication
1. **Inner dimensions** of the two matrices to be multiplied must match, e.g. 3x2 @ 2x3
2. The resulting matrix has the shape of the **outer dimensions**

In [189]:
matrix = torch.rand(10, 3)

In [215]:
%%time
torch.matmul(matrix, matrix.T)    # dot product

CPU times: total: 0 ns
Wall time: 0 ns


tensor([[0.9973, 0.6776, 0.6147, 0.5762, 0.3779, 0.9448, 0.6945, 0.3637, 0.7314,
         0.3335],
        [0.6776, 0.8996, 0.6521, 0.5744, 0.6105, 0.8457, 0.5772, 0.3672, 0.8324,
         0.4936],
        [0.6147, 0.6521, 0.5335, 0.4793, 0.4409, 0.6702, 0.6268, 0.3230, 0.7275,
         0.4027],
        [0.5762, 0.5744, 0.4793, 0.4328, 0.3828, 0.6118, 0.5733, 0.2914, 0.6500,
         0.3529],
        [0.3779, 0.6105, 0.4409, 0.3828, 0.4405, 0.5085, 0.4404, 0.2570, 0.6106,
         0.3768],
        [0.9448, 0.8457, 0.6702, 0.6118, 0.5085, 1.0048, 0.6049, 0.3755, 0.7788,
         0.4008],
        [0.6945, 0.5772, 0.6268, 0.5733, 0.4404, 0.6049, 1.1996, 0.4503, 1.0627,
         0.5606],
        [0.3637, 0.3672, 0.3230, 0.2914, 0.2570, 0.3755, 0.4503, 0.2065, 0.4737,
         0.2590],
        [0.7314, 0.8324, 0.7275, 0.6500, 0.6106, 0.7788, 1.0627, 0.4737, 1.1164,
         0.6295],
        [0.3335, 0.4936, 0.4027, 0.3529, 0.3768, 0.4008, 0.5606, 0.2590, 0.6295,
         0.3750]])

In [216]:
%%time
torch.mm(matrix, matrix.T)  # shorthand for torch.matmult
# Interestingly enough, torch.mm() does not work with vectors.

CPU times: total: 0 ns
Wall time: 0 ns


tensor([[0.9973, 0.6776, 0.6147, 0.5762, 0.3779, 0.9448, 0.6945, 0.3637, 0.7314,
         0.3335],
        [0.6776, 0.8996, 0.6521, 0.5744, 0.6105, 0.8457, 0.5772, 0.3672, 0.8324,
         0.4936],
        [0.6147, 0.6521, 0.5335, 0.4793, 0.4409, 0.6702, 0.6268, 0.3230, 0.7275,
         0.4027],
        [0.5762, 0.5744, 0.4793, 0.4328, 0.3828, 0.6118, 0.5733, 0.2914, 0.6500,
         0.3529],
        [0.3779, 0.6105, 0.4409, 0.3828, 0.4405, 0.5085, 0.4404, 0.2570, 0.6106,
         0.3768],
        [0.9448, 0.8457, 0.6702, 0.6118, 0.5085, 1.0048, 0.6049, 0.3755, 0.7788,
         0.4008],
        [0.6945, 0.5772, 0.6268, 0.5733, 0.4404, 0.6049, 1.1996, 0.4503, 1.0627,
         0.5606],
        [0.3637, 0.3672, 0.3230, 0.2914, 0.2570, 0.3755, 0.4503, 0.2065, 0.4737,
         0.2590],
        [0.7314, 0.8324, 0.7275, 0.6500, 0.6106, 0.7788, 1.0627, 0.4737, 1.1164,
         0.6295],
        [0.3335, 0.4936, 0.4027, 0.3529, 0.3768, 0.4008, 0.5606, 0.2590, 0.6295,
         0.3750]])

In [218]:
%%time
# by hand
res_matrix = torch.zeros(10, 10)
for row in range(matrix.shape[0]):
    for col in range(matrix.shape[0]):
        res_matrix[row][col] = (matrix[row] * matrix[col]).sum()
res_matrix

CPU times: total: 0 ns
Wall time: 3.45 ms


tensor([[0.9973, 0.6776, 0.6147, 0.5762, 0.3779, 0.9448, 0.6945, 0.3637, 0.7314,
         0.3335],
        [0.6776, 0.8996, 0.6521, 0.5744, 0.6105, 0.8457, 0.5772, 0.3672, 0.8324,
         0.4936],
        [0.6147, 0.6521, 0.5335, 0.4793, 0.4409, 0.6702, 0.6268, 0.3230, 0.7275,
         0.4027],
        [0.5762, 0.5744, 0.4793, 0.4328, 0.3828, 0.6118, 0.5733, 0.2914, 0.6500,
         0.3529],
        [0.3779, 0.6105, 0.4409, 0.3828, 0.4405, 0.5085, 0.4404, 0.2570, 0.6106,
         0.3768],
        [0.9448, 0.8457, 0.6702, 0.6118, 0.5085, 1.0048, 0.6049, 0.3755, 0.7788,
         0.4008],
        [0.6945, 0.5772, 0.6268, 0.5733, 0.4404, 0.6049, 1.1996, 0.4503, 1.0627,
         0.5606],
        [0.3637, 0.3672, 0.3230, 0.2914, 0.2570, 0.3755, 0.4503, 0.2065, 0.4737,
         0.2590],
        [0.7314, 0.8324, 0.7275, 0.6500, 0.6106, 0.7788, 1.0627, 0.4737, 1.1164,
         0.6295],
        [0.3335, 0.4936, 0.4027, 0.3529, 0.3768, 0.4008, 0.5606, 0.2590, 0.6295,
         0.3750]])

Apart from the basic matrix operations, computationally intensive operations like matrix multiplication are implemented more efficiently in PyTorch. Therefore, it is recommended to use the PyTorch implementation instead of the Python implementation.

In [219]:
%%time
matrix @ matrix.T

CPU times: total: 0 ns
Wall time: 0 ns


tensor([[0.9973, 0.6776, 0.6147, 0.5762, 0.3779, 0.9448, 0.6945, 0.3637, 0.7314,
         0.3335],
        [0.6776, 0.8996, 0.6521, 0.5744, 0.6105, 0.8457, 0.5772, 0.3672, 0.8324,
         0.4936],
        [0.6147, 0.6521, 0.5335, 0.4793, 0.4409, 0.6702, 0.6268, 0.3230, 0.7275,
         0.4027],
        [0.5762, 0.5744, 0.4793, 0.4328, 0.3828, 0.6118, 0.5733, 0.2914, 0.6500,
         0.3529],
        [0.3779, 0.6105, 0.4409, 0.3828, 0.4405, 0.5085, 0.4404, 0.2570, 0.6106,
         0.3768],
        [0.9448, 0.8457, 0.6702, 0.6118, 0.5085, 1.0048, 0.6049, 0.3755, 0.7788,
         0.4008],
        [0.6945, 0.5772, 0.6268, 0.5733, 0.4404, 0.6049, 1.1996, 0.4503, 1.0627,
         0.5606],
        [0.3637, 0.3672, 0.3230, 0.2914, 0.2570, 0.3755, 0.4503, 0.2065, 0.4737,
         0.2590],
        [0.7314, 0.8324, 0.7275, 0.6500, 0.6106, 0.7788, 1.0627, 0.4737, 1.1164,
         0.6295],
        [0.3335, 0.4936, 0.4027, 0.3529, 0.3768, 0.4008, 0.5606, 0.2590, 0.6295,
         0.3750]])

This is the shorthand form for matrix multiplication, but may cause confusions regarding which implementation is used when other libraries such as 
numpy are used which also support the `@`operator

## Tensor aggregation
min, max, mean, sum


In [220]:
torch.min(res_matrix), res_matrix.min()

(tensor(0.2065), tensor(0.2065))

In [221]:
torch.max(res_matrix), res_matrix.max()

(tensor(1.1996), tensor(1.1996))

In [222]:
torch.mean(res_matrix)

tensor(0.5638)

In [223]:
torch.sum(res_matrix), res_matrix.sum()

(tensor(56.3843), tensor(56.3843))

### Positional min and max
Positional min and max return the index of the target tensor where the min or max respectively occurs.
*Note:* The positional index is also returned by `min()`and `max()` as the second return value.

In [228]:
res_matrix.argmin(dim=0), res_matrix.argmax()

(tensor([9, 7, 7, 7, 7, 7, 4, 7, 7, 7]), tensor(66))

If `dim` is not set, per default the tensor is flattened and the min/max index of the flattened tensor is returned.

In [230]:
res_matrix.flatten()[res_matrix.argmax().item()]

tensor(1.1996)

## Reshaping, stacking, squeezing and unsqueezing tensors
* Reshaping - reshapes an input tensor to a defined shape
* view - returns a view of an input tensor of a certain shape but keep the same memory as the original tensor
* Stacking - concatenate multiple tensors on top of each other (vstack) or side by side(hstack)
* Squeeze - remove all `1` dimensions from a tensor
* Unqueeze - add a `1` dimension to a target tensor
* Permute - return a view of an input tensor with dimensions permuted in a certain way

In [249]:
x = torch.arange(1., 11.)
x, x.shape

(tensor([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]), torch.Size([10]))

In [250]:
# Add extra dimension on 0-th dimension
x_reshaped = x.reshape(1, 10)
x_reshaped, x_reshaped.shape

(tensor([[ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]]),
 torch.Size([1, 10]))

In [251]:
# Add extra dimension on the 1-st dimension
x_reshaped = x_reshaped.reshape(10, 1)
x_reshaped, x_reshaped.shape

(tensor([[ 1.],
         [ 2.],
         [ 3.],
         [ 4.],
         [ 5.],
         [ 6.],
         [ 7.],
         [ 8.],
         [ 9.],
         [10.]]),
 torch.Size([10, 1]))

In [252]:
x_reshaped = x_reshaped.reshape(2, 5)
x_reshaped, x_reshaped.shape

(tensor([[ 1.,  2.,  3.,  4.,  5.],
         [ 6.,  7.,  8.,  9., 10.]]),
 torch.Size([2, 5]))

When reshaping a tensor, the number of elements of the input and output tensor must match.

In [253]:
# Change view
z = x.view(1, 10)
z, z.shape

(tensor([[ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]]),
 torch.Size([1, 10]))

Same result as with reshaping. However, whereas `tensor.reshape()`creates a deep copy of the tensor, `tensor.view()` creates a shallow copy. This 
means that 
changes in one tensor object will be reflected in the other since they both share the same memory.

In [261]:
z[:, 0] = 11.
z, x

(tensor([[11.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]]),
 tensor([11.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]))

We can concatenate tensors along specified dimensions using `torch.stack()`.

In [275]:
# Stack tensors
x_hstacked = torch.stack([x, x, x, x], dim=0)
x_hstacked, x_hstacked.shape

(tensor([[11.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
         [11.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
         [11.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
         [11.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]]),
 torch.Size([4, 10]))

In [276]:
x_vstacked = torch.stack([x, x, x, x], dim=1)
x_vstacked, x_vstacked.shape

(tensor([[11., 11., 11., 11.],
         [ 2.,  2.,  2.,  2.],
         [ 3.,  3.,  3.,  3.],
         [ 4.,  4.,  4.,  4.],
         [ 5.,  5.,  5.,  5.],
         [ 6.,  6.,  6.,  6.],
         [ 7.,  7.,  7.,  7.],
         [ 8.,  8.,  8.,  8.],
         [ 9.,  9.,  9.,  9.],
         [10., 10., 10., 10.]]),
 torch.Size([10, 4]))

In [277]:
torch.stack([x, x, x, x], dim=2)

IndexError: Dimension out of range (expected to be in range of [-2, 1], but got 2)

This will throw an error since the original tensor only has two dimensions, not three. We can simply add another dimension to the original tensor 
with `tensor.reshape()`.

In [278]:
x_3d = x.reshape(1, 2, 5)
x_3d, x_3d.shape

(tensor([[[11.,  2.,  3.,  4.,  5.],
          [ 6.,  7.,  8.,  9., 10.]]]),
 torch.Size([1, 2, 5]))

If we perform the stacking on `dim=2` with the 3d tensor it should work.

In [279]:
x_3d_stacked_dim2 = torch.stack([x_3d, x_3d, x_3d, x_3d], dim=2)
x_3d_stacked_dim2, x_3d_stacked_dim2.shape

(tensor([[[[11.,  2.,  3.,  4.,  5.],
           [11.,  2.,  3.,  4.,  5.],
           [11.,  2.,  3.,  4.,  5.],
           [11.,  2.,  3.,  4.,  5.]],
 
          [[ 6.,  7.,  8.,  9., 10.],
           [ 6.,  7.,  8.,  9., 10.],
           [ 6.,  7.,  8.,  9., 10.],
           [ 6.,  7.,  8.,  9., 10.]]]]),
 torch.Size([1, 2, 4, 5]))

We can now also stack along `dim=3`.

In [280]:
x_3d_stacked_dim3 = torch.stack([x_3d, x_3d, x_3d, x_3d], dim=3)
x_3d_stacked_dim3, x_3d_stacked_dim3.shape

(tensor([[[[11., 11., 11., 11.],
           [ 2.,  2.,  2.,  2.],
           [ 3.,  3.,  3.,  3.],
           [ 4.,  4.,  4.,  4.],
           [ 5.,  5.,  5.,  5.]],
 
          [[ 6.,  6.,  6.,  6.],
           [ 7.,  7.,  7.,  7.],
           [ 8.,  8.,  8.,  8.],
           [ 9.,  9.,  9.,  9.],
           [10., 10., 10., 10.]]]]),
 torch.Size([1, 2, 5, 4]))

In [283]:
# Negative dimension indices for indexing from the back
torch.stack([x, x, x, x], dim=-2), torch.stack([x, x, x, x], dim=0)

(tensor([[11.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
         [11.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
         [11.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
         [11.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]]),
 tensor([[11.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
         [11.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
         [11.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
         [11.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]]))

Remove all single dimensions with `torch.squeeze(tensor)` or `tensor.squeeze()`.

In [285]:
x_3d_squeezed = x_3d.squeeze()
x_3d.shape, x_3d_squeezed.shape

(torch.Size([1, 2, 5]), torch.Size([2, 5]))

Add a single dimension with `torch.unsqueeze(tensor, dim)` or `tensor.unsqueeze(dim)`.

In [291]:
x_unsqueeze_dim0 = x.unsqueeze(0)
x_unsqueeze_dim1 = x.unsqueeze(1)
x_unsqueeze_dim0.shape, x_unsqueeze_dim1.shape

(torch.Size([1, 10]), torch.Size([10, 1]))

In [293]:
# Throws an error
x.unsqueeze(2)

IndexError: Dimension out of range (expected to be in range of [-2, 1], but got 2)

Switch the dimensions of the tensor using `tensor.permute(list of rearranged dimensions)`.

In [296]:
# We want to permute an image tensor s.t. the color dimension is the first
img_original = torch.rand(244, 244, 3)
img_permuted = img_original.permute([2, 0, 1])
img_original.shape, img_permuted.shape

(torch.Size([244, 244, 3]), torch.Size([3, 244, 244]))

## Indexing (selecting data from tensors)

In [298]:
x = torch.arange(1, 10).reshape(1, 3, 3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [299]:
# Get first tensor inside original tensor
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [301]:
# Indexing on dim=1
x[0][0], x[0, 0]

(tensor([1, 2, 3]), tensor([1, 2, 3]))

In [303]:
# Indexing on dim=2
x[0][1][1], x[0, 1, 1]

(tensor(5), tensor(5))

Use `:` top select "all" of a target dimension.

In [304]:
# Get all values of the 0th and 1st dimension but only index 1 of the 2nd dimension.
x[:, :, 1]

tensor([[2, 5, 8]])

In [305]:
# Get all values of the 0th dimension but only index 1 of the 1st and 2nd dimension.
x[:, 1, 1]

tensor([5])

In [308]:
# This is the same
x[:, 1], x[:, 1, :]

(tensor([[4, 5, 6]]), tensor([[4, 5, 6]]))

## PyTorch and Numpy
* Data in NumPy to PyTorch tensor -> `torch.from_numpy(ndarray)`
* PyTorch tensor to NumPy -> `torch.Tensor.numpy()`

In [318]:
# PyTorch tensor from NumPy array
int_array = np.arange(1, 10)
int_32_tensor = torch.from_numpy(int_array)
int_array, int_32_tensor

(array([1, 2, 3, 4, 5, 6, 7, 8, 9]),
 tensor([1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=torch.int32))

*Note:* In contrast to PyTorch, NumPy actually creates integer arrays when you pass integers to `np.arange()` and the resulting tesnor will be 
int32!

In [319]:
float_array = np.arange(1., 10.)
float_64_tensor = torch.from_numpy(float_array)
float_array, float_64_tensor

(array([1., 2., 3., 4., 5., 6., 7., 8., 9.]),
 tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.], dtype=torch.float64))

*Note:* In contrast to PyTorch, NumPy uses float64 as default float datatype and not float32 as PyTorch! This means that we might have to change the 
datatype to float32 to avoid potential datatype issues later down the line.

In [320]:
float_32_tensor = torch.from_numpy(float_array.astype("float32"))
float_32_tensor

tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.])

In [321]:
float_32_tensor = torch.from_numpy(float_array).type(torch.float32)
float_32_tensor

tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.])

In [324]:
# Tensor to NumPy
float_32_array = float_32_tensor.numpy()
float_32_array

array([1., 2., 3., 4., 5., 6., 7., 8., 9.], dtype=float32)

**Note:** Conversion from NumPy to PyTorch and vice versa retain the original datatype of the array/tensor.

## Reproducibility: trying to remove the random from random
The weights of neural networks are usually initialized randomly. The starting point for the weights of the neural network effects the 
weights at the end of training. However, we often want to have the quality of *reproducibility*, e.g. when sharing your code or publishing a paper.

To achieve reproducible randomness, we use *random seeds*.

In [325]:
random_tensor_A = torch.rand(3, 4)
random_tensor_B = torch.rand(3, 4)
random_tensor_A, random_tensor_B, random_tensor_A == random_tensor_B

(tensor([[0.1664, 0.5858, 0.6485, 0.6501],
         [0.1531, 0.2001, 0.3092, 0.1462],
         [0.8469, 0.5949, 0.1607, 0.7096]]),
 tensor([[0.2094, 0.6273, 0.2818, 0.7761],
         [0.7730, 0.9874, 0.3691, 0.4016],
         [0.5740, 0.5646, 0.8798, 0.8231]]),
 tensor([[False, False, False, False],
         [False, False, False, False],
         [False, False, False, False]]))

In [327]:
# With random seed
RANDOM_SEED = 42    # some numerical value
torch.manual_seed(RANDOM_SEED)
random_tensor_C = torch.rand(3, 4)
torch.manual_seed(RANDOM_SEED)      # Seed must be set before each random operation
random_tensor_D = torch.rand(3, 4)
random_tensor_C, random_tensor_D, random_tensor_C == random_tensor_D


(tensor([[0.8823, 0.9150, 0.3829, 0.9593],
         [0.3904, 0.6009, 0.2566, 0.7936],
         [0.9408, 0.1332, 0.9346, 0.5936]]),
 tensor([[0.8823, 0.9150, 0.3829, 0.9593],
         [0.3904, 0.6009, 0.2566, 0.7936],
         [0.9408, 0.1332, 0.9346, 0.5936]]),
 tensor([[True, True, True, True],
         [True, True, True, True],
         [True, True, True, True]]))

Extra resources for reproducibility:
* https://pytorch.org/docs/stable/notes/randomness.html
* https://en.wikipedia.org/wiki/Random_seed

## Running tensors and PyTorch objects on GPUs
GPUs = parallelization of matrix operations = faster computation

### Getting a GPU
1. Free GPUs on Google Colab, suited for small experiments and projects
2. Get your own GPU rack, see https://timdettmers.com/2023/01/30/which-gpu-for-deep-learning/
3. Cloud computing on GCP, AWS, Azure, etc.

In [333]:
!nvidia-smi

Der Befehl "nvidia-smi" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
