# Practical 1: PyTorch basics I
---

In [1]:
import torch

## 0. Introduction

In this practical lab, we discuss the basics of PyTorch, a Python library for machine learning and deep learning. Specifically, we cover
- Tensors and their initialisation
- Tensor math
- Tensor indexing
- Tensor reshaping
- The three common mistakes (or errors) in PyTorch 


## 1. Tensor Initialisation

PyTorch tensors can represent scalars (or numbers), vectors (list of numbers), matrices (grid of numbers), and $n$-dimensional arrays (called tensors). Let $k$ be a scalar, $\mathbf{v}$ and $\mathbf{w}$ be vectors, $A$ and $B$ be matrices, and $T$ be a tensor, such that
$$k=7,\quad \mathbf{v} = \begin{pmatrix} 2 \\ 2 \\ 1 \\ \end{pmatrix}, 
\quad \mathbf{w} = \begin{pmatrix} 13 \\ 20 \\ 10 \\ 21\\ \end{pmatrix}, \quad
A = \begin{bmatrix} 2 & 1 & 2 \\ 3 & 2 & 7 \\ 2 & 3 & 6 \\ \end{bmatrix}, \quad
B = \begin{bmatrix} 12 & 21 & 12 \\ 15 & 12 & 27 \\ 22 & 13 & 16 \\ \end{bmatrix}, \quad
\mathbf{T} = 
\begin{bmatrix} 
\begin{bmatrix} 2 & 1 & 2 \\ 3 & 2 & 7 \\ 2 & 3 & 6 \\ \end{bmatrix} \\
\begin{bmatrix} 12 & 21 & 12 \\ 15 & 12 & 27 \\ 22 & 13 & 16 \\ \end{bmatrix}
\end{bmatrix} = \begin{bmatrix} A \\ B \end{bmatrix}
$$

We can initialise PyTorch tensors to store this data in this way:

In [2]:
# scalars
scalar_k = torch.tensor(7)
scalar_k

tensor(7)

In [3]:
# vectors 
vector_v = torch.tensor([2,2,1])
vector_w = torch.tensor([13,20,10,12])
vector_v, vector_w

(tensor([2, 2, 1]), tensor([13, 20, 10, 12]))

In [4]:
# matrices
matrix_A = torch.tensor([
    [2,1,2],
    [3,2,7],
    [2,3,6]
])
matrix_B = torch.tensor([
    [12,21,12],
    [15,12,27],
    [22,13,16]
])
matrix_A, matrix_B

(tensor([[2, 1, 2],
         [3, 2, 7],
         [2, 3, 6]]),
 tensor([[12, 21, 12],
         [15, 12, 27],
         [22, 13, 16]]))

In [5]:
# tensors (n-dimensional arrays)
tensor_T = torch.tensor([
    [
       [2,1,2],
        [3,2,7],
        [2,3,6] 
    ],
    [
        [12,21,12],
        [15,12,27],
        [22,13,16]
    ]
])
tensor_T

tensor([[[ 2,  1,  2],
         [ 3,  2,  7],
         [ 2,  3,  6]],

        [[12, 21, 12],
         [15, 12, 27],
         [22, 13, 16]]])

### Dimension and shape

In [6]:
# a scalar has 0 dimension (dimensionless)
scalar_k.ndim

0

In [7]:
# vectors are one-dimensional
vector_v.ndim, vector_w.ndim

(1, 1)

In [8]:
# matrices are two-dimensional
matrix_A.ndim, matrix_B.ndim

(2, 2)

In [9]:
# Tensors are three-dimensional and up
tensor_T.ndim

3

We now investigate the shape of tensors

In [10]:
scalar_k.shape

torch.Size([])

Scalars have no shape, what about vectors?

In [11]:
vector_w.shape

torch.Size([4])

The shape of vectors say there are $n$ elements in the vector in question. For example, the shape of vector $\mathbf{w}$ is $4$, it means $\mathbf{w}$ has $4$ elements.

In [12]:
matrix_A.shape, tensor_T.shape

(torch.Size([3, 3]), torch.Size([2, 3, 3]))

So in general, **the shape of a tensor describes the number of elements along each dimension**. For example, tensor $\mathbf{T}$ has $2$ elements in the first dimension (the two matrices), and each matrix has $3$ columns (the second dimension), and each column has $3$ entries (third dimension).

### Devices

PyTorch can run on a CPU or on a GPU. We have to set the ```device``` to cuda if cuda-enabled GPU is available else it will default to CPU

In [13]:
device = 'cuda' if torch.cuda.is_available() else 'cpu'
device

'cpu'

To initialize a new tensor on a device specified, we use the following code:


In [14]:
new_tensor = torch.tensor([1,2,3], dtype=torch.float32, device=device)
new_tensor.device

device(type='cpu')

To move existing tensor to the device we use the following code:

In [15]:
tensor_T.to(device).device

device(type='cpu')

### NumPy and Pandas 

In [16]:
import pandas as pd
import numpy as np

In [17]:
np_array = np.array([2,1,3])
np_array

array([2, 1, 3])

To create a tensor from NumPy ```ndarray```, we run the following code

In [18]:
tensor_from_numpy = torch.tensor(np_array)
tensor_from_numpy

tensor([2, 1, 3])

In [44]:
tnsr_from_numpy = torch.from_numpy(np_array)
tnsr_from_numpy

tensor([2, 1, 3])

To convert a tensor into NumPy ```ndarray``` we use the following code:

In [19]:
tensor_from_numpy.numpy()

array([2, 1, 3])

To move from pandas dataframe to tensor and vice versa:

In [20]:
# Example Pandas DataFrame
df = pd.DataFrame({
    'A': [1.0, 2.0, 3.0],
    'B': [4.0, 5.0, 6.0]
})

In [21]:
# from pandas to tensor
torch.tensor(df.values, dtype=torch.float32)

tensor([[1., 4.],
        [2., 5.],
        [3., 6.]])

In [22]:
# from tensor to pandas
pd.DataFrame(matrix_A.numpy(), columns=['A', 'B', 'C']) # specify the columns 

Unnamed: 0,A,B,C
0,2,1,2
1,3,2,7
2,2,3,6


### Other common initialization methods

In [23]:
# creates a tensor with an uninitialised entries (entries may be whatever is in the memory)
torch.empty((1,2))

tensor([[-8.2505e+11,  9.3467e-43]])

In [24]:
# creates a tensor with 0 entries
torch.empty((3,4))

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [25]:
# creates a tensor with 1 entries
torch.ones((2,2,4))

tensor([[[1., 1., 1., 1.],
         [1., 1., 1., 1.]],

        [[1., 1., 1., 1.],
         [1., 1., 1., 1.]]])

In [26]:
# creates a random tensor
torch.rand(1, dtype=torch.float64, device=device)

tensor([0.3435], dtype=torch.float64)

In [27]:
# creates an identity matrix
torch.eye(7, dtype=torch.int8)

tensor([[1, 0, 0, 0, 0, 0, 0],
        [0, 1, 0, 0, 0, 0, 0],
        [0, 0, 1, 0, 0, 0, 0],
        [0, 0, 0, 1, 0, 0, 0],
        [0, 0, 0, 0, 1, 0, 0],
        [0, 0, 0, 0, 0, 1, 0],
        [0, 0, 0, 0, 0, 0, 1]], dtype=torch.int8)

In [28]:
# starts at 'start', ends at 'end', and makes sure there are 'steps'/values between 'start' and 'end'
torch.linspace(start=10, end=90, steps=10)

tensor([10.0000, 18.8889, 27.7778, 36.6667, 45.5556, 54.4444, 63.3333, 72.2222,
        81.1111, 90.0000])

In [35]:
# works similarly to python's arange function
torch.arange(start=10, end=22, step=2)

tensor([10, 12, 14, 16, 18, 20])

In [34]:
list(range(10,22,2))

[10, 12, 14, 16, 18, 20]

In [38]:
# Fills self tensor with elements samples from the normal distribution parameterized by mean and std
torch.empty(size=(1,10)).normal_(mean=0, std=1)

tensor([[ 0.2447,  0.7794, -0.0952, -0.5203, -0.3524, -1.8581, -0.2164,  0.2737,
          0.4727,  0.1196]])

In [42]:
torch.empty(size=(1,100)).uniform_(12, 15)

tensor([[14.8226, 12.0745, 12.6554, 14.4834, 14.5791, 14.6777, 14.9227, 13.0805,
         14.9706, 13.4336, 13.6891, 14.9503, 14.5773, 14.6093, 13.4869, 14.9141,
         14.2065, 12.4368, 14.0307, 12.9429, 12.2360, 13.9300, 13.5394, 12.5432,
         14.5043, 14.0182, 14.2424, 12.2127, 14.2983, 14.2305, 14.7003, 12.4082,
         13.6703, 13.1115, 12.9725, 14.1429, 14.6416, 12.1552, 14.5767, 12.4684,
         12.4718, 13.1372, 12.0812, 13.7628, 14.1279, 14.7925, 13.1097, 14.5880,
         14.2641, 13.1057, 12.2301, 14.6504, 14.1189, 12.4299, 14.6281, 14.1733,
         14.7446, 13.1821, 12.9533, 14.5342, 13.3801, 14.0148, 12.0348, 13.2964,
         14.8919, 13.0803, 13.9430, 12.4567, 12.0293, 14.3399, 13.9862, 14.3409,
         13.4746, 14.3799, 13.1609, 13.8925, 13.0767, 13.0064, 12.1014, 13.2868,
         12.4728, 12.1787, 14.9755, 12.6716, 13.9604, 14.4200, 14.5912, 12.7536,
         14.9629, 13.0740, 14.9758, 12.3238, 12.9955, 14.0515, 13.3790, 12.1428,
         14.9291, 14.3067, 1

**Tensor Types**

In [45]:
tensor = torch.ones(size=(1,9))

In [52]:
tensor.bool()
tensor.short() # int16
tensor.long() # int64
tensor.float() # float32
tensor.half() # float16
tensor.double() # float64

tensor([[1., 1., 1., 1., 1., 1., 1., 1., 1.]], dtype=torch.float64)

## 2. Tensor Math

In [55]:
tensor_X=torch.tensor([1,2,3])
tensor_Y = torch.tensor([2,3,4])

**Addition**

Let $Z$ be $$Z=X+Y$$

In [62]:
tensor_Z = torch.empty((1,3))
tensor_Z

tensor([[1.0000e+00, 4.4766e+00, 3.6013e-43]])

In [64]:
torch.add(tensor_X, tensor_Y, out=tensor_Z.resize_(0))
tensor_Z

tensor([3., 5., 7.])

In [58]:
tensor_Z = torch.add(tensor_X, tensor_Y)
tensor_Z

tensor([3, 5, 7])

In [60]:
tensor_Z = tensor_X.add(tensor_Y)
tensor_Z

tensor([3, 5, 7])

In [65]:
tensor_Z = tensor_X + tensor_Y
tensor_Z

tensor([3, 5, 7])

**Subtraction**

Let $Z$ now be $$Z=X-Y$$

In [67]:
tensor_Z = tensor_X-tensor_Y
tensor_Z

tensor([-1, -1, -1])

**Division**

Let $k$ be a scalar and $Z$ be a tensor vector

$$k=2, \qquad Z = X \otimes Y, \qquad \text{ where } \otimes \text{ represents element-wise division operation }$$

In [69]:
scalar_k = torch.tensor(2)
scalar_k

tensor(2)

In [70]:
# tensor-tensor division
tensor_Z = torch.true_divide(tensor_X, tensor_Y)
tensor_Z

tensor([0.5000, 0.6667, 0.7500])

In [72]:
# scalar divistion
torch.true_divide(tensor_X, scalar_k)

tensor([0.5000, 1.0000, 1.5000])

**Inplace operations**

Inplace operations are when duplicate copies are not created during the operations. For large datasets, this can be helpful to save computation resources, but it comes it a downside. The following are inplace operations. In general, every function with an underscore suffix, is an inplace operation.

In [73]:
tensor_X.add_(tensor_Y) # inplace operation

tensor([3, 5, 7])

In [76]:
tensor_Y += tensor_Y + 1 # inplace operation

In [77]:
tensor_Y

tensor([5, 7, 9])

In [78]:
tensor_X = tensor_X + tensor_Y # not an inplace operation (a copy is created during operation)
tensor_X

tensor([ 8, 12, 16])

**Exponentiation and element-wise comparison**

Exponentiation is similar to standard python exponentiation, and functions are the same. Element-wise operations are similar to that of numpy and pandas

**Matrix Multiplication and Exponentiation**

Let $A$ and $B$ be matrices:
$$
A = \begin{bmatrix} 2 & 1 & 2 \\ 3 & 2 & 7 \\ 2 & 3 & 6 \\ \end{bmatrix}, \quad
B = \begin{bmatrix} 12 & 21 & 12 \\ 15 & 12 & 27 \\ 22 & 13 & 16 \\ \end{bmatrix}, \quad
$$
We want to compute $A^5$ and $AB$ 

In [79]:
matrix_A.matrix_power(5)

tensor([[ 9428,  9760, 22484],
        [25328, 26224, 60416],
        [24460, 25328, 58348]])

In [81]:
AB = torch.mm(matrix_A, matrix_B)
AB

tensor([[ 83,  80,  83],
        [220, 178, 202],
        [201, 156, 201]])

**Element-wise multiplication and the dot-product**
Suppose $\mathbf{w}$ and $\mathbf{v}$ are vectors as follows 
$$\mathbf{v} = \begin{pmatrix} 2 \\ 2 \\ 1 \\ \end{pmatrix}, 
\quad \mathbf{w} = \begin{pmatrix} 13 \\ 20 \\ 10 \\ \end{pmatrix}$$
We want to compute $\mathbf{w}\bullet \mathbf{v}$ and $\mathbf{w}\otimes \mathbf{v}$ where $\otimes$ is element-wise multiplication

In [84]:
vector_w = torch.tensor((13,20,10))

In [85]:
torch.dot(vector_w, vector_v)

tensor(76)

In [86]:
vector_w * vector_v

tensor([26, 40, 10])

**Batch matrix multiplication**

Suppose we have tensors $\mathbf{A}\in\mathbb{R}^{b\times n \times m}$ and $\mathbf{B}\in\mathbb{R}^{b\times m \times p}$

We want to compute $\mathbf{AB}$. (**TODO:** Investigate this more) 

In [87]:
# suppose we have
batch=32
n=10
m=20
p=30

In [90]:
T_A = torch.rand((batch, n, m))
T_B = torch.rand((batch, m, p))
T_AB = torch.bmm(T_A, T_B)

**Broadcasting**

If we have matrix $A$ and vector $\mathbf{w}$, mathematically, it doesn't make sense to compute 
$A+\mathbf{w}$, but with PyTorch, it's possible, since $\mathbf{w}$ is _broadcasted_ to match $A$'s dimensions. So in PyTorch, $A+\mathbf{w}$ becomes

$$A+\mathbf{w} = 
\begin{bmatrix} 2 & 1 & 2 \\ 3 & 2 & 7 \\ 2 & 3 & 6 \\ \end{bmatrix} + 
\begin{pmatrix} 13 \\ 20 \\ 10 \\ \end{pmatrix} = 
\begin{bmatrix} 2 & 1 & 2 \\ 3 & 2 & 7 \\ 2 & 3 & 6 \\ \end{bmatrix} + 
\begin{pmatrix} 13 & 13 & 13 \\ 20 & 20 & 20 \\ 10 & 10 & 10 \\ \end{pmatrix}
= \begin{bmatrix} 15 & 14 & 15 \\ 23 & 22 & 27 \\ 12 & 13 & 16 \\ \end{bmatrix}
$$

In [93]:
matrix_A + vector_w

tensor([[15, 21, 12],
        [16, 22, 17],
        [15, 23, 16]])

**Other commonly used mathematical operations**
- ```torch.sum(...)```
- ```torch.max(...)```, ```torch.min(...)```
- ```torch.abs(...)```
- ```torch.argmax(...)```, ```torch.argmin(...)```
- ```torch.mean(...)```
- ```torch.median(...)```
- ```torch.eq(...)```, equality
- ```torch.sort(...)```
- ```torch.clamp(...)``` , see documentation and examples
- ```torch.any(...)```
- ```torch.all(...)```

## 3. Tensor Indexing

Tensor indexing works as follows:

```python
tensor[n_1, n_2, n_3, ...]
```

- $n_1$: Refers to the specific index(es) desired in the **first dimension**.
- $n_2$: Refers to the specific index(es) desired in the **second dimension**.
- $n_3$: Refers to the specific index(es) desired in the **third dimension**, and so on.

For example, consiser a matrix $A$,
$$\begin{bmatrix} 2 & 1 & 2 \\ 3 & 2 & 7 \\ 2 & 3 & 6 \\ \end{bmatrix}$$
which has the shape ```(rows, columns)``` -> ```(3, 3)```. If you run 
```python
matrix_A[2,1]
```
You are saying, I want row **3**  and column **2** which will be $3$

In [95]:
matrix_A[2,1]

tensor(3)

In [98]:
matrix_A[(0,2), [2,1]]

tensor([2, 3])

Get all the elements less than or equals to 3

In [101]:
three_bounded=matrix_A[matrix_A<=3]
three_bounded

tensor([2, 1, 2, 3, 2, 2, 3])

Get unique elements from a matrix/tensor

In [102]:
three_bounded.unique()

tensor([1, 2, 3])

Using the ```torch.where(...)``` function

```python
torch.where(condition, statement_if_true, statement_if_false)
```
Similar Exel's WHERE function.

In [103]:
# if element is greater than 3, divide it by 3, else multiply it by 2
torch.where(matrix_A > 3, matrix_A / 3, matrix_A * 3)

tensor([[6.0000, 3.0000, 6.0000],
        [9.0000, 6.0000, 2.3333],
        [6.0000, 9.0000, 2.0000]])

In [104]:
# Get the number of elements in the tensor
matrix_A.numel()

9

## 4. Tensor Reshaping

Reshaping and related operations in PyTorch are essential when working with tensors, as they allow you to adjust the structure of data without altering its content.

**Reshaping: `reshape` and `view`**

 **`reshape`**
- Allows you to change the shape of a tensor while maintaining its data.
- The new shape must have the same total number of elements as the original tensor.

Example:
```python
import torch
a = torch.arange(12)  # Tensor with 12 elements [0, 1, ..., 11]
reshaped = a.reshape(3, 4)  # Reshape to 3x4 matrix
print(reshaped)
```

**`view`**
- Similar to `reshape`, but operates differently under the hood.
- It returns a tensor with the same data but interpreted with a new shape (requires the tensor to be contiguous in memory).

Example:
```python
viewed = a.view(3, 4)  # View as 3x4 matrix
print(viewed)
```

**Note**: Use `view` only if you’re sure the tensor is contiguous; otherwise, `reshape` is safer.

**Stacking: `stack` and `cat`**

#### **`stack`**
- Combines multiple tensors along a new dimension.

Example:
```python
a = torch.tensor([1, 2, 3])
b = torch.tensor([4, 5, 6])
stacked = torch.stack([a, b], dim=0)  # Add a new dimension (0)
print(stacked)  # [[1, 2, 3], [4, 5, 6]]
```

**`cat`**
- Concatenates tensors along an existing dimension.

Example:
```python
concatenated = torch.cat([a.unsqueeze(0), b.unsqueeze(0)], dim=0)
print(concatenated)  # [[1, 2, 3], [4, 5, 6]]

**Adding and Removing Dimensions: `squeeze` and `unsqueeze`**

**`unsqueeze`**
- Adds a new dimension of size 1 at the specified position.

Example:
```python
a = torch.tensor([1, 2, 3])
unsqueezed = a.unsqueeze(0)  # Add a dimension at 0 (row vector)
print(unsqueezed)  # [[1, 2, 3]]
```
 **`squeeze`**
- Removes dimensions of size 1.

Example:
```python
b = torch.tensor([[1, 2, 3]])  # Shape [1, 3]
squeezed = b.squeeze(0)  # Remove the 0th dimension
print(squeezed)  # [1, 2, 3]
```

**Transposing: `transpose` and `permute`**

**`transpose`**
- Swaps two specified dimensions of a tensor.

Example:
```python
a = torch.tensor([[1, 2, 3], [4, 5, 6]])  # Shape [2, 3]
transposed = a.transpose(0, 1)  # Swap dimensions 0 and 1
print(transposed)  # [[1, 4], [2, 5], [3, 6]]
```
**`permute`**
- Allows reordering of all dimensions.

Example:
```python
a = torch.randn(2, 3, 4)  # Shape [2, 3, 4]
permuted = a.permute(2, 0, 1)  # Change to [4, 2, 3]

**Flattening: `flatten`**

- Collapses specified dimensions into one.

Example:
```python
a = torch.tensor([[1, 2], [3, 4]])
flattened = a.flatten()  # Flattens to [1, 2, 3, 4]

 **Splitting: `chunk` and `split`**

 **`chunk`**
- Splits a tensor into equal-sized chunks along a specified dimension.

Example:
```python
a = torch.arange(10)  # [0, 1, ..., 9]
chunks = torch.chunk(a, 5)  # Split into 5 chunks
print(chunks)
```

**`split`**
- Splits a tensor into chunks of specified sizes.

Example:
```python
splits = torch.split(a, [3, 7])  # Split into chunks of sizes 3 and 7
print(splits)
```

**Expanding and Repeating: `expand` and `repeat`**

**`expand`**
- Expands a tensor along singleton dimensions (without copying data).

Example:
```python
a = torch.tensor([1, 2, 3]).unsqueeze(0)  # [1, 3]
expanded = a.expand(3, 3)  # Expand to [3, 3]
```

**`repeat`**
- Repeats elements of a tensor (copies data).

Example:
```python
repeated = a.repeat(3, 1)  # Repeat along specified dimensions

### When to Use Which Function:
1. **`reshape`**: To change the shape of data.
2. **`stack`/`cat`**: To combine tensors.
3. **`squeeze`/`unsqueeze`**: To adjust dimensions.
4. **`transpose`/`permute`**: To reorder dimensions.
5. **`split`/`chunk`**: To divide tensors.
6. **`expand`/`repeat`**: For broadcasting-like behaviors.


## 5. Common mistakes

Head over to [learnpytorch.io](https://www.learnpytorch.io/pytorch_most_common_errors/)'s tutorial on this one

---