In [2]:
import torch
import numpy as np
from PIL import Image
from pathlib import Path

**Some tensor exercises to regain familiarity w PyTorch**

In [None]:
# generate random tensor, scalar
random_tensor = torch.rand(size = (3, 4))
random_tensor

tensor([[0.5600, 0.0561, 0.4043, 0.7830],
        [0.8432, 0.1763, 0.0597, 0.8329],
        [0.5180, 0.9345, 0.3740, 0.5444]])

In [None]:
# check dimensions
random_tensor.ndim

2

In [None]:
random_tensor.shape

torch.Size([3, 4])

To find the dimensions of a tensor, count the number of square brackets [ on the outside (i.e. the leftmost square brackets).

To think about tensor shapes, we can think about the first dimension as the "count". For example, a tensor item of shape [2, 4, 5] can be interpreted as:
- 2 matrices with shape [4, 5]
- where [4, 5] represents 4 vectors (rows) with 5 datapoints

A tensor item of shape [3, 4] can subsequently be interpreted as a matrix with 3 vectors (rows), each having shape 4 (aka 4 elements).

In [None]:
custom_matrix = torch.tensor([[[1, 3, 4, 5],
                               [2, 4, 5, 5],
                               [1, 0, 9, 9]]])

The matrix above has 3 dimensions, and appears to have shape [1, 3, 4] (one matrix with 3 vectors, where each vector contains 4 elements).

In [None]:
custom_matrix.shape

torch.Size([1, 3, 4])

PyTorch also supports lots of the same implementations as numpy:
- torch.zeros(size = ())
- torch.ones(size = ())
- torch.arange(start =, stop =, step =)

Some **attributes** that can be specified when in torch.tensor() include:
- dtype (default: float32)
- device (default: None)

In [None]:
custom_matrix.dtype

torch.int64

**Tensor Operations**
- Adding, subtracting and multiplying can be achieved via broadcasting
- Can perform element-wise multiplication via torch.mul()
- Can perform matrix multiplication with matrix @ matrix or torch.matmul()
- Two methods of transpose: torch.transpose(input, dim0, dim1) or tensor.T

**Aggregate Operations**
- Mean: convert into float32, then use .mean()
- Can directly implement tensor.min(), .max(), and .sum()
- We can also return the **index** of where the max and min are found using .argmax() and .argmin()

**Changing Type of Tensor**
- Given an existing tensor, we can use tensor.type()

In [None]:
tensor = torch.tensor([1, 3, 5, 5, 2, 6, 1])

tensor.type(torch.float32).mean()

tensor(3.2857)

In [None]:
tensor.max()

tensor(6)

**Quick note on datatypes**
- These are in charge of precision for storing tensor in memory
- Ex) 64-bit precision is more than 32-bit
- Less precise computations take less time and lead to a smaller overall model, but can sacrifice accuracy

In [None]:
tensor_16 = torch.arange(10.0, 100.0, 10.0).type(torch.float16)
tensor_16.dtype

torch.float16

**Reshaping without changing values**
- torch.reshape(input, shape) or torch.Tensor.reshape reshapes input to a compatible shape
- Tensor.view(shape) provides view of original tensor in different shape (but same data)
- torch.stack(tensors, dim) stacks tensors along a specific dimension
- torch.squeeze(tensor) removes all dimensions w value 1
- torch.unsqueeze(tensor, dim) adds a value of 1 at dim
- torch.permute(tensor, dims) returns original input with dimensions permuted to dims

In [3]:
x = torch.arange(1, 8)
x.shape

torch.Size([7])

In [5]:
# add a dimension using reshape
x_reshaped = x.reshape(1, 7, 1)
x_reshaped, x_reshaped.shape

(tensor([[[1],
          [2],
          [3],
          [4],
          [5],
          [6],
          [7]]]),
 torch.Size([1, 7, 1]))

In [6]:
# let's look at it in a new view (this changes the tensor if we assign it to a new variable)
x_reshaped.view(1, 7)

tensor([[1, 2, 3, 4, 5, 6, 7]])

Let's perform some **operations on a 1D tensor**.
- Here, there is only one dimension, so we can only perform operations on dim = 0 (the horizontal dimension)
- Summing over a 1D tensor gives the sum of all values in the tensor
- Stacking **adds a dimension**

In [20]:
# we can start with the tensor we generated earlier
x, x.ndim, x.shape

(tensor([1, 2, 3, 4, 5, 6, 7]), 1, torch.Size([7]))

In [22]:
x.sum(dim = 0)

tensor(28)

In [33]:
stacked_1d = torch.stack([x, x, x])

stacked_1d, stacked_1d.shape, stacked_1d.ndim

(tensor([[1, 2, 3, 4, 5, 6, 7],
         [1, 2, 3, 4, 5, 6, 7],
         [1, 2, 3, 4, 5, 6, 7]]),
 torch.Size([3, 7]),
 2)

Now, let's try on a **2D tensor**
- Now, we have dimensions 0 and 1, so we can perform operations on these dimensions
- We can consider dim = 0 as the "outermost dimension". This means "the largest chunk" in the tensor
- We consider dim = 1 as the "innermost dimension", aka the smallest building block in the tensor

For example, .sum(dim = 0) sums over the "largest chunks", which would be the individual matrices within the tensor (i.e. [1, 2, 3] and [3, 5, 7]. We could express this addition like item-wise addition:

```
  [1, 2, 3]
+ [3, 5, 7]
= [4, 7, 10]
```
The resulting sum is a 1D tensor with shape 3

In contrast, .sum(dim = 1) sums over the "smallest chunks". In this case, that would be the numbers within the samller matrices. We can think of this as "within-group" addition:


```
[1 + 2 + 3, 3 + 5 + 7] = [6, 15]
```
The resulting sum is a 1D tensor with shape 2


- We can also access a tensor's innermost dimension with dim = -1 (similar to list indexing)



In [27]:
tens_2d = torch.tensor([[1, 2, 3], [3, 5, 7]])
tens_2d, tens_2d.ndim, tens_2d.shape

(tensor([[1, 2, 3],
         [3, 5, 7]]),
 2,
 torch.Size([2, 3]))

In [29]:
# this will sum over the "largest chunks"
dim0_sum = tens_2d.sum(dim = 0)

print("Dimension 0")
print(dim0_sum, dim0_sum.shape, dim0_sum.ndim)

Dimension 0
tensor([ 4,  7, 10]) torch.Size([3]) 1


In [30]:
# this sums over the "smallest chunks" aka the individual numbers
dim1_sum = tens_2d.sum(dim = 1)

print("Dimension 1")
print(dim1_sum, dim1_sum.shape, dim1_sum.ndim)

Dimension 1
tensor([ 6, 15]) torch.Size([2]) 1


Let's try stacking for a 2D tensor

- Here, stacking along dim = 0 adds a dimension to the front. We stack along the axis of the "largest chunks"
- Stacking along axis dim = 1 takes the smaller units within the matrices and stacks the individual "vectors"
- Stacking along both dim = 0, 1 results in the same shape and dim

- However, stacking along dim = 2 is the transpose of dim = 1. This is the innermost dimension.

In [35]:
dim0_stack = torch.stack([tens_2d, tens_2d], dim = 0)
dim0_stack, dim0_stack.shape, dim0_stack.ndim

## shape interpretation: 2 matrices of size [2, 3] (2 vectors, 3 elements per vector)

(tensor([[[1, 2, 3],
          [3, 5, 7]],
 
         [[1, 2, 3],
          [3, 5, 7]]]),
 torch.Size([2, 2, 3]),
 3)

In [36]:
dim1_stack = torch.stack([tens_2d, tens_2d], dim = 1)
dim1_stack, dim1_stack.shape, dim1_stack.ndim



(tensor([[[1, 2, 3],
          [1, 2, 3]],
 
         [[3, 5, 7],
          [3, 5, 7]]]),
 torch.Size([2, 2, 3]),
 3)

In [37]:
dim2_stack = torch.stack([tens_2d, tens_2d], dim = 2)
dim2_stack, dim2_stack.shape, dim2_stack.ndim

(tensor([[[1, 1],
          [2, 2],
          [3, 3]],
 
         [[3, 3],
          [5, 5],
          [7, 7]]]),
 torch.Size([2, 3, 2]),
 3)

Finally, let's work with 3D tensors !! We can use the same logic as our 2D tensors. To generate a 3D tensor, we should have three "outermost" brackets before reaching the first number.

Analyzing the sums:
- For summing along dim = 0, this is element-wise addition. We add the entire first matrix to the entire second matrix.

```
  [1, 2, 3] # first row of mat1
+ [4, 5, 1] # first row of mat2
= [5, 7, 4]
and
  [4, 2, 6] # second row of mat1
+ [2, 2, 3] # second row of mat2
= [6, 4, 9]
```
- For summing along dim = 1, we can think of this as "within-matrix addition". We sum the second-largest components (i.e. the vectors within the matrices) with each other

```
  [1, 2, 3] # first row of mat1
+ [4, 2, 6] # second row of mat1
= [5, 4, 9]
```
- For summing along dim = 2, we see that this is a "individual sum", and we sum up all the elements within the vectors.

```
[1 + 2 + 3], [4 + 2 + 6]
[4 + 5 + 1], [2 + 2 + 3]
= [6], [12]
  [10], [7]
```

In [3]:
tens_3d = torch.tensor([
    [[1, 2, 3], [4, 2, 6]],
    [[4, 5, 1], [2, 2, 3]]
    ])

tens_3d, tens_3d.shape, tens_3d.ndim

## shape interpretation: 2 matrices of size [2, 3], aka two vectors with three entries each

(tensor([[[1, 2, 3],
          [4, 2, 6]],
 
         [[4, 5, 1],
          [2, 2, 3]]]),
 torch.Size([2, 2, 3]),
 3)

In [4]:
# this will sum over the "largest chunks"
dim0_sum = tens_3d.sum(dim = 0)

print("Dimension 0")
print(dim0_sum, dim0_sum.shape, dim0_sum.ndim)

Dimension 0
tensor([[5, 7, 4],
        [6, 4, 9]]) torch.Size([2, 3]) 2


In [5]:
dim1_sum = tens_3d.sum(dim = 1)

print("Dimension 1")
print(dim1_sum, dim1_sum.shape, dim1_sum.ndim)

Dimension 1
tensor([[5, 4, 9],
        [6, 7, 4]]) torch.Size([2, 3]) 2


In [6]:
dim2_sum = tens_3d.sum(dim = 2)

print("Dimension 2")
print(dim2_sum, dim2_sum.shape, dim2_sum.ndim)

Dimension 2
tensor([[ 6, 12],
        [10,  7]]) torch.Size([2, 2]) 2


Let's examine stacking:
- As expected, stacking along dim = 0 simply stacks the outermost matrices together. Here, the outermost matrix is [2, 2, 3] or 2 x [2, 3]

```
[1, 2, 3],
[4, 2, 6]

[4, 5, 1],
[2, 2, 3]
```
- Dim = 1 is the next-outermost dimension, which is reflected in the smaller [2, 3] matrices. Each of these [2, 3] matrices are stacked on top of each other.
- Dim = 2 should refer to each of the 1x3 vectors within each [2, 3] matrix. Each of the individual vectors from one copy of tens_3d is stacked onto an individ. vector from the second copy. There are four individual vectors in dim = 2
```
[1, 2, 3],
[4, 2, 6],
[4, 5, 1],
[2, 2, 3]
```
- Finally, dim = 3 refers to the individual numbers within each of the vectors (1, 2, 3, 4, 2, 6 .... 2, 3). These are "stacked", or combined with the corresponding number from the second copy of tens_3d.

In [7]:
dim0_stack = torch.stack([tens_3d, tens_3d], dim = 0)
dim0_stack, dim0_stack.shape, dim0_stack.ndim

## shape interpretation: now 4-dims. 2 large matrices, with 2 smaller matrices inside. Each smaller matrix has shape [2, 3]


(tensor([[[[1, 2, 3],
           [4, 2, 6]],
 
          [[4, 5, 1],
           [2, 2, 3]]],
 
 
         [[[1, 2, 3],
           [4, 2, 6]],
 
          [[4, 5, 1],
           [2, 2, 3]]]]),
 torch.Size([2, 2, 2, 3]),
 4)

In [8]:
dim1_stack = torch.stack([tens_3d, tens_3d], dim = 1)
dim1_stack, dim1_stack.shape, dim1_stack.ndim

## shape interpretation: now 4-dims. 2 large matrices, with 2 smaller matrices inside. Each smaller matrix has shape [2, 3]

(tensor([[[[1, 2, 3],
           [4, 2, 6]],
 
          [[1, 2, 3],
           [4, 2, 6]]],
 
 
         [[[4, 5, 1],
           [2, 2, 3]],
 
          [[4, 5, 1],
           [2, 2, 3]]]]),
 torch.Size([2, 2, 2, 3]),
 4)

In [10]:
dim2_stack = torch.stack([tens_3d, tens_3d], dim = 2)
dim2_stack, dim2_stack.shape, dim2_stack.ndim

(tensor([[[[1, 2, 3],
           [1, 2, 3]],
 
          [[4, 2, 6],
           [4, 2, 6]]],
 
 
         [[[4, 5, 1],
           [4, 5, 1]],
 
          [[2, 2, 3],
           [2, 2, 3]]]]),
 torch.Size([2, 2, 2, 3]),
 4)

In [11]:
dim3_stack = torch.stack([tens_3d, tens_3d], dim = 3)
dim3_stack, dim3_stack.shape, dim3_stack.ndim

(tensor([[[[1, 1],
           [2, 2],
           [3, 3]],
 
          [[4, 4],
           [2, 2],
           [6, 6]]],
 
 
         [[[4, 4],
           [5, 5],
           [1, 1]],
 
          [[2, 2],
           [2, 2],
           [3, 3]]]]),
 torch.Size([2, 2, 3, 2]),
 4)

**Indexing**
- Indexing is similar to Python lists, going from outermost to innermost dimension.
- Ex) tensor[0] accesses the 0th element of the outermost dimension. tensor[0][0] accesses the 0th element of the second-outermost dimension, and so forth
- Similar to Python lists, adding a colon + comma selects all the values in that dimension
- Ex) tensor[:, :, 1] takes the entire dim0, entire dim1, and the first index of dim2

In [3]:
x = torch.arange(1, 10).reshape(1, 3, 3) # creating a 3D tensor

x, x.shape, x.ndim

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]),
 3)

In [4]:
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [5]:
x[0][1]

tensor([4, 5, 6])

In [6]:
x[0][1][2]

tensor(6)

In [7]:
x[:, 1, 1] # this should return [5]

tensor([5])

In [9]:
x[:, :, 2] # this returns [3, 6, 9], aka the second element of all the vectors

tensor([[3, 6, 9]])

Running on GPU for large scale calculations:
1. Check if GPU is available
2. Write device variable which can set to run on gpu if available

In [3]:
torch.cuda.is_available()

True

In [4]:
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

Exercises:

In [5]:
# 1 random tensor

rando = torch.rand(size = (7, 7))
rando

tensor([[0.0457, 0.5235, 0.2222, 0.2548, 0.3591, 0.9179, 0.0644],
        [0.0491, 0.4401, 0.8279, 0.2169, 0.1140, 0.7570, 0.6340],
        [0.3988, 0.1262, 0.6524, 0.7222, 0.6510, 0.6468, 0.9578],
        [0.0657, 0.7745, 0.9396, 0.1807, 0.3543, 0.3372, 0.4056],
        [0.7353, 0.2378, 0.7191, 0.6935, 0.6947, 0.4327, 0.2974],
        [0.3118, 0.3243, 0.1541, 0.3925, 0.3662, 0.7409, 0.5370],
        [0.2067, 0.8930, 0.4838, 0.5705, 0.5794, 0.5639, 0.9413]])

In [6]:
rando_2 = torch.rand(size = (1, 7))
rando_2

tensor([[0.9083, 0.9561, 0.8080, 0.8432, 0.5581, 0.6355, 0.6683]])

In [7]:
# 2 multiplication

mult = torch.matmul(rando, rando_2.T)
mult

tensor([[1.7633],
        [2.2856],
        [3.0334],
        [2.3949],
        [2.9225],
        [2.0828],
        [3.2244]])

In [9]:
# 3 seed

random_seed = 0
torch.manual_seed(seed = random_seed)

rando = torch.rand(size = (7, 7))
rando

tensor([[0.4963, 0.7682, 0.0885, 0.1320, 0.3074, 0.6341, 0.4901],
        [0.8964, 0.4556, 0.6323, 0.3489, 0.4017, 0.0223, 0.1689],
        [0.2939, 0.5185, 0.6977, 0.8000, 0.1610, 0.2823, 0.6816],
        [0.9152, 0.3971, 0.8742, 0.4194, 0.5529, 0.9527, 0.0362],
        [0.1852, 0.3734, 0.3051, 0.9320, 0.1759, 0.2698, 0.1507],
        [0.0317, 0.2081, 0.9298, 0.7231, 0.7423, 0.5263, 0.2437],
        [0.5846, 0.0332, 0.1387, 0.2422, 0.8155, 0.7932, 0.2783]])

In [10]:
# need to reset the seed every time a new rand is called

torch.random.manual_seed(seed = random_seed)

rando_2 = torch.rand(size = (1, 7))
rando_2

tensor([[0.4963, 0.7682, 0.0885, 0.1320, 0.3074, 0.6341, 0.4901]])

In [11]:
mult = torch.matmul(rando, rando_2.T)
mult

tensor([[1.5985],
        [1.1173],
        [1.2741],
        [1.6838],
        [0.8279],
        [1.0347],
        [1.2498]])

In [12]:
# 4 gpu seed

torch.cuda.manual_seed(seed = 1234)

In [14]:
# 5 tensors on gpu

torch.manual_seed(1234)

rando = torch.rand(size = (2, 3))
rando_gpu = rando.to(device)

rando2 = torch.rand(size = (2, 3))
rando2_gpu = rando2.to(device)

rando_gpu, rando2_gpu

(tensor([[0.0290, 0.4019, 0.2598],
         [0.3666, 0.0583, 0.7006]], device='cuda:0'),
 tensor([[0.0518, 0.4681, 0.6738],
         [0.3315, 0.7837, 0.5631]], device='cuda:0'))

In [15]:
mult = torch.matmul(rando_gpu, rando2_gpu.T)
mult

tensor([[0.3647, 0.4709],
        [0.5184, 0.5617]], device='cuda:0')

In [17]:
print(f"minimum: {torch.min(mult)}")
print(f"maximum: {torch.max(mult)}")
print(f"argmin: {torch.argmin(mult)}")
print(f"argmax: {torch.argmax(mult)}")

minimum: 0.3647301495075226
maximum: 0.5617256760597229
argmin: 0
argmax: 3


In [20]:
# 6 reshaping
torch.manual_seed(7)

rand = torch.rand(size = (1, 1, 1, 10))
new_rand = torch.squeeze(rand)

rand.shape, new_rand.shape

(torch.Size([1, 1, 1, 10]), torch.Size([10]))

In [21]:
rand, new_rand

(tensor([[[[0.5349, 0.1988, 0.6592, 0.6569, 0.2328, 0.4251, 0.2071, 0.6297,
            0.3653, 0.8513]]]]),
 tensor([0.5349, 0.1988, 0.6592, 0.6569, 0.2328, 0.4251, 0.2071, 0.6297, 0.3653,
         0.8513]))