In [1]:
## 00. PyTorch Fundamentals

print('well, this is a start')


well, this is a start


In [2]:
!nvidia-smi

/bin/bash: line 1: nvidia-smi: command not found


In [3]:
import torch
import pandas as pd
import numpy
import matplotlib.pyplot as plt
print(torch.__version__)

2.3.0+cu121


## Intro to Tensors

Tensors are the fundamental building block of machine learning.

Their job is to represent data in a numerical way.

For example, you could represent an image as a tensor with shape `[3, 224, 224]` which would mean `[colour_channels, height, width]`, as in the image has 3 colour channels (red, green, blue), a height of 224 pixels and a width of 224 pixels.

### Creating tensors

**DOCS HERE**: [torch.Tensor](https://pytorch.org/docs/stable/tensors.html)


In [4]:
# scalar
scalar = torch.tensor(7)
scalar

tensor(7)

In [5]:
scalar.ndim

0

In [6]:
# Get tensor back as Python int
scalar.item()


7

In [7]:
# Vector
vector = torch.tensor([7, 7])
vector

tensor([7, 7])

In [8]:
# Dimension is kinda like, spatial representation. You can look at it as 'number of pairs of brackets' haha
vector.ndim

1

In [9]:
vector.shape

torch.Size([2])

In [10]:
# MATRIX
MATRIX = torch.tensor([[7, 8], [9, 10]])
MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

In [11]:
MATRIX.ndim


2

In [12]:
MATRIX.shape


torch.Size([2, 2])

In [13]:
# TENSOR
TENSOR = torch.tensor([[[1, 2, 3],
                        [4, 5, 6],
                        [7, 8, 9]]])
TENSOR

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

In [14]:
TENSOR[0]


tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

## Random tensors

We've established tensors represent some form of data.

And machine learning models such as neural networks manipulate and seek patterns within tensors.

But when building machine learning models with PyTorch, it's rare you'll create tensors by hand (like what we've being doing).

Instead, a machine learning model often starts out with large random tensors of numbers and adjusts these random numbers as it works through data to better represent it.

In essence:

`Start with random numbers -> look at data -> update random numbers -> look at data -> update random numbers...`



In [15]:
# Create random tensor of size/shape (3, 4)
rand_tensor = torch.rand(3, 4)

rand_tensor, rand_tensor.dtype

(tensor([[0.5917, 0.8675, 0.5167, 0.0153],
         [0.5917, 0.7927, 0.1949, 0.6637],
         [0.2433, 0.9219, 0.6308, 0.5385]]),
 torch.float32)

In [16]:
# Create a random tensor of size (224, 224, 3)
random_image_size_tensor = torch.rand(size=(224, 224, 3))
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

## Zeros and ones
Sometimes you'll just want to fill tensors with zeros or ones.

This happens a lot with masking (like masking some of the values in one tensor with zeros to let a model know not to learn them).

Let's create a tensor full of zeros with `torch.zeros()`

Again, the `size` parameter comes into play.

In [17]:
# Create a tensor of all zeros
zeros = torch.zeros(size=(3, 4))
zeros, zeros.dtype

(tensor([[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]),
 torch.float32)

We can do the same to create a tensor of all ones except using `torch.ones()` instead.

In [18]:
# Create a tensor of all ones
ones = torch.ones(size=(3, 4))
ones, ones.dtype

(tensor([[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]),
 torch.float32)

## Create a tensor of all ones

Sometimes you might want a range of numbers, such as 1 to 10 or 0 to 100.

You can use `torch.arange(start, end, step)` to do so.

Where:

- start = start of range (e.g. 0)
- end = end of range (e.g. 10)
- step = how many steps in between each value (e.g. 1)

Note: In Python, you can use `range()` to create a range. However in PyTorch, `torch.range()` is deprecated and may show an error in the future.

In [19]:
# Use torch.arange(), torch.range() is deprecated
# zero_to_ten_deprecated = torch.range(0, 10) # Note: this may return an error in the future

# Create a range of values 0 to 10
zero_to_ten = torch.arange(start=0, end=10, step=1)
zero_to_ten

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

## Tensor Datatypes



In [20]:
# Default datatype for tensors is float32
float_32_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=None,            # defaults to None, which is torch.float32 or whatever datatype is passed
                               device='cpu',           # defaults to None, which uses the default tensor type
                               requires_grad=False)   # if True, operations performed on the tensor are recorded

float_32_tensor.shape, float_32_tensor.dtype, float_32_tensor.device

(torch.Size([3]), torch.float32, device(type='cpu'))

In [21]:
float_16_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=torch.float16) # torch.half would also work

float_16_tensor.dtype

torch.float16

## Getting information from tensors

Once you've created tensors (or someone else or a PyTorch module has created them for you), you might want to get some information from them.

! Three of the most common attributes you'll want to find out about tensors are:

- `shape` - what shape is the tensor? (some operations require specific shape rules)
- `dtype` - what datatype are the elements within the tensor stored in?
- `device` - what device is the tensor stored on? (usually GPU or CPU)

**Note**: When you run into issues in PyTorch, it's very often one to do with one of the three attributes above. So when the error messages show up, sing yourself a little song called "what, what, where":

- *"what shape are my tensors? what datatype are they and where are they stored? what shape, what datatype, where where where"*

In [22]:
some_tensor = torch.rand(3, 4)

# Find out details about it
print(some_tensor)
print(f"Shape of tensor: {some_tensor.shape}")
print(f"Datatype of tensor: {some_tensor.dtype}")
print(f"Device tensor is stored on: {some_tensor.device}") # will default to CPU

tensor([[0.3933, 0.2981, 0.0744, 0.7520],
        [0.7344, 0.3234, 0.8232, 0.5034],
        [0.7833, 0.2148, 0.9900, 0.0297]])
Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


In [23]:
another_tensor = torch.rand(size=(5, 6),
                            dtype=torch.float64,
                            device='cpu')

print(another_tensor)
print(f"Shape of tensor: {another_tensor.shape}")
print(f"Datatype of tensor: {another_tensor.dtype}")
print(f"Device tensor is stored on: {another_tensor.device}")

tensor([[0.2945, 0.0911, 0.7484, 0.3016, 0.7137, 0.2961],
        [0.3701, 0.9336, 0.2943, 0.6248, 0.5361, 0.9470],
        [0.5326, 0.1482, 0.0917, 0.1686, 0.8187, 0.9837],
        [0.1321, 0.5424, 0.9348, 0.8434, 0.3192, 0.9937],
        [0.4332, 0.2386, 0.4099, 0.0567, 0.8203, 0.5785]], dtype=torch.float64)
Shape of tensor: torch.Size([5, 6])
Datatype of tensor: torch.float64
Device tensor is stored on: cpu


## Manipulating tensors (tensor operations)

In deep learning, data (images, text, video, audio, protein structures, etc) gets represented as tensors.

A model learns by investigating those tensors and performing a series of operations (could be 1,000,000s+) on tensors to create a representation of the patterns in the input data.

These operations are often a wonderful dance between:

- Addition
- Substraction
- Multiplication (element-wise)
- Division
- Matrix multiplication

And that's it. Sure there are a few more here and there but these are the basic building blocks of neural networks.

Stacking these building blocks in the right way, you can create the most sophisticated of neural networks (just like lego!).


### Basic operations

In [24]:
# Create a tensor of values and add a number to it
tensor = torch.tensor([1, 2, 3])
tensor + 10

tensor([11, 12, 13])

In [25]:
# Multiply it by 10
tensor * 10

tensor([10, 20, 30])

Notice how the tensor values above didn't end up being `tensor([110, 120, 130])`, this is because the values inside the tensor don't change unless they're reassigned.

In [26]:
# Tensors don't change unless reassigned
tensor

tensor([1, 2, 3])

In [27]:
# Subtract and reassign
tensor = tensor - 10
tensor

tensor([-9, -8, -7])

In [28]:
# Add and reassign
tensor = tensor + 10
tensor

tensor([1, 2, 3])


PyTorch also has a bunch of built-in functions like `torch.mul()` (short for multiplication) and torch.add() to perform basic operations.

In [29]:
# Can also use torch functions
torch.multiply(tensor, 10)

tensor([10, 20, 30])

In [30]:
# Original tensor is still unchanged
tensor


tensor([1, 2, 3])

## Matrix multiplication (is all you need)

One of the most common operations in machine learning and deep learning algorithms (like neural networks) is matrix multiplication.

PyTorch implements matrix multiplication functionality in the `torch.matmul()` method.

The main two rules for matrix multiplication to remember are:

1. The inner dimensions must match:
  - `(3, 2) @ (3, 2)` won't work
  - `(2, 3) @ (3, 2)` will work
  - `(3, 2) @ (2, 3)` will work

2. The resulting matrix has the shape of the outer dimensions:
  - `(2, 3) @ (3, 2)` -> `(2, 2)`
  - `(3, 2) @ (2, 3)` -> `(3, 3)`

Note: "@" in Python is the symbol for matrix multiplication.

In [31]:
import torch
tensor = torch.tensor([1, 2, 3])
tensor.shape

torch.Size([3])

## One of the most common errors: shape errors

Because much of deep learning is multiplying and performing operations on matrices and matrices have a strict rule about what shapes and sizes can be combined, one of the most common errors you'll run into in deep learning is shape mismatches.

In [32]:
# Shapes need to be in the right way
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]], dtype=torch.float32)

tensor_B = torch.tensor([[7, 10],
                         [8, 11],
                         [9, 12]], dtype=torch.float32)

# torch.matmul(tensor_A, tensor_B) # (this will error)

We can make matrix multiplication work between `tensor_A` and `tensor_B` by making their inner dimensions match.

One of the ways to do this is with a *transpose* (switch the dimensions of a given tensor).

You can perform transposes in PyTorch using either:

- `torch.transpose(input, dim0, dim1)` - where input is the desired tensor to transpose and `dim0` and `dim1` are the dimensions to be swapped.
- `tensor.T` - where tensor is the desired tensor to transpose.

In [33]:
# View tensor_A and tensor_B
print(tensor_A)
print(tensor_B)

tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])
tensor([[ 7., 10.],
        [ 8., 11.],
        [ 9., 12.]])


In [34]:
# View tensor_A and tensor_B.T
print(tensor_A)
print(tensor_B.T)

tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])
tensor([[ 7.,  8.,  9.],
        [10., 11., 12.]])


In [35]:
# The operation works when tensor_B is transposed
print(f"Original shapes: tensor_A = {tensor_A.shape}, tensor_B = {tensor_B.shape}\n")
print(f"New shapes: tensor_A = {tensor_A.shape} (same as above), tensor_B.T = {tensor_B.T.shape}\n")
print(f"Multiplying: {tensor_A.shape} * {tensor_B.T.shape} <- inner dimensions match\n")
print("Output:\n")
output = torch.matmul(tensor_A, tensor_B.T)
print(output)
print(f"\nOutput shape: {output.shape}")

Original shapes: tensor_A = torch.Size([3, 2]), tensor_B = torch.Size([3, 2])

New shapes: tensor_A = torch.Size([3, 2]) (same as above), tensor_B.T = torch.Size([2, 3])

Multiplying: torch.Size([3, 2]) * torch.Size([2, 3]) <- inner dimensions match

Output:

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])

Output shape: torch.Size([3, 3])


Without the transpose, the rules of matrix mulitplication aren't fulfilled and we get an error like above.

## Tensor aggregation

A few ways to manipulate tensors, let's run through a few ways to aggregate them (go from more values to less values).


In [36]:
x = torch.arange(0, 100, 10)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [37]:
print(f"Minimum: {x.min()}")
print(f"Maximum: {x.max()}")
# print(f"Mean: {x.mean()}") # this will error
print(f"Mean: {x.type(torch.float32).mean()}") # won't work without float datatype
print(f"Sum: {x.sum()}")

Minimum: 0
Maximum: 90
Mean: 45.0
Sum: 450


  **Note**: Some methods such as torch.mean() require tensors to be in torch.float32 (the most common) or another specific datatype, otherwise the operation will fail.


  You can also do the same as above with torch methods.
  

In [38]:
torch.max(x), torch.min(x), torch.mean(x.type(torch.float32)), torch.sum(x)


(tensor(90), tensor(0), tensor(45.), tensor(450))

### Positional min/max

You can also find the index of a tensor where the max or minimum occurs with `torch.argmax()` and `torch.argmin()` respectively.

This is helpful in case you just want the position where the highest (or lowest) value is and not the actual value itself (we'll see this in a later section when using the softmax activation function).

In [39]:
# Create a tensor
tensor = torch.rand(size=(2, 4, 5))
print(f"Tensor: {tensor}")

# Returns index of max and min values
print(f"Index where max value occurs: {tensor.argmax()}")
print(f"Index where min value occurs: {tensor.argmin()}")

Tensor: tensor([[[0.0757, 0.7427, 0.7120, 0.3682, 0.3985],
         [0.9651, 0.4214, 0.5962, 0.5198, 0.9939],
         [0.2867, 0.3916, 0.0586, 0.9227, 0.5142],
         [0.7224, 0.8538, 0.5811, 0.5291, 0.4006]],

        [[0.4675, 0.2979, 0.3091, 0.1650, 0.9569],
         [0.4543, 0.0900, 0.2677, 0.3266, 0.6938],
         [0.4010, 0.5771, 0.5161, 0.3927, 0.8014],
         [0.9426, 0.6035, 0.4158, 0.7720, 0.6492]]])
Index where max value occurs: 9
Index where min value occurs: 12


## Change tensor datatype

As mentioned, a common issue with deep learning operations is having your tensors in different datatypes.

If one tensor is in `torch.float64` and another is in `torch.float32`, you might run into some errors.

But there's a fix.

You can change the datatypes of tensors using `torch.Tensor.type(dtype=None)` where the dtype parameter is the datatype you'd like to use.



In [40]:
# Create a tensor and check its datatype
tensor = torch.arange(10., 100., 10.)
tensor.dtype

torch.float32

In [41]:
# Create a float16 tensor
tensor_float16 = tensor.type(torch.float16)
tensor_float16

tensor([10., 20., 30., 40., 50., 60., 70., 80., 90.], dtype=torch.float16)

In [42]:
# Create a int8 tensor
tensor_int8 = tensor.type(torch.int8)
tensor_int8


tensor([10, 20, 30, 40, 50, 60, 70, 80, 90], dtype=torch.int8)

## Reshaping, stacking, squeezing and unsqueezing

- Reshaping - reshapes and input tensor to a defined shape
- View - return a  view of an input tensor of certain shape but keep the same memory as the original tensor
- Stacking - combine multiple tensors on top of each other (vstack) or side-by-side(hstack)
- Squeeze - removes all `1` dimensios from a tensor
- Unsqueeze - add a `1` dimension to a target tensor
- Permute - return a view of the input with dimensions permuted (swapped) in a certain way

In [43]:
import torch
x = torch.arange(1, 10)
x, x.shape

(tensor([1, 2, 3, 4, 5, 6, 7, 8, 9]), torch.Size([9]))

In [44]:
# Add an extra dimension
x_reshaped = x.reshape(1, 9, 1)
x_reshaped, x_reshaped.shape

(tensor([[[1],
          [2],
          [3],
          [4],
          [5],
          [6],
          [7],
          [8],
          [9]]]),
 torch.Size([1, 9, 1]))

In [45]:
# Change the view
z = x.view(1, 9)
z, z.shape

(tensor([[1, 2, 3, 4, 5, 6, 7, 8, 9]]), torch.Size([1, 9]))

In [46]:
# Changing z changes x, becauze a voew of a tensor shares the same memory  as the original
z[:, 0] = 5
z, x


(tensor([[5, 2, 3, 4, 5, 6, 7, 8, 9]]), tensor([5, 2, 3, 4, 5, 6, 7, 8, 9]))

In [47]:
# Stack tensors on top of eachother
x_stacked = torch.stack([x, x, x, x], dim=0)
x, x_stacked

(tensor([5, 2, 3, 4, 5, 6, 7, 8, 9]),
 tensor([[5, 2, 3, 4, 5, 6, 7, 8, 9],
         [5, 2, 3, 4, 5, 6, 7, 8, 9],
         [5, 2, 3, 4, 5, 6, 7, 8, 9],
         [5, 2, 3, 4, 5, 6, 7, 8, 9]]))

In [48]:
x_stacked = torch.stack([x, x, x, x], dim=1)
x, x_stacked

(tensor([5, 2, 3, 4, 5, 6, 7, 8, 9]),
 tensor([[5, 5, 5, 5],
         [2, 2, 2, 2],
         [3, 3, 3, 3],
         [4, 4, 4, 4],
         [5, 5, 5, 5],
         [6, 6, 6, 6],
         [7, 7, 7, 7],
         [8, 8, 8, 8],
         [9, 9, 9, 9]]))

In [49]:
# torch.squeeze() - removes al single dimensions from a target tensor
x_stacked

tensor([[5, 5, 5, 5],
        [2, 2, 2, 2],
        [3, 3, 3, 3],
        [4, 4, 4, 4],
        [5, 5, 5, 5],
        [6, 6, 6, 6],
        [7, 7, 7, 7],
        [8, 8, 8, 8],
        [9, 9, 9, 9]])

In [50]:
x_stacked.squeeze()

tensor([[5, 5, 5, 5],
        [2, 2, 2, 2],
        [3, 3, 3, 3],
        [4, 4, 4, 4],
        [5, 5, 5, 5],
        [6, 6, 6, 6],
        [7, 7, 7, 7],
        [8, 8, 8, 8],
        [9, 9, 9, 9]])

In [51]:
x_test = torch.zeros([1, 3, 4, 1, 2])
x_test, x_test.size

(tensor([[[[[0., 0.]],
 
           [[0., 0.]],
 
           [[0., 0.]],
 
           [[0., 0.]]],
 
 
          [[[0., 0.]],
 
           [[0., 0.]],
 
           [[0., 0.]],
 
           [[0., 0.]]],
 
 
          [[[0., 0.]],
 
           [[0., 0.]],
 
           [[0., 0.]],
 
           [[0., 0.]]]]]),
 <function Tensor.size>)

In [52]:
x_test.squeeze()


tensor([[[0., 0.],
         [0., 0.],
         [0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.],
         [0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.],
         [0., 0.],
         [0., 0.]]])

In [53]:
print(f"Previous tensor: {x_reshaped}")
print(f"Previous shape: {x_reshaped.shape}")

# Remove extra dimension from x_reshaped
x_squeezed = x_reshaped.squeeze()
print(f"\nNew tensor: {x_squeezed}")
print(f"New shape: {x_squeezed.shape}")

Previous tensor: tensor([[[5],
         [2],
         [3],
         [4],
         [5],
         [6],
         [7],
         [8],
         [9]]])
Previous shape: torch.Size([1, 9, 1])

New tensor: tensor([5, 2, 3, 4, 5, 6, 7, 8, 9])
New shape: torch.Size([9])


In [54]:
x_og = torch.rand(size=(224, 224, 3)) # [height, width, color_channels]

# Permute the og tensor to rearrance the axis (or dim) order
x_permuted = x_og.permute(2, 0, 1)
print(f"Previous shape: {x_og.shape}")
x_og[0,0,0] = 123
print(f"New shape: {x_permuted.shape}")
x_permuted[0,0,0]

Previous shape: torch.Size([224, 224, 3])
New shape: torch.Size([3, 224, 224])


tensor(123.)

## Indexing (selecting data from tensors)

Indexing with PyTorch is similar to indexing with NumPy



In [55]:
import torch
x = torch.arange(1, 10).reshape(1, 3, 3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [56]:
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [57]:
x[0, 0]

tensor([1, 2, 3])

In [58]:
x[0, 0, 2]

tensor(3)

In [59]:
x[0, 0, -1]

tensor(3)

In [60]:
x[0, 0, :]

tensor([1, 2, 3])

In [61]:
x[0, 0, :1]

tensor([1])

## PyTorch tensors & NumPy

Since NumPy is a popular Python numerical computing library, PyTorch has functionality to interact with it nicely.

The two main methods well want to use for NumPy to PyTorch (and back again) are:

- `torch.from_numpy(ndarray)` - NumPy array ->PyTorch tensor.
- `torch.Tensor.numpy()` - PyTorch tensor -> NumPy array.

In [62]:
# NumPy array to tensor
import torch
import numpy as np
array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array) # when converting, pytorch reflets numpys default datatype float64 unless specified otherwise
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

**Note:** By default, NumPy arrays are created with the datatype `float64` and if you convert it to a PyTorch tensor, it'll keep the same datatype (as above).

However, many PyTorch calculations default to using `float32`.

So if you want to convert your NumPy array (float64) -> PyTorch tensor (float64) -> PyTorch tensor (float32), you can use `tensor = torch.from_numpy(array).type(torch.float32)`.

In [63]:
# Change the array, keep the tensor
array = array + 1
array, tensor

(array([2., 3., 4., 5., 6., 7., 8.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))


And if you want to go from PyTorch tensor to NumPy array, you can call `tensor.numpy()`.

In [70]:
# Tensor to NumPy array
tensor = torch.ones(7) # create a tensor of ones with dtype=float32
numpy_tensor = tensor.numpy() # will be dtype=float32 unless changed
# numpy_tensor = tensor.type(torch.float64).numpy()
tensor, numpy_tensor

(tensor([1., 1., 1., 1., 1., 1., 1.]), dtype('float32'))


And the same rule applies as above, if you change the original `tensor`, the new `numpy_tensor` stays the same. **So they DONT share memory**

In [71]:
# Change the tensor, keep the array the same
tensor = tensor + 1
tensor, numpy_tensor

(tensor([2., 2., 2., 2., 2., 2., 2.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

## Reproducibility (trying to take the random out of random)¶

In short how nerual networks learn is:

`start with random numbers -> tensor operations -> try to make better (again and again and again)`

To reduce the randomness in nerual networks, we have **random seeds**

Essentially what the random seed does is "flavour" the randomness.

In [72]:
import torch

# Create two random tensors
random_tensor_A = torch.rand(3, 4)
random_tensor_B = torch.rand(3, 4)

print(f"Tensor A:\n{random_tensor_A}\n")
print(f"Tensor B:\n{random_tensor_B}\n")
print(f"Does Tensor A equal Tensor B? (anywhere)")
random_tensor_A == random_tensor_B

Tensor A:
tensor([[0.6736, 0.8918, 0.2056, 0.5957],
        [0.9155, 0.0102, 0.4885, 0.5538],
        [0.7301, 0.3855, 0.9041, 0.1328]])

Tensor B:
tensor([[0.6488, 0.6065, 0.7416, 0.4879],
        [0.6822, 0.9986, 0.2792, 0.9434],
        [0.9476, 0.0818, 0.1389, 0.5930]])

Does Tensor A equal Tensor B? (anywhere)


tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])

But what if you wanted to created two random tensors with the same values.

As in, the tensors would still contain random values but they would be of the same flavour.

That's where `torch.manual_seed(seed)` comes in, where seed is an integer (like 42 but it could be anything) that flavours the randomness.

In [73]:
import torch
import random

# Set the random seed
RANDOM_SEED=42
torch.manual_seed(seed=RANDOM_SEED)
random_tensor_C = torch.rand(3, 4)

# Have to reset the seed every time a new rand() is called
# Without this, tensor_D would be different to tensor_C
torch.random.manual_seed(seed=RANDOM_SEED)
random_tensor_D = torch.rand(3, 4)

print(f"Tensor C:\n{random_tensor_C}\n")
print(f"Tensor D:\n{random_tensor_D}\n")
print(f"Does Tensor C equal Tensor D? (anywhere)")
random_tensor_C == random_tensor_D

Tensor C:
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])

Tensor D:
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])

Does Tensor C equal Tensor D? (anywhere)


tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])

## Running tensors on GPUs

In [1]:
!nvidia-smi


Sun Jun 30 08:59:50 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   33C    P8               9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

You can test if PyTorch has access to a GPU using `torch.cuda.is_available()`.

In [2]:
# Check for GPU
import torch
torch.cuda.is_available()

True

In [3]:
# Set device type
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

If you want to do faster computing you can use a GPU but if you want to do much faster computing, you can use multiple GPUs.

You can count the number of GPUs PyTorch has access to using `torch.cuda.device_count()`.



In [4]:
# Count number of devices
torch.cuda.device_count()

1

Knowing the number of GPUs PyTorch has access to is helpful incase you wanted to run a specific process on one GPU and another process on another (PyTorch also has features to let you run a process across all GPUs).



### Putting tensors (and models) on the GPU

You can put tensors (and models, we'll see this later) on a specific device by calling to(device) on them. Where device is the target device you'd like the tensor (or model) to go to.

Why do this?

GPUs offer far faster numerical computing than CPUs do and if a GPU isn't available, because of our device agnostic code (see above), it'll run on the CPU.


**Note**: Putting a tensor on GPU using `to(device)` (e.g. `some_tensor.to(device)`) returns a copy of that tensor, e.g. the same tensor will be on CPU and GPU. To overwrite tensors, reassign them:

`some_tensor = some_tensor.to(device)`

In [5]:
# Create tensor (default on CPU)
tensor = torch.tensor([1, 2, 3])

# Tensor not on GPU
print(tensor, tensor.device)

# Move tensor to GPU (if available)
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3]) cpu


tensor([1, 2, 3], device='cuda:0')

Notice the second tensor has `device='cuda:0'`, this means it's stored on the 0th GPU available (GPUs are 0 indexed, if two GPUs were available, they'd be `'cuda:0'` and `'cuda:1'` respectively, up to `'cuda:n'`).

###  Moving tensors back to the CPU


In [7]:
# If tensor is on GPU, can't transform it to NumPy (this will error)
tensor_on_gpu.numpy()


TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

Instead, to get a tensor back to CPU and usable with NumPy we can use Tensor.cpu().

This copies the tensor to CPU memory so it's usable with CPUs.

In [None]:
# Instead, copy the tensor back to cpu
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

The above returns a copy of the GPU tensor in CPU memory so the original tensor is still on GPU.

In [8]:
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')