### 00. First Tutorial - PyTorch, Using T4- GPU

Resource notebook: https://www.learnpytorch.io/00_pytorch_fundamentals/

Question: https://github.com/mrdbourke/pytorch-deep-learning/discussions

In [None]:
!nvidia-smi - # if using GPU

/bin/bash: line 1: nvidia-smi: command not found


#### Importing Libraries

In [None]:
import torch
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

#### Checking torch version

In [None]:
print(torch.__version__)

2.6.0+cu124


### Introduction to Tensors

  #### Creating Tensors

In [None]:
# scalar
# a torch.Tensor is a multi-dimensional matrix containing elements of a single data type.

scalar = torch.tensor(7) # to create PyTorch tensors
scalar

tensor(7)

In [None]:
# attributes of a scalar

scalar.ndim

0

In [None]:
# Get number out of tensor type - Get it back as a python int

scalar.item()

7

In [None]:
# Vector

vector = torch.tensor([7,7]) # number of dimensions = number of square brackets
vector

tensor([7, 7])

In [None]:
vector.shape

torch.Size([2])

In [None]:
# MATRIX

MATRIX = torch.tensor([[5,6],[7,8]])
MATRIX

tensor([[5, 6],
        [7, 8]])

In [None]:
MATRIX.ndim # ndim gives us the rank of the tensor?

2

In [None]:
MATRIX.shape

torch.Size([2, 2])

In [None]:
# TENSOR - documentation is capital/uppercase for matrices and tensors

TENSOR = torch.tensor([[[1,2],[4,5],[7,8]]])

TENSOR

tensor([[[1, 2],
         [4, 5],
         [7, 8]]])

In [None]:
TENSOR.ndim

3

In [None]:
TENSOR.shape

torch.Size([1, 3, 2])

In [None]:
TENSOR[0]

tensor([[1, 2],
        [4, 5],
        [7, 8]])

In [None]:
TENSOR[0][0][0]

tensor(1)

### Random Tensors

Why random tensors?

Random tensors are important because the way many NNs learn is that they start with tensors full of random numbers and then they adjust those random numbers to better represent the data.

`Start with random numbers -> Look at data -> Update random numbers -> Look at data -> Update the numbers`

Documentat

In [None]:
# Creating a random tensor of size/shape  (3,4)

random_tensor = torch.rand(3,4)
random_tensor

tensor([[0.7158, 0.5864, 0.6644, 0.0232],
        [0.1013, 0.0642, 0.3200, 0.3416],
        [0.7802, 0.1671, 0.7572, 0.9625]])

In [None]:
random_tensor.ndim

2

In [None]:
# Create a random tensor  with similar shape to an image tensor

random_image_size_tensor = torch.rand(size=(224,244,3))
""" Can also have colour channels at the start:
random_image_size_tensor = torch.rand(size=(3,224,244)) """
#size - height, width, colour channels

random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 244, 3]), 3)

#### Tensors full of Zeros and Ones

In [None]:
zeroes = torch.zeros(size=(3,4))
zeroes

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [None]:
zeroes*random_tensor

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [None]:
ones = torch.ones(size=(3,4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

#### DataType

In [None]:
random_tensor.dtype

torch.float32

### Range of tensors and tensors-like

In [None]:
# Use torch.range() and if get deprecated message, use torch.arange()

one_to_ten = torch.arange(start=1,end = 11, step=1)
one_to_ten

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [None]:
 one_to_thousand = torch.arange(start=0,end = 1000, step=77)
 one_to_thousand

tensor([  0,  77, 154, 231, 308, 385, 462, 539, 616, 693, 770, 847, 924])

In [None]:
# Create tensors like

ten_zeros = torch.zeros_like(input = one_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

## Tensor datatypes

**Note**: Tensor datatypes is one of the 3 big errors you will run into with PyTorch & deep learning:

1. Tensors not right datatype
2. Tensors not right shape
3. Tensors not on right device

In [None]:
# Float 32 tensor - this is the default datatype even if it says None

float_32_tensor = torch.tensor([3.0,6.0,9.0], dtype=None, device="cpu", requires_grad=False)

# datatype - what datatype is the tensor -> float 32, or float 16 and float 64 eg?
# device - default "cpu", we put None rn , if using GPU, then put "cuda" - What device is your tensor on?
# requires_grad - if u want pytorch to track the gradients - Whether to track gradients or not for tensors?


In [None]:
float_32_tensor.dtype

torch.float32

In [None]:
float_16_tensor = float_32_tensor.type(torch.half)
#float_16_tensor = float_32_tensor.type(torch.float16)

In [None]:
float_16_tensor.dtype

torch.float16

In [None]:
(float_16_tensor *  float_32_tensor).dtype

torch.float32

##### Getting information from tensors

In [None]:
# Getting info from tensors

print(float_32_tensor.dtype) # datatype
print(float_32_tensor.shape) # shape
print(float_32_tensor.device) # device

torch.float32
torch.Size([3])
cpu


### NON COMPATIBLE DATA TYPES IN TENSORS EXAMPLE (INT AND FLOAT 32) - they surprisingly work!

In [None]:
int_32_tensor = torch.tensor([3,6,9], dtype = torch.int64)

In [None]:
int_32_tensor * float_32_tensor # also worked

tensor([ 9., 36., 81.])

In [None]:
int_32_tensor = torch.tensor([3,6,9], dtype = torch.long)

In [None]:
int_32_tensor * float_32_tensor # also worked

tensor([ 9., 36., 81.])

#### Getting Information from Tensors
(Written above)

1. Tensors not right datatype - to get datatype from a tensor, can use 'tensor.dtype'
2. Tensors not right shape - can use tensor.shape
3. Tensors not on right device - can use tensor.device

In [None]:
some_tensor = torch.rand(3,4)

In [None]:
some_tensor.size() # same as shape (size is a function, whereas shape is an attribute)

torch.Size([3, 4])

In [None]:
#Find out details:

print(some_tensor)
print(f'Datatype of tensor: {some_tensor.dtype}')
print(f'Shape of tensor: {some_tensor.shape}')
print(f'Device of tensor: {some_tensor.device}')

tensor([[0.8372, 0.7589, 0.4699, 0.8659],
        [0.7405, 0.0338, 0.8486, 0.5779],
        [0.3477, 0.2787, 0.5047, 0.7695]])
Datatype of tensor: torch.float32
Shape of tensor: torch.Size([3, 4])
Device of tensor: cpu


## Manipulating Tensors (tensor operations)

Tensor operations include :

1. Addition
2. Subtraction
3. Multiplication (element-wise)
4. Division
5. Matrix Multiplication

In [None]:
# Addition with a scalar (a number) - create a tensor

tensor = torch.tensor([1,2,3])
tensor + 10 # have no reassigned it, hence, tensor still is 1,2,3

tensor([11, 12, 13])

In [None]:
# Multiplication by a scalar

tensor * 10

tensor([10, 20, 30])

In [None]:
# Subtract a scalar

tensor - 10

tensor([-9, -8, -7])

In [None]:
# Try out inbuilt PyTorch functions - same as * 10

# Prefer the operators more
torch.mul(tensor,10)

tensor([10, 20, 30])

In [None]:
torch.add(tensor,10) #addition inbuilt function

tensor([11, 12, 13])

### Matrix Multiplication

There are two main ways of performing multiplication in neural networks and deep learning.

1) Element-wise multiplication

2) Matrix multiplication (one of the most common ones used) - multiplying a matrix by another matrix, we need to do the dot product hence used interchangeably.

That are two main rules that performining matrix multiplication needs to satisfy (otherwise we get an error!):

1. The **inner dimensions** must match.

* `(3,2) @ (3,2)` won't work
* `(2,3) @ (3,2)` will work
* `(3,2) @ (2,3)` will work

2. The resulting matrix has the shape of the **outer dimensions**.

* `(3,2) @ (2,3)` will have 3x3 dimensions
* `(2,3) @ (3,2)` will have 2x2 dimensions

In [None]:
# rule 1

# torch.matmul(torch.rand(3,2), torch.rand(3,2)) this doesn't work

answer = torch.matmul(torch.rand(3,2), torch.rand(2,3)) # works (matching inner dimensions)

In [None]:
# rule 2

answer.shape

torch.Size([3, 3])

In [None]:
# Element-wise multiplication

print(tensor ,"*", tensor)
print("Equals", tensor*tensor)

tensor([1, 2, 3]) * tensor([1, 2, 3])
Equals tensor([1, 4, 9])


In [None]:
# Matrix multiplication

torch.matmul(tensor, tensor) # sum of 1 + 4 + 9 , so it treated the other one as a row vector and one as column

tensor(14)

In [None]:
# Matrix multiplication by hand

1*1 + 2*2 + 3*3

14

In [None]:
tensor@tensor # '@' is also used for indicating matrix multiplication

tensor(14)

In [None]:
%%time

value = 0
for i in range(len(tensor)):
  value += tensor[i] * tensor[i]
print(value)

tensor(14)
CPU times: user 1.35 ms, sys: 60 µs, total: 1.41 ms
Wall time: 2.49 ms


In [None]:
%%time
print(torch.matmul(tensor,tensor)) #preferred due to less run time

tensor(14)
CPU times: user 898 µs, sys: 3 µs, total: 901 µs
Wall time: 909 µs


### Rules for larger matrix multiplication

One of the most common errors in deep learning is shape errors

In [None]:
# Shapes for matrix multiplication

tensor_A = torch.tensor([[1,2],
                        [3,4],
                        [5,6]])

tensor_B = torch.tensor([[7,10],
                        [8,11],
                        [9,12]])

# torch.mm(tensor_A, tensor_B) # torch.mm is alias for torch.matmul

In [None]:
tensor_A.shape, tensor_B.shape

(torch.Size([3, 2]), torch.Size([3, 2]))

##### To fix this, we can manipulate the shape of one of our tensors using transpose:

 A **transpose** switches the axes or dimensions of a given tensor.

In [None]:
# The matrix multiplication operation works when tensor_B is transposed

print(f'Original shapes: tensor_A = {tensor_A.shape}, tensor_B = {tensor_B.shape}')
print(f'New shapes: tensor_A = {tensor_A.shape}, tensor_B transpose = {tensor_B.T.shape}')

print(torch.mm(tensor_A, tensor_B.T)) # .T for transpose
print(f"Output Shape {torch.mm(tensor_A, tensor_B.T).shape}")

Original shapes: tensor_A = torch.Size([3, 2]), tensor_B = torch.Size([3, 2])
New shapes: tensor_A = torch.Size([3, 2]), tensor_B transpose = torch.Size([2, 3])
tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])
Output Shape torch.Size([3, 3])


### Finding Min, Max and Sum etc (Tensor aggregation)

In [None]:
# Create a tensor

x = torch.arange(1,100,10)
x, x.dtype

(tensor([ 1, 11, 21, 31, 41, 51, 61, 71, 81, 91]), torch.int64)

In [None]:
torch.min(x), x.min()

(tensor(1), tensor(1))

In [None]:
torch.max(x), x.max()

(tensor(91), tensor(91))

In [None]:
torch.mean(x.type(torch.float32)), x.type(torch.float32).mean() # error in data type, int64, hence long, hence mean isn't working

(tensor(46.), tensor(46.))

In [None]:
# Find sum:

torch.sum(x), x.sum()

(tensor(460), tensor(460))

#### Finding positional min and max, that is ARGMIN, ARGMAX (the index at which the minimum and maximum numbers lie in the tensor)

In [None]:
x.argmin() # argmin

tensor(0)

In [None]:
x.argmax() # argmax - useful when we use the softmax activation function later!

tensor(9)

## Reshaping, stacking, squeezing and unsqueezing tensors:

* Reshaping - reshapes an input tensor to a defined shape

* View - return a view of an input tensor of a certain shape but keep the same memory as the original tensor (shows the same tensor from a different perspective)

* Stacking - combine multiple tensors on top of each other (vstack) or side by side (hstack). Concatenates a sequence of tensors along a new dimension (we can specify the dimension)

* Squeezing - removes all ``1`` dimensions from a tensors

* Unsqueezing - adds a  ``1`` dimensions to a target tensor

* Permute - return a view of the input with dimensions permuted (swapped) in a certain way


(All of these manipulate the shape/size/dim of tensors)

In [None]:
# Creating a tensor
import torch

x = torch.arange(1.,10.)
x, x.shape, x.dtype

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]), torch.float32)

In [None]:
# Reshape - add an extra dimension

x_reshaped = x.reshape(1,9) #the dimensions have to be compatible with the original dimensions
x_reshaped, x_reshaped.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [None]:
x_reshaped = x.reshape(9,1) #the dimensions have to be compatible with the original dimensions
x_reshaped, x_reshaped.shape

(tensor([[1.],
         [2.],
         [3.],
         [4.],
         [5.],
         [6.],
         [7.],
         [8.],
         [9.]]),
 torch.Size([9, 1]))

In [None]:
# Change the view
z = x.view(1,9)
z, z.shape # z shares the same memory as x, changing z, changes x

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [None]:
z[:,0] = 5
z,x

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]))

In [None]:
# Stack tensors on top of each other

x_stacked = torch.stack([x,x,x], dim=0) #dim = 0 default (vertical), dim 1 = horizontal
x_stacked, x_stacked.shape

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
         [5., 2., 3., 4., 5., 6., 7., 8., 9.],
         [5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 torch.Size([3, 9]))

In [None]:
y_vstack = torch.vstack([x,x])
y_hstack = torch.hstack([x,x])

y_vstack, y_vstack.shape, y_hstack, y_hstack.shape # hstack just adds them! = size went from 9 to 18, instead of 9 to 2,9

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
         [5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 torch.Size([2, 9]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9., 5., 2., 3., 4., 5., 6., 7., 8., 9.]),
 torch.Size([18]))

### Squeezing and Unsqueezing

In [None]:
# torch.squeeze() - removes all single dimensions from a target tensor
t = torch.zeros(1, 1, 2, 1, 2)
t # Initial tensor

tensor([[[[[0., 0.]],

          [[0., 0.]]]]])

In [None]:
t.ndim, t.size()

(5, torch.Size([1, 1, 2, 1, 2]))

In [None]:
s = torch.squeeze(t) # Squeezed tensor / output
s, s.ndim, s.size()

(tensor([[0., 0.],
         [0., 0.]]),
 2,
 torch.Size([2, 2]))

In [None]:
x, x_reshaped.shape # Second example with initially reshaped x

(tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9, 1]))

In [None]:
# Second example

w = x_reshaped.squeeze()

In [None]:
w.shape # After squeezing

torch.Size([9])

##### Unsqueeze

In [None]:
# torch.unsqueeze() - addes a single dimension to a target tensor at a specific dim!

print("Tensor: ", w)
print("Dimensions: ", w.ndim)
print("Shape: ", w.shape) # x_reshaped which was squeezed

Tensor:  tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.])
Dimensions:  1
Shape:  torch.Size([9])


In [None]:
# ADDING dimension with unsqueeze

z = w.unsqueeze(dim=1) # dim = 0 will add before 9, and dim = 1, adds after 9
print("Tensor: ", z)
print("Dimensions: ", z.ndim)
print("Shape: ", z.shape)

Tensor:  tensor([[5.],
        [2.],
        [3.],
        [4.],
        [5.],
        [6.],
        [7.],
        [8.],
        [9.]])
Dimensions:  2
Shape:  torch.Size([9, 1])


#### torch.permute - (returns a view) rearranges the dimensions of target tensor in a specific order

- Used with images more often

Syntax : `torch.permute ( input, (dimension order)) OR x.permute(dim order)`

In [None]:
x_original = torch.rand(size=(224,244,3)) # [height,width, colour channels]

# Permute the original tensor to rearrange the axis or dim order

x_permuted = x_original.permute(2,0,1) # [color channels, height, width]

print(f'Before Permutation: {x_original.shape}')
print(f'After Permutation: {x_permuted.shape}')

Before Permutation: torch.Size([224, 244, 3])
After Permutation: torch.Size([3, 224, 244])


In [None]:
x_original[0,0,0] = 437150
x_original[0,0,0]

tensor(437150.)

In [None]:
x_permuted[0,0,0] # yes, changes in original are reflected in the permuted tensor

tensor(437150.)

### Some additional information !

A tensor becomes non-contiguous when you do operations that reorder or reshape it without copying data.

These operations create views, but the memory layout becomes non-standard (non-contiguous), because the tensor just points to parts of the original memory, instead of storing the data in a nice row-by-row format. [ permute(), select(), transpose(), narrow(), expand() ]

### Is it only when we create a view?
Not always — but most non-contiguous tensors are views created from the original tensor using operations like permute, transpose, etc.

However, a view can still be contiguous if it doesn’t mess with memory layout.

### Does modifying the permuted tensor, change the original?

permute() returns a view, but often a non-contiguous one. Modifying it in-place will error out unless you make it contiguous (which breaks the view).

Hence,  Modifying a permuted tensor doesn't change the original in most practical cases, especially after contiguous() or operations that copy data.

## Indexing (selecting data from tensors)

Similar to indexing in NumPy

In [None]:
import torch

# Create a torch

x = torch.arange(1,10).reshape(1,3,3) # 9 values
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [None]:
# Let's index on our new tensor

x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [None]:
x[0][1] # same as x[0,0] - index on second dimension (3)

tensor([4, 5, 6])

In [None]:
x[0,2,2] #- index on third dimension (3) - returns 9

tensor(9)

In [None]:
# you can also use `:` to select "all" of a target dimension

x[0,:,2]

tensor([3, 6, 9])

In [None]:
# Get index index 0 of 0th and 1st dimension and all values of 2nd dimension

x[0,0,:]

tensor([1, 2, 3])

In [None]:
# Get all values of the 0th dimension but only 1 index value of 1st and 2nd dimension

x[:,1,1]

tensor([5])

## PyTorch Tensors and NumPy

NumPy is a popular scientific Python numerical computing library. And because of this, PyTorch has functionality to interact with it.

1. Data in NumPy, want in PyTorch tensor using `torch.from_numpy(ndarray)`
2. PyTorch tensor -> NumPy Array `torch.tensor.numpy()`


In [None]:
# NumPy array to tensor

import numpy as np
import torch

array = np.arange(1.0,8.0)
tensor = torch.from_numpy(array) # warning: when converting from numpy -> pytorch reflects numpy's default data type, unless specified otherwise!

# Why does the tensor dtype come as float64, because default numpy datatype is float64, and PyToch adapts to it
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [None]:
# Change the value of array, what will this do to the tensor?

array = array + 1
array, tensor # Answer - tensor value doesn't change!, new tensor in memory

(array([2., 3., 4., 5., 6., 7., 8.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [None]:
# Tensor to NumPy array

tensor = torch.ones(7)
numpy_array = tensor.numpy() # will have dtype = float32, reflects the original PyTorch default
tensor, numpy_array

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

In [None]:
# Change the tensor, what happens to numpy array/ tensor?

tensor = tensor + 1
tensor, numpy_array # Answer - doesn't change, hence they don't share memory

(tensor([2., 2., 2., 2., 2., 2., 2.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

# Reproducibility

**A random seed (or seed state, or just seed) is a number (or vector) used to initialize a pseudorandom number generator.**

- Trying to take the random out of random
- In short, how a NN learns is that we

`Start with random numbers -> tensor operations -> update random numbers to try and make them better representations of the data -> repeat again and again `

- Use torch.rand() to start with random numbers, but what if you share this notebook with a friend and we will get different output?

- Hence, to reduce the randomness in neural networks and PyTorch comes the concept of **random seed**

Essentially, what the random seed does is the "flavour" the randomness. This is called Pseudo-randomness!

In [None]:
import torch

random_tensor_A = torch.rand(3,4)
random_tensor_B = torch.rand(3,4)

random_tensor_A, random_tensor_B, random_tensor_A == random_tensor_B # they are not equal (obviously)

(tensor([[9.1022e-01, 9.2708e-01, 9.2110e-01, 9.1411e-01],
         [2.1153e-01, 8.8279e-01, 8.6602e-01, 9.6379e-01],
         [2.9296e-04, 5.3588e-01, 6.8811e-01, 9.7044e-01]]),
 tensor([[0.8300, 0.5375, 0.0287, 0.2321],
         [0.2951, 0.6255, 0.6260, 0.6717],
         [0.0673, 0.6746, 0.2714, 0.0199]]),
 tensor([[False, False, False, False],
         [False, False, False, False],
         [False, False, False, False]]))

In [None]:
# Make them random but reproducible..?

RANDOM_SEED = 42 # can set it to any, different flavours of randomness
torch.manual_seed(RANDOM_SEED)

random_tensor_c = torch.rand(3,4)

RANDOM_SEED = 42 # ONLY WORKS for one block of code! gets used up in one assignment
torch.manual_seed(RANDOM_SEED)
random_tensor_d = torch.rand(3,4)

# use random seed, every time u call .rand() method or start manual_seed(RANDOM_SEED) do it at the start

random_tensor_c, random_tensor_d, random_tensor_c == random_tensor_d

(tensor([[0.8823, 0.9150, 0.3829, 0.9593],
         [0.3904, 0.6009, 0.2566, 0.7936],
         [0.9408, 0.1332, 0.9346, 0.5936]]),
 tensor([[0.8823, 0.9150, 0.3829, 0.9593],
         [0.3904, 0.6009, 0.2566, 0.7936],
         [0.9408, 0.1332, 0.9346, 0.5936]]),
 tensor([[True, True, True, True],
         [True, True, True, True],
         [True, True, True, True]]))

### Running tensors and PyTorch objects on the GPUs (and making faster computations)

GPUs - faster comutation on numbers, thanks to CUDA + NVIDIA hardware + PyTorch working behind the scenes to making everything dory (good).



### 1. Getting a GPU:

a) Easiest - Use Google Colab or free GPU (to upgrade as well)

b) Use your own GPU - takes a little bit of setup and requires the investment of puchasing a GPU (too many options) (See this https://timdettmers.com/2023/01/30/which-gpu-for-deep-learning/*italicised text*)

c) Use cloud computing - GCP, AWS, Azure, these services allow you to rent computers on the cloud and access them

### 2. Check for GPU Access



In [None]:
# Check for GPU access with PyTorch

import torch
torch.cuda.is_available()

False

In [None]:
# Setup device agnostic code

device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cpu'

In [None]:
# Count number of devices
torch.cuda.device_count() # for large, models which are run on multiple devices

0

for PyTorch since its capable of running computer on GPU or CPU, its best practice to set up device agnostic code.

https://docs.pytorch.org/docs/stable/notes/cuda.html#best-practices

### 3. Putting tensors (and models) on the GPU

The reason we want our tensors/models on the GPU is because using a GPU results in faster computations.


In [None]:
# Create a tensor ( default is on the CPU )

tensor = torch.tensor([1,2,3], device = "cpu")

print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [None]:
# Move tensor to GPU (if available)
device = "cuda" if torch.cuda.is_available() else "cpu"
tensor_on_gpu = tensor.to(device)
tensor_on_gpu, tensor_on_gpu.device # cuda:0, index = 0, as we only have 1 GPU

(tensor([1, 2, 3]), device(type='cpu'))

### 4. Moving tensors back to the CPU

In [None]:
# If tensor is on GPU, can't transform it to NumPy

tensor_on_gpu.numpy() # 3rd type of error, incompatiable device

array([1, 2, 3])

In [None]:
# To fix the GPU tensor with NumPY issue, we can first set it to the CPU

tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

array([1, 2, 3])

In [None]:
tensor_on_gpu # remains unchanged!

tensor([1, 2, 3])