# **Pytorch Fundamentals**

In [None]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

print(torch.__version__)

2.6.0+cu124


## Introduction to Tensors  

A vector is a measure of magnitude in a given direction, represented as an array of numbers like (x, y, z).  
If a vector is a 1D array, a matrix is a 2D array, then a tensor is basically a multidimensional array.
Tensors are very good for computation in deep learning as they allow for fast processing.  
All arrays have a tensor rank. The tensor rank is given by the amount of dimensions (Rank 1 = Vector, Rank 2 = Matrix, etc...)  
  
In Pytorch, all types of these arrays are still of a "tensor" data type.

### Creating Tensors

In [None]:
# scalar
scalar = torch.tensor(7)
scalar

tensor(7)

Pytorch tensors are creating using

```
torch.tensor()
```

Pytorch Documentation Definition:  
A torch.Tensor is a multi-dimensional matrix containing elements of a single data type.

In [None]:
scalar.ndim # checking the number of dimensions

0

In [None]:
scalar.item() # changes it back to a regular python integer

7

In [None]:
# Vector - 1D Array
vector = torch.tensor([7, 7])
vector

tensor([7, 7])

In [None]:
vector.ndim # 1 dimension (# of square brackets)

1

In [None]:
vector.shape # shape of the array (2 by 1 elements)

torch.Size([2])

In [None]:
# MATRIX - 2D Array
MATRIX = torch.tensor([[7, 8],
                       [9, 8]])
MATRIX

tensor([[7, 8],
        [9, 8]])

In [None]:
MATRIX.ndim # 2 Square brackets:

2

In [None]:
MATRIX[0, 1] # Indexes like a normal array

tensor(8)

In [None]:
MATRIX.shape # 2 by 2

torch.Size([2, 2])

In [None]:
# TENSOR - 3D array
TENSOR = torch.tensor([[[1, 2, 3],
                        [3, 6, 9],
                        [2, 4, 5]]])
TENSOR

tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 5]]])

In [None]:
TENSOR.ndim # 3 Dimensions

3

In [None]:
TENSOR.shape # we have one 3 by 3 shaped tensor

torch.Size([1, 3, 3])

In [None]:
TENSOR[0]

tensor([[1, 2, 3],
        [3, 6, 9],
        [2, 4, 5]])

In [None]:
TENSOR = torch.tensor([[[[3, 3, 4], [1, 2, 4], [1 , 2, 3]],
                        [[3, 3, 4], [1, 2, 4], [1 , 2, 3]]]])
print( TENSOR.shape )
print( TENSOR.ndim )

torch.Size([1, 2, 3, 3])
4


##### So basically the dimensions start at the blue brackets (since the array is literally everything incomposed in the pink brackets).  
##### There is one blue bracket so 1 dimension of 2 (two yellow brackets inside the blue brackets) dimensions of 3 (three purple brackets in each yellow bracket) of 3 (3 elements inside each purple bracket).

#### In summary, it is how many elements are in each bracket.

### Random Tensors  
  
Why random tensors?  
Random tensors are important because the way many neural networks learn is that they start with tensors full of random numbesr and then adjust those randoms numbers to better represent the data.


`Start w/ random numbers -> Look at data -> Update random numbers -> Look at data -> Update random numbers`



In [None]:
# Create a random tensor of size (3, 4)
random_tensor = torch.rand(3, 4)
random_tensor

tensor([[0.6284, 0.6001, 0.2895, 0.8759],
        [0.3338, 0.1847, 0.6915, 0.7893],
        [0.8311, 0.7183, 0.1566, 0.5625]])

In [None]:
random_tensor.ndim # 2 Dimensions (3 by 4)

2

In [None]:
# Create a random tensor w/ similar shape to an image tensor
random_image_size_tensor = torch.rand(size=(244, 244, 3)) # height, width, color changes (R, G, B)
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([244, 244, 3]), 3)

So images can be represented as tensors (tensors are very similar to numpy arrays, which images basically are)  
  
##### Also the "size=" is an optional part of the size parameter. It can go either way when creating tensors.

### Zeros and ones

In [None]:
# Create a tensor of all zeros
zero = torch.zeros(size=(3, 4))
# Useful for creating masks, as when multiplying tensors by zero, they become zero.
zero

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [None]:
# Create a tensor of all ones
one = torch.ones(size=(3,4))
one

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [None]:
one.dtype # Default datatype is .float

torch.float32

### Creating a range of tensors and tensors-like

In [None]:
# Use torch.arange(start, end, step)
one_to_nine = torch.arange(1, 10, step=3) # Starts indexing out zero, so this goes 1-9. Stepsize of 3.

In [None]:
# Creating tensors like
three_zeros = torch.zeros_like(input=one_to_nine) # creates a new tensor with old tensor's shape
three_zeros

tensor([0, 0, 0])

### Tensor Datatypes  
  
**Note:** Tensor datatypes are one of the 3 big errors encountered with Pytorch & deep learning:  
1. Tensors not right datatype
2. Tensors not right shape
3. Tensors not on the right device

In [None]:
# Float 32 tensor
float_32_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=None, # What datatype is the tensor (float32 is default)
                               device=None, # What device is your tensor on (CPU is default)
                               requires_grad=False) # Whether or not to track gradients w/ this tensors operations

A single-precision floating point is called a float 32.  
Half-precision is called float 16.  
If you want to compute faster but sacrifice detail, you can use float 16. If you need more precision, you might go up to float 64.

In [None]:
# Converting Datatypes
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3., 6., 9.], dtype=torch.float16)

In [None]:
float_16_tensor * float_32_tensor

tensor([ 9., 36., 81.])

### Getting Information from Tensors  
  
1. Tensors not right datatype -> to get datatype from a tensor, can use `tensor.dtype`
2. Tensors not right shape -> to get shape from a tensor, can use `tensor.shape`
3. Tensors not on the right device -> to get device from a tensor can use `tensor.device`

In [None]:
# Create a tensor
some_tensor = torch.rand(3,4)
some_tensor

tensor([[0.1670, 0.8292, 0.7222, 0.2704],
        [0.1359, 0.1659, 0.8098, 0.3510],
        [0.4314, 0.2359, 0.8547, 0.5130]])

In [None]:
# Find out details about some tensor
print(some_tensor)
print(f"\nDatatype of tensor: {some_tensor.dtype}")
print(f"\nShape of tensor: {some_tensor.shape}")
print(f"\nDevice of tensor: {some_tensor.device}")

tensor([[0.1670, 0.8292, 0.7222, 0.2704],
        [0.1359, 0.1659, 0.8098, 0.3510],
        [0.4314, 0.2359, 0.8547, 0.5130]])

Datatype of tensor: torch.float32

Shape of tensor: torch.Size([3, 4])

Device of tensor: cpu


In [None]:
ran_tensor = torch.rand(size=(3, 4))
ran_tensor = ran_tensor.type(torch.int32)

### Manipulating Tensors (tensor operations)  
  
Tensor operations include:  
* Addition
* Subtraction
* Multiplication (element wise)
* Division
* Matrix Multiplication

In [None]:
# Create a tensor & add 10 to it
tensor = torch.tensor([1,2,3])
tensor + 10

tensor([11, 12, 13])

In [None]:
# Multiply tensor by 10
tensor * 10

tensor([10, 20, 30])

In [None]:
# Subtract 10
tensor - 10

tensor([-9, -8, -7])

In [None]:
# Pytorch has in-built functions
torch.mul(tensor, 10)

tensor([10, 20, 30])

In [None]:
torch.add(tensor, 10)

tensor([11, 12, 13])

### Matrix Multiplication  
  
Two main ways of performing multiplicaiton in neural networks and deep learning:  
  
1. Element-wise multiplication
2. Matrix multiplication (dot product)  <-- *Most common in deep learning*  
  
There are two main rules that performing matrix multiplication needs to satisfy:  
1. The **inner dimensions** must match:
* `(3, 2) @ (3, 2)` won't work
* `(2, 3) @ (3, 2)` will work
* `(3, 2) @ (2, 3)` will work  
Basically the inner numbers must match: (outer, **inner**) and (**inner**, outer)  
2. The resulting matrix has the shape of the **outer dimensions**:
* `(2, 3) @ (3, 2)` -> `(2, 2)`
* `(3, 2) @ (2, 3)` -> `(3, 3)`

In [None]:
torch.matmul(torch.rand(2, 3), torch.rand(3, 2)) # Inner dimensions match, gives a (2, 2) shape

tensor([[0.4983, 0.7353],
        [0.3355, 0.5017]])

In [None]:
# Element wise multiplication
print( tensor, "*", tensor)
print(f"Equals: {tensor*tensor}")

tensor([1, 2, 3]) * tensor([1, 2, 3])
Equals: tensor([1, 4, 9])


In [None]:
# "@" stands for matrix multiplication
tensor @ tensor

tensor(14)

In [None]:
# Matrix Multiplication
torch.matmul(tensor, tensor)

tensor(14)

In [None]:
# Matrix Multiplication by hand
1*1 + 2*2 + 3*3

14

Using the Pytorch function of `torch.matmul()` is way faster than coding a for loop or some other process by hand. Pytorch uses vectorization to compute extremely fast.  
**Always Use `torch.matmul()` for matrix multiplication!**

### One of the most common errors in deep learning: shape errors

In [None]:
# Shapes for matrix multiplication
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]])

tensor_B = torch.tensor([[7, 10],
                         [8, 11],
                         [9, 12]])

torch.mm(tensor_A, tensor_B) # torch.mm is the same as torch.matmull (shorter syntax version)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x2 and 3x2)

In [None]:
tensor_A.shape, tensor_B.shape # Inner dimensions are not the same! 2 =/ 3!

(torch.Size([3, 2]), torch.Size([3, 2]))

To fix our tensor shape issues, we can manipulate the shape of one of our tensors using a **transpose**.  
  
**transpose** - switches the axes or dimensions of a given tensor

In [None]:
tensor_B, tensor_B.shape

(tensor([[ 7, 10],
         [ 8, 11],
         [ 9, 12]]),
 torch.Size([3, 2]))

In [None]:
tensor_B.T, tensor_B.T.shape # Changes shape from (3, 2) to (2, 3)

# Still have the same elements, just rearranged.

(tensor([[ 7,  8,  9],
         [10, 11, 12]]),
 torch.Size([2, 3]))

In [None]:
# The matrix multiplication operation works when tensor_B is transposed
print(f"Original shapes: tensor_A = {tensor_A.shape}, tensor_B = {tensor_B.shape}\n")
print(f"New shapes: tensor_A = {tensor_A.shape} (same shape as above), tensor_B.T = {tensor_B.T.shape}\n")
print(f"Multiplying: {tensor_A.shape} @ {tensor_B.T.shape} <- inner dimensions must match\n")

print("Output:")
output = torch.mm(tensor_A, tensor_B.T)
print(output)

print(f"Output shape: {output.shape}")

Original shapes: tensor_A = torch.Size([3, 2]), tensor_B = torch.Size([3, 2])

New shapes: tensor_A = torch.Size([3, 2]) (same shape as above), tensor_B.T = torch.Size([2, 3])

Multiplying: torch.Size([3, 2]) @ torch.Size([2, 3]) <- inner dimensions must match

Output:
tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])
Output shape: torch.Size([3, 3])


## Finding the min, max, mean, sum, etc (tensor aggregation)

In [None]:
# Create a tensor
x = torch.arange(1, 100, 10)
x, x.dtype

(tensor([ 1, 11, 21, 31, 41, 51, 61, 71, 81, 91]), torch.int64)

In [None]:
# Find the min
torch.min(x), x.min()

(tensor(1), tensor(1))

In [None]:
# Find the max
torch.max(x), x.max()

(tensor(91), tensor(91))

In [None]:
# Find the mean - note: that torch.mean() function requires a tensor fo float32 datatype to work
torch.mean(x.type(torch.float32)), x.type(torch.float32).mean()

(tensor(46.), tensor(46.))

In [None]:
# Find the sum
torch.sum(x), x.sum()

(tensor(460), tensor(460))

## Finding the positional min and max

In [None]:
x

tensor([ 1, 11, 21, 31, 41, 51, 61, 71, 81, 91])

In [None]:
# Find the position in tensor that has the minimum value w/ argmin() -> returns the index position of target tensor where the min value occurs
x.argmin()
# At index [0], we get the minimum value.

tensor(0)

In [None]:
x[0]

tensor(1)

In [None]:
# Find the position in tensor that has the maximum value w/ argmax() -> returns the index position of target tensor where the max value occurs
x.argmax()

tensor(9)

In [None]:
x[9]

tensor(91)

## Reshaping, stacking, squeezing and unsqueezing tensors  
  
* Reshaping - reshapes an input tensor to a defined shape
* View - return a view of an input tensor of a certain shape but keep the same memory as the original tensor
* Stacking - combine multiple tensors on top of each (vstack) or side by side (hstack)
* Squeeze - removes all `1` dimensions from a tensor
* Unsqueeze - add a `1` dimension to a target tensor
* Permute - return a view of the input with dimensions permuted (swapped) in a certain way

In [None]:
# Let's create a tensor
x = torch.arange(1., 10.)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [None]:
# Lets reshape
x_reshaped= x.reshape(1, 9) # Dimensions must be compatible with the original dimensions
x_reshaped, x_reshaped.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [None]:
# Change the view
z = x.view(1, 9)
z, z.shape

# View is similar to reshape, however, view shares the memory with the original tensor.

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [None]:
# Changing z changes x (because a view of a tensor shares the same memory as the original input)
z[:, 0] = 5
z, x

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]))

In [None]:
# Stack tensors on top of each other
x_stacked = torch.stack ([x, x], dim=0) # Dimension 0 is default.
x_stacked

# It combines two 1D tensors and adds a new dimensions, resulting in a 2D tensor.

tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.]])

In [None]:
# Using vstack aka vertical stack - Stack tensors in sequence vertically (row wise).
x_vstacked = torch.vstack([x,x,x,x])
x_vstacked

tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.]])

In [None]:
# Using hstack aka horizontal stack - Stack tensors in sequence horizontally (column wise).
x_vstacked = torch.hstack([x,x,x,x])
x_vstacked

tensor([5., 2., 3., 4., 5., 6., 7., 8., 9., 5., 2., 3., 4., 5., 6., 7., 8., 9.,
        5., 2., 3., 4., 5., 6., 7., 8., 9., 5., 2., 3., 4., 5., 6., 7., 8., 9.])

### Squeeze and Unsqueeze  

These are used to manipulate tensor shapes without changing the underlying data.

In [None]:
# torch.squeeze() - removes all single dimensions from a target tensor
print(f"Previous tensor: {x_reshaped}")
print(f"Previous shape: {x_reshaped.shape}")

# Remove extra dimensions from x_reshaped
x_squeezed = x_reshaped.squeeze()
print(f"\nNew tensor: {x_squeezed}")
print(f"New shape: {x_squeezed.shape}")

Previous tensor: tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]])
Previous shape: torch.Size([1, 9])

New tensor: tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.])
New shape: torch.Size([9])


So you can see there that it removed a set of brackets, thus turning it from (1, 9) shape to (9) or just 9 elements.

In [None]:
# torch.unsqueeze() - adds a single dimension to a target tensor at a specific dim (dimension)
print(f"Previous target: {x_squeezed}")
print(f"Previous shape: {x_squeezed.shape}")

# Add an extra dimension w/ unsqueeze
x_unsqueezed = x_squeezed.unsqueeze(dim=0)
print(f"\nNew tensor: {x_unsqueezed}")
print(f"New shape: {x_unsqueezed.shape}")

Previous target: tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.])
Previous shape: torch.Size([9])

New tensor: tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]])
New shape: torch.Size([1, 9])


Now observe that it added a set of brackets, turning it from (9) to (1, 9), so it added a dimension.

In [None]:
# torch.permute - reorders the dimensions of a tensor
x_original = torch.rand(size=(224, 224, 3)) # [height, width, color_channels]

# Permute the original tnesor to rearrange the axis (or dim) order
x_permuted = x_original.permute(2, 0, 1) # shifts axis 0->1, 1->2, 2->0

print(f"Previous shape: {x_original.shape}")
print(f"New shape: {x_permuted.shape}") # [color_channels, height, width]

Previous shape: torch.Size([224, 224, 3])
New shape: torch.Size([3, 224, 224])


Permute returns a VIEW Of the original tensor. It doesn't make a copy or anything it IS the original tensor just rearranged.

In [None]:
x_original[0, 0, 1] = 9999
x_original[0, 0, 1], x_permuted[1, 0, 0] # Notice how I didn't change x_permuted, it changed because it shares the same memory as x_original.
# However the dimensions changed:

(tensor(9999.), tensor(9999.))

## Indexing (selecting data from tensors)  
  
Indexing w/ Pytorxh is similar to indexing w/ NumPy.

In [None]:
# Create a tensor
import torch
x = torch.arange(1, 10).reshape(1, 3, 3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [None]:
# Lets index on our new tensor
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [None]:
# Lets index on the middle bracket (dim=1)
x[0][0] # <-- First bracket, First element in the bracket

tensor([1, 2, 3])

In [None]:
x[0, 0] # this is the same

tensor([1, 2, 3])

In [None]:
# Let's index on the most inner bracket (last dimension)
x[0, 2, 2]

tensor(9)

In [None]:
# You can also use the ":" to select "all" of the target dimension
x[:, 0]

tensor([[1, 2, 3]])

In [None]:
# Get all values of 0th and 1st dimensions but only index 1 of the 2nd dimension
x[:, :, 1]

tensor([[2, 5, 8]])

In [None]:
# Get all values of the 0 dimension but only the 1 index of the 1st and 2nd dimension
x[:, 1, 1]

tensor([5])

In [None]:
# Get index 0 of 0th and 1st dimension and all values of 2nd dimension
x[0, 0, :]

tensor([1, 2, 3])

In [None]:
# Getting all of the values of the 1st dimension and the last value of the 2nd dimension
x[0, :, 2]

tensor([3, 6, 9])

## PyTorch tensors & NumPy  
  
Pytorch has functionality to interact w/ NumPy.  
  
* Data in NumPy, want in PyTorch tensor -> `torch.from_numpy(ndarray)`  
* PyTorch tensor -> NumPy -> `torch.Tensor.numpy()`

In [None]:
# NumPy aray to tensor
import numpy as np

array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array)
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

Warning: When converting from Numpy -> Pytorch, datatype is float64 unless specified otherwise

In [None]:
array.dtype # Numpy's default datatype is float64

dtype('float64')

In [None]:
torch.arange(1.0, 8.0).dtype # Pytorch's default datatype is float32

torch.float32

In [None]:
# Change the value of array, what will this do to `tensor`?
array = array + 1
array, tensor
# Changing the value of the array does not change the value of the tensor created from the array.

(array([2., 3., 4., 5., 6., 7., 8.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [None]:
# Tensor to NumPy array
tensor = torch.ones(7)
numpy_tensor = tensor.numpy()
tensor, numpy_tensor

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

In [None]:
# Change the tensor, what happens to `numpy_tensor`?
tensor = tensor + 1
tensor, numpy_tensor
# Changing the value of the tensor does not change the value of the array created from the tensor.

(tensor([2., 2., 2., 2., 2., 2., 2.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

## Reproducibility (trying to take the random out of random)

It is like getting the same seed for a minecraft speedrun. Normally, the seed is completely random, and even with a good strategy, results may vary. However, if you use the same seed each time, you can make direct comparisons with your speedrunning techniques to see if you can improve your time.  
  
It is the same with Pytorch model training. Pytorch normally uses random numbers to train the model, so each time you adjust the model and train it, the results can still vary because you use random numbers.  
If you use the same set of random numbers each time, called a random seed, then you can determine how to best improve your model without the random aspects.


In [None]:
# Create two random tensors
random_tensor_A = torch.rand(3, 4)
random_tensor_B= torch.rand(3, 4)

print(random_tensor_A)
print(random_tensor_B)
print(random_tensor_A == random_tensor_B)

tensor([[0.3786, 0.1083, 0.9574, 0.1207],
        [0.5193, 0.0936, 0.2997, 0.6653],
        [0.6320, 0.3516, 0.0333, 0.5107]])
tensor([[0.4681, 0.4859, 0.9973, 0.1444],
        [0.5047, 0.4975, 0.5073, 0.5576],
        [0.2862, 0.1979, 0.0947, 0.6266]])
tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


In [None]:
# Let's make some random but reproducible tensors

# Set the random seed
RANDOM_SEED = 42

torch.manual_seed(RANDOM_SEED)
random_tensor_C = torch.rand(3, 4)

torch.manual_seed(RANDOM_SEED)
random_tensor_D = torch.rand(3, 4)

print(random_tensor_C)
print(random_tensor_D)
print(random_tensor_C == random_tensor_D)

# If you are creating a manual seed, it only works for one block of code.

tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


## Running tensors and PyTorch objects on GPUs (and making faster computations)  
  
GPUs = faster computation on numbers, thanks to CUDA + NVIDIA hardware + PyTorch working behind the scenes to make everything hunky dory.

### 1. Getting a GPU  
  
1. Easiest - Use Google Colab for a free GPU.
2. Use your own GPU - takes a little but of setup & purchase of GPU
3. Use cloud computing - GCP, AWS, Azure, these services allow you to rent computers on the cload and access them.

In [None]:
!nvidia-smi

Wed Jul  9 22:57:39 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   48C    P8             10W /   70W |       0MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

### 2. Check for GPU access w/ PyTorch

In [None]:
import torch
torch.cuda.is_available()

True

Run on GPU if available, else default to CPU

In [None]:
# Setup device agnostic code
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [None]:
# Count the number of devices
torch.cuda.device_count()

1

### 3. Putting tensors (and models) on the GPU
  
The reason we want our tensors/models on the GPU is because using a GPU results in faster computations.

In [None]:
# Create a tensor (default on the CPU)
tensor = torch.tensor([1,2,3])

# Tensor not on GPU
print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [None]:
# Move tensor to GPU (if available)
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

### 4. Moving tensors back to the CPU

In [None]:
# If tensor is on GPU, can't transform it to NumPy
tensor_on_gpu.numpy()

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

In [None]:
# To fix the GPU tensor w/ NumPy issue, we can first set it to the CPU
tensor_back_on_cpu = tensor_on_gpu.cpu()
tensor_back_on_cpu.numpy()

array([1, 2, 3])