### Imports

##### Notes: 
`torch.Tensor` - A class constructor that creates a tensor of a specified size, which is uninitialized by default and defaults to a float32 datatype.

`torch.tensor()` - A factory function that creates a tensor from existing data, which is always initialized and infers the datatype from the input.

In [3]:
import torch
import numpy as np

### Initializing and Creating a Tensor


PyTorch tensors are created using `torch.Tensor()`

In [4]:
# scalar
scalar = torch.tensor(7)
scalar 

tensor(7)

In [5]:
scalar.ndim

0

In [6]:
# Get tensor back as python int
scalar.item()

7

In [7]:
# vectors
vector = torch.tensor([7, 7])
vector

tensor([7, 7])

In [8]:
vector.ndim

1

In [9]:
vector.shape

torch.Size([2])

**MATRIX**

In [10]:
MATRIX = torch.tensor([[7, 8],
                       [9, 10]])
MATRIX 

tensor([[ 7,  8],
        [ 9, 10]])

In [11]:
MATRIX.ndim

2

In [12]:
MATRIX[1]

tensor([ 9, 10])

In [13]:
MATRIX.shape

torch.Size([2, 2])

**TENSOR**

In [14]:
TENSOR = torch.tensor([[[1, 2, 3],
                        [3, 6, 9],
                        [2, 4, 5]]])
TENSOR

tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 5]]])

In [15]:
TENSOR.ndim

3

In [16]:
TENSOR.shape

torch.Size([1, 3, 3])

In [17]:
torch.rand(size=(3, 4))

tensor([[0.0045, 0.6852, 0.5535, 0.9865],
        [0.3760, 0.8978, 0.6018, 0.6894],
        [0.3110, 0.8157, 0.7050, 0.4586]])

In [18]:
torch.rand(3, 4)

tensor([[0.6885, 0.9362, 0.0493, 0.3285],
        [0.2260, 0.0215, 0.0738, 0.0597],
        [0.8451, 0.9234, 0.9524, 0.7914]])

**Random Tensors**

Why random tensors?

Random tensors are important because the way neural network learn it that they start with tensors full of random numbers and then adjust those random numbers to better represent the data.

`Start with random numbers -> look at data -> update random numbers -> look at data -> update random numbers...`

In [19]:
# Create a random tensor of size (3, 4)
# random_tensor = torch.rand(1, 10 , 10)
# random_tensor = torch.rand(1, 3, 4)
random_tensor = torch.rand(3, 4)
random_tensor

tensor([[0.1029, 0.2193, 0.4258, 0.5353],
        [0.8592, 0.0591, 0.5978, 0.4943],
        [0.6799, 0.6051, 0.6330, 0.8697]])

In [20]:
print(random_tensor.ndim)

2


In [21]:
# Create a random tensor with similar shape to an image tensor
random_image_size_tensor = torch.rand(size=(3, 224, 224)) # color channel, heigh, width
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([3, 224, 224]), 3)

### Zeros and Ones Tensor

In [22]:
# Create a tensor of all zeros
zeros = torch.zeros(size=(3, 4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [23]:
# Create a tensor of all ones
ones = torch.ones(3, 4)
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [24]:
ones.dtype

torch.float32

**Directly from data**

In [25]:
data = [[1, 2], [3, 4]]
x_data = torch.tensor(data)
print(x_data)

tensor([[1, 2],
        [3, 4]])


**From a NumPy Array**

In [26]:
np_array = np.array(data)
x_np = torch.from_numpy(np_array)
print(x_np)

tensor([[1, 2],
        [3, 4]])


**From another Tensor**

The new tensor retains the properties (shape, datatype) of the argument tensor, unless explicitly overridden.

In [27]:
x_ones = torch.ones_like(x_data) # retains the properties of x_data
print(f"Ones Tensor: \n {x_ones} \n")

x_rand = torch.rand_like(x_data, dtype=torch.float) # overrides the datatype of x_data
print(f"Random Tensor: \n {x_rand} \n")

Ones Tensor: 
 tensor([[1, 1],
        [1, 1]]) 

Random Tensor: 
 tensor([[0.1497, 0.9118],
        [0.9367, 0.2235]]) 



**With Random or constant values**

shape is a tuple of tensor dimensions. In the functions below, it determines the dimensionality of the output tensor.

In [28]:
shape = (2,3,)
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)

print(f"Random Tensor: \n {rand_tensor} \n")
print(f"Ones Tensor: \n {ones_tensor} \n")
print(f"Zeros Tensor: \n {zeros_tensor} \n")

Random Tensor: 
 tensor([[0.1266, 0.5012, 0.2305],
        [0.0656, 0.0513, 0.1110]]) 

Ones Tensor: 
 tensor([[1., 1., 1.],
        [1., 1., 1.]]) 

Zeros Tensor: 
 tensor([[0., 0., 0.],
        [0., 0., 0.]]) 



### Creating a range of tensor and tensors-like

In [29]:
# use torch.arange()
 
# one_to_ten = torch.arange(0, 10)
one_to_ten = torch.arange(start=1, end=11, step=1)
one_to_ten

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [30]:
# Creating tensors like
ten_zeros = torch.zeros_like(input=one_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

### Tensor Datatypes

**Note:** Tensor datatypes is one of the 3 big errors you'll run into with PyTorch & deep learning: 

1. Tensors not right datatype
2. Tensors not right shape
3. Tensors not on the right device

In [85]:
# Float 32 tensor
float_32_tensor = torch.tensor([3.0, 6.0, 9.0],
                            #    dtype=torch.float16, 
                               dtype=None, # what datatype is the tensor (e.g. float32 or float16)
                               device=None, # What device is your tensor on
                               requires_grad=False # Whether or not to track gradients
                               ) 
float_32_tensor

tensor([3., 6., 9.])

In [86]:
float_32_tensor.dtype

torch.float32

In [87]:
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3., 6., 9.], dtype=torch.float16)

In [88]:
float_16_tensor * float_32_tensor

tensor([ 9., 36., 81.])

### Attributes of a Tensor

Tensor attributes describe their shape, datatype, and the device on which they are stored.

1. Tensors not right datatype - to get datatype from a tensor, use `tensor.dtype`
2. Tensors not right shape - to get shape from a tensor, use `tensor.shape`
3. Tensors not on the right device - to get device from a tensor, use `tensor.device`

In [35]:
tensor = torch.rand(3,4)

print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")

Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


In [36]:
some_tensor = torch.rand(3, 4)
some_tensor

tensor([[0.6914, 0.8281, 0.4172, 0.1646],
        [0.1191, 0.6375, 0.4355, 0.7285],
        [0.2973, 0.3286, 0.1865, 0.6102]])

In [37]:
some_tensor.size, some_tensor.size(), some_tensor.shape

(<function Tensor.size>, torch.Size([3, 4]), torch.Size([3, 4]))

### Operations on Tensors

Over 1200 tensor operations, including arithmetic, linear algebra, matrix manipulation (transposing, indexing, slicing), sampling and more are comprehensively described [here](https://docs.pytorch.org/docs/stable/torch.html).

By default, tensors are created on the **CPU**. We need to explicitly move tensors to the accelerator using `.to` method, but first you need to check the availability of *accelerator*.

Moving large tensors across devices can be expensice in terms of time and memory!


In [38]:
# We move our tensor to the current accelerator if available
if torch.accelerator.is_available():
    tensor = tensor.to(torch.accelerator.current_accelerator())
print(tensor)

tensor([[0.4252, 0.5038, 0.6527, 0.3586],
        [0.9153, 0.5130, 0.5783, 0.5227],
        [0.5210, 0.7752, 0.2019, 0.0740]], device='cuda:0')


**Few Operations**

In [39]:
shape = (2,3,)

tensor = torch.rand(shape)
print(tensor)

# Operations
print(torch.is_tensor(tensor))

tensor([[0.7831, 0.9448, 0.5845],
        [0.4244, 0.3711, 0.2625]])
True


**Standard Numpy-Like Indexing and Slicing**

In [40]:
tensor = torch.ones(5,4)
print(f"First Row: {tensor[0]}")
print(f"First column: {tensor[:, 0]}")
print(f"Last column: {tensor[..., -1]}")
tensor[:,1] = 0
print(tensor)

First Row: tensor([1., 1., 1., 1.])
First column: tensor([1., 1., 1., 1., 1.])
Last column: tensor([1., 1., 1., 1., 1.])
tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])


**Joining Tensors**

understand this a bit

In [41]:
t1 = torch.cat([tensor, tensor, tensor], dim=1)
print(t1)

tensor([[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.]])


**Arithmetic Operations**

In [42]:
# This Computes the matrix multiplication between two tensors. y1, y2, y3 will have the same value
# ``tensor.T` returns the Transpose of a tensor
print(tensor)

y1 = tensor @ tensor.T
print(f"y1:\n {y1}")

y2 = tensor.matmul(tensor.T)
print(f"y2:\n {y2}")

y3 = torch.rand_like(y1)
torch.matmul(tensor, tensor.T, out=y3)

# This computes the element-wise product. z1, z2, z3 will have the same value
z1 = tensor * tensor
z2 = tensor.mul(tensor)

z3 = torch.rand_like(tensor)
torch.mul(tensor, tensor, out=z3)

tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])
y1:
 tensor([[3., 3., 3., 3., 3.],
        [3., 3., 3., 3., 3.],
        [3., 3., 3., 3., 3.],
        [3., 3., 3., 3., 3.],
        [3., 3., 3., 3., 3.]])
y2:
 tensor([[3., 3., 3., 3., 3.],
        [3., 3., 3., 3., 3.],
        [3., 3., 3., 3., 3.],
        [3., 3., 3., 3., 3.],
        [3., 3., 3., 3., 3.]])


tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])

**Single-element Tensors**

if you have one-element tensor, for example by aggregating all values of a tensor into one value, you can convert it to a Python numerical value using `item()`:

In [43]:
agg = tensor.sum()
agg_item = agg.item()
print(agg_item, type(agg_item))

15.0 <class 'float'>


**In-Place Operations** 

Operations that store the result into the operand are called in-place. They are denoted by a `_` suffix. for example: `x.copy_(y)`, `x.t_()`, will change `x`.

**Note**: 
In-place operations save some memory, but can be problematic when computing derivatives because of an immediate loss of history. Hence, their use is discouraged.

In [44]:
print(f"{tensor} \n")
tensor.add_(5)
# tensor.sub_(5)
print(tensor)

tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]]) 

tensor([[6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.]])


### Bridge with NumPy

Tensors on the CPU and NumPy arrays can share their underlying memory locations, and changing one will change the other.

- Data in numpy, want in PyTorch tensor -> `torch.from_numpy(ndarray)`
- PyTorch tensor -> Numpy -> `torch.Tensor.numpy()`

**Tensor to NumPy array**

In [45]:
tensor = torch.ones(5)
print(f"tensort: {tensor}")

numpy_array = tensor.numpy()
print(f"array: {numpy_array}")

tensor, numpy_array

print(f"Datatype of numpy array: {numpy_array.dtype}")
print(f"Datatype of tensor: {tensor.dtype}")

tensort: tensor([1., 1., 1., 1., 1.])
array: [1. 1. 1. 1. 1.]
Datatype of numpy array: float32
Datatype of tensor: torch.float32


A change in the tensor reflects in the NumPy array.

In [46]:
tensor.add_(1)
print(f"tensor: {tensor}")
print(f"array: {numpy_array}")

tensor: tensor([2., 2., 2., 2., 2.])
array: [2. 2. 2. 2. 2.]


In [47]:
tensor = tensor + 1
tensor, numpy_array

(tensor([3., 3., 3., 3., 3.]), array([2., 2., 2., 2., 2.], dtype=float32))

**NumPy array to Tensor**

In [48]:
np_array = np.ones(5)
t = torch.from_numpy(np_array) # warning : By default, NumPy arrays are created with the datatype float64 and if you convert it to a PyTorch tensor, it'll keep the same datatype (as above).
# t = torch.from_numpy(np_array).type(torch.flaot32)
np_array, t

(array([1., 1., 1., 1., 1.]),
 tensor([1., 1., 1., 1., 1.], dtype=torch.float64))

In [49]:
print(f"Datatype of numpy array: {np_array.dtype}")
print(f"Datatype of tensor: {t.dtype}")

Datatype of numpy array: float64
Datatype of tensor: torch.float64


change in the NumPy Array reflects in the Tensor.

In [50]:
np.add(np_array, 1, out=np_array)
print(f"t: {t}")
print(f"np_array: {np_array}")

t: tensor([2., 2., 2., 2., 2.], dtype=torch.float64)
np_array: [2. 2. 2. 2. 2.]


In [51]:
n = n + 1
n, t

NameError: name 'n' is not defined

### Manipulating Tensors ( tensor opertions ) 

Tensor Opertions include:
- Addition
- Substraction
- Multiplication (element-wise)
- Division
- Matrix multiplication 

In [None]:
# Create a Tensor and add 10 to it
tensor = torch.tensor([1, 2, 3])
tensor + 10

tensor([11, 12, 13])

In [None]:
# Multiply tensor by 10
tensor * 10

tensor([10, 20, 30])

In [None]:
tensor

tensor([1, 2, 3])

In [None]:
# Substract 10
tensor - 10

tensor([-9, -8, -7])

In [None]:
# PyTorch in-built functions
torch.mul(tensor, 10)

tensor([10, 20, 30])

**MATRIX MULTIPLICATION**

https://www.matrixmultiplication.xyz

Two main ways of performing multiplication in neural networks and deeplearning

1. Element-wise multiplication 
2. Matrix Multiplication ( dot product )

The main two rules for matrix multiplication to remember are:

1. The**inner dimensions** must match:
- (3, 2) @ (3, 2) won't work
- (2, 3) @ (3, 2) will work
- (3, 2) @ (2, 3) will work

2. The resulting matrix has the shape of the **outer dimensions**:
- (2, 3) @ (3, 2) -> (2, 2)
- (3, 2) @ (2, 3) -> (3, 3)

In [None]:
# torch.matmul(torch.rand(2, 3), torch.rand(2, 3))
torch.matmul(torch.rand(2, 3), torch.rand(3, 2))

tensor([[0.3965, 0.6830],
        [0.2702, 0.3799]])

In [None]:
# Element wise
print(tensor, "*" ,tensor)
print(f"Equals: {tensor * tensor}")

tensor([1, 2, 3]) * tensor([1, 2, 3])
Equals: tensor([1, 4, 9])


In [None]:
# Matrix multiplication
torch.matmul(tensor, tensor)

tensor(14)

In [None]:
tensor

tensor([1, 2, 3])

In [None]:
# Matrix multiplication by hand
1*1 + 2*2 + 3*3

14

In [None]:
%%time
# Matrix multiplication by hand 
# (avoid doing operations with for loops at all cost, they are computationally time expensive)
value = 0
for i in range(len(tensor)):
  value += tensor[i] * tensor[i]
value

CPU times: total: 0 ns
Wall time: 1.01 ms


tensor(14)

In [None]:
%%time
torch.matmul(tensor, tensor)

CPU times: total: 0 ns
Wall time: 0 ns


tensor(14)

### One of the most common errors in deep learning: Shape errors

In [None]:
# Shape for matrix multiplication
tensor_A = torch.tensor([[1, 2,],
                         [3, 4],
                         [5, 6]])

tensor_B = torch.tensor([[7, 10],
                         [8, 11], 
                         [9, 12]])

# torch.mm is the same as torch.matmul (it's and alias)
torch.matmul(tensor_A, tensor_B) # thorws error

RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x2 and 3x2)

To fix tensor shape issues, we can manipulate the shape of one of our tensor using a **transpose**.
A **transpose** switches the axes or dimensions of a given tensor.

In [None]:
tensor_B, tensor_B.shape

In [None]:
tensor_B.T, tensor_B.T.shape

In [None]:
# The matrix multipication operation works when tensor_B is transposed

# The operation works when tensor_B is transposed
print(f"Original shapes: tensor_A = {tensor_A.shape}, tensor_B = {tensor_B.shape}\n") 
print(f"New shapes: tensor_A = {tensor_A.shape} (same as above), tensor_B.T = {tensor_B.T.shape}\n")
print(f"Multiplying: {tensor_A.shape} * {tensor_B.T.shape} <- inner dimensions match\n")
print("Output:\n")
output = torch.matmul(tensor_A, tensor_B.T)
print(output) 
print(f"\nOutput shape: {output.shape}")

### Finding the min, max, mean, sum, etc (tensor aggregation)

In [None]:
# create a tensor 
x = torch.arange(1, 100, 10)
x

In [None]:
# Find the min
torch.min(x), x.min()

In [None]:
# Find the max
torch.max(x), x.max()

In [None]:
# Find the mean - note: the torch.mean() function requires a tensor of float32 datatype to work
# torch.mean(x.type(torch.float32)), x.mean() # datatype error coz its long
torch.mean(x.type(torch.float32)), x.type(torch.float32).mean()

In [None]:
# Find the sum
torch.sum(x), x.sum()

### Finding the positional min and max

You can also find the index of a tensor where the max or minimum occurs with `torch.argmax()` and `torch.argmin()` respectively.

In [None]:
x

In [None]:
# Find the position in tensor that has teh minimum value with argmin() -> return index  position of target tensor where the minimum value occurs
x.argmin(), x.argmax()

In [None]:
x[0], x[9]

### Reshaping, Stacking, squeezing and unsqueezing tensors

- Reshaping - Reshaping an input tensor to a detined snape

- View - Return a view of an input tensor of certain shape but keep the same memory as the original tensor

- Stacking-combine multiple tensors on top of each other (vstack) or side by side (hstack)

- Squeeze - removes all 1 dimensions from a tensor
Unsqueeze add a 1 dimension to a target tensor

- Permute - Return a view of the input with dimensions permuted (swapped) in a certain way

In [None]:
# Let's create a tensor
import torch 
x = torch.arange(1., 10.)
x, x.shape

In [None]:
# Add an extra dimension
x_reshaped = x.reshape(1, 9) # should be 9 by multiplying   
x_reshaped, x_reshaped.shape

In [None]:
# Change the view 
z = x.view(1, 9)
z, z.shape

In [None]:
# Changing z changes x ( because a view of a tensor shares the same memory as the original input)
z[:, 0] =  5
z, x

In [None]:
x_stacked = torch.stack([x, x, x, x], dim=0)
x_stacked

**`torch.squeeze()`** - Remove all single dimensions from a target tensor

To do so you can use torch.squeeze() (I remember this as squeezing the tensor to only have dimensions over 1).

In [None]:
print(f"Previous tensor: {x_reshaped}")
print(f"Previous shape: {x_reshaped.shape}")

# Remove extra dimension from x_reshaped
x_squeezed = x_reshaped.squeeze()
print(f"\nNew tensor: {x_squeezed}")
print(f"New shape: {x_squeezed.shape}")

**`torch.unsqueeze()`** - adds a single dimension to a target tensor at a specific dim ( dimension )

In [None]:
print(f"Previous tensor: {x_squeezed}")
print(f"Previous shape: {x_squeezed.shape}")

## Add an extra dimension with unsqueeze
x_unsqueezed = x_squeezed.unsqueeze(dim=0)
print(f"\nNew tensor: {x_unsqueezed}")
print(f"New shape: {x_unsqueezed.shape}")

**`torch.permute()`** - rearrange the dimensions of a target tensor in specified order

In [None]:
# x_original = torch.rand(size=(3, 224, 224)) # [heigh, width, color channel]
x_original = torch.rand(size=(224, 224, 3)) # [heigh, width, color channel]

# Permute the original tensor to rearrange the axis (or dim) order
x_permuted = x_original.permute(2, 0, 1) # shifts axis 0->1, 1->2, 2->0

# View shares the same memory as the original tensor
print(f"Previous shape: {x_original}")
print(f"Previous shape: {x_original.shape}")
print(f"New shape: {x_permuted}")  
print(f"New shape: {x_permuted.shape}")  

In [None]:
x_original[0, 0, 1]

### Indexing( selecting data from tensors)
Indexing with PyTorch is similar to indexing with nump.

In [None]:
# Create a tensor
import torch
x = torch.arange(1, 10).reshape(1, 3, 3)
x, x.shape

In [None]:
# Let's index on our new tensor
x[0]

In [None]:
# Let's index on the middle bracket (dim=1)
x[0][0]

In [None]:
# Let's index on the most inner bracket( last dimension )
x[0][0][0]

In [None]:
# You can also use ":" to select "all" of a target dimension
x[:, 0]

In [None]:
# You can also use ":" to select "all" of a target dimension
x[:, :, 1]

In [None]:
# Get all values of the 0 dimension but only the 1 index value of 1st and 2nd dimension
print(x)
print(x[:, 1, 1])

In [None]:
x[0:0:]

### Reproducibility (Trying to take random out of random)

**Extra Resourc** 
https://docs.pytorch.org/docs/stable/notes/randomness.html#reproducibility

In short how a neural network learns: 
 
`start with random numbers -> tensor operations -> update random numbers to try and make better representations of the data -> again -> again... `

to reduce the randomness in neural network and PyTorch comes the concept of a random seed.

Essentially what the random seed does in "flavour" the randomness.

In [None]:
import torch

# Create two random tensor

random_tensor_A = torch.rand(3, 4)
random_tensor_B = torch.rand(3, 4)

print(random_tensor_A)
print(random_tensor_B)
print(random_tensor_A == random_tensor_B)

In [None]:
# Let's make some random but reproducible tensors
import torch

# Set the random seed
RANDOM_SEED = 42

torch.manual_seed(RANDOM_SEED)
random_tensor_C = torch.rand(3, 4)

torch.manual_seed(RANDOM_SEED)
random_tensor_D = torch.rand(3, 4)

print(random_tensor_C)
print(random_tensor_D)
print(random_tensor_C == random_tensor_D)

tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


### Accessing a GPU

`nvidia-smi`

In [80]:
!nvidia-smi

Sun Sep  7 13:56:30 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 576.52                 Driver Version: 576.52         CUDA Version: 12.9     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA GeForce RTX 3050 ...  WDDM  |   00000000:01:00.0  On |                  N/A |
| N/A   43C    P5              5W /   75W |     733MiB /   4096MiB |     34%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

In [52]:
# Checking if GPU driver and CUDA is enabled and accessible by pytorch.
torch.cuda.is_available()

True

**Note**: In PyTorch, it's best practice to write device agnostic code. This means code that'll run on CPU (always available) or GPU (if available).

In [None]:
# Setup device agnostic code
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [None]:
# Count number of devices
torch.cuda.device_count()

1

### 3. Putting tensors (and models on the GPU)

The reason we want our tensor/models on the GPU is because using GPU results in faster computations.

In [56]:
# Create a tensor (default on the CPU)
tensor = torch.tensor([1, 2, 3])

# Tensor not on GPU 
print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [57]:
# Move tensor to GPU (if availabel)
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

### 4. Moving tensors back to the CPU 

In [58]:
# If tensor is on GPU, can't transform it to NumPy
tensor_on_gpu.numpy()

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

In [60]:
# To fix the CPU tensor with NumPy issue, we can first set it to CPU
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

array([1, 2, 3])

In [61]:
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')