In [2]:
import torch
print(torch.__version__)

2.7.0+cu118


### **Intro to Tensors** 

**Creating Tensors**

In [50]:
#Scalar
scalar = torch.tensor(7)
scalar

tensor(7)

In [51]:
scalar.ndim

0

In [52]:
#Vector
vector = torch.tensor([7,7])
vector

tensor([7, 7])

In [53]:
vector.ndim

1

In [54]:
vector.shape

torch.Size([2])

In [55]:
#MATRIX
MATRIX = torch.tensor([[1 , 2],
                       [3 , 4]])
MATRIX

tensor([[1, 2],
        [3, 4]])

In [56]:
MATRIX.ndim

2

In [57]:
MATRIX.shape

torch.Size([2, 2])

In [58]:
#Tensor
TENSOR = torch.tensor([[
    [1 , 2 , 3],
    [4 , 5 , 6],
    [7 , 8 , 9]
]])

In [59]:
TENSOR.ndim

3

In [60]:
TENSOR.shape

torch.Size([1, 3, 3])

**Random Tensors**  
We've established tensors represent some form of data.  
  
And machine learning models such as neural networks manipulate and seek patterns within tensors.  
  
But when building machine learning models with PyTorch, it's rare you'll create tensors by hand (like what we've been doing).  
  
Instead, a machine learning model often starts out with large random tensors of numbers and adjusts these random numbers as it works through data to better represent it.  
  
In essence:  
  
Start with random numbers -> look at data -> update random numbers -> look at data -> update random numbers...  

In [61]:
#Create a random tensor of size (3,4)
random_tensor = torch.rand(3 , 4)
random_tensor

tensor([[0.0619, 0.1091, 0.1871, 0.2883],
        [0.7423, 0.4650, 0.0242, 0.5491],
        [0.4984, 0.9962, 0.9743, 0.9429]])

In [62]:
random_tensor.ndim

2

In [40]:
random_tensor.shape

torch.Size([3, 4])

In [41]:
#Random tensor with similar to shape of image tensor
random_image_tensor = torch.rand(size=(3,224,224))  #height,width,colour channels(R,G,B)

In [42]:
random_image_tensor

tensor([[[0.9681, 0.8757, 0.3956,  ..., 0.3941, 0.6442, 0.2677],
         [0.0655, 0.1788, 0.2567,  ..., 0.9253, 0.4347, 0.9333],
         [0.5591, 0.3680, 0.7631,  ..., 0.8198, 0.4463, 0.8901],
         ...,
         [0.5475, 0.8664, 0.9954,  ..., 0.8089, 0.8798, 0.8404],
         [0.3772, 0.1607, 0.2592,  ..., 0.1331, 0.1471, 0.1216],
         [0.1494, 0.7423, 0.9972,  ..., 0.8657, 0.2837, 0.2063]],

        [[0.5937, 0.5048, 0.2155,  ..., 0.2482, 0.1480, 0.9156],
         [0.9484, 0.3724, 0.1656,  ..., 0.5479, 0.1754, 0.2026],
         [0.5767, 0.9453, 0.4056,  ..., 0.2314, 0.0300, 0.2972],
         ...,
         [0.2924, 0.9555, 0.2533,  ..., 0.7689, 0.7517, 0.4501],
         [0.4013, 0.6907, 0.3460,  ..., 0.9442, 0.2465, 0.5752],
         [0.4359, 0.9969, 0.0229,  ..., 0.0893, 0.8634, 0.0739]],

        [[0.6740, 0.4404, 0.7405,  ..., 0.3696, 0.0354, 0.9212],
         [0.9091, 0.9139, 0.2474,  ..., 0.5654, 0.7006, 0.2083],
         [0.9739, 0.1725, 0.4442,  ..., 0.7440, 0.1591, 0.

Example:  
![Failed To Load](./images/img16.png)

**Zeros and Ones**

In [43]:
#Creating a tensor of all zeros 
zeros = torch.zeros((3,4))

In [44]:
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [45]:
#Creating a tensor of all ones 
ones = torch.ones((3,4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [46]:
#The default data type is float
zeros.dtype, ones.dtype

(torch.float32, torch.float32)

In [47]:
ones = torch.ones(size = (3,4), dtype=torch.int)
ones

tensor([[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]], dtype=torch.int32)

**Creating Range of Tensors and Tensors-Like**

In [48]:
#Using torch.arange()
one_to_ten = torch.arange(0 , 11)
one_to_ten  

tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [49]:
tensor_range = torch.arange(start = 0, end = 1000, step = 100)
tensor_range

tensor([  0, 100, 200, 300, 400, 500, 600, 700, 800, 900])

In [63]:
#Creating tensors like
zero_tens = torch.zeros_like(input = one_to_ten) # Same shape like input
zero_tens

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

### **Tensor Datatypes**

In [67]:
#Float 32 tensor
float_32_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype= torch.float32,  #Higher precision, slower computation
                               device = None,   #What device is your tensor on
                               requires_grad= False  #Whether or not to track gradients with this tensors operations
                               )
float_32_tensor

tensor([3., 6., 9.])

In [68]:
float_32_tensor.dtype

torch.float32

In [69]:
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor.dtype

torch.float16

In [70]:
#Or you can specify params:
float_16_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype= torch.float16, 
                               device = None,  
                               requires_grad= False  
                               )
float_16_tensor.dtype

torch.float16

### **Getting Information from Tensors**  
Once you've created tensors (or someone else or a PyTorch module has created them for you), you might want to get some information from them.  
  
We've seen these before but three of the most common attributes you'll want to find out about tensors are:  
  
`shape` - what shape is the tensor? (some operations require specific shape rules)  
`dtype` - what datatype are the elements within the tensor stored in?  
`device` - what device is the tensor stored on? (usually GPU or CPU)  
Let's create a random tensor and find out details about it.  

In [71]:
#Create random tensor
some_tensor = torch.rand(3,3)

print(some_tensor)

print(f"The shape of tensor is : {some_tensor.shape}")
print(f"The device of tensor is : {some_tensor.device}")
print(f"The datatype of tensor is : {some_tensor.dtype}")

tensor([[0.1309, 0.6453, 0.7077],
        [0.3643, 0.1649, 0.2705],
        [0.2177, 0.2710, 0.2027]])
The shape of tensor is : torch.Size([3, 3])
The device of tensor is : cpu
The datatype of tensor is : torch.float32


### **Manipulating Tensors**  
In deep learning, data (images, text, video, audio, protein structures, etc) gets represented as tensors.  
  
A model learns by investigating those tensors and performing a series of operations (could be 1,000,000s+) on tensors   to create a representation of the patterns in the input data.  
  
These operations are often a wonderful dance between:  
  
Addition  
Substraction  
Multiplication (element-wise)  
Division  
Matrix multiplication  
And that's it. Sure there are a few more here and there but these are the basic building blocks of neural networks.  
  
Stacking these building blocks in the right way, you can create the most sophisticated of neural networks (just like lego!).  

**Basic Operations**

In [72]:
#Create a tensor and add 10 to them
tensor = torch.tensor([1 , 2, 3])
tensor + 10

tensor([11, 12, 13])

In [73]:
#Create a tensor and multiply by 10
tensor * 10

tensor([10, 20, 30])

In [74]:
#Subtract by 10
tensor - 10

tensor([-9, -8, -7])

In [75]:
#Using inbuilt functions
torch.add(tensor , 10 
          )

tensor([11, 12, 13])

**Matrix Multiplication**  
  
Main ways of performing matrix multiplication:  
  
**1. Element wise multiplication**

In [76]:
#Element wise operation

print(tensor , "*" , tensor)
print(tensor * tensor)

tensor([1, 2, 3]) * tensor([1, 2, 3])
tensor([1, 4, 9])


In [77]:
torch.matmul(tensor, tensor)

tensor(14)

**2. Matrix multiplication**  
PyTorch implements matrix multiplication functionality in the torch.matmul() method.  
  
The main two rules for matrix multiplication to remember are:  
  
The inner dimensions must match:  
`(3, 2) @ (3, 2)` won't work  
`(2, 3) @ (3, 2) `will work  
`(3, 2) @ (2, 3)` will work  
The resulting matrix has the shape of the outer dimensions:   
`(2, 3) @ (3, 2) -> (2, 2) `   
`(3, 2) @ (2, 3) -> (3, 3) `  
> Note: "@" in Python is the symbol for matrix multiplication.  
> Note: One of the most common errors in deep learning: shape errors

In [79]:
#Shapes for matrix multiplication
tensor_A = torch.tensor([
    [1 , 2 ],
    [3 , 4 ],
    [5 , 6 ]
])

tensor_B = torch.tensor([
    [7 , 10],
    [8 , 11],
    [9 , 12]
])

# torch.mm(tensor_A , tensor_B)  # torch.mm is the same as torch.matmul()
torch.matmul(tensor_A ,  tensor_B)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x2 and 3x2)

In [80]:
tensor_A.shape, tensor_B.shape

(torch.Size([3, 2]), torch.Size([3, 2]))

We can make matrix multiplication work between tensor_A and tensor_B by making their inner dimensions match.  
  
One of the ways to do this is with a transpose (switch the dimensions of a given tensor).  
  
You can perform transposes in PyTorch using either:  
  
1. torch.transpose(input, dim0, dim1) - where input is the desired tensor to transpose and dim0 and dim1 are the dimensions to be swapped.  
2. tensor.T - where tensor is the desired tensor to transpose. 
   
Let's try the latter.  

In [81]:
tensor_B.T, tensor_B.T.shape

(tensor([[ 7,  8,  9],
         [10, 11, 12]]),
 torch.Size([2, 3]))

In [82]:
# The operation works when tensor_B is transposed
print(f"Original shapes: tensor_A = {tensor_A.shape}, tensor_B = {tensor_B.shape}\n")
print(f"New shapes: tensor_A = {tensor_A.shape} (same as above), tensor_B.T = {tensor_B.T.shape}\n")
print(f"Multiplying: {tensor_A.shape} * {tensor_B.T.shape} <- inner dimensions match\n")
print("Output:\n")
output = torch.matmul(tensor_A, tensor_B.T)
print(output) 
print(f"\nOutput shape: {output.shape}")

Original shapes: tensor_A = torch.Size([3, 2]), tensor_B = torch.Size([3, 2])

New shapes: tensor_A = torch.Size([3, 2]) (same as above), tensor_B.T = torch.Size([2, 3])

Multiplying: torch.Size([3, 2]) * torch.Size([2, 3]) <- inner dimensions match

Output:

tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])

Output shape: torch.Size([3, 3])


**Tensor Aggregation (min, max, sum etc)**

In [83]:
#Create a tensor
x =  torch.arange(0, 100, 10)
x , x.shape

(tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90]), torch.Size([10]))

In [84]:
print(f"Minimum: {x.min()}")
print(f"Maximum: {x.max()}")
# print(f"Mean: {x.mean()}") # this will error
print(f"Mean: {x.type(torch.float32).mean()}") # won't work without float datatype
print(f"Sum: {x.sum()}")

Minimum: 0
Maximum: 90
Mean: 45.0
Sum: 450


In [85]:
torch.max(x), torch.min(x), torch.mean(x.type(torch.float32)), torch.sum(x)

(tensor(90), tensor(0), tensor(45.), tensor(450))

Positional min/max  
You can also find the index of a tensor where the max or minimum occurs with torch.argmax() and torch.argmin() respectively.  
  
This is helpful incase you just want the position where the highest (or lowest) value is and not the actual value itself  

In [86]:
#Crate a tensor
tensor = torch.arange(10 , 100 , 10)
print(f"Tensor : {tensor}")

# Returns index of max and min values
print(f"Index where the min value occurs {tensor.argmin()}")
print(f"Index where the max value occurs {tensor.argmax()}")

Tensor : tensor([10, 20, 30, 40, 50, 60, 70, 80, 90])
Index where the min value occurs 0
Index where the max value occurs 8


Change tensor datatype  
As mentioned, a common issue with deep learning operations is having your tensors in different datatypes.  
  
If one tensor is in torch.float64 and another is in torch.float32, you might run into some errors.  
  
But there's a fix.  
  
You can change the datatypes of tensors using torch.Tensor.type(dtype=None) where the dtype parameter is   the datatype you'd like to use.  
  
First we'll create a tensor and check its datatype (the default is torch.float32).  

In [87]:
#Creating a tensor by default
tensor = torch.arange(10. , 100. , 10.)
tensor.dtype

torch.float32

In [88]:
#Now we'll create another tensor the same as before but change its datatype to torch.float16
tensor_float16 = tensor.type(torch.float16)
tensor_float16

tensor([10., 20., 30., 40., 50., 60., 70., 80., 90.], dtype=torch.float16)

In [89]:
# And we can do something similar to make a torch.int8 tensor
tensor_int8 = tensor.type(torch.int8)
tensor_int8

tensor([10, 20, 30, 40, 50, 60, 70, 80, 90], dtype=torch.int8)

### Reshaping, stacking, squeezing and unsqueezing

Often times you'll want to reshape or change the dimensions of your tensors without actually changing the values inside them.

To do so, some popular methods are:

| Method | One-line description |
| ----- | ----- |
| [`torch.reshape(input, shape)`](https://pytorch.org/docs/stable/generated/torch.reshape.html#torch.reshape) | Reshapes `input` to `shape` (if compatible), can also use `torch.Tensor.reshape()`. |
| [`Tensor.view(shape)`](https://pytorch.org/docs/stable/generated/torch.Tensor.view.html) | Returns a view of the original tensor in a different `shape` but shares the same data as the original tensor. |
| [`torch.stack(tensors, dim=0)`](https://pytorch.org/docs/1.9.1/generated/torch.stack.html) | Concatenates a sequence of `tensors` along a new dimension (`dim`), all `tensors` must be same size. |
| [`torch.squeeze(input)`](https://pytorch.org/docs/stable/generated/torch.squeeze.html) | Squeezes `input` to remove all the dimenions with value `1`. |
| [`torch.unsqueeze(input, dim)`](https://pytorch.org/docs/1.9.1/generated/torch.unsqueeze.html) | Returns `input` with a dimension value of `1` added at `dim`. | 
| [`torch.permute(input, dims)`](https://pytorch.org/docs/stable/generated/torch.permute.html) | Returns a *view* of the original `input` with its dimensions permuted (rearranged) to `dims`. | 

Why do any of these?

Because deep learning models (neural networks) are all about manipulating tensors in some way. And because of the rules of matrix multiplication, if you've got shape mismatches, you'll run into errors. These methods help you make sure the right elements of your tensors are mixing with the right elements of other tensors. 

Let's try them out.

In [90]:
#Lets create a tensor
x = torch.arange(0 , 10)
x , x.shape

(tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), torch.Size([10]))

In [91]:
#Add an extra dimention
x_reshaped = x.reshape(1 , 10)
x_reshaped, x_reshaped.shape

(tensor([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]), torch.Size([1, 10]))

In [92]:
#Change the view
z = x.view(1,10)
z, z.shape

(tensor([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]), torch.Size([1, 10]))

Remember though, changing the view of a tensor with torch.view() really only creates a new view of the same tensor.  
  
So changing the view changes the original tensor too.  

In [93]:
z[:,0] = 5
z , x

(tensor([[5, 1, 2, 3, 4, 5, 6, 7, 8, 9]]),
 tensor([5, 1, 2, 3, 4, 5, 6, 7, 8, 9]))

In [94]:
#Stack tensors on top of each other
x_stacked = torch.stack([x,x,x,x], dim = 1)
x_stacked

tensor([[5, 5, 5, 5],
        [1, 1, 1, 1],
        [2, 2, 2, 2],
        [3, 3, 3, 3],
        [4, 4, 4, 4],
        [5, 5, 5, 5],
        [6, 6, 6, 6],
        [7, 7, 7, 7],
        [8, 8, 8, 8],
        [9, 9, 9, 9]])

In [95]:
x_stacked = torch.stack([x,x,x,x], dim = 0)
x_stacked

tensor([[5, 1, 2, 3, 4, 5, 6, 7, 8, 9],
        [5, 1, 2, 3, 4, 5, 6, 7, 8, 9],
        [5, 1, 2, 3, 4, 5, 6, 7, 8, 9],
        [5, 1, 2, 3, 4, 5, 6, 7, 8, 9]])

In [96]:
# torch.squeeze() -  removes all single dimentions from a target tensor
print(f"Previous Tensor {x_reshaped}")
print(f"Previous Tensor Shape {x_reshaped.shape}")

#Remove extra dimension
x_squeezed = x_reshaped.squeeze()
print(f"\nNew tensor: {x_squeezed}")
print(f"\nNew shape: {x_squeezed.shape}")

Previous Tensor tensor([[5, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
Previous Tensor Shape torch.Size([1, 10])

New tensor: tensor([5, 1, 2, 3, 4, 5, 6, 7, 8, 9])

New shape: torch.Size([10])


In [97]:
#torch.unsqueeze - adds a single dimention to target tensor at specific dim 
print(f"Previous Traget {x_squeezed}")
print(f"Previous Traget Shape {x_squeezed.shape}")

#Add extra dimension with unsqueeze
x_unsqeezed = x_squeezed.unsqueeze(dim = 0)
print(f"New Tensor {x_unsqeezed}")
print(f"New Tensor Shape {x_unsqeezed.shape}")

Previous Traget tensor([5, 1, 2, 3, 4, 5, 6, 7, 8, 9])
Previous Traget Shape torch.Size([10])
New Tensor tensor([[5, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
New Tensor Shape torch.Size([1, 10])


In [98]:
#You can also rearrange the order of axes values with torch.permute(input, dims)
x_original = torch.rand(size = (244,244,3))

#Permute the original tensor to rearrange dims
x_permuted = x_original.permute(2, 0 ,1)  # shifts axis 0->1, 1->2, 2->0

print(f"Previous shape: {x_original.shape}")
print(f"New shape: {x_permuted.shape}")

Previous shape: torch.Size([244, 244, 3])
New shape: torch.Size([3, 244, 244])


> Note: Permuted and original share the same memory

### **Indexing (selecting data from tensors)**   
Sometimes you'll want to select specific data from tensors (for example, only the first column or second row).  
  
To do so, you can use indexing.  
  
If you've ever done indexing on Python lists or NumPy arrays, indexing in PyTorch with tensors is very similar.  

In [99]:
#Create tensor
x = torch.arange(1 , 10).reshape(1,3,3)
x ,  x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [100]:
#Indexing values goes outer dimension -> inner dimension (check out the square brackets).

# Let's index bracket by bracket
print(f"First square bracket:\n{x[0]}") 
print(f"Second square bracket: {x[0][0]}") 
print(f"Third square bracket: {x[0][0][0]}")

First square bracket:
tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])
Second square bracket: tensor([1, 2, 3])
Third square bracket: 1


In [101]:
#You can also use : to specify "all values in this dimension" 
# and then use a comma (,) to add another dimension.

# Get all values of 0th dimension and the 0 index of 1st dimension
x[:, 0]

tensor([[1, 2, 3]])

In [102]:
# Get all values of 0th & 1st dimensions but only index 1 of 2nd dimension
x[:, :, 1]

tensor([[2, 5, 8]])

In [103]:
# Get all values of the 0 dimension but only the 1 index value of the 1st and 2nd dimension
x[:, 1, 1]

tensor([5])

### **PyTorch tensors & NumPy**

Since NumPy is a popular Python numerical computing library, PyTorch has functionality to interact with it nicely.  

The two main methods you'll want to use for NumPy to PyTorch (and back again) are: 
* [`torch.from_numpy(ndarray)`](https://pytorch.org/docs/stable/generated/torch.from_numpy.html) - NumPy array -> PyTorch tensor. 
* [`torch.Tensor.numpy()`](https://pytorch.org/docs/stable/generated/torch.Tensor.numpy.html) - PyTorch tensor -> NumPy array.

Let's try them out.

In [105]:
#Numpy array to tensor
import numpy as np 

array = np.arange(1.0 , 8.0)
tensor = torch.from_numpy(array)

array , tensor


(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

> **Note:** By default, NumPy arrays are created with the datatype `float64` and if you convert it to a PyTorch tensor, it'll keep the same datatype (as above). 
>
> However, many PyTorch calculations default to using `float32`. 
> 
> So if you want to convert your NumPy array (float64) -> PyTorch tensor (float64) -> PyTorch tensor (float32), you can use `tensor = torch.from_numpy(array).type(torch.float32)`.

Because we reassigned `tensor` above, if you change the tensor, the array stays the same.

In [106]:
# Change the array, keep the tensor
array = array + 1
array , tensor

(array([2., 3., 4., 5., 6., 7., 8.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

And the same rule applies as above, if you change the original tensor, the new numpy_tensor stays the same.

In [108]:
# Tensor to numpy 
tensor = torch.ones(7)
array = tensor.numpy()

tensor , array

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

## Reproducibility (trying to take the random out of random)  
  
As you learn more about neural networks and machine learning, you'll start to discover how much randomness plays a part.  
  
Well, pseudorandomness that is. Because after all, as they're designed, a computer is fundamentally deterministic (each step is predictable) so the randomness they create are simulated randomness (though there is debate on this too).  
  
How does this relate to neural networks and deep learning then?  
  
We've discussed neural networks start with random numbers to describe patterns in data (these numbers are poor descriptions) and try to improve those random numbers using tensor operations (and a few other things we haven't discussed yet) to better describe patterns in data.  
  
In short:   
  
``start with random numbers -> tensor operations -> try to make better (again and again and again)``  
  
Although randomness is nice and powerful, sometimes you'd like there to be a little less randomness.  
  
Why?  
  
So you can perform repeatable experiments.  
  
For example, you create an algorithm capable of achieving X performance.  
  
And then your friend tries it out to verify you're not crazy.  
  
How could they do such a thing?  
  
That's where **reproducibility** comes in.  
  
In other words, can you get the same (or very similar) results on your computer running the same code as I get on mine?  
  
Let's see a brief example of reproducibility in PyTorch.  
  
We'll start by creating two random tensors, since they're random, you'd expect them to be different right? 

In [110]:
#Create random tensors 

tensor_A = torch.rand((3,4))
tensor_B = torch.rand((3,4))

print(tensor_A)
print(tensor_B)

print(tensor_A ==  tensor_B)

tensor([[0.5541, 0.8543, 0.3694, 0.4855],
        [0.1727, 0.2309, 0.6540, 0.4154],
        [0.7184, 0.9161, 0.3937, 0.5592]])
tensor([[0.5280, 0.5128, 0.9880, 0.5027],
        [0.4204, 0.1079, 0.6641, 0.2262],
        [0.0755, 0.4410, 0.0492, 0.3259]])
tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


In [114]:
# Set the random seed
RANDOM_SEED=42 # try changing this to different values and see what happens to the numbers below
torch.manual_seed(seed=RANDOM_SEED) 
random_tensor_C = torch.rand(3, 4)

# Have to reset the seed every time a new rand() is called 
# Without this, tensor_D would be different to tensor_C 
torch.random.manual_seed(seed=RANDOM_SEED) 
random_tensor_D = torch.rand(3, 4)

print(f"Tensor C:\n{random_tensor_C}\n")
print(f"Tensor D:\n{random_tensor_D}\n")
print(f"Does Tensor C equal Tensor D? (anywhere)")
random_tensor_C == random_tensor_D

Tensor C:
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])

Tensor D:
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])

Does Tensor C equal Tensor D? (anywhere)


tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])

### **Running tensors on GPUs (and making faster computations)**  
  
Deep learning algorithms require a lot of numerical operations.  
  
And by default these operations are often done on a CPU (computer processing unit).  
  
However, there's another common piece of hardware called a GPU (graphics processing unit), which is often much faster at performing the specific types of operations neural networks need (matrix multiplications) than CPUs.  
  
Your computer might have one.  
  
If so, you should look to use it whenever you can to train neural networks because chances are it'll speed up the training time dramatically.  
  
There are a few ways to first get access to a GPU and secondly get PyTorch to use the GPU.  
  



In [115]:
!nvidia-smi

Thu Jun  5 08:44:19 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 576.02                 Driver Version: 576.02         CUDA Version: 12.9     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA GeForce GTX 1650      WDDM  |   00000000:01:00.0 Off |                  N/A |
| N/A   43C    P8              1W /   50W |       0MiB /   4096MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

In [116]:
#Check for GPU access with PyTorch
torch.cuda.is_available()

True

If the above outputs True, PyTorch can see and use the GPU, if it outputs False, it can't see the GPU and in that case, you'll have to go back through the installation steps.  
  
Now, let's say you wanted to setup your code so it ran on CPU or the GPU if it was available.  
  
That way, if you or someone decides to run your code, it'll work regardless of the computing device they're using.  
  
Let's create a device variable to store what kind of device is available.

In [117]:
# Set device type
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

### **Putting tensors (and models) on the GPU**  
You can put tensors (and models, we'll see this later) on a specific device by calling to(device) on them. Where device is the target device you'd like the tensor (or model) to go to.  
  
Why do this?  
  
GPUs offer far faster numerical computing than CPUs do and if a GPU isn't available, because of our device agnostic code (see above), it'll run on the CPU.  
  
Note: Putting a tensor on GPU using to(device) (e.g. some_tensor.to(device)) returns a copy of that tensor, e.g. the same tensor will be on CPU and GPU. To overwrite tensors, reassign them:  
  
some_tensor = some_tensor.to(device)  
  
Let's try creating a tensor and putting it on the GPU (if it's available).

In [119]:
# Create tensor (default on CPU)
tensor = torch.tensor([1, 2, 3])

# Tensor not on GPU
print(tensor, tensor.device)

# Move tensor to GPU (if available)
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3]) cpu


tensor([1, 2, 3], device='cuda:0')

**Moving tensors back to the CPU**

In [120]:
tensor_on_gpu.numpy()

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

In [123]:
# Instead, copy the tensor back to cpu
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

array([1, 2, 3])

In [124]:
#The above returns a copy of the GPU tensor in CPU memory so the original tensor is still on GPU.
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')