In [1]:
%pip install torch torchvision torchaudio

Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch)
  Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.2.1.3 (from torch)
  Downloading nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-curand-cu12==10.3.5.147 (from torch)
  Downloading nvidia_curand_cu12-10.3.5

In [1]:
!nvidia-smi

Thu Jul 10 19:06:30 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01              Driver Version: 566.24         CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA GeForce RTX 3070        On  |   00000000:01:00.0  On |                  N/A |
| 30%   48C    P8             20W /  220W |    7836MiB /   8192MiB |      6%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

Let's check the `PyTorch` version

In [2]:
import torch
torch.__version__

'2.7.1+cu126'

All this work will be compatible with the above mentioned version of PyTorch

## **Tensors**

Tensors are fundamental for machine learning. They represent data in numerical way. We can represent a tensor with shape `[3,224,224]` which would mean `[colour_channels, height, width]`





### **Creating Tensors**

PyTorch has a whole documentation for the `torch.Tensor` class. Let's go through it

A `torch.Tensor` is a multi-dimensional matrix containing elements of a single data type.



In [3]:
torch.tensor([[1,-1],[-1,1]])

tensor([[ 1, -1],
        [-1,  1]])

In [4]:
import numpy as np
torch.tensor(np.array([[1,2,3],[4,5,6]]))

tensor([[1, 2, 3],
        [4, 5, 6]])

#### **Scalar**

Let's start with scalar, a scalar is a single number and in tensor language it is a zero dimensional tensor

In [5]:
scalar = torch.tensor(9)
scalar

tensor(9)

It measn, that although the `scalar` is a number, its type is `torch.Tensor`. We can check the dimensions of the tensor using `ndim`

In [6]:
scalar.ndim

0

and if we want to extract the number out of the tensor, we can convert it to a Python integer, using `item()` method

In [7]:
scalar.item()

9

#### **Vector**

A vector is a single dimensional tensor, and can contain many numbers

In [8]:
vector = torch.tensor([7,7])
vector

tensor([7, 7])

In [9]:
vector.ndim

1

You can tell the dimension of the tensor, by how many number of square brackets `[` are on the outside, and you only need to count on the one side. Here it's only one.

In [10]:
vector.shape

torch.Size([2])

Since it has two elements, it's shape is 2.

#### **Matrix**

I don't think it needs any definition, everyone knows a matrix, well everyone knew a vector too, but it's nice to reiterate the basic building block.

In [11]:
MATRIX = torch.tensor([[7,8],[9,10]])
MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

There's a reason I wrote `MATRIX` in all caps, instead of `matrix`, since `matrix` is used very frequently in code. Now guess it's dimensions?

In [12]:
MATRIX.ndim

2

and the shape?

In [13]:
MATRIX.shape

torch.Size([2, 2])

Since it's a $2\times2$ matrix

#### **Tensor**

Well tensor is a tensor. It is something that transforms like a tensor

In [14]:
TENSOR = torch.tensor([[[1,2,3],[4,5,6], [7,8,9]]])
TENSOR

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

Same reason for using a `TENSOR` here instead of `tensor`.

In [15]:
TENSOR.ndim

3

In [16]:
TENSOR.shape

torch.Size([1, 3, 3])

We can make a $3 \times 3 \times 3$ tensor as well

In [17]:
TENSOR = torch.tensor([[[1,2,3],[4,5,6], [7,8,9]],[[1,2,3],[4,5,6], [7,8,9]],[[1,2,3],[4,5,6], [7,8,9]]])
TENSOR

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]],

        [[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]],

        [[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

In [18]:
TENSOR.shape

torch.Size([3, 3, 3])

### **Random Tensors**



In machine learning models, we manipulate these tensors and try to seek pattern among them. Let's see how can we create random tensors

In [19]:
random_tensor = torch.rand(size=(3, 4))
print(random_tensor)
print(f"Shape of tensor: {random_tensor.shape}")
print(f"Datatype of tensor: {random_tensor.dtype}")

tensor([[0.2485, 0.7105, 0.3127, 0.8816],
        [0.6174, 0.0922, 0.5848, 0.3300],
        [0.2171, 0.8162, 0.8548, 0.6319]])
Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32


We can adjust it to any shape or size that we want

In [20]:
random_image_size_tensor = torch.rand(size=(224,224,3))
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

I will not print it, since it is very big, but you get the idea.

### **Zeros and Ones**

Sometimes we need to do padding, i.e we need to fill tensors with zeros and ones, like for Identity matrix or zero matrix, we can make them easily.

In [21]:
zeros = torch.zeros(size=(3,4))
print(zeros)
print(f"Shape of tensor: {zeros.shape}")
print(f"Datatype of tensor: {zeros.dtype}")

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])
Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32


can do the same for ones:

In [22]:
ones = torch.ones(size=(3,4))
print(ones)
print(f"Shape of tensor: {ones.shape}")
print(f"Datatype of tensor: {ones.dtype}")

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])
Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32


### **Range in Tensors**

Sometimes we want a range of numbers, like 1-10 or even 100, we can make use of `torch.arange(start, end,step)` and to do so:

- `start` = start of the range
- `end` = end of the range
- `step` = how many steps to take between each value

In [23]:
zero_to_ten = torch.arange(start=0, end=10, step=1)
zero_to_ten

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

suppose you want a tensor of all zeros, but of the same shape as the other tensor, easier way to do this is:

In [24]:
ten_zeros = torch.zeros_like(input=zero_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

### **Tensor Datatypes**

There are many different datatypes available in PyTorch, some are CPU specific, some are GPU specific. General rule of thumb is, if you see `cuda` written anywhere, its being used for GPU.

The default are `torch.float32` or `torch.float`, this refers to the 32-bit floating point, but thre are 16-bit and 64-bit floating point as well.


They are here to do precision in computing, i.e the amount of details we are using to describe a number, lower precision datatypes are generally faster to compute, but sacrifice some performance on evaluation metrics, like accuracy

In [25]:
float_32_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=None, # defaults to None, which is torch.float32 or whatever datatype is passed
                               device=None, # defaults to None, which uses the default tensor
                               requires_grad=False) # if True, operations perfromed on the tensor are recorded

print(float_32_tensor)
print(float_32_tensor.dtype)

tensor([3., 6., 9.])
torch.float32


In [26]:
float_16_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=torch.float16) # torch.half would also work

float_16_tensor.dtype

torch.float16

The broad idea is to keep the tensor of the same precision and on the same device.

#### **Getting Information**

We can get information out of tensor as well, so you can debugg if you face any error

In [27]:
some_tensor = torch.rand(3,4)

print(some_tensor)
print(f"Datatype of tensor: {some_tensor.dtype}")
print(f"Shape of tensor: {some_tensor.shape}")
print(f"Device tensor is stored on: {some_tensor.device}")

tensor([[0.6146, 0.0921, 0.5154, 0.0826],
        [0.9760, 0.8325, 0.9739, 0.7954],
        [0.3148, 0.6308, 0.4066, 0.1792]])
Datatype of tensor: torch.float32
Shape of tensor: torch.Size([3, 4])
Device tensor is stored on: cpu


### **Manipulating Tensors**

In deep learning, data (images, texts, videos, audios) get represeneted as tensors. A model learns by investigating those tensors and performing a series of operations on tensors to create a representation of the pattern in the input data.

In [28]:
# addition
t1 = torch.tensor([1,2,3])
t2 = torch.tensor([9,8,7])
t1 + t2


tensor([10, 10, 10])

In [29]:
t1+10

tensor([11, 12, 13])

In [30]:
# multiply by 10
t1*10

tensor([10, 20, 30])

In [31]:
# subtract
t1-10

tensor([-9, -8, -7])

PyTorch has some inbuilt functions as well, like `torch.mul()` and `torch.add()` to perform basic operations

In [32]:
torch.multiply(t1,10)

tensor([10, 20, 30])

#### **Matrix Multiplication**

This is the one most used, the most common one, in pytorch it is implemented by `torch.matmul()`

and you know the basic rules of matrix multiplication, they will work here as well.

In [33]:
# multiply two tensors

tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]])

tensor_B = torch.tensor([[7, 10],
                         [8, 11],
                         [9, 12]])

# Element-wise multiplication
element_wise_multiply = tensor_A * tensor_B
print(f"Element-wise multiplication:\n{element_wise_multiply}")

# Matrix multiplication (requires appropriate shapes, transpose one if needed)
# For A (3x2) and B (3x2), we can't do A @ B. We can do A @ B.T (3x2 @ 2x3 -> 3x3)
matrix_multiply = torch.matmul(tensor_A, tensor_B.T)
print(f"\nMatrix multiplication (A @ B.T):\n{matrix_multiply}")

# Alternative using the @ operator
matrix_multiply_alt = tensor_A @ tensor_B.T
print(f"\nMatrix multiplication (A @ B.T) using @ operator:\n{matrix_multiply_alt}")

Element-wise multiplication:
tensor([[ 7, 20],
        [24, 44],
        [45, 72]])

Matrix multiplication (A @ B.T):
tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])

Matrix multiplication (A @ B.T) using @ operator:
tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])


Let's see which method is faster

In [34]:
tensor_1 = torch.tensor([1,2,3])
tensor_1 * tensor_1

tensor([1, 4, 9])

In [35]:
%%time
value = 0
for i in range(len(tensor_1)):
  value += tensor_1[i] * tensor_1[i]
value

CPU times: user 1.52 ms, sys: 184 μs, total: 1.71 ms
Wall time: 1.2 ms


tensor(14)

In [36]:
%%time
torch.matmul(tensor_1, tensor_1)

CPU times: user 0 ns, sys: 1.19 ms, total: 1.19 ms
Wall time: 1.62 ms


tensor(14)

### **ERRORS**

The most common errors you will encounter are the shape errors, since most of the deep learning is multiplying and performing operations on matrices, and they have a strict rule on what shapes and sizes can be combined, so beware of that

Neural networks are full of matrix multiplication and dot products, the `torch.nn.Linear()`  module, also known as the feed forward layer of fully connected later, implements a matrix multiplication between input `x` and a weight matrix `A`

$$ y = x \cdot A^{T} + b $$


where :    

- `x` is the input to the layer
- `A` is the weight matrix created by the layer, this starts as random numbers that gets adjusted as the neural network learns to better represent the patterns in the data

- `b` is the bias term used to slightly offset the weights and inputs
- `y` is the output


This is a linear function, and can be used to draw a straight line.

In [37]:
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]], dtype=torch.float32)

tensor_B = torch.tensor([[7, 10],
                         [8, 11],
                         [9, 12]])

In [38]:
torch.manual_seed(2)

linear = torch.nn.Linear(in_features=2, out_features=8)
x = tensor_A
output = linear(x)
print(f"Input shape: {x.shape}\n")
print(f"Output:\n{output}\n\nOutput shape: {output.shape}")

Input shape: torch.Size([3, 2])

Output:
tensor([[ 0.1249, -0.2955,  0.3669, -0.5446, -1.7832,  0.1389,  0.9449,  1.1352],
        [ 0.1128,  0.0201,  1.3077, -1.8505, -2.7124,  0.4187,  1.7903,  1.8160],
        [ 0.1006,  0.3357,  2.2486, -3.1563, -3.6416,  0.6986,  2.6357,  2.4968]],
       grad_fn=<AddmmBackward0>)

Output shape: torch.Size([3, 8])


#### **Min, MAX, Mean, SUM**

In [39]:
tensor_1 = torch.arange(0,100,10)
tensor_1

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [40]:
# performing aggregation
print(f"Minimum: {tensor_1.min()}")
print(f"Maximum: {tensor_1.max()}")
# print(f"Mean: {tensor_1.mean()}") # this will error , to take an average, you have to define the datatype
print(f"Mean: {tensor_1.type(torch.float32).mean()}") # won't work without float datatype
print(f"Sum: {tensor_1.sum()}")

Minimum: 0
Maximum: 90
Mean: 45.0
Sum: 450


#### **Positional Min Max**

We can also find the index of the tensor where the maximum or the minimum occurs, using `torch.argmax()` and `torch.argmin()`

In [41]:
print(f"Index where maximum value occurs is {tensor_1.argmax()}")
print(f"Index where minimum value occurs is {tensor_1.argmin()}")

Index where maximum value occurs is 9
Index where minimum value occurs is 0


### **Change Tensor Datatype**

If one tensor is in `torch.float16` and the other in `torch.float64`, we will run into errors, but we can change the datatypes and fix it, using `torch.Tensor.dtype(dtype=None)`, let's see an example

In [42]:
tensor_1.dtype

torch.int64

In [43]:
tensor_float_16 = tensor_1.type(torch.float16)
tensor_float_16.dtype

torch.float16

In [44]:
tensor_float_16

tensor([ 0., 10., 20., 30., 40., 50., 60., 70., 80., 90.], dtype=torch.float16)

Different datatypes , with different numbers might be confusing, but think it like this way, smaller the number, the less precise the computer stores the value and faster the calculation is, but it is less precise.

### **Reshaping, stacking, squeezing and unsqueezing**

Some times, we want to reshapr, or change the dimensions of the tensors we are working with, without changing anything, so we make use of the following:

In [45]:
# torch.reshape(input, shape)
"""
Reshapes input to shape, can also use torch.Tensor.reshape()
"""

tensor_1 = torch.arange(0,100,10)
print(f"Original tensor: {tensor_1}")


# added extra dimension
tensor_1_reshaped = tensor_1.reshape(1,10)
print(f"Reshaped tensor: {tensor_1_reshaped}")

Original tensor: tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])
Reshaped tensor: tensor([[ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90]])


In [46]:
# tensor.view(shape)
"""
Returns a view of the original tensor in a different shape but shares the same data as the original tensor.
"""

tensor_1_view = tensor_1.view(10,1)
tensor_1_view

tensor([[ 0],
        [10],
        [20],
        [30],
        [40],
        [50],
        [60],
        [70],
        [80],
        [90]])

In [47]:
# torch.stack(tensors,dim=0)
"""
Concatenates a sequence of tensors along a new dimension (dim), all tensors must be the same size
"""

tensor_1_concat = torch.stack([tensor_1,tensor_1,tensor_1],dim=1)
tensor_1_concat

tensor([[ 0,  0,  0],
        [10, 10, 10],
        [20, 20, 20],
        [30, 30, 30],
        [40, 40, 40],
        [50, 50, 50],
        [60, 60, 60],
        [70, 70, 70],
        [80, 80, 80],
        [90, 90, 90]])

In [48]:
# torch.squeeze(input)
"""
Squeezes input by removing all the dimensions with value 1.
"""

tensor_1_squeezed = tensor_1_reshaped.squeeze()
tensor_1_squeezed


tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [49]:
# torch.permute(input,dims)
"""
Returns a view of the original input with its dimensions permuted (rearranged) to dims.
"""

x = torch.randn(2, 3, 5)
torch.permute(x, (2, 0, 1)).size()  # shift axis 0 to 1, 1 to 2 and 2 to 0



torch.Size([5, 2, 3])

### **Indexing**

Sometimes we want to select specific data from tensors, we can use indexing for this

In [50]:
x = torch.arange(1,10).reshape(1,3,3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

Indexing goes outer dimension -> inner dimension

In [51]:
# Let's index bracket by bracket
print(f"First square bracket:\n{x[0]}")
print(f"Second square bracket: {x[0][0]}")
print(f"Third square bracket: {x[0][0][0]}")

First square bracket:
tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])
Second square bracket: tensor([1, 2, 3])
Third square bracket: 1


### **PyTorch tensors & NumPy**

NumPy is very popular, so it is natural to have PyTorch functionally interact with it, using
- `torch.from_numpy(ndarray)` which transform numpy array to pytorch tensor

and

- `torch.Tensor.numpy()` which does the opposite

In [52]:
array_1 = np.arange(1.0,8.0)
tensor_1 = torch.from_numpy(array_1)
print(array_1.dtype)
print(tensor_1.dtype)




float64
torch.float64


Numpy arrays are by default created with `float64` and if we convert it, it is converted to the same PyTorch type. However, many PyTorch calculations are defaulted to `float32`, so keep in mind that specify the dtype

`torch.from_numpy(array).type(torch.float32)`

In [53]:
tensor_2 = torch.from_numpy(array_1).type(torch.float32)

print(tensor_2.dtype)

torch.float32


### **Reproducibility**

We want our experiments to be reproducible, so that any other person following our code, get's the same result, and pseudorandomness plays an important role at that.

In [54]:
random_tensor_1 = torch.rand(3,4)
random_tensor_2 = torch.rand(3,4)

print(random_tensor_1)
print(random_tensor_2)
print(random_tensor_1 == random_tensor_2)

tensor([[0.0445, 0.9356, 0.1712, 0.6581],
        [0.4811, 0.5881, 0.5484, 0.0326],
        [0.3926, 0.1839, 0.9251, 0.4386]])
tensor([[0.0021, 0.6211, 0.7171, 0.2762],
        [0.4531, 0.7162, 0.1889, 0.2357],
        [0.4518, 0.1489, 0.8073, 0.5409]])
tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


They are two random tensors, and they are not equal to each other at any value, but what if we want to create two random tensors with the same value. This is where `torch.manual_seed(seed)` comes in, where `seed` is an integer that flavours the randomness.

In [55]:
seed = 135
torch.manual_seed(seed=seed)

random_tensor_3 = torch.rand(3,4)

torch.manual_seed(seed=seed)
random_tensor_4 = torch.rand(3,4)

print(random_tensor_3)
print(random_tensor_4)
print(random_tensor_3 == random_tensor_4)

tensor([[0.3002, 0.7669, 0.8898, 0.9107],
        [0.0219, 0.6328, 0.9235, 0.2540],
        [0.2259, 0.8689, 0.3308, 0.7802]])
tensor([[0.3002, 0.7669, 0.8898, 0.9107],
        [0.0219, 0.6328, 0.9235, 0.2540],
        [0.2259, 0.8689, 0.3308, 0.7802]])
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


## **Running Tensors on GPU**

Deep Learning requires a lot of computational and numerical operations, and they are done by default on CPU, however, we can do them on GPUs as well. It will generally speed things up.

We will be focusing on NVIDIA GPU in this one, for Apple Sillicon, I will maybe write another one. We can check the GPU by running `|!nvidia-smi`

In [56]:
!nvidia-smi

Thu Jul 10 19:06:33 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01              Driver Version: 566.24         CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA GeForce RTX 3070        On  |   00000000:01:00.0  On |                  N/A |
| 30%   47C    P8             21W /  220W |    7799MiB /   8192MiB |     20%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

### **Getting PyTorch to run GPU**

Once you have the GPU access, the next step is to get PyTorch to store data and compute data on GPU. We use `torch.cuda` for this

In [57]:
torch.cuda.is_available()

True

or can use this:

In [58]:
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

This will run for everyone, if they don't have GPU, it will run on CPU

You can make use of multiple GPUs as well, if you have them available:

In [59]:
# count the number of GPUs
torch.cuda.device_count()

1

### **Apple Silicon**

We have Apple's M1/M2/M3 and now M4 GPUs, and can run them using `torch.backends.mps`

In [60]:
torch.backends.mps.is_available()

False

In [61]:
device = "mps" if torch.backends.mps.is_available() else "cpu"
device

'cpu'

In [62]:
if torch.cuda.is_available():
    device = torch.device("cuda")
elif torch.backends.mps.is_available():
    device = torch.device("mps")
else:
    device = torch.device("cpu")

device

device(type='cuda')

### **Putting tensors and models on GPU**

We can put tensors and models on a specific device by calling `to(device)` on them. The reason to do this is GPUs offer far faster numerical computations than CPUs.

Let's try it:

In [63]:
tensor_A = torch.tensor([1,2,4])

# not on GPU
print(tensor_A, tensor_A.device)

# move tensor to GPU (if available)
tensor_B = tensor_A.to(device)
print(tensor_B, tensor_B.device)

tensor([1, 2, 4]) cpu
tensor([1, 2, 4], device='cuda:0') cuda:0


### **Moving them back to CPU**

In [64]:
tensor_C = tensor_B.cpu().numpy()
print(f"Tensor on GPU: {tensor_B} and the device is:  {tensor_B.device}")
print(f"Tensor on CPU: {tensor_C} and the device is:  {tensor_C.device}")

Tensor on GPU: tensor([1, 2, 4], device='cuda:0') and the device is:  cuda:0
Tensor on CPU: [1 2 4] and the device is:  cpu
