<a href="https://colab.research.google.com/github/tasnim0tantawi/deep-learning-PyTorch/blob/main/00_PyTorch_Fundenmentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)

1.13.1+cu116


# Introduction to Tensors

## Types of Tensors: 

1.   Scalar
2.   Vector
3.   Matrix
4.   Tensor




In [2]:
scalar = torch.tensor(7) 
scalar.ndim

0

In [3]:
scalar.item()

7

In [4]:
vector = torch.tensor([7, 7])
vector

tensor([7, 7])

ndim --> Number of dimenstions. Scalar has 0 dimensions. Vector has 1. Matrix has 2. We can think of dimensions as the []. 
The number of attributes, features or input variables of a dataset is referred to as its dimensionality. 

In [5]:
vector.ndim

1

In [6]:
vector.shape

torch.Size([2])

In [7]:
MATRIX = torch.tensor([[7, 9], 
                      [8, 76],
                      [12, 8]]
                      )
MATRIX

tensor([[ 7,  9],
        [ 8, 76],
        [12,  8]])

In [8]:
MATRIX.ndim

2

In [9]:
MATRIX.shape

torch.Size([3, 2])

In [10]:
TENSOR = torch.tensor(
    [[[2, 8, 9],
      [5, 9, 0],
      [5, 9, 77]
      ]]
)
TENSOR


tensor([[[ 2,  8,  9],
         [ 5,  9,  0],
         [ 5,  9, 77]]])

In [11]:
TENSOR.shape

torch.Size([1, 3, 3])

In [12]:
TENSOR[0][0]

tensor([2, 8, 9])

### Random Tensors 
We've established tensors represent some form of data.

And machine learning models such as neural networks manipulate and seek patterns within tensors.

But when building machine learning models with PyTorch, it's rare you'll create tenors by hand (like what we've being doing).

Instead, a machine learning model often starts out with large random tensors of numbers and adjusts these random numbers as it works through data to better represent it.
`Start with random numbers -> look at data -> update random numbers -> look at data -> update random numbers...`

In [13]:
# Create a random tensor of size (3, 4)
random_tensor = torch.rand(size=(3, 4))
random_tensor, random_tensor.dtype


(tensor([[0.9258, 0.2797, 0.5745, 0.1870],
         [0.1931, 0.4586, 0.8498, 0.6143],
         [0.7958, 0.1663, 0.2071, 0.4395]]), torch.float32)

Creating random tensors for image representation:  [224, 224, 3] ([height, width, color_channels]).

In [14]:
# Create a random tensor of size (224, 224, 3)
random_image_size_tensor = torch.rand(size=(224, 224, 3))
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

### Zeros and Ones Tensors

In [15]:
zeros = torch.zeros(3, 4)
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [16]:
ones = torch.ones(4, 5)
ones

tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]])

In [17]:
ones.dtype
# The default data type is float32 in PyTorch. It can be changed. 

torch.float32

### Creating a range and tensors like

In [18]:
torch.arange(1, 11) # creating a range from 1 to 10

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [19]:
torch.arange(start=0, end=11, step=2)

tensor([ 0,  2,  4,  6,  8, 10])

In [20]:
# Tensors like
ten_zeroes = torch.zeros_like(input= torch.arange(start=0, end=11, step=1))
ten_zeroes

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

### Tensor datatypes
There are many different tensor datatypes available in PyTorch.

Some are specific for CPU and some are better for GPU.

Getting to know which is which can take some time.

Generally if you see torch.cuda anywhere, the tensor is being used for GPU (since Nvidia GPUs use a computing toolkit called CUDA).

The most common type (and generally the default) is torch.float32 or torch.float.

This is referred to as "32-bit floating point".

But there's also 16-bit floating point (torch.float16 or torch.half) and 64-bit floating point (torch.float64 or torch.double).

And to confuse things even more there's also 8-bit, 16-bit, 32-bit and 64-bit integers.
The reason for all of these is to do with precision in computing.

Precision is the amount of detail used to describe a number.

The higher the precision value (8, 16, 32), the more detail and hence data used to express a number.

This matters in deep learning and numerical computing because you're making so many operations, the more detail you have to calculate on, the more compute you have to use.

So lower precision datatypes are generally faster to compute on but sacrifice some performance on evaluation metrics like accuracy (faster to compute but less accurate).

In [21]:
# Default datatype for tensors is float32
tensor_32 = torch.tensor([3.0, 6.0, 9.0],
                        dtype=None, # defaults to None, which is torch.float32, can be changed
                        device=None, # defaults to None, which uses the CPU, we can change it to use a GPU.
                        requires_grad=False) # if True, operations perfromed on the tensor are recorded 
tensor_32.shape, tensor_32.device, tensor_32.dtype

(torch.Size([3]), device(type='cpu'), torch.float32)

Aside from shape issues (tensor shapes don't match up), two of the other most common issues you'll come across in PyTorch are datatype and device issues.

For example, one of tensors is torch.float32 and the other is torch.float16 (PyTorch often likes tensors to be the same format). Sometimes it throws an error especially when training large neural networks, other times it does not. However, it is better to be consistent with data types when performing calculations (mainly matrix multiplication) with tensors.

Or one of your tensors is on the CPU and the other is on the GPU (PyTorch likes calculations between tensors to be on the same device). 

In [22]:
# Creating float16 tensor
tensor_16 = torch.tensor(
    [4.0, 9.0, 100.0, 12.],
    dtype=torch.float16
)
tensor_16, tensor_16.dtype, tensor_16.shape

(tensor([  4.,   9., 100.,  12.], dtype=torch.float16),
 torch.float16,
 torch.Size([4]))

### Getting information from tensors

Three of the most common attributes about tensors are:

*   shape - what shape is the tensor? (some operations require specific shape rules) 
*   dtype - what datatype are the elements within the tensor stored in? e.g., float32, int, long..etc.
*   device - what device is the tensor stored on? (usually GPU or CPU, and less commonly TPU)



In [23]:
# Create a tensor
some_tensor = torch.rand(5, 10) # 5 rows 10 columns

# Find out details about it
print(some_tensor)
print(f"Shape of tensor: {some_tensor.shape}")
print(f"Datatype of tensor: {some_tensor.dtype}")
print(f"Device tensor is stored on: {some_tensor.device}") # will default to CPU

tensor([[0.5477, 0.3477, 0.7854, 0.7087, 0.8080, 0.7055, 0.5773, 0.7819, 0.8090,
         0.1695],
        [0.5310, 0.6643, 0.8081, 0.2583, 0.8319, 0.4102, 0.2413, 0.1573, 0.0412,
         0.7788],
        [0.4037, 0.9901, 0.1786, 0.6957, 0.7830, 0.4978, 0.5826, 0.3440, 0.0704,
         0.4771],
        [0.6919, 0.2559, 0.4427, 0.3884, 0.7486, 0.0687, 0.8772, 0.8260, 0.8402,
         0.1360],
        [0.3829, 0.1332, 0.7726, 0.0436, 0.4106, 0.0856, 0.5443, 0.0274, 0.6115,
         0.8065]])
Shape of tensor: torch.Size([5, 10])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


## Tensor Operations
* Addition
* Substraction
* Multiplication (element-wise)
* Division
* Matrix multiplication

##### Basic operations: 
* `+ `
* `-`
* `*`


In [24]:
# Create a tensor of values and add a number to it
tensor = torch.tensor([[1, 2, 3],
                       [3, 77, 8]])
tensor + 20 
# will add 20 to all values

tensor([[21, 22, 23],
        [23, 97, 28]])

In [25]:
# Multiply it by 10
tensor * 10

tensor([[ 10,  20,  30],
        [ 30, 770,  80]])

In [26]:
# Subtract
tensor - 10


tensor([[-9, -8, -7],
        [-7, 67, -2]])

In [27]:
# Can also use torch functions
# But it is more common to use *
torch.multiply(tensor, 10), torch.mul(tensor, 56)

(tensor([[ 10,  20,  30],
         [ 30, 770,  80]]), tensor([[  56,  112,  168],
         [ 168, 4312,  448]]))

### Matrix multiplication (Most Common Operation in Neural Networks)
Using PyTorch's torch.matmul() method.
The @ symbol is used for matrix multiplication as a short-hand.

Two rules for matrix multiplication:
1. Inner dimensions must match.
* (4, 2) @ (3, 6) won't work
* (5, 3) @ (3, 2) will work
* (3, 2) @ (2, 3) will work

2. Resulting matrix is of outer dimensions.
* (2, 3) @ (3, 2) -> (2, 2)
* (6, 2) @ (2, 8) -> (6, 8)


In [28]:
# Element-wise matrix multiplication
tensor, tensor * tensor

(tensor([[ 1,  2,  3],
         [ 3, 77,  8]]), tensor([[   1,    4,    9],
         [   9, 5929,   64]]))

In [29]:
# Matrix multiplication
tensor1 = torch.tensor([[10, 4, 7],
                        [8, 9, 0],
                        [5,8, 3]] # 3 x 3
)
tensor2 = torch.tensor([[10, 7],
                        [8, 0],
                        [5, 3]] # 3 x 2
)
torch.matmul(tensor1, tensor2)

tensor([[167,  91],
        [152,  56],
        [129,  44]])

In [30]:
# Can also use the "@" symbol for matrix multiplication, though not much recommended for readability.
tensor1 @ tensor2

tensor([[167,  91],
        [152,  56],
        [129,  44]])

In [31]:
%%time
# Matrix multiplication by hand 
# (avoid doing operations with for loops at all cost, they are computationally expensive)
value = 0
tensor = torch.tensor([3, 8, 9])
for i in range(len(tensor1)):
  value += tensor[i] * tensor[i]
value

CPU times: user 773 µs, sys: 957 µs, total: 1.73 ms
Wall time: 2.01 ms


tensor(154)

In [32]:
%%time
torch.matmul(tensor, tensor)

CPU times: user 378 µs, sys: 0 ns, total: 378 µs
Wall time: 441 µs


tensor(154)

Performing matrix multiplication with loops takes lots of time (approximately 10x slower than PyTorch's matmul() function. 
That's because PyTorch uses vectorization and performs these operations in parallel.
### Shape Errors
This happens because the inner dimensions of multiplied metricies do not match. Can be solved with transpose (switch the dimensions of a given tensor). 

In [33]:
# Shapes need to be in the right way  
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]], dtype=torch.float32)

tensor_B = torch.tensor([[7, 10],
                         [8, 11], 
                         [9, 12]], dtype=torch.float32)

# torch.matmul(tensor_A, tensor_B) # (this will cause error)

# View tensor_A and tensor_B
print(tensor_A)
print(tensor_B)

tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])
tensor([[ 7., 10.],
        [ 8., 11.],
        [ 9., 12.]])


In [34]:
# View tensor_A and tensor_B.T
print(tensor_A)
print(tensor_B.T)
# Same as above
# Now we can multiply both since inner dimensions match. (3, 2) and (2, 3)
print(torch.transpose(tensor_B, 0, 1))
print(f"Multiplying...dim A: {tensor_A.shape}, dim_B: {tensor_B.T.shape}: " )
output = torch.matmul(tensor_A, tensor_B.T)
print(output) 
print(f"\nOutput shape: {output.shape}")


tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])
tensor([[ 7.,  8.,  9.],
        [10., 11., 12.]])
tensor([[ 7.,  8.,  9.],
        [10., 11., 12.]])
Multiplying...dim A: torch.Size([3, 2]), dim_B: torch.Size([2, 3]): 
tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])

Output shape: torch.Size([3, 3])


In [35]:
# torch.mm is a shortcut for matmul
torch.mm(tensor_A, tensor_B.T)

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])

**Note:** A matrix multiplication like this is also referred to as the **dot product** of two matrices.

## Finding the min, max, mean, sum, etc (Aggregation)

In [36]:
# Create a tensor
x = torch.arange(0, 100, 10)
print(x)
print(f"Minimum: {x.min()}")
print(f"Maximum: {x.max()}")
# print(f"Mean: {x.mean()}") # this will error
print(f"Mean: {x.type(torch.float32).mean()}") # won't work without float datatype
print(f"Sum: {x.sum()}")

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])
Minimum: 0
Maximum: 90
Mean: 45.0
Sum: 450


In [37]:
# Other way:
torch.max(x), torch.min(x), torch.mean(x.type(torch.float32)), torch.sum(x)
# Note that .mean() works only with the default data type float. Cannot work with int.

(tensor(90), tensor(0), tensor(45.), tensor(450))

In [38]:
# Returns index of max and min values
print(f"Index where max value occurs: {x.argmax()}")
print(f"Index where min value occurs: {x.argmin()}")

Index where max value occurs: 9
Index where min value occurs: 0


### Tensor datatype, reshaping, stacking, squeezing and unsqueezing

In [39]:
# Create a tensor and check its datatype
tensor = torch.arange(10., 100., 10.)
print(tensor.dtype) # default is float32

tensor = torch.arange(10, 100, 10) # Removed the dots
print(tensor.dtype)

# Create a float16 tensor
tensor_float16 = tensor.type(torch.float16)
print(tensor_float16.dtype)

# Create a int8 tensor
tensor_int8 = tensor.type(torch.int8)
print(tensor_int8.dtype)


torch.float32
torch.int64
torch.float16
torch.int8


Because deep learning models (neural networks) are all about manipulating tensors in some way. And because of the rules of matrix multiplication, if you've got shape mismatches, you'll run into errors. Squeeze, unsqueeze, and reshape help you make the right elements of your tensors are mixing with the right elements of other tensors.

In [40]:
# Create a tensor
x = torch.arange(1., 8.)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7.]), torch.Size([7]))

In [41]:
# Add an extra dimension
x_reshaped = x.reshape(7, 1) # multiplying dimensions should be equal to the old dimensions.
x_reshaped, x_reshaped.shape

(tensor([[1.],
         [2.],
         [3.],
         [4.],
         [5.],
         [6.],
         [7.]]), torch.Size([7, 1]))

In [42]:
x = torch.tensor([
    list(torch.arange(1, 10)),
    list(torch.arange(1, 10))

]) #---> 2*9 = 18
print(list(x))
x_reshaped = x.reshape(1, 18) #---> 1*18= 18
x_reshaped, x_reshaped.shape



[tensor([1, 2, 3, 4, 5, 6, 7, 8, 9]), tensor([1, 2, 3, 4, 5, 6, 7, 8, 9])]


(tensor([[1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9]]),
 torch.Size([1, 18]))

In [43]:
# .view() creates a view of the tensor that shares the same memory address, means if we change the original tensor, the view will change as well.
z = x.view(1, 18)
z, z.shape

(tensor([[1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9]]),
 torch.Size([1, 18]))

In [44]:
# Changing z changes x
z[0, 0] = 44
print(z)
print(x)

tensor([[44,  2,  3,  4,  5,  6,  7,  8,  9,  1,  2,  3,  4,  5,  6,  7,  8,  9]])
tensor([[44,  2,  3,  4,  5,  6,  7,  8,  9],
        [ 1,  2,  3,  4,  5,  6,  7,  8,  9]])


To stack tensors on top of eachother, torch.stack(). 



In [45]:
# Stack tensors on top of each other
x_stacked = torch.stack([x, x, x, x], dim=0) # try changing dim to dim=1 and see what happens
print(x_stacked)
print(x_stacked.ndim)

tensor([[[44,  2,  3,  4,  5,  6,  7,  8,  9],
         [ 1,  2,  3,  4,  5,  6,  7,  8,  9]],

        [[44,  2,  3,  4,  5,  6,  7,  8,  9],
         [ 1,  2,  3,  4,  5,  6,  7,  8,  9]],

        [[44,  2,  3,  4,  5,  6,  7,  8,  9],
         [ 1,  2,  3,  4,  5,  6,  7,  8,  9]],

        [[44,  2,  3,  4,  5,  6,  7,  8,  9],
         [ 1,  2,  3,  4,  5,  6,  7,  8,  9]]])
3


In [46]:
# squeeze() removes all single dimensions from a tensor. 
tensor = torch.tensor([[[4,7,7,9]]])
print(f"Previous tensor: {tensor}")
print(f"Previous shape: {tensor.shape}")

# Remove extra dimension from x_reshaped
t_squeezed = tensor.squeeze()
print(f"\nNew tensor: {t_squeezed}")
print(f"New shape: {t_squeezed.shape}")


Previous tensor: tensor([[[4, 7, 7, 9]]])
Previous shape: torch.Size([1, 1, 4])

New tensor: tensor([4, 7, 7, 9])
New shape: torch.Size([4])


In [47]:
# torch.unsqueeze() adds a dimension value of 1 at a specific index.
print(f"Previous tensor: {t_squeezed}")
print(f"Previous shape: {t_squeezed.shape}")


## Add an extra dimension with unsqueeze to dim=1
t_unsqueezed = t_squeezed.unsqueeze(dim=1)
print(f"\nNew tensor with extra dimension at 1: {t_unsqueezed}")
print(f"New shape: {t_unsqueezed.shape}")

## Add an extra dimension with unsqueeze to dim=0
t_unsqueezed = t_squeezed.unsqueeze(dim=0)
print(f"\nNew tensor with extra dimension at 0: {t_unsqueezed}")
print(f"New shape: {t_unsqueezed.shape}")

Previous tensor: tensor([4, 7, 7, 9])
Previous shape: torch.Size([4])

New tensor with extra dimension at 1: tensor([[4],
        [7],
        [7],
        [9]])
New shape: torch.Size([4, 1])

New tensor with extra dimension at 0: tensor([[4, 7, 7, 9]])
New shape: torch.Size([1, 4])


In [48]:
# Create tensor with specific shape, then permuting (rearranging) its dimensions.
x_original = torch.rand(size=(224, 224, 3))
# Permute the original tensor to rearrange the axis order (0, 1, 2) ----> (2, 0, 1)
x_permuted = x_original.permute(2, 0, 1) 
# 0->1,
# 1->2,
# 2->0 


In [49]:
# Let's index bracket by bracket
x = torch.arange(1, 10)
x = torch.stack([x, x, x, x, x]).unsqueeze(dim=0)
print(x)

print(f"First square bracket:\n{x[0]}") # getting a matrix
print(f"Second square bracket: {x[0][0]}") # getting the 1st row of that matrix 
print(f"Third square bracket: {x[0][0][0]}") # element

tensor([[[1, 2, 3, 4, 5, 6, 7, 8, 9],
         [1, 2, 3, 4, 5, 6, 7, 8, 9],
         [1, 2, 3, 4, 5, 6, 7, 8, 9],
         [1, 2, 3, 4, 5, 6, 7, 8, 9],
         [1, 2, 3, 4, 5, 6, 7, 8, 9]]])
First square bracket:
tensor([[1, 2, 3, 4, 5, 6, 7, 8, 9],
        [1, 2, 3, 4, 5, 6, 7, 8, 9],
        [1, 2, 3, 4, 5, 6, 7, 8, 9],
        [1, 2, 3, 4, 5, 6, 7, 8, 9],
        [1, 2, 3, 4, 5, 6, 7, 8, 9]])
Second square bracket: tensor([1, 2, 3, 4, 5, 6, 7, 8, 9])
Third square bracket: 1


You can also use : to specify "all values in this dimension" and then use a comma (,) to add another dimension.



In [50]:
x = torch.tensor([[
      [1, 2, 3, 4],
      [10, 20, 30, 40],
      [100, 200, 300, 400],
      [1000, 2000, 3000, 4000]]])
print(x[:])
print(x[:, 1, :])
print(x[:, :, 0])

tensor([[[   1,    2,    3,    4],
         [  10,   20,   30,   40],
         [ 100,  200,  300,  400],
         [1000, 2000, 3000, 4000]]])
tensor([[10, 20, 30, 40]])
tensor([[   1,   10,  100, 1000]])


## PyTorch tensors & NumPy

In [51]:
array = np.arange(1.0, 8.0)
# Converting from numpy to tensor
tensor = torch.from_numpy(array)
print(array, tensor, tensor.dtype)
tensor = tensor.type(torch.float32)




[1. 2. 3. 4. 5. 6. 7.] tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64) torch.float64


**Note** By default, NumPy arrays are created with the datatype float64 and if you convert it to a PyTorch tensor, it'll keep the same datatype (as above).



In [52]:
# Tensor to NumPy array
tensor = torch.ones(10) # create a tensor of ones with dtype=float32
numpy_tensor = tensor.numpy() # will be dtype=float32 unless changed
tensor, numpy_tensor

(tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], dtype=float32))

## Reproducibility (trying to take the randomness out of random) Random Seeds
Neural networks start with random numbers to describe patterns in data (these numbers are bad descriptions) and try to improve those random numbers using tensor operations ml algorithms to better describe patterns in data.

In short:
`start with random numbers -> tensor operations -> try to make better (again and again and again)` 

**Reproducibility**: Getting the same random numbers again.




In [53]:
# Create two random tensors
random_tensor_A = torch.rand(3, 4)
random_tensor_B = torch.rand(3, 4)

print(f"Tensor A:\n{random_tensor_A}\n")
print(f"Tensor B:\n{random_tensor_B}\n")
print(f"Does Tensor A equal Tensor B? (anywhere)")
random_tensor_A == random_tensor_B

Tensor A:
tensor([[0.9601, 0.0256, 0.4391, 0.7214],
        [0.5082, 0.5991, 0.4742, 0.6298],
        [0.3299, 0.0543, 0.2164, 0.7189]])

Tensor B:
tensor([[0.4061, 0.9625, 0.4351, 0.0213],
        [0.5543, 0.7348, 0.5461, 0.0605],
        [0.8355, 0.3648, 0.5487, 0.3939]])

Does Tensor A equal Tensor B? (anywhere)


tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])

In [54]:
# # Set the random seed, usually it is any number but 42 is commonly used.
RANDOM_SEED=45 # try changing this to different values and see what happens to the numbers below
torch.manual_seed(seed=RANDOM_SEED) 
random_tensor_C = torch.rand(3, 4)

# Have to reset the seed every time a new rand() is called 
# Without this, tensor_D would be different to tensor_C 
torch.random.manual_seed(seed=RANDOM_SEED) # try commenting this line out and seeing what happens
random_tensor_D = torch.rand(3, 4)

print(f"Tensor C:\n{random_tensor_C}\n")
print(f"Tensor D:\n{random_tensor_D}\n")
print(f"Does Tensor C equal Tensor D? (anywhere)")
random_tensor_C == random_tensor_D

Tensor C:
tensor([[0.1869, 0.9613, 0.6834, 0.8988],
        [0.0505, 0.5555, 0.7861, 0.0566],
        [0.7842, 0.1480, 0.0388, 0.1037]])

Tensor D:
tensor([[0.1869, 0.9613, 0.6834, 0.8988],
        [0.0505, 0.5555, 0.7861, 0.0566],
        [0.7842, 0.1480, 0.0388, 0.1037]])

Does Tensor C equal Tensor D? (anywhere)


tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])

## Running tensors on GPUs 
Deep learning algorithms require a lot of numerical operations (mainly matrix multiplications).

And by default these operations are often done on a CPU.

However, there's another common piece of hardware called a GPU (graphics processing unit), which is often much faster at performing the specific types of operations neural networks need than CPUs.

In [55]:
!nvidia-smi


Tue Jan 31 00:02:07 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03    Driver Version: 510.47.03    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   59C    P0    27W /  70W |      0MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

**Ways to get a GPU:**
1. Google Colab.
2. Buying a GPU.
3. Cloud AWS/ Azure/ GCP.

### 2. Running PyTorch on GPU

In [56]:
# Seting up device agnostic code, it will use GPU if available. Writing device agnostic code is best practice.
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [57]:
# Count number of devices
torch.cuda.device_count()

1

### 3. Putting tensors (and models) on the GPU

You can put tensors (and models, we'll see this later) on a specific device by calling `to(device) on them.

In [59]:
# Create tensor (default on CPU)
tensor = torch.tensor([1, 2, 3])

# Tensor not on GPU, on CPU by default
print(tensor, tensor.device)

# Move tensor to GPU (if available)
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

# Moving tensor back to CPU

# The next line will result an error because numpy() arrays must live in a CPU. 
# tensor_on_gpu.numpy() 


# Instead, copy the tensor back to cpu, so we can convert them to numpy.
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

tensor([1, 2, 3]) cpu


array([1, 2, 3])