<a href="https://colab.research.google.com/github/Gus-Victrix/starting_data/blob/main/00_PyTorch_Fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 00. PyTorch Fundamentals

In [1]:
import torch
print(torch.__version__)

2.1.0+cu121


In [2]:
!nvidia-smi


Wed Feb 14 23:39:40 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   41C    P8               9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

## Introduction to Tensors

### Creating tensors
PyTorch uses `torch.Tensor()` to create them.

In [3]:
# scalar
scalar = torch.tensor(7)
scalar

tensor(7)

In [4]:
# rank
scalar.ndim

0

In [5]:
# getting value from scalar tensor
scalar.item()

7

In [6]:
# vector
vector = torch.tensor([4, 8])
vector

tensor([4, 8])

In [7]:
vector.ndim # no. of brackets

1

In [8]:
# Shape
vector.shape

torch.Size([2])

In [9]:
# matrix
matrix = torch.tensor([[ 4, 4],
                       [ 4, 3]])
matrix

tensor([[4, 4],
        [4, 3]])

In [10]:
print(matrix.shape)
print(matrix.ndim)
print(matrix)

torch.Size([2, 2])
2
tensor([[4, 4],
        [4, 3]])


In [11]:
# tensor
tensor = torch.tensor([[[1, 2, 3],
                        [3, 5, 3],
                        [2, 3, 4]]])

In [12]:
print(tensor)
print(tensor.shape)
print(tensor.ndim)

tensor([[[1, 2, 3],
         [3, 5, 3],
         [2, 3, 4]]])
torch.Size([1, 3, 3])
3


### Random tensors

Why random tensors?

> Many neural networks learn by randomized tensors and adjust those random numbers to better represent the

`Take random numbers -> look at data -> update random numbers -> repeat step 2 until random numbers closely match data`

In [13]:
# Create a random tensor of size (3, 4)
random_tensor = torch.rand(3, 4)
random_tensor

tensor([[0.1697, 0.6018, 0.8632, 0.6173],
        [0.7566, 0.1024, 0.0160, 0.3017],
        [0.0896, 0.6034, 0.0493, 0.8302]])

In [14]:
# Create a random tensor similar to image
random_image_tensor = torch.rand(size=(224, 224, 3)) # height, width, color channels
random_image_tensor.shape, random_image_tensor.ndim

(torch.Size([224, 224, 3]), 3)

### Zeroes and ones

In [15]:
# tensor with all zeroes
zeros = torch.zeros(size=(3, 4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [16]:
# tensor of all ones
ones = torch.ones(size=(3, 4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [17]:
# data-type
ones.dtype

torch.float32

### Creating a range of tensors and tensors-like

In [18]:
# Torch.range ** to be deprecated **
one_to_ten = torch.range(1, 10)
one_to_ten

  one_to_ten = torch.range(1, 10)


tensor([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

In [19]:
# Torch.arange
one_to_nine = torch.arange(1, 10)
one_to_nine

tensor([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [20]:
# Creating tensors with same shape as another
ten_zeros = torch.zeros_like(one_to_ten)
ten_zeros

tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

### Tensor datatypes

**Note:** Tensor datatypes is one of the 3 main errors encountered in deeplearning

1. Tensors not right datatype
2. Tensors not right shape
3. Tensors not on the right device

In [21]:
# Float 32 tensor ** default data-type **
float_32 = torch.tensor([2, 3 ,4.], dtype=None)
float_32.dtype

torch.float32

In [22]:
# Float 64 tensor
float_64 = torch.tensor([2. ,3 ,4],  # Data used to create the tensor
                        dtype=torch.float64,  # Precision
                        device=None,  # The device where the tensor is located
                        requires_grad=False)  # Whether or not to track gradient
float_64.dtype

torch.float64

In [23]:
# torch.float16 tensor, conversion from float64
float_16 = float_64.type(torch.float16)
float_16.dtype

torch.float16

### Getting information from tensors

1. Tensor shape :`tensor.shape`
2. Tensor data type: `tensor.dtype`
3. Tensor device: `tensor.device`

In [24]:
# Create random tensor
rand = torch.rand(3, 2, 3)

# Obtain information from random tensor
print(f"Tensor shape: {tensor.shape}")
print(f"Tensor datatype: {tensor.dtype}")
print(f"Tensor's device: {tensor.device}")

Tensor shape: torch.Size([1, 3, 3])
Tensor datatype: torch.int64
Tensor's device: cpu


### Manipulating tensors (tensor operations)

These include:
- Addition
- Subtraction
- Multiplication (element-wise)
- Division
- Matrix multiplication

In [25]:
# Create a tensor
tensor = torch.tensor([1, 2, 3])
# add ten
tensor + 10

tensor([11, 12, 13])

In [26]:
# multiply by ten
tensor * 10

tensor([10, 20, 30])

In [27]:
# divide by ten
tensor / 10

tensor([0.1000, 0.2000, 0.3000])

In [28]:
# subtract ten
tensor - 10

tensor([-9, -8, -7])

In [29]:
# inbuilt pytorch
print(torch.mul(tensor, 10))
print(torch.add(tensor, 10))
print(torch.sub(tensor, 10))
print(torch.div(tensor, 10))

tensor([10, 20, 30])
tensor([11, 12, 13])
tensor([-9, -8, -7])
tensor([0.1000, 0.2000, 0.3000])


#### Matrix multiplication


In [30]:
tensor = torch.rand(size=[2,2,2])

In [31]:
%%time
tensor.matmul(tensor)

CPU times: user 1.23 ms, sys: 48 µs, total: 1.28 ms
Wall time: 4.33 ms


tensor([[[0.3955, 0.5437],
         [0.8078, 1.1426]],

        [[0.3963, 0.5553],
         [0.3170, 0.5043]]])

In [32]:
%%time
torch.matmul(tensor, tensor)

CPU times: user 73 µs, sys: 0 ns, total: 73 µs
Wall time: 77 µs


tensor([[[0.3955, 0.5437],
         [0.8078, 1.1426]],

        [[0.3963, 0.5553],
         [0.3170, 0.5043]]])

1. Inner dimensions must match
- `(3, 2) @ (3, 2)` won't work
- `(2, 3) @ (3, 2)` will work
2. Outer dimensions determine final shape
- `(3, 2) @ (2, 3)` produces `(3, 3)`

In [33]:
torch.rand(2,3).matmul(torch.rand(3, 2))

tensor([[0.2733, 0.7964],
        [0.2597, 1.0040]])

In [34]:
torch.rand(3,2).matmul(torch.rand(2, 3))

tensor([[0.1378, 0.3949, 0.2319],
        [0.6781, 0.5311, 0.4973],
        [0.0757, 0.2138, 0.1260]])

## Reshaping, stacking, squeezing and unsqueezing tensors

- Reshaping - changes tensor to specified shape of the same size
- View - Same tensor different perspective
- Stacking - combine multiple tensors on top of one another (vstack) or side by side (hstack)
- Squeeze - removes all `1` dimension from a tensor.
- Unsqueeze - add 1 dimension to target tensor.
- Permute - Return a view with dimensions swapped a certain way.

In [35]:
# Creating a tensor
import torch
x = torch.arange(1., 10.)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [36]:
# Add an extra dimension
x.reshape(1, 9), x.reshape(9, 1)

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([[1.],
         [2.],
         [3.],
         [4.],
         [5.],
         [6.],
         [7.],
         [8.],
         [9.]]))

In [37]:
# Changing the view
z = x.view(9,1)
z, z.shape

(tensor([[1.],
         [2.],
         [3.],
         [4.],
         [5.],
         [6.],
         [7.],
         [8.],
         [9.]]),
 torch.Size([9, 1]))

In [38]:
# Changing z changes x bcoz they reside in same memory
z[0][0] = 10
z, x

(tensor([[10.],
         [ 2.],
         [ 3.],
         [ 4.],
         [ 5.],
         [ 6.],
         [ 7.],
         [ 8.],
         [ 9.]]),
 tensor([10.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.]))

In [39]:
# Stack tensors on top of each other
x_stacked = torch.stack([x,x,x,x,x,x], dim=1)
x_stacked

tensor([[10., 10., 10., 10., 10., 10.],
        [ 2.,  2.,  2.,  2.,  2.,  2.],
        [ 3.,  3.,  3.,  3.,  3.,  3.],
        [ 4.,  4.,  4.,  4.,  4.,  4.],
        [ 5.,  5.,  5.,  5.,  5.,  5.],
        [ 6.,  6.,  6.,  6.,  6.,  6.],
        [ 7.,  7.,  7.,  7.,  7.,  7.],
        [ 8.,  8.,  8.,  8.,  8.,  8.],
        [ 9.,  9.,  9.,  9.,  9.,  9.]])

In [40]:
# Squeeze
x_stacked.squeeze(), x.squeeze(), z.squeeze()


(tensor([[10., 10., 10., 10., 10., 10.],
         [ 2.,  2.,  2.,  2.,  2.,  2.],
         [ 3.,  3.,  3.,  3.,  3.,  3.],
         [ 4.,  4.,  4.,  4.,  4.,  4.],
         [ 5.,  5.,  5.,  5.,  5.,  5.],
         [ 6.,  6.,  6.,  6.,  6.,  6.],
         [ 7.,  7.,  7.,  7.,  7.,  7.],
         [ 8.,  8.,  8.,  8.,  8.,  8.],
         [ 9.,  9.,  9.,  9.,  9.,  9.]]),
 tensor([10.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.]),
 tensor([10.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.]))

In [41]:
#unsqueeze
x.unsqueeze(dim=1)

tensor([[10.],
        [ 2.],
        [ 3.],
        [ 4.],
        [ 5.],
        [ 6.],
        [ 7.],
        [ 8.],
        [ 9.]])

In [42]:
# Permute
print(x.unsqueeze(dim=1).shape)
print(x.unsqueeze(dim=1).permute(1,0))

torch.Size([9, 1])
tensor([[10.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.]])


## Indexing (selecting data from tensors)

It follows the same concept as numpy indexing

In [43]:
print(tensor, end="\n\n")
tensor[:, 0, 0]

tensor([[[0.3828, 0.4093],
         [0.6082, 0.9454]],

        [[0.5008, 0.5049],
         [0.2882, 0.5990]]])



tensor([0.3828, 0.5008])

## PyTorch tensor & NumPy
You can transform:
- NumPy ndarray to PyTorch tensor `torch.from_numpy()`
- PyTorch tensor to NumPy ndarray `torch.to_numpy()`

In [44]:
# NumPy array to tensor
import torch
import numpy as np
array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array)
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [45]:
# torch has default data type as float32 while numpy has float64 as default
np.arange(1., 3.).dtype, torch.arange(1., 3.).dtype

(dtype('float64'), torch.float32)

In [46]:
# Tensor to NumPy array
tensor = torch.ones(7,3)
numpy_tensor = tensor.numpy()  # NumPy inherits the default float32 from torch
tensor, numpy_tensor

(tensor([[1., 1., 1.],
         [1., 1., 1.],
         [1., 1., 1.],
         [1., 1., 1.],
         [1., 1., 1.],
         [1., 1., 1.],
         [1., 1., 1.]]),
 array([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]], dtype=float32))

## Reproducibility (trying to take random out of random)

Sometimes it's necessary to set the same starting point for all instances of a neural network.

To reduce the randomness in neural networks, there's a concept of a `random seed`

This flavors the randomness, by ensuring that there's a common starting point for all instances of the same experiment.

In [47]:
# Demonstrating unreproducability of random generators
torch.rand(3, 3), torch.rand(3, 3), torch.rand(3, 3)

(tensor([[0.2114, 0.5945, 0.7739],
         [0.8348, 0.0742, 0.0235],
         [0.5365, 0.9832, 0.1946]]),
 tensor([[0.9052, 0.8732, 0.9854],
         [0.9938, 0.2059, 0.7892],
         [0.3107, 0.9796, 0.7936]]),
 tensor([[0.2605, 0.5638, 0.7787],
         [0.4882, 0.5510, 0.1286],
         [0.6481, 0.2491, 0.2851]]))

In [48]:
# Making randomness reproducible
import torch
# Set random seed
RANDOM_SEED = 42
# seeding first tensor
torch.manual_seed(RANDOM_SEED)
tensor1 = torch.rand(3, 4)
# seeding second tensor
torch.manual_seed(RANDOM_SEED)
tensor2 = torch.rand(3,4)
# comparing the generated values
tensor2 == tensor1

tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])

## Running tensors and pytorch objects on gpus

GPU = faster computation on numbers

### Getting a GPU

1. Free on Colab
2. Buying
3. Cloud computing

In [49]:
!nvidia-smi

Wed Feb 14 23:39:41 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   41C    P8               9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

### Checking for GPU access with PyTorch

In [50]:
import torch
torch.cuda.is_available()

True

In [51]:
# Setup device agnostic code
device = "cuda" if torch.cuda.is_available() else "cpu"
print(device)

cuda


### Putting tenfor tpu and other xla devices
import torch

t = torch.randn(2, 2, device="xla")
print(t.device)sors (and models) on the GPU
Definitely for the faster computation provided

In [52]:
# Create a tensor (default on the CPU)
tensor = torch.tensor([1, 2, 3])

# Tensor not on GPU
print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [53]:
# Move tensor to GPU (if available)
device = "cuda" if torch.cuda.is_available() else "cpu"
tensor_on_gpu = tensor.to(device)
tensor_on_gpu.device

device(type='cuda', index=0)

In [54]:
# If tensor is on GPU cannot be converted to NumPy
# So first return it to gpu
tensor_on_cpu = tensor_on_gpu.to("cpu")
print(tensor_on_cpu.device)
print(tensor_on_cpu.numpy())


cpu
[1 2 3]


In [55]:
# Converting to numpy and back to torch in gpu in one stroke
tensor_on_gpu = torch.from_numpy(tensor_on_gpu.to("cpu").numpy()).to("cuda")
tensor_on_gpu.device

device(type='cuda', index=0)