# Introduction to Deep Learning with PyTorch

In this notebook, you'll get introduced to [PyTorch](http://pytorch.org/), a framework for building and training neural networks. PyTorch in a lot of ways behaves like the arrays you love from Numpy. These Numpy arrays, after all, are just tensors. PyTorch takes these tensors and makes it simple to move them to GPUs for the faster processing needed when training neural networks. It also provides a module that automatically calculates gradients (for backpropagation!) and another module specifically for building neural networks. All together, PyTorch ends up being more coherent with Python and the Numpy/Scipy stack compared to TensorFlow and other frameworks.

### Install PyTorch

Getting PyTorch up and running is straightforward. Use the following command based on your system configuration [Start Locally](https://pytorch.org/get-started/locally/):

<img src="assets\Capture.JPG" alt="Example Image" width="900px">

In [28]:
# For CPU-only
!pip install torch torchvision

# For GPU support (make sure you have CUDA installed)
!pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

In [29]:
# Check your CUDA driver and device.
!nvidia-smi

Sat Sep 14 19:23:56 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.85                 Driver Version: 555.85         CUDA Version: 12.5     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA GeForce GTX 1650      WDDM  |   00000000:01:00.0 Off |                  N/A |
| N/A   46C    P8              2W /   35W |    1611MiB /   4096MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

### Let's start by importing PyTorch and checking the version we're using.

In [30]:
import torch
torch.__version__

'2.4.0+cu124'

## Understanding Tensors

It turns out neural network computations are just a bunch of linear algebra operations on *tensors*, a generalization of matrices. A vector is a 1-dimensional tensor, a matrix is a 2-dimensional tensor, an array with three indices is a 3-dimensional tensor (RGB color images for example). The fundamental data structure for neural networks are tensors and PyTorch (as well as pretty much every other deep learning framework) is built around tensors.

<img src="assets\tensor_examples.svg" width=600px>

### Initializing a Tensor

Tensors can be initialized in various ways. Take a look at the following examples:

**Directly from data**

Tensors can be created directly from data. The data type is automatically inferred.

In [31]:
data = [[1, 2],[3, 4]]
x_data = torch.tensor(data)

**From a NumPy array**

Tensors can be created from NumPy arrays using `torch.from_numpy` (and vice versa - see `torch.tensor.numpy`)

In [32]:
np_array = np.array(data)
x_np = torch.from_numpy(np_array)

**From another tensor:**

The new tensor retains the properties (shape, datatype) of the argument tensor, unless explicitly overridden.

In [33]:
x_ones = torch.ones_like(x_data) # retains the properties of x_data
print(f"Ones Tensor: \n {x_ones} \n")

x_rand = torch.rand_like(x_data, dtype=torch.float) # overrides the datatype of x_data
print(f"Random Tensor: \n {x_rand} \n")

Ones Tensor: 
 tensor([[1, 1],
        [1, 1]]) 

Random Tensor: 
 tensor([[0.7710, 0.3980],
        [0.8417, 0.7372]]) 



## Types of Tensors (from dimensionality prespective)

<img src='assets\Tensor_types.jpeg' width=600px>


### Rank 0 Tensor **(scalar)**

A scalar is a single number and in tensor-speak it's a zero dimension tensor.

In [34]:
# Scalar
scalar = torch.tensor(7)
scalar

tensor(7)

What if we wanted to retrieve the number from the tensor? To do we can use the `item()` method.

In [35]:
# Get the Python number within a tensor (only works with one-element tensors)
scalar.item()

7

We can check the dimensions of a tensor using the `ndim` attribute.

In [36]:
scalar.ndim

0

### Rank 1 Tensor **(Vector)**

A vector is a single dimension tensor but can contain many numbers.

In [37]:
# Vector
vector = torch.tensor([7, 7])
vector

tensor([7, 7])

How many dimensions do you think it'll have?

In [38]:
# Check the number of dimensions of vector
vector.ndim

1

As we see, `vector` contains two numbers but only has a single dimension.

You can tell the number of dimensions a tensor in PyTorch has by the number of square brackets on the outside ([) and you only need to count one side.

### **Dimension VS Shape**

Another important concept for tensors is their shape attribute. The shape tells you how the elements inside them are arranged. For vectors, the shape will have only one number which tells you the length of the vector.

In [39]:
# Check shape of vector
vector.shape

torch.Size([2])

The above returns `torch.Size([2])` which means our vector has a shape of `[2]`. This is because of the two elements we placed inside the square brackets `([7, 7])`.

### Rank 2 Tensor **(Matrix)**

A vector is a single dimension tensor but can contain many numbers.

In [40]:
# Matrix
MATRIX = torch.tensor([[7, 8], 
                        [9, 10]])
MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

In [41]:
# Check number of dimensions
MATRIX.ndim

2

`MATRIX` has two dimensions (did you count the number of square brackets on the outside of one side?).

What `shape` do you think it will have?

In [42]:
MATRIX.shape

torch.Size([2, 2])

### Rank `N` Tensor


In [43]:
# Tensor
TENSOR = torch.tensor([[[1, 2, 3],
                        [3, 6, 9],
                        [2, 4, 5]]])
TENSOR

tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 5]]])

How many dimensions do you think it has? (hint: use the square bracket counting trick)

In [44]:
# Check number of dimensions for TENSOR
TENSOR.ndim

3

And what about its shape?

In [45]:
# Check shape of TENSOR
TENSOR.shape

torch.Size([1, 3, 3])

Alright, it outputs `torch.Size([1, 3, 3])`.

The dimensions go outer to inner.

That means there's 1 dimension of 3 by 3.

<img src="assets/PyTorch-different-tensor-dimensions.png" alt="Example Image" width="800"/>

### Attributes of a Tensor

Tensor attributes describe their shape, datatype, and the device on which they are stored.

In [46]:
tensor = torch.rand(3,4)

print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")

Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


As we will see later that we can perform operations using this tensors, Each of these operations can be run on the GPU (at typically higher speeds than on a CPU). If you’re using Colab, allocate a GPU by going to `Runtime > Change runtime type > GPU`.

By default, tensors are created on the CPU. We need to explicitly move tensors to the GPU using `.to` method (after checking for GPU availability). Keep in mind that copying large tensors across devices can be expensive in terms of time and memory!

In [52]:
# We move our tensor to the GPU if available
if torch.cuda.is_available():
    tensor = tensor.to("cuda")
    
print(f"Device tensor is stored on: {tensor.device}")

Device tensor is stored on: cuda:0


### Operations on Tensors

Over 100 tensor operations, including arithmetic, linear algebra, matrix manipulation (transposing, indexing, slicing), sampling and more are comprehensively described [here](https://pytorch.org/docs/stable/torch.html).

Let's dive into Tensors operations
- **Arithmetic Operations**: Addition, subtraction, multiplication, division.

In [48]:
# Create tensors
a = torch.tensor([2.0, 4.0, 6.0])
b = torch.tensor([1.0, 3.0, 5.0])

In [49]:
# Arithmetic Operations
print("Addition:")
print(a + b)

print("\nSubtraction:")
print(a - b)

print("\nElement-wise Multiplication:")
print(a * b)

print("\nElement-wise Division:")
print(a / b)

Addition:
tensor([ 3.,  7., 11.])

Subtraction:
tensor([1., 1., 1.])

Element-wise Multiplication:
tensor([ 2., 12., 30.])

Element-wise Division:
tensor([2.0000, 1.3333, 1.2000])


- **In-place Operations**: Operations that modify tensors without allocating new memory.

In [50]:
# In-place Operations
print("\nIn-place Multiplication (a *= 2):")
a.mul_(2)
print(a)


In-place Multiplication (a *= 2):
tensor([ 4.,  8., 12.])


- **Reduction Operations**: Summing, averaging, or finding the maximum of tensor elements.

In [54]:
# Reduction Operations
c = torch.tensor([[1.0, 2.0], [3.0, 4.0]])

print("\nSum of elements in c:")
print(c.sum())

print("\nMean of elements in c:")
print(c.mean())  # Convert to float before calling mean()

print("\nMaximum value in c:")
print(c.max())


Sum of elements in c:
tensor(10.)

Mean of elements in c:
tensor(2.5000)

Maximum value in c:
tensor(4.)


- **Matrix Multiplication**: Performing dot products and matrix operations.

In [56]:
# Matrix Multiplication
d = torch.tensor([[5.0, 6.0], [7.0, 8.0]])
print("\nMatrix Multiplication of c and d:")
print(torch.matmul(c, d))  # or torch.mm(c, d)


Matrix Multiplication of c and d:
tensor([[19., 22.],
        [43., 50.]])


### Indexing and Slicing

Indexing and slicing in PyTorch allow you to access and manipulate specific elements, rows, columns, or slices of tensors. This is essential for data preprocessing and manipulation in deep learning models.

Examples:

- **Basic Indexing**: Accessing individual elements.

In [57]:
# Create a 2D tensor
tensor = torch.tensor([[10, 20, 30],
                       [40, 50, 60],
                       [70, 80, 90]])
print("Original Tensor:")
print(tensor)

Original Tensor:
tensor([[10, 20, 30],
        [40, 50, 60],
        [70, 80, 90]])


In [58]:
# Basic Indexing
print("\nElement at row 1, column 2:")
print(tensor[1, 2])


Element at row 1, column 2:
tensor(60)


- **Slicing**: Extracting sub-tensors.

In [59]:
# Slicing
print("\nFirst two rows and last two columns:")
print(tensor[:2, 1:])


First two rows and last two columns:
tensor([[20, 30],
        [50, 60]])


- **Integer Array Indexing**: Using tensors of indices.

In [60]:
# Integer Array Indexing
rows = torch.tensor([0, 2])
cols = torch.tensor([1, 0])
print("\nElements at positions (0,1) and (2,0):")
print(tensor[rows, cols])


Elements at positions (0,1) and (2,0):
tensor([20, 70])


- **Boolean Masking**: Filtering elements based on conditions.

In [61]:
# Boolean Masking
print("\nElements greater than or equal to 50:")
print(tensor[tensor >= 50])


Elements greater than or equal to 50:
tensor([50, 60, 70, 80, 90])


### Conversion to Other Python Objects

Interoperability with other Python libraries is often necessary. PyTorch tensors can be converted to NumPy arrays, Python scalars, and lists.

Examples and Improvements
- **Tensor to NumPy Array**: Using `tensor.numpy()`.

In [62]:
# Tensor to NumPy array
tensor = torch.tensor([100, 200, 300])
array = tensor.numpy()
print("NumPy array:", array)

NumPy array: [100 200 300]


**Pay Attention**

the values in the array will be the same as the tensor and if you change the values in the array, it will change the values of the tensor

In [63]:
# Modifying the NumPy array affects the tensor
array[0] = 999
print("\nAfter modifying array[0] = 999")
print("NumPy array:", array)
print("Tensor:", tensor)


After modifying array[0] = 999
NumPy array: [999 200 300]
Tensor: tensor([999, 200, 300])


- **NumPy Array to Tensor**: Using `torch.from_numpy()`.

In [64]:
# NumPy array to Tensor
np_array = np.array([400, 500, 600])
tensor_from_array = torch.from_numpy(np_array)
print("\nTensor from NumPy array:", tensor_from_array)


Tensor from NumPy array: tensor([400, 500, 600], dtype=torch.int32)


- **Sharing Memory**: Understanding that they share underlying data.

In [65]:
# Modifying the tensor affects the NumPy array
tensor_from_array[1] = 555
print("\nAfter modifying tensor_from_array[1] = 555")
print("Tensor:", tensor_from_array)
print("NumPy array:", np_array)


After modifying tensor_from_array[1] = 555
Tensor: tensor([400, 555, 600], dtype=torch.int32)
NumPy array: [400 555 600]


- **Converting to Scalars and Lists**: Using `item()` and `tolist()`.

In [66]:
# Tensor to Python scalar
scalar_tensor = torch.tensor(42)
scalar_value = scalar_tensor.item()
print("\nPython scalar:", scalar_value)


Python scalar: 42


In [67]:
# Tensor to list
tensor = torch.tensor([[1, 2], [3, 4]]) # 2D Tensor
list_from_tensor = tensor.tolist()
print("\nList from tensor:", list_from_tensor)


List from tensor: [[1, 2], [3, 4]]


### Benchmarking NumPy Arrays vs PyTorch Tensors (CPU and GPU Performance)

This demonstration compares the performance of matrix multiplication using:

1. NumPy (CPU): NumPy arrays are restricted to CPU computations.

2. PyTorch (CPU): PyTorch Tensors, similar to NumPy arrays, are used on the CPU.

3. PyTorch (GPU): PyTorch Tensors, when moved to the GPU, utilize hardware acceleration for faster computation.

In [68]:
import numpy as np
import torch
import time

# Define matrix sizes for the benchmark
N = 10000

# 1. NumPy (CPU)
start_time = time.time()
a_np = np.random.rand(N, N)
b_np = np.random.rand(N, N)
result_np = np.dot(a_np, b_np)
time_numpy_cpu = time.time() - start_time
print(f"Time taken by NumPy (CPU): {time_numpy_cpu:.4f} seconds")

# 2. PyTorch (CPU)
start_time = time.time()
a_torch_cpu = torch.rand(N, N)
b_torch_cpu = torch.rand(N, N)
result_torch_cpu = torch.mm(a_torch_cpu, b_torch_cpu)
time_torch_cpu = time.time() - start_time
print(f"Time taken by PyTorch (CPU): {time_torch_cpu:.4f} seconds")

# 3. PyTorch (GPU)
if torch.cuda.is_available():
    start_time = time.time()
    a_torch_gpu = torch.rand(N, N, device='cuda')
    b_torch_gpu = torch.rand(N, N, device='cuda')
    result_torch_gpu = torch.mm(a_torch_gpu, b_torch_gpu)
    torch.cuda.synchronize()  # Wait for all operations to finish on GPU
    time_torch_gpu = time.time() - start_time
    print(f"Time taken by PyTorch (GPU): {time_torch_gpu:.4f} seconds")
else:
    print("CUDA is not available.")

Time taken by NumPy (CPU): 19.4701 seconds
Time taken by PyTorch (CPU): 26.7916 seconds
Time taken by PyTorch (GPU): 1.6697 seconds


### Reshaping Tensors in PyTorch

sneak peek to the future, In deep learning, especially in image classification models, it's common to reshape tensors while preserving their contents and number of elements. This is often necessary when transitioning from a convolutional layer to a linear (fully connected) layer.

so let's learn how to control reshape a tensor using `reshape()` method

In [69]:
# Generate a 3D tensor with random values of shape (6, 20, 20)
output3d = torch.rand(6, 20, 20)
print(output3d.shape)  # Print the shape of the 3D tensor

# Reshape the 3D tensor into a 1D tensor
input1d = output3d.reshape(6 * 20 * 20)
print(input1d.shape)  # Print the shape of the 1D tensor

torch.Size([6, 20, 20])
torch.Size([2400])


In [70]:
# Create a tensor
import torch
x = torch.arange(1., 8.)
x, x.shape

# Add an extra dimension
x_reshaped = x.reshape(1, 7)
x_reshaped, x_reshaped.shape

(tensor([[1., 2., 3., 4., 5., 6., 7.]]), torch.Size([1, 7]))

### Tensor Stacking

If we wanted to stack our new tensor on top of itself five times, we could do so with `torch.stack()`.

In [71]:
# Stack tensors on top of each other
x_stacked = torch.stack([x, x, x, x], dim=0) # try changing dim to dim=1 and see what happens
x_stacked

tensor([[1., 2., 3., 4., 5., 6., 7.],
        [1., 2., 3., 4., 5., 6., 7.],
        [1., 2., 3., 4., 5., 6., 7.],
        [1., 2., 3., 4., 5., 6., 7.]])

### Tensor Squeezing
In PyTorch, tensor squeezing refers to the process of removing dimensions of size 1 from a tensor's shape. This is done using the `torch.squeeze()` function. It is useful when you want to eliminate redundant dimensions that do not add value to the tensor's data.

For example, if you have a tensor with the shape `(1, 3, 1, 4)`, where there are dimensions with size 1, `squeeze()` will remove these dimensions, resulting in a shape of `(3, 4)`.

In [72]:
# Create a tensor with extra dimensions of size 1
tensor = torch.rand(1, 3, 1, 4)
print(tensor.shape)  # Output: torch.Size([1, 3, 1, 4])

# Apply squeeze
squeezed_tensor = torch.squeeze(tensor)
print(squeezed_tensor.shape)  # Output: torch.Size([3, 4])

torch.Size([1, 3, 1, 4])
torch.Size([3, 4])


### Unsqueezing

To add a dimension of size 1, you can use the `torch.unsqueeze()` function. This is the reverse operation of squeezing.

In [73]:
# Create a tensor with shape (3, 4)
tensor = torch.rand(3, 4)
print(f"Original tensor shape: {tensor.shape}")  # Output: torch.Size([3, 4])

# Apply unsqueeze to add a dimension at position 0 (before the first dimension)
unsqueezed_tensor_0 = torch.unsqueeze(tensor, 0)
print(f"Shape after unsqueeze at dim 0: {unsqueezed_tensor_0.shape}")  # Output: torch.Size([1, 3, 4])

# Apply unsqueeze to add a dimension at position 2 (between dimensions 1 and 2)
unsqueezed_tensor_2 = torch.unsqueeze(tensor, 2)
print(f"Shape after unsqueeze at dim 2: {unsqueezed_tensor_2.shape}")  # Output: torch.Size([3, 1, 4])

Original tensor shape: torch.Size([3, 4])
Shape after unsqueeze at dim 0: torch.Size([1, 3, 4])
Shape after unsqueeze at dim 2: torch.Size([3, 4, 1])


# Happy Learning