# Introduction to PyTorch

Welcome to the `01_intro_to_pytorch` notebook. This is part of a portfolio designed to showcase foundational PyTorch concepts and techniques that will be utilized in later projects. 

Here, I cover essential topics such as setting up the environment, working with tensors, leveraging GPU acceleration, and implementing automatic differentiation. Through various exercises, this notebook will show how to create and manipulate tensors, build and train simple neural networks, and evaluate model performance. 

This notebook lays the groundwork for more advanced PyTorch applications in subsequent projects.

Also, keep in mind that these notebooks following a "question-and-answer" format for active learning training purposes. So instead of just having explanatory code I'd rather go and and try to actively recall (or look up) the answer to a problem I face, which could as simple as loading libraries, to more complex things such as how to fine-tune models.

## What is PyTorch?

PyTorch is an open-source deep learning framework developed by Facebook's AI Research lab. It provides a flexible and intuitive platform for building and training neural networks. 
PyTorch's key features include dynamic computation graphs, which allow for more efficient model building and debugging, and support for GPU acceleration, enabling faster computations. 

With its extensive library of tools and utilities, PyTorch is widely used for both research and production in machine learning and artificial intelligence projects. These projects include:

- **Natural Language Processing (NLP)**: Building models for text classification, sentiment analysis, and machine translation.
- **Computer vision**: Implementing image classification, object detection, and image generation tasks.
- **Reinforcement learning**: Developing algorithms for game playing and decision-making processes.
- **Generative Adversarial Networks (GANs)**: Creating realistic images, videos, and other data generation tasks.
- **Time series analysis**: Forecasting and anomaly detection in sequential data.
- **Speech recognition**: Building models for converting speech to text and vice versa.
- **Robotics**: Developing intelligent control systems for robotic movements and actions.
- **Healthcare**: Predictive modeling and medical image analysis for diagnostics and treatment planning.

## Setting up the environment

##### **Q1: How do you install the base PyTorch libraries using a Jupyter notebook?**

In [1]:
!pip install torch torchvision torchaudio

Defaulting to user installation because normal site-packages is not writeable
Collecting torch
  Downloading torch-2.3.1-cp311-cp311-win_amd64.whl (159.8 MB)
     -------------------------------------- 159.8/159.8 MB 2.0 MB/s eta 0:00:00
Collecting torchvision
  Downloading torchvision-0.18.1-cp311-cp311-win_amd64.whl (1.2 MB)
     ---------------------------------------- 1.2/1.2 MB 2.5 MB/s eta 0:00:00
Collecting torchaudio
  Downloading torchaudio-2.3.1-cp311-cp311-win_amd64.whl (2.4 MB)
     ---------------------------------------- 2.4/2.4 MB 2.3 MB/s eta 0:00:00
Collecting typing-extensions>=4.8.0
  Downloading typing_extensions-4.12.2-py3-none-any.whl (37 kB)
Collecting sympy
  Downloading sympy-1.13.0-py3-none-any.whl (6.2 MB)
     ---------------------------------------- 6.2/6.2 MB 2.1 MB/s eta 0:00:00
Collecting networkx
  Downloading networkx-3.3-py3-none-any.whl (1.7 MB)
     ---------------------------------------- 1.7/1.7 MB 1.5 MB/s eta 0:00:00
Collecting mkl<=2021.4.0,>=2


[notice] A new release of pip available: 22.3.1 -> 24.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


##### **Q2: How do you import the base PyTorch libraries for later use?**

In [2]:
import torch

print(torch.__version__)

2.3.1+cpu


## PyTorch basics

##### **Q3: How do you create a tensor in PyTorch? Provide examples of different ways to create tensors.**

In [3]:
# From a list
tensor_from_list = torch.tensor([1, 2, 3, 4])
print(tensor_from_list)

tensor([1, 2, 3, 4])


In [4]:
# Zeros tensor
zeros_tensor = torch.zeros(3, 3)
print(zeros_tensor)

tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])


In [5]:
# Ones tensor
ones_tensor = torch.ones(2, 4)
print(ones_tensor)

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.]])


In [7]:
# Random values
random_tensor = torch.rand(3, 2)
print(random_tensor)

tensor([[0.1047, 0.7210],
        [0.6622, 0.0976],
        [0.5603, 0.8628]])


In [8]:
# From a NumPy array
import numpy as np

numpy_array = np.array([[1, 2], [3, 4]])
tensor_from_numpy = torch.tensor(numpy_array)
print(tensor_from_numpy)

tensor([[1, 2],
        [3, 4]], dtype=torch.int32)


In [9]:
# With a specific data type
float_tensor = torch.tensor([1.0, 2.0, 3.0], dtype=torch.float32)
print(float_tensor)

int_tensor = torch.tensor([1, 2, 3], dtype=torch.int32)
print(int_tensor)

tensor([1., 2., 3.])
tensor([1, 2, 3], dtype=torch.int32)


In [13]:
# Uninitialized
uninitialized_tensor = torch.empty(2, 3)
print(uninitialized_tensor) # it's a tensor whose values are not set and can contain any data that was already present in the allocated memory block, making it useful for performance optimization when the initial values are irrelevant

tensor([[-1.9481e+33,  8.2256e-43,  0.0000e+00],
        [ 0.0000e+00,  0.0000e+00,  0.0000e+00]])


In [12]:
# Using a range
range_tensor = torch.arange(0, 10, step=2)
print(range_tensor)

tensor([0, 2, 4, 6, 8])


In [14]:
# Using linspace()
linspace_tensor = torch.linspace(0, 1, steps=5)
print(linspace_tensor)

tensor([0.0000, 0.2500, 0.5000, 0.7500, 1.0000])


##### **Q4: How do you perform basic tensor operations such as addition and multiplication?**

In [21]:
# Create two tensors for the exercise
tensor1 = torch.tensor([1, 2, 3])
tensor2 = torch.tensor([4, 5, 6])

# Element-wise addition
result = tensor1 + tensor2
print(result)

tensor([5, 7, 9])


In [16]:
# Using torch.add()
result = torch.add(tensor1, tensor2)
print(result)

tensor([5, 7, 9])


In [23]:
# Element-wise subtraction
result = tensor2 - tensor1
print(result)

tensor([3, 3, 3])


In [28]:
# Using torch.sub()
result = torch.sub(tensor2, tensor1)
print(result)

tensor([3, 3, 3])


In [17]:
# Element-wise multiplication
result = tensor1 * tensor2
print(result)

tensor([ 4, 10, 18])


In [18]:
# Using torch.mul()
result = torch.mul(tensor1, tensor2)
print(result)

tensor([ 4, 10, 18])


In [27]:
# Element-wise division
result = tensor2 / tensor1
print(result)

tensor([4.0000, 2.5000, 2.0000])


In [29]:
result = torch.div(tensor2, tensor1)
print(result)

tensor([4.0000, 2.5000, 2.0000])


In [30]:
# Examples for matrix operations
tensor1 = torch.tensor([[1, 2], [3, 4]])
tensor2 = torch.tensor([[5, 6], [7, 8]])

# Matrix multiplication
result = torch.matmul(tensor1, tensor2)
print(result)

tensor([[19, 22],
        [43, 50]])


In [31]:
# Using the @ operator
result = tensor1 @ tensor2
print(result)

tensor([[19, 22],
        [43, 50]])


In [32]:
# Broadcasting (i.e., arithmetic operations on tensors of different shapes)
tensor1 = torch.tensor([[1, 2, 3], [4, 5, 6]])
tensor2 = torch.tensor([1, 2, 3])

result = tensor1 + tensor2
print(result)

tensor([[2, 4, 6],
        [5, 7, 9]])


##### **Q5: How do you slice and index tensors in PyTorch?**

In [33]:
# Indexing a single element
tensor = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
element = tensor[0, 1]  # Access the element at row 0, column 1
print(element)

tensor(2)


In [34]:
# Basic slicing
slice_tensor = tensor[:2, 1:]  # Slice the first two rows and columns from the second to the end
print(slice_tensor)

tensor([[2, 3],
        [5, 6]])


In [35]:
# Slicing with steps
step_slice = tensor[::2, ::2]  # Slice every second element along both dimensions
print(step_slice)

tensor([[1, 3],
        [7, 9]])


In [36]:
# Select all elements in a dimension with ellipsis
tensor = torch.tensor([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
ellipsis_slice = tensor[..., 1]  # Select the last element from each sub-array
print(ellipsis_slice)

tensor([[2, 4],
        [6, 8]])


In [37]:
# Boolean indexing
tensor = torch.tensor([1, 2, 3, 4, 5, 6])
bool_index = tensor[tensor > 3]  # Select elements greater than 3
print(bool_index)

tensor([4, 5, 6])


In [38]:
# Indexing with a tensor of indices
tensor = torch.tensor([10, 20, 30, 40, 50])
indices = torch.tensor([0, 2, 4])
advanced_index = tensor[indices]  # Select elements at positions 0, 2, and 4
print(advanced_index)

tensor([10, 30, 50])


In [39]:
# Indexing + slicing
tensor = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
combined = tensor[1:, :2]  # Slice rows from the second to the end and columns up to the second
print(combined)

tensor([[4, 5],
        [7, 8]])


##### **Q6: How do you change the shape of a tensor in PyTorch?**

In [40]:
# Using reshape()
tensor = torch.tensor([[1, 2, 3], [4, 5, 6]])

reshaped_tensor = tensor.reshape(3, 2)
print(reshaped_tensor)

tensor([[1, 2],
        [3, 4],
        [5, 6]])


In [41]:
# Using view()
viewed_tensor = tensor.view(3, 2)
print(viewed_tensor)

tensor([[1, 2],
        [3, 4],
        [5, 6]])


In [42]:
# Using transpose()
transposed_tensor = tensor.transpose(0, 1)
print(transposed_tensor)

tensor([[1, 4],
        [2, 5],
        [3, 6]])


In [43]:
# Create a 3D tensor
tensor_3d = torch.tensor([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])

# Use permute()
permuted_tensor = tensor_3d.permute(2, 0, 1)
print(permuted_tensor)

tensor([[[1, 3],
         [5, 7]],

        [[2, 4],
         [6, 8]]])


In [44]:
# Using flatten()
flattened_tensor = tensor.flatten()
print(flattened_tensor)

tensor([1, 2, 3, 4, 5, 6])


##### **Q7: How do you concatenate two tensors in PyTorch?**

In [48]:
# Concatenate along a specified dimension
tensor1 = torch.tensor([[1, 2, 3], [4, 5, 6]])
tensor2 = torch.tensor([[7, 8, 9], [10, 11, 12]])

# Along the first dimension (rows)
concatenated_tensor = torch.cat((tensor1, tensor2), dim=0)
print(concatenated_tensor, '\n')

# Along the second dimension (columns)
concatenated_tensor = torch.cat((tensor1, tensor2), dim=1)
print(concatenated_tensor)

tensor([[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9],
        [10, 11, 12]]) 

tensor([[ 1,  2,  3,  7,  8,  9],
        [ 4,  5,  6, 10, 11, 12]])


In [51]:
# Stack along a new dimension
stacked_tensor = torch.stack((tensor1, tensor2), dim=0)
print(stacked_tensor, '\n')

stacked_tensor = torch.stack((tensor1, tensor2), dim=1)
print(stacked_tensor)

tensor([[[ 1,  2,  3],
         [ 4,  5,  6]],

        [[ 7,  8,  9],
         [10, 11, 12]]]) 

tensor([[[ 1,  2,  3],
         [ 7,  8,  9]],

        [[ 4,  5,  6],
         [10, 11, 12]]])


##### **Q8: How do you convert a NumPy array to a PyTorch tensor and vice versa?**

In [52]:
# Create a NumPy array
numpy_array = np.array([[1, 2, 3], [4, 5, 6]])

# Convert the NumPy array to a PyTorch tensor
torch_tensor = torch.from_numpy(numpy_array)
print(torch_tensor)

tensor([[1, 2, 3],
        [4, 5, 6]], dtype=torch.int32)


In [53]:
# Create a PyTorch tensor
torch_tensor = torch.tensor([[1, 2, 3], [4, 5, 6]])

# Convert the PyTorch tensor to a NumPy array
numpy_array = torch_tensor.numpy()
print(numpy_array)

[[1 2 3]
 [4 5 6]]


In [57]:
# Avoiding memory sharing by creating a copy
torch_tensor_copy = torch.from_numpy(numpy_array.copy())
numpy_array_copy = torch_tensor.numpy().copy()

print(torch_tensor_copy, '\n') 
print(numpy_array_copy)

tensor([[1, 2, 3],
        [4, 5, 6]]) 

[[1 2 3]
 [4 5 6]]


##### **Q9: How do you get the size and shape of a tensor in PyTorch?**

In [58]:
# Get the size of a tensor
tensor = torch.tensor([[1, 2, 3], [4, 5, 6]])

size = tensor.size()
print(size)

torch.Size([2, 3])


In [59]:
# Get the shape of the tensor
shape = tensor.shape
print(shape)

torch.Size([2, 3])


In [60]:
# Get number of dimensions
num_dimensions = tensor.ndimension()
print(num_dimensions)

2


In [61]:
# Get size of a specific dimension
rows = tensor.size(0)
cols = tensor.size(1)
print(f"Rows: {rows}, Columns: {cols}")

Rows: 2, Columns: 3


##### **Q10: How do you use advanced indexing techniques in PyTorch?**

In [62]:
# Boolean indexing
tensor = torch.tensor([1, 2, 3, 4, 5, 6])

mask = tensor > 3
selected_elements = tensor[mask]
print(selected_elements)

tensor([4, 5, 6])


In [63]:
# Indexing with another tensor
tensor = torch.tensor([10, 20, 30, 40, 50])

indices = torch.tensor([0, 2, 4])
selected_elements = tensor[indices]
print(selected_elements)

tensor([10, 30, 50])


In [64]:
# Indexing with a list of indices
tensor = torch.tensor([[1, 2], [3, 4], [5, 6]])

selected_rows = tensor[[0, 2]]
print(selected_rows)

tensor([[1, 2],
        [5, 6]])


In [65]:
# Use the mask with integer indexing
tensor = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

mask = tensor > 4

selected_elements = tensor[mask]
print(selected_elements)

tensor([5, 6, 7, 8, 9])


In [66]:
# Ellipsis indexing
tensor = torch.randn(3, 4, 5)

selected_elements = tensor[..., 1]
print(selected_elements.shape)

torch.Size([3, 4])


In [67]:
# Using torch.gather()
tensor = torch.tensor([[1, 2], [3, 4], [5, 6]])

indices = torch.tensor([[0, 0], [1, 0], [0, 1]])

gathered_tensor = torch.gather(tensor, 1, indices)
print(gathered_tensor)

tensor([[1, 1],
        [4, 3],
        [5, 6]])


## GPU acceleration

##### **Q11: How do you check if a GPU is available in PyTorch?**

In [68]:
if torch.cuda.is_available():
    print("GPU is available")
else:
    print("GPU is not available")

GPU is not available


##### **Q12: How do you move tensors to GPU and perform operations on them?**

##### **Q13: How do you measure the time taken for tensor operations on GPU versus CPU?**

##### **Q14: How do you handle tensors when working with multiple GPUs?**

## Automatic differentiation

##### **Q15: How do you enable automatic differentiation in PyTorch and compute gradients?**

##### **Q16: How do you stop PyTorch from tracking history on tensors?**

##### **Q17: How do you manually zero the gradients in PyTorch?**

##### **Q18: How do you use the `backward()` method for computing gradients?**

## Building a simple neural network

##### **Q19: How do you define a simple neural network using `nn.Module` in PyTorch?**

##### **Q20: How do you initialize the weights and biases of a neural network?**

##### **Q21: How do you add multiple layers to a neural network?**

## Loss function and optimizer

##### **Q22: How do you define a loss function and an optimizer for your neural network?**

##### **Q23: How do you use different types of optimizers in PyTorch?**

##### **Q24: How do you adjust the learning rate during training?**

## Training the model

##### **Q25: How do you create a training loop to train your neural network in PyTorch?**

## Evaluation and inference

##### **Q26: How do you evaluate your model's performance and make predictions on new data?**

##### **Q27: How do you calculate the accuracy of your model?**

##### **Q28: How do you handle model evaluation for regression tasks?**

##### **Q29: How do you handle model evaluation for classification tasks?**

##### **Q30: How do you use confusion matrices to evaluate model performance?**

## Saving and loading models

##### **Q31: How do you save and load a PyTorch model?**

##### **Q32: How do you save and load model checkpoints during training?**

## Custom datasets and DataLoaders

##### **Q33: How do you use PyTorch's DataLoader to load a dataset in batches?**

##### **Q34: How do you implement a custom dataset in PyTorch?**

##### **Q35: How do you apply data transformations using `torchvision.transforms`?**

##### **Q36: How do you handle data augmentation in PyTorch?**

## Conclusion

## Further exercises

##### **Q37: How do you create a tensor of shape (2, 3) filled with zeros and then with ones?**

##### **Q38: How do you train a neural network to predict the output of a simple linear function?**

##### **Q39: How do you experiment with different optimizers and learning rates to see their effect on training?**

##### **Q40: How do you visualize the training loss and accuracy over epochs in PyTorch?**

##### **Q41: How do you implement dropout regularization in a neural network using PyTorch?**