<a href="https://colab.research.google.com/github/babupallam/PyTorch-Learning-Repository/blob/main/01_Introduction_to_PyTorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Section 1: Introduction to PyTorch

This section will serve as a foundation for learning PyTorch, focusing on tensors, tensor operations, broadcasting, and automatic differentiation. The goal is to ensure a solid understanding of these concepts before moving forward to building neural networks.

---



#### **1.1. What is PyTorch?**
- PyTorch is an open-source machine learning library developed by Facebook’s AI Research Lab (FAIR).
- It is widely used for tasks like deep learning, computer vision, natural language processing, and reinforcement learning.
- PyTorch provides flexibility and ease of use with dynamic computation graphs, which allow for more intuitive programming than static computation graphs (like TensorFlow 1.x).
- PyTorch supports automatic differentiation, GPU acceleration, and a rich set of APIs for building neural networks.



#### **1.2. PyTorch Installation (On Google Colab)**
- Google Colab comes with pre-installed PyTorch, so there’s no need for manual installation.
- However, for local environments, you can install PyTorch using the following command:
    ```bash
    !pip install torch torchvision
    ```

---



#### **1.3. Tensors: The Core Data Structure in PyTorch**
- **Tensors** are multidimensional arrays that are fundamental to PyTorch (similar to NumPy arrays but optimized for GPUs).
- They represent all data inputs, parameters, and outputs in machine learning models.
  
**Key properties of tensors:**
  - Shape: The dimensions of a tensor (e.g., a 2x3 matrix has a shape of `[2, 3]`).
  - Dtype: The data type of elements within the tensor (`float32`, `int64`, etc.).
  - Device: Tensors can be moved between CPU and GPU using `.to()` or `.cuda()`.

---

**1.3.1. Basic Tensor Operations**

- **Creating Tensors**:
  PyTorch provides several functions to create tensors, including `torch.tensor()`, `torch.zeros()`, `torch.ones()`, `torch.randn()`, and more.
  
  **Demonstration: Basic Tensor Creation**

In [1]:

import torch

# Creating a tensor from a list
x = torch.tensor([[1, 2], [3, 4]])
print(x)  # Output: tensor([[1, 2], [3, 4]])

# Creating a tensor filled with zeros
zeros_tensor = torch.zeros((2, 3))
print(zeros_tensor)  # Output: A 2x3 tensor filled with 0s

# Creating a tensor filled with ones
ones_tensor = torch.ones((3, 3))
print(ones_tensor)  # Output: A 3x3 tensor filled with 1s

# Creating a tensor with random values
random_tensor = torch.randn((3, 3))
print(random_tensor)  # Output: A 3x3 tensor with random values


tensor([[1, 2],
        [3, 4]])
tensor([[0., 0., 0.],
        [0., 0., 0.]])
tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
tensor([[ 1.8048, -1.8740,  0.1330],
        [ 0.1068,  0.2662, -0.5584],
        [-0.0408,  0.6901, -0.8036]])



  **Explanation:**
  - `torch.tensor()`: Creates a tensor from Python lists or NumPy arrays.
  - `torch.zeros()`: Creates a tensor filled with zeros.
  - `torch.ones()`: Creates a tensor filled with ones.
  - `torch.randn()`: Creates a tensor with random values drawn from a normal distribution.

---

**1.3.2. Tensor Shapes and Indexing**
- Tensor shapes are defined by their dimensions (number of rows, columns, etc.).
- PyTorch allows you to easily access and manipulate specific elements, rows, or columns using indexing.

  **Demonstration: Tensor Indexing and Shape**

In [2]:
# Checking the shape of the tensor
print(x.shape)  # Output: torch.Size([2, 2])

# Accessing specific elements
element = x[1, 0]  # Access element at row 1, column 0
print(element)  # Output: tensor(3)

# Slicing a tensor
slice_tensor = x[:, 1]  # Access all rows, column 1
print(slice_tensor)  # Output: tensor([2, 4])


torch.Size([2, 2])
tensor(3)
tensor([2, 4])



  **Explanation:**
  - `x.shape`: Returns the shape of the tensor.
  - `x[1, 0]`: Indexes the tensor to access the element at row 1, column 0.
  - `x[:, 1]`: Slices the tensor to access all rows from column 1.

---

**1.3.3. Tensor Reshaping and Broadcasting**
- **Reshaping** tensors is essential for machine learning, as data often needs to be prepared in a specific format.
- **Broadcasting** allows PyTorch to perform operations on tensors with different shapes, as long as the shapes are compatible.

  **Demonstration: Tensor Reshaping and Broadcasting**

In [3]:

# Reshaping a tensor
reshaped_tensor = x.view(4)
print(reshaped_tensor)  # Output: tensor([1, 2, 3, 4])

# Broadcasting operation
y = torch.tensor([1, 2])
broadcast_result = x + y
print(broadcast_result)  # Output: tensor([[2, 4], [4, 6]])


tensor([1, 2, 3, 4])
tensor([[2, 4],
        [4, 6]])



  **Explanation:**
  - `view()`: Reshapes a tensor without changing its data.
  - `x + y`: Broadcasting adds the tensor `y` to each row of tensor `x`.

---



#### **1.4. Operations on Tensors**
- PyTorch supports a variety of tensor operations similar to those in NumPy.
  
**Common Operations:**
  - Element-wise operations: `+`, `-`, `*`, `/`, etc.
  - Matrix multiplication: `torch.mm()`.
  - Aggregation: `sum()`, `mean()`, `max()`, etc.
  - Stacking: `torch.cat()`, `torch.stack()`.
  - Transpose and permute for changing tensor dimensions.

---

**1.4.1. Element-wise Operations**

  **Demonstration: Element-wise Operations**

In [6]:

print(x)  # Output: tensor([[1, 2], [3, 4]]
# Element-wise addition
add_result = x + 2
print(add_result)  # Output: tensor([[3, 4], [5, 6]])

# Element-wise multiplication
multiply_result = x * 3
print(multiply_result)  # Output: tensor([[ 3,  6], [ 9, 12]])


tensor([[1, 2],
        [3, 4]])
tensor([[3, 4],
        [5, 6]])
tensor([[ 3,  6],
        [ 9, 12]])



  **Explanation:**
  - PyTorch supports element-wise operations directly using standard Python operators like `+` and `*`.

---

**1.4.2. Matrix Multiplication**
- Matrix multiplication is a crucial operation in deep learning, used to calculate activations in neural networks.
  
  **Demonstration: Matrix Multiplication**

In [7]:

# Matrix multiplication (dot product)
matrix_a = torch.tensor([[1, 2], [3, 4]])
matrix_b = torch.tensor([[5, 6], [7, 8]])

result = torch.mm(matrix_a, matrix_b)
print(result)  # Output: tensor([[19, 22], [43, 50]])


tensor([[19, 22],
        [43, 50]])



  **Explanation:**
  - `torch.mm()`: Performs matrix multiplication (dot product) between two tensors.
  
---

**1.4.3. Tensor Aggregation**
- Aggregation functions are important for tasks like calculating loss or mean squared error in machine learning models.

  **Demonstration: Aggregation Functions**

In [9]:

print(x)  # Output: tensor([[1, 2], [3, 4]]
# Summing all elements in a tensor
sum_result = torch.sum(x)
print(sum_result)  # Output: tensor(10)

# Finding the maximum element
max_result = torch.max(x)
print(max_result)  # Output: tensor(4)

# Calculating the mean
mean_result = torch.mean(x.float())
print(mean_result)  # Output: tensor(2.5)


tensor([[1, 2],
        [3, 4]])
tensor(10)
tensor(4)
tensor(2.5000)



  **Explanation:**
  - `torch.sum()`: Sums all elements in the tensor.
  - `torch.max()`: Returns the maximum value from the tensor.
  - `torch.mean()`: Computes the mean (average) of tensor elements (converted to float if necessary).

---



#### **1.5. Automatic Differentiation (Autograd)**
- PyTorch's **autograd** system automates the computation of gradients, which are essential for training neural networks.
- **Gradients** are partial derivatives of loss functions with respect to model parameters, helping optimize parameters using gradient descent.
  
---

**1.5.1. Setting Up Autograd**
- To enable autograd, tensors should have `requires_grad=True`. This tells PyTorch to track all operations on these tensors.

  **Demonstration: Simple Autograd Example**

In [12]:
# Importing torch library for tensor operations and autograd functionality
import torch

# Step 1: Creating a tensor with requires_grad=True to track computation.
# 'requires_grad=True' tells PyTorch that we want to compute gradients with respect to this tensor during backpropagation.
# This is necessary when we want to perform automatic differentiation later on.
x = torch.tensor([1.0, 2.0, 3.0], requires_grad=True)

# Step 2: Performing an operation on the tensor. Here, we're squaring each element in the tensor 'x'.
# PyTorch keeps track of all the operations performed on tensors that have 'requires_grad=True',
# so it can later compute gradients for these operations during backpropagation.
y = x ** 2

# Step 3: Printing the result of the operation.
# 'y' is a new tensor that holds the squared values of 'x'. Notice that the tensor has a 'grad_fn' attribute.
# This indicates that PyTorch is recording the operation for automatic differentiation. In this case,
# 'PowBackward0' is the gradient function created by PyTorch, which will be used during backpropagation.
print(y)  # Output: tensor([1., 4., 9.], grad_fn=<PowBackward0>)

# Step 4: Summing up the elements of tensor 'y'. This operation will also be tracked for gradient computation.
# Treating this sum as a "loss" is a common practice in machine learning to reduce a tensor to a scalar.
# The 'grad_fn' here is now 'SumBackward0', showing that summing was recorded as well.
loss = y.sum()

# Step 5: Printing the loss. The loss is a scalar (14.0), which is the sum of [1., 4., 9.].
# This will be used for backpropagation in the next step.
print(loss)  # Output: tensor(14., grad_fn=<SumBackward0>)

# Step 6: Backpropagating the loss to compute gradients. This calculates the derivative of the loss with respect to
# each element of 'x'. PyTorch will traverse the recorded operations in reverse (hence "backward") to compute the gradients.
loss.backward()

# Step 7: Accessing the gradient stored in 'x'. After calling 'backward()', the gradients are computed and stored in 'x.grad'.
# The gradient is the derivative of the loss (14) with respect to 'x', which in this case turns out to be:
# d(loss)/dx = 2 * x (from the derivative of the square operation).
# Therefore, x.grad will be [2 * 1.0, 2 * 2.0, 2 * 3.0] = [2.0, 4.0, 6.0].
print(x.grad)  # Output: tensor([2., 4., 6.])


tensor([1., 4., 9.], grad_fn=<PowBackward0>)
tensor(14., grad_fn=<SumBackward0>)
tensor([2., 4., 6.])



  **Explanation:**
  - `requires_grad=True`: Tells PyTorch

 to track gradients for the tensor.
  - `.backward()`: Computes the gradient of the loss with respect to the tensor.
  - `x.grad`: Holds the computed gradients after backpropagation.

---

**1.5.2. Observations in Current Research:**
  - PyTorch's dynamic computation graph has enabled more flexible model architectures and research prototypes.
  - With automatic differentiation, complex models such as transformers, generative models, and reinforcement learning agents can be built rapidly.
  - Researchers find PyTorch easier for debugging and iterative experiments compared to static graph-based libraries.

---



#### **1.6. Working with GPU Acceleration**
- PyTorch makes it easy to move tensors and models to GPU to utilize CUDA for faster computation.

  **Demonstration: Using GPU in PyTorch**

In [11]:

# Checking if a GPU is available
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)  # Output: 'cuda' if GPU is available, otherwise 'cpu'

# Moving a tensor to the GPU
x_gpu = x.to(device)
print(x_gpu)  # Tensor is now on GPU

# Moving it back to CPU
x_cpu = x_gpu.to('cpu')
print(x_cpu)  # Tensor is now back on CPU


cpu
tensor([1., 2., 3.], requires_grad=True)
tensor([1., 2., 3.], requires_grad=True)



  **Explanation:**
  - `torch.cuda.is_available()`: Checks if a CUDA-enabled GPU is available.
  - `.to(device)`: Moves the tensor to the specified device (either CPU or GPU).

---



### Continuity to the Next Section:
- In the next section, we will explore **building simple neural networks** using PyTorch's `torch.nn` module.
- We will leverage the understanding of tensors, operations, and automatic differentiation to implement neural networks from scratch.
  