<a href="https://colab.research.google.com/github/rahul0772/python-ml-ai-relearning/blob/main/AI%20and%20ML%20with%20PyTorch/day11_AI_ML_withPyTorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# PyTorch Basics: Tensors

In [4]:
# ============================
# Creation, Manipulation, and Operations in Pytorch
# ============================

# Import the PyTorch library
# torch is the main package used for tensor computations and deep learning
import torch


# -------------------------------------------------
# WHAT IS A TENSOR?
# -------------------------------------------------
# A tensor is the core data structure in PyTorch.
# It is similar to a NumPy array but with extra features:
# 1. Can run on GPU or other accelerators
# 2. Supports automatic differentiation (used in backpropagation)
# 3. Optimized for deep learning workloads
#
# Tensors are used to store:
# - Input data
# - Output data
# - Model parameters (weights and biases)


# -------------------------------------------------
# 1. Creating a tensor from a Python list
# -------------------------------------------------
tensor1 = torch.tensor([1, 2, 3])

# torch.tensor() converts a Python list into a PyTorch tensor
# This creates a 1-dimensional tensor (vector)
# Shape of tensor1 -> (3,)
# Data type is automatically inferred (usually int64)

print("Tensor from list:")
print(tensor1)
print("Shape of tensor1:", tensor1.shape)
print()


# -------------------------------------------------
# 2. Creating a tensor filled with zeros
# -------------------------------------------------
tensor2 = torch.zeros(2, 3)

# torch.zeros(2, 3) creates a tensor with:
# 2 rows and 3 columns
# All values initialized to 0
# This is a 2D tensor (matrix)
# Shape of tensor2 -> (2, 3)

print("Tensor of zeros:")
print(tensor2)
print("Shape of tensor2:", tensor2.shape)
print()


# -------------------------------------------------
# 3. Creating a tensor with random values
# -------------------------------------------------
tensor3 = torch.rand(3, 2)

# torch.rand(3, 2) creates a tensor with random values
# Values are sampled from a uniform distribution between 0 and 1
# Shape of tensor3 -> (3, 2)
# Random tensors are often used to initialize neural network weights

print("Random tensor:")
print(tensor3)
print("Shape of tensor3:", tensor3.shape)
print()


# =================================================
# OPERATIONS ON TENSORS
# =================================================


# -------------------------------------------------
# 4. Tensor Addition
# -------------------------------------------------
result_add = tensor1 + tensor2

# tensor1 shape -> (3,)
# tensor2 shape -> (2, 3)
#
# PyTorch uses BROADCASTING here:
# tensor1 is automatically expanded to match the shape of tensor2
# The values [1, 2, 3] are added to each row of tensor2
#
# Result shape -> (2, 3)

print("Addition result (tensor1 + tensor2):")
print(result_add)
print("Shape of addition result:", result_add.shape)
print()


# -------------------------------------------------
# 5. Scalar Multiplication
# -------------------------------------------------
result_mul = tensor2 * 5

# Each element of tensor2 is multiplied by the scalar value 5
# This operation does NOT change the shape
# Shape remains -> (2, 3)

print("Multiplication result (tensor2 * 5):")
print(result_mul)
print("Shape of multiplication result:", result_mul.shape)
print()


# -------------------------------------------------
# 6. Matrix Multiplication
# -------------------------------------------------
result_matmul = torch.matmul(tensor2, tensor3)

# tensor2 shape -> (2, 3)
# tensor3 shape -> (3, 2)
#
# Matrix multiplication rule:
# (m x n) · (n x p) = (m x p)
#
# So:
# (2 x 3) · (3 x 2) = (2 x 2)
#
# torch.matmul() performs true matrix multiplication (dot product)

print("Matrix multiplication result (tensor2 @ tensor3):")
print(result_matmul)
print("Shape of matrix multiplication result:", result_matmul.shape)
print()


# -------------------------------------------------
# SUMMARY
# -------------------------------------------------
# - torch.tensor() creates tensors from Python data
# - torch.zeros() creates tensors filled with zeros
# - torch.rand() creates tensors with random values
# - PyTorch supports broadcasting for element-wise operations
# - torch.matmul() follows strict matrix multiplication rules
# - Tensors are the foundation of all PyTorch models


Tensor from list: tensor([1, 2, 3])
Tensor of zeros: tensor([[0., 0., 0.],
        [0., 0., 0.]])
Random tensor: tensor([[0.5498, 0.9187],
        [0.7770, 0.7964],
        [0.0734, 0.3793]])
Addition result: tensor([[1., 2., 3.],
        [1., 2., 3.]])
Multiplication result: tensor([[0., 0., 0.],
        [0., 0., 0.]])
Matrix multiplication result: tensor([[0., 0.],
        [0., 0.]])


### Autograd : Automatic Differentiation in PyTorch

In [8]:
# ============================================
# PyTorch Autograd: Automatic Differentiation
# ============================================

# Import PyTorch
import torch


# -------------------------------------------------
# WHAT IS AUTOGRAD?
# -------------------------------------------------
# Autograd is PyTorch's automatic differentiation engine.
# It automatically computes gradients for tensor operations.
# “If I change this number a little, how much does the output change i.e gradient?”
# In deep learning:
# - Gradients tell us how much a parameter affects the output
# - They are used to update model weights during training
#
# Without autograd, we would need to manually calculate
# derivatives, which is slow and error-prone.


# -------------------------------------------------
# 1. Creating tensors with gradient tracking enabled
# -------------------------------------------------
# Create a PyTorch tensor holding the value 2.0, 3.0 and store it in a variable called x and y
x = torch.tensor(2.0, requires_grad=True)
y = torch.tensor(3.0, requires_grad=True)

# creates a PyTorch tensor with the value 2.0, store the number 2.0 in a PyTorch container
# requires_grad=True tells PyTorch to:
# - Track all operations performed on this tensor and remember how the output was computed
# - Build a computation graph
# - Enable gradient calculation later for x
#
# x and y are scalar tensors (0-dimensional)


# -------------------------------------------------
# 2. Performing a computation
# -------------------------------------------------
z = x**2 + y**3

# This expression means:
# z = (x squared) + (y cubed)
#
# Substituting values:
# z = (2^2) + (3^3)
# z = 4 + 27 = 31
#
# PyTorch internally records this operation in a computation graph

print("Output tensor z:", z)


# -------------------------------------------------
# 3. Backpropagation (Gradient Computation)
# -------------------------------------------------
z.backward()

# z.backward() computes:
# - dz/dx (gradient of z with respect to x)
# - dz/dy (gradient of z with respect to y)
#
# This is done using the chain rule from calculus
# Gradients are stored in x.grad and y.grad


# -------------------------------------------------
# 4. Understanding the gradients mathematically
# -------------------------------------------------
# z = x^2 + y^3
#
# Partial derivative with respect to x:
# dz/dx = 2x
# dz/dx = 2 * 2 = 4
#
# Partial derivative with respect to y:
# dz/dy = 3y^2
# dz/dy = 3 * (3^2) = 27

print("Gradient of x (dz/dx):", x.grad)
print("Gradient of y (dz/dy):", y.grad)


# -------------------------------------------------
# IMPORTANT NOTES
# -------------------------------------------------
# 1. Gradients are accumulated by default
#    Calling backward() multiple times will add gradients
#
# 2. backward() can only be called on scalar outputs
#    (or you must provide a gradient argument)
#
# 3. Autograd is the backbone of:
#    - Backpropagation
#    - Optimizers
#    - Neural network training in PyTorch


# -------------------------------------------------
# SUMMARY
# -------------------------------------------------
# - Autograd automatically computes gradients
# - requires_grad=True enables gradient tracking
# - backward() triggers backpropagation
# - .grad stores the computed gradients
# - No manual derivative calculations are needed


Output tensor z: tensor(31., grad_fn=<AddBackward0>)
Gradient of x (dz/dx): tensor(4.)
Gradient of y (dz/dy): tensor(27.)


### NEURAL NETWORKS IN PYTORCH

In [11]:
# ============================================================
# ============================================================

# ------------------------------------------------------------
# 1. IMPORTING LIBRARIES (TOOLS WE NEED)
# ------------------------------------------------------------
# torch         -> main PyTorch library
# torch.nn      -> used to build neural networks
# torch.optim   -> used to update model weights (learning)
# sklearn       -> only used to load and prepare the dataset

import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split # Split your data into training and testing parts
from sklearn.preprocessing import StandardScaler


# ------------------------------------------------------------
# 2. WHAT IS THE IRIS DATASET?
# ------------------------------------------------------------
# Each flower has 4 input values:
# - sepal length
# - sepal width
# - petal length
# - petal width
#
# The output (label) is the flower type:
# 0 = Setosa
# 1 = Versicolor
# 2 = Virginica

iris = load_iris()

# X = input data (features)
# y = correct answers (labels)
X = iris.data  # input data (what the model sees)
y = iris.target  # target/output (what the model predicts)


# ------------------------------------------------------------
# 3. SPLITTING DATA INTO TRAINING AND TESTING
# ------------------------------------------------------------
# Training data -> used to teach the model
# Testing data  -> used to check how good it learned
# test_size=0.2 means 20% of the data goes to testing and 80% goes to training
# random_state=42 controls randomness, every run gives a different split, results change each time
# 42 is just a random number

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)


# ------------------------------------------------------------
# 4. STANDARDIZING THE DATA
# ------------------------------------------------------------
# Neural networks learn better when numbers are on a similar scale
# if one side is super heavy, the other side barely make an influence.
# StandardScaler makes values centered around 0, scale numbers so all features are roughly the same size
# transforms data so that each feature has a mean of 0 → values are centered around 0 and
# - Has a standard deviation of 1 → all features are comparable in size

scaler = StandardScaler()                           # tool to scale data
X_train = scaler.fit_transform(X_train)             # calculate mean and stad. dev. of each feature and
                                                        # - applies the formula to scale all training data
X_test = scaler.transform(X_test)                   # Use the same mean and std from training data
                                                    # - Do NOT recalculate them from X_test!
                                                   # makes sure your model evaluates unseen data fairly, without peeking at it
# after preprocessing, you have:
# X_train  # features, like [[0.5, -1.2], [1.3, 0.7], ...]
# y_train  # labels, like [0, 1, 0, 2, ...]

# ------------------------------------------------------------
# 5. CONVERT NUMPY ARRAYS TO PYTORCH TENSORS
# ------------------------------------------------------------
# Right now, X_train, X_test, y_train, y_test are NumPy arrays
# PyTorch cannot use NumPy arrays directly
# PyTorch only works with tensors

X_train = torch.FloatTensor(X_train)    # Features often have decimals (like 0.5, -1.2) → need float numbers
X_test = torch.FloatTensor(X_test)      # Same as X_train
y_train = torch.LongTensor(y_train)     # Labels are integers (0,1,2…) → classification loss in PyTorch needs integers
y_test = torch.LongTensor(y_test)       # Same as y_train

# After conversion it would look:
# Xtrain: tensor([[ 0.5000, -1.2000],
#         [ 1.3000,  0.7000]])
# y_train: tensor([0, 1])

# ============================================================
# 6. WHAT IS nn.Module?
# ============================================================
# nn.Module is the BASE CLASS for all neural networks in PyTorch/the blueprint for any neural network in PyTorch.
# All PyTorch models must “inherit” from it.
# Think of it like:
# "If you want to build a neural network, you MUST follow this format"
#
# It handles:
# - storing weights and biases
# - tracking parameters
# - saving & loading models
# - working with autograd automatically


class SimpleNN(nn.Module):                                            # SimpleNN is a child of nn.Module
                                                                      # will have all the features from nn.Module
    # --------------------------------------------------------
    # __init__(): CREATE/BUILDS THE LAYERS PARTS
    # --------------------------------------------------------
    def __init__(self, input_size, hidden_size, output_size):
        super(SimpleNN, self).__init__()                              # super(): connects your network to PyTorch’s base functionality

        # nn.Linear = fully connected layer
        # It learns weights and bias automatically

        # Input layer/fully connected layer:
        # Takes 4 input numbers -> outputs hidden_size numbers
        self.fc1 = nn.Linear(input_size, hidden_size)

        # ReLU activation:
        # Makes the network NON-LINEAR / Without it, your network is just a straight line
        self.relu = nn.ReLU()

        # Output layer:
        # Takes hidden_size numbers -> outputs 3 numbers (classes)
        self.fc2 = nn.Linear(hidden_size, output_size)


    # --------------------------------------------------------
    # forward(): HOW DATA FLOWS THROUGH THE NETWORK
    # --------------------------------------------------------
    def forward(self, x):

        # Step 1: Input goes through first linear layer
        # You give fc1 some numbers → it multiplies each by a secret weight, adds a secret bias → produces new numbers
        # These “new numbers” are called hidden numbers because they are in the hidden layer (inside the network, not final output yet).
        x = self.fc1(x)

        # Step 2: Apply activation function
        x = self.relu(x)

        # Step 3: Pass through output layer
        x = self.fc2(x)

        # Final output returned
        return x

        # input → machine → hidden calculations → activation → final prediction


# ============================================================
# 7. WHAT IS nn.Parameter?
# ============================================================
# You DO NOT see nn.Parameter directly here
#
# Why?
# Because nn.Linear AUTOMATICALLY creates nn.Parameter objects
#
# Example:
# - weights inside fc1 and fc2 are nn.Parameter
# - PyTorch knows they must be learned
# - Optimizer updates them automatically


# ============================================================
# 8. CREATE THE MODEL
# ============================================================
# input_size  = 4 features
# hidden_size = 10 neurons (your choice)
# output_size = 3 flower classes

model = SimpleNN(input_size=4, hidden_size=10, output_size=3)


# ------------------------------------------------------------
# 9. LOSS FUNCTION AND OPTIMIZER
# ------------------------------------------------------------
# Loss function:
# Measures how wrong the model's predictions are
criterion = nn.CrossEntropyLoss()

# Optimizer:
# Updates model parameters to reduce loss
optimizer = optim.Adam(model.parameters(), lr=0.01)


# ============================================================
# 10. TRAINING THE NEURAL NETWORK
# ============================================================
# Training means:
# - Predict
# - Check error
# - Fix weights
# - Repeat

num_epochs = 100

for epoch in range(num_epochs):

    # 1. Forward pass (prediction)
    outputs = model(X_train)

    # 2. Compute loss (how wrong)
    loss = criterion(outputs, y_train)

    # 3. Clear old gradients
    optimizer.zero_grad()

    # 4. Backward pass (compute gradients)
    loss.backward()

    # 5. Update weights
    optimizer.step()

    # Print progress
    if (epoch + 1) % 10 == 0:
        print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}")


# ============================================================
# 11. EVALUATING THE MODEL (TESTING)
# ============================================================
# We check how well the model performs on unseen data

with torch.no_grad():  # No learning during testing

    outputs = model(X_test)

    # Get class with highest score
    _, predicted = torch.max(outputs, 1)

    # Calculate accuracy
    accuracy = (predicted == y_test).sum().item() / len(y_test)

print(f"Accuracy on the test set: {accuracy:.2f}")


# ============================================================
# FINAL SIMPLE SUMMARY
# ============================================================
# Tensor        = smart number
# nn.Module    = neural network template
# nn.Linear    = learns weights
# ReLU         = helps learning
# forward()    = data flow
# Loss         = how wrong
# Optimizer    = fixes mistakes
# Training     = learn again and again
# Evaluation   = test how good it is

Epoch [10/100], Loss: 0.8548
Epoch [20/100], Loss: 0.5777
Epoch [30/100], Loss: 0.4146
Epoch [40/100], Loss: 0.3266
Epoch [50/100], Loss: 0.2591
Epoch [60/100], Loss: 0.2026
Epoch [70/100], Loss: 0.1581
Epoch [80/100], Loss: 0.1261
Epoch [90/100], Loss: 0.1053
Epoch [100/100], Loss: 0.0917
Accuracy on the test set: 1.00


### Working with Data in PyTorch

In [12]:
# ============================================================
# PyTorch Data Handling: Dataset & DataLoader
# ============================================================
"""
In PyTorch, working with data is a **crucial step** for building machine learning models.
Efficient data handling ensures your model trains faster, uses memory properly, and works on GPUs if available.

Two main components for data handling in PyTorch:

1️⃣ Dataset
2️⃣ DataLoader

-----------------------------------
1️⃣ Dataset: The interface for your data
-----------------------------------
- The Dataset class allows PyTorch to understand **your data format**.
- You can use built-in datasets (like MNIST, CIFAR) or create **custom datasets**.
- A custom dataset **must implement two functions**:
    a) __len__()     → returns the total number of samples
    b) __getitem__(idx) → returns a single sample (features + label) at index idx

Think of Dataset as a **library card catalog**: you can ask it:
    - How many books (samples) are there? → __len__()
    - Give me the 5th book → __getitem__(4)

-----------------------------------
2️⃣ DataLoader: How to feed data in batches
-----------------------------------
- DataLoader wraps around Dataset and allows you to:
    - Load **batches** of data (mini-batch training)
    - **Shuffle** data (randomize order for better training)
    - Use **multiple workers** for faster loading
    - Automatically transfer data to CPU or GPU if needed

Think of DataLoader as a **conveyor belt**:
- Dataset = books in a library
- DataLoader = conveyor belt delivering N books at a time to the model

-----------------------------------
Example: Custom Dataset + DataLoader
-----------------------------------
"""

import torch
from torch.utils.data import DataLoader, Dataset

# -----------------------------
# Step 1: Define a Custom Dataset
# -----------------------------
class CustomDataset(Dataset):
    def __init__(self, data, targets):
        """
        data: input samples (features)
        targets: labels
        """
        self.data = data
        self.targets = targets

    def __len__(self):
        # total number of samples
        return len(self.data)

    def __getitem__(self, idx):
        # fetch sample at index 'idx'
        return self.data[idx], self.targets[idx]

# -----------------------------
# Step 2: Create sample data
# -----------------------------
data = torch.randn(100, 3, 32, 32)   # 100 example images (3 channels, 32x32 pixels)
targets = torch.randint(0, 10, (100,))  # 100 labels for 10 classes

# Create an instance of the custom dataset
custom_dataset = CustomDataset(data, targets)

# -----------------------------
# Step 3: Wrap dataset in a DataLoader
# -----------------------------
batch_size = 32
shuffle = True       # shuffle data each epoch
num_workers = 4      # number of parallel workers for loading data

data_loader = DataLoader(custom_dataset, batch_size=batch_size,
                         shuffle=shuffle, num_workers=num_workers)

# -----------------------------
# Step 4: Iterate over batches
# -----------------------------
for batch_idx, (inputs, targets) in enumerate(data_loader):
    print(f"Batch {batch_idx+1}: Inputs shape: {inputs.shape}, Targets shape: {targets.shape}")

"""
-----------------------------------
What happens in the code:

1️⃣ Dataset:
- CustomDataset stores your data and labels
- __len__ tells how many samples there are (100 in this case)
- __getitem__ fetches one sample (image + label)

2️⃣ DataLoader:
- Batches data into size 32
- Shuffles the order each epoch for better learning
- Uses 4 workers to load data faster

3️⃣ Iteration:
- Each iteration gives a **batch of inputs and targets**
- Example output shapes:
    - Inputs: torch.Size([32, 3, 32, 32]) → 32 images in batch
    - Targets: torch.Size([32]) → 32 labels

-----------------------------------
Key Notes:
- Use Dataset + DataLoader for **any PyTorch model** (images, text, tabular data)
- Shuffling helps prevent overfitting by randomizing input order
- Batching reduces memory usage and speeds up training
- CustomDataset is flexible: you can add **preprocessing**, **data augmentation**, etc. in __getitem__()
"""



Batch 1: Inputs shape: torch.Size([32, 3, 32, 32]), Targets shape: torch.Size([32])
Batch 2: Inputs shape: torch.Size([32, 3, 32, 32]), Targets shape: torch.Size([32])
Batch 3: Inputs shape: torch.Size([32, 3, 32, 32]), Targets shape: torch.Size([32])
Batch 4: Inputs shape: torch.Size([4, 3, 32, 32]), Targets shape: torch.Size([4])


'\n-----------------------------------\nWhat happens in the code:\n\n1️⃣ Dataset:\n- CustomDataset stores your data and labels\n- __len__ tells how many samples there are (100 in this case)\n- __getitem__ fetches one sample (image + label)\n\n2️⃣ DataLoader:\n- Batches data into size 32\n- Shuffles the order each epoch for better learning\n- Uses 4 workers to load data faster\n\n3️⃣ Iteration:\n- Each iteration gives a **batch of inputs and targets**\n- Example output shapes:\n    - Inputs: torch.Size([32, 3, 32, 32]) → 32 images in batch\n    - Targets: torch.Size([32]) → 32 labels\n\n-----------------------------------\nKey Notes:\n- Use Dataset + DataLoader for **any PyTorch model** (images, text, tabular data)\n- Shuffling helps prevent overfitting by randomizing input order\n- Batching reduces memory usage and speeds up training\n- CustomDataset is flexible: you can add **preprocessing**, **data augmentation**, etc. in __getitem__()\n'

### Preprocessing Data: Transformations and Normalization

In [13]:
# ============================================================
# PyTorch Image Preprocessing: Transformations & Normalization
# ============================================================
"""
Before feeding data to a neural network, we need to **preprocess it**.
Preprocessing ensures that data is in the **right format** and makes training **faster and more stable**.

Two main steps in preprocessing:

1️⃣ Transformations
2️⃣ Normalization

-----------------------------------
1️⃣ Transformations:
-----------------------------------
Transformations are **operations applied to images** to prepare or augment them.
- Resize: Make all images the same size
- Crop: Cut a part of the image (can be random for augmentation)
- Flip / Rotate: Create variations of images for robustness
- ToTensor: Convert images into PyTorch tensors (numbers the model can understand)

Think of it like **preparing ingredients** for a recipe:
- You chop, clean, and adjust ingredients before cooking.

-----------------------------------
2️⃣ Normalization:
-----------------------------------
- Normalization scales image pixels to have **zero mean and unit variance**
- Helps the neural network **learn faster and avoid exploding/vanishing gradients**
- Standard normalization formula:
    normalized_pixel = (pixel - mean) / std

- PyTorch convention for RGB images:
    mean = [0.485, 0.456, 0.406]   # average pixel values in ImageNet dataset
    std  = [0.229, 0.224, 0.225]   # standard deviation in ImageNet

-----------------------------------
Example: Preprocessing an image
-----------------------------------
"""

import torch
import torchvision.transforms as transforms

# -----------------------------
# Step 1: Define transformations
# -----------------------------
transform = transforms.Compose([
    transforms.Resize(256),              # Resize image to 256x256
    transforms.RandomCrop(224),          # Randomly crop 224x224 from resized image
    transforms.RandomHorizontalFlip(),   # Randomly flip image horizontally
    transforms.ToTensor(),               # Convert PIL image to PyTorch tensor (0-1 range)
    transforms.Normalize(mean=[0.485, 0.456, 0.406],   # Normalize each channel
                         std=[0.229, 0.224, 0.225])
])

# -----------------------------
# Step 2: Create an example image
# -----------------------------
# Random tensor simulating an RGB image
example_image = transforms.ToPILImage()(torch.randn(3, 256, 256))

# -----------------------------
# Step 3: Apply transformations
# -----------------------------
transformed_image = transform(example_image)

# -----------------------------
# Step 4: Check shape
# -----------------------------
print("Transformed image shape:", transformed_image.shape)
# Expected output: torch.Size([3, 224, 224])

"""
-----------------------------------
Explanation of what happened:

1️⃣ Resize:
- Original image → 256x256
- Ensures all images are same size for batch processing

2️⃣ RandomCrop:
- Randomly selects 224x224 region
- Acts as **data augmentation** (network sees slightly different images each epoch)

3️⃣ RandomHorizontalFlip:
- Randomly flips image left-right
- Adds more variations → network becomes more robust

4️⃣ ToTensor:
- Converts image to tensor
- Pixel values now range 0-1 instead of 0-255

5️⃣ Normalize:
- Adjusts pixels so mean=0 and std=1
- Helps **stabilize training** and speeds up learning

-----------------------------------
Analogy:
- Resize & crop = chopping ingredients to same size
- Flip = flipping or rotating ingredients to get more variety
- ToTensor = converting ingredients into a format your blender (model) understands
- Normalize = making all ingredients balanced in taste before mixing

-----------------------------------
Key Notes:
- Transformations = **data augmentation + formatting**
- Normalization = **scaling data to help learning**
- Combine transformations with `transforms.Compose()` to apply in sequence
"""

Transformed image shape: torch.Size([3, 224, 224])


'\n-----------------------------------\nExplanation of what happened:\n\n1️⃣ Resize:\n- Original image → 256x256\n- Ensures all images are same size for batch processing\n\n2️⃣ RandomCrop:\n- Randomly selects 224x224 region\n- Acts as **data augmentation** (network sees slightly different images each epoch)\n\n3️⃣ RandomHorizontalFlip:\n- Randomly flips image left-right\n- Adds more variations → network becomes more robust\n\n4️⃣ ToTensor:\n- Converts image to tensor\n- Pixel values now range 0-1 instead of 0-255\n\n5️⃣ Normalize:\n- Adjusts pixels so mean=0 and std=1\n- Helps **stabilize training** and speeds up learning\n\n-----------------------------------\nAnalogy:\n- Resize & crop = chopping ingredients to same size\n- Flip = flipping or rotating ingredients to get more variety\n- ToTensor = converting ingredients into a format your blender (model) understands\n- Normalize = making all ingredients balanced in taste before mixing\n\n-----------------------------------\nKey Notes:\

In [14]:
# ============================================================
# PyTorch: Handling Custom Datasets
# ============================================================
"""
In PyTorch, we often work with datasets that are **not standard**, i.e., custom datasets.
To handle these datasets efficiently, PyTorch provides the `Dataset` class.

Key Points:
1️⃣ Custom Dataset allows you to define **how to fetch your data and labels**.
2️⃣ You need to implement two methods:
   a) __len__()      → returns total number of samples
   b) __getitem__(idx) → returns a sample and its corresponding label

Analogy:
- Dataset = your personal library
- __len__ = how many books are in the library
- __getitem__ = fetch the Nth book

-----------------------------------
Step 1: Import required libraries
-----------------------------------
"""
import torch
from torch.utils.data import Dataset, DataLoader

# ----------------------------------
# Step 2: Create a Custom Dataset
# ----------------------------------
class CustomDataset(Dataset):
    def __init__(self, data, targets):
        """
        data: features (input samples)
        targets: labels
        """
        self.data = data
        self.targets = targets

    def __len__(self):
        # Return total number of samples
        return len(self.data)

    def __getitem__(self, index):
        # Fetch sample and its target based on the index
        sample = self.data[index]
        target = self.targets[index]
        return sample, target

# ----------------------------------
# Step 3: Create sample data
# ----------------------------------
data = torch.tensor([[1, 2], [3, 4], [5, 6], [7, 8]])  # example features
targets = torch.tensor([0, 1, 0, 1])                   # example labels

# Create instance of the custom dataset
custom_dataset = CustomDataset(data, targets)

# ----------------------------------
# Step 4: Create a DataLoader
# ----------------------------------
batch_size = 2
data_loader = DataLoader(custom_dataset, batch_size=batch_size, shuffle=True)

# ----------------------------------
# Step 5: Iterate over DataLoader
# ----------------------------------
for batch_idx, (samples, targets) in enumerate(data_loader):
    print(f"Batch {batch_idx}:")
    print("Samples:", samples)
    print("Targets:", targets)

"""
-----------------------------------
Explanation:

1️⃣ __init__():
- Store your data and labels in the dataset object

2️⃣ __len__():
- Returns the number of samples
- Example: len(custom_dataset) → 4

3️⃣ __getitem__(index):
- Returns the sample and label at that index
- Example: custom_dataset[2] → ([5,6], 0)

4️⃣ DataLoader:
- Handles batching, shuffling, and parallel loading
- batch_size = number of samples per batch
- shuffle = randomizes order for each epoch

5️⃣ Iterating:
- Each iteration returns a **batch of samples and targets**
- Example output:
Batch 0:
Samples: tensor([[5, 6],
                 [3, 4]])
Targets: tensor([0, 1])
Batch 1:
Samples: tensor([[1, 2],
                 [7, 8]])
Targets: tensor([0, 1])

-----------------------------------
Key Notes:
- CustomDataset + DataLoader = most flexible way to handle **any dataset**
- You can add **transformations, preprocessing, or augmentation** in __getitem__()
- DataLoader ensures **efficient mini-batch training**
"""

Batch 0:
Samples: tensor([[7, 8],
        [5, 6]])
Targets: tensor([1, 0])
Batch 1:
Samples: tensor([[1, 2],
        [3, 4]])
Targets: tensor([0, 1])


'\n-----------------------------------\nExplanation:\n\n1️⃣ __init__():\n- Store your data and labels in the dataset object\n\n2️⃣ __len__():\n- Returns the number of samples\n- Example: len(custom_dataset) → 4\n\n3️⃣ __getitem__(index):\n- Returns the sample and label at that index\n- Example: custom_dataset[2] → ([5,6], 0)\n\n4️⃣ DataLoader:\n- Handles batching, shuffling, and parallel loading\n- batch_size = number of samples per batch\n- shuffle = randomizes order for each epoch\n\n5️⃣ Iterating:\n- Each iteration returns a **batch of samples and targets**\n- Example output:\nBatch 0:\nSamples: tensor([[5, 6],\n                 [3, 4]])\nTargets: tensor([0, 1])\nBatch 1:\nSamples: tensor([[1, 2],\n                 [7, 8]])\nTargets: tensor([0, 1])\n\n-----------------------------------\nKey Notes:\n- CustomDataset + DataLoader = most flexible way to handle **any dataset**\n- You can add **transformations, preprocessing, or augmentation** in __getitem__()\n- DataLoader ensures **eff

In [15]:
# ============================================================
# PyTorch Intermediate Topics: Optimizers, Loss, Evaluation
# ============================================================
"""
After learning the basics, we now move to **intermediate topics** in PyTorch.
These are essential to **train, optimize, and evaluate models** properly.

Topics covered:
1️⃣ Optimizers
2️⃣ Loss Functions
3️⃣ Validation and Testing
4️⃣ Overfitting vs Underfitting
5️⃣ Model Evaluation
"""

import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris

# ============================================================
# Step 1: Load Example Dataset (Iris)
# ============================================================
iris = load_iris()
X = torch.tensor(iris.data, dtype=torch.float32)  # Features
y = torch.tensor(iris.target, dtype=torch.long)   # Labels (0,1,2)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# ============================================================
# Step 2: Build a simple Neural Network
# ============================================================
class SimpleNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(SimpleNN, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        return x

# Initialize model
input_size = X.shape[1]   # 4 features
hidden_size = 10
output_size = 3           # 3 classes
model = SimpleNN(input_size, hidden_size, output_size)

# ============================================================
# Step 3: Define Loss Function
# ============================================================
# CrossEntropyLoss for multi-class classification
criterion = nn.CrossEntropyLoss()

# ============================================================
# Step 4: Define Optimizer
# ============================================================
# Using Adam optimizer
optimizer = optim.Adam(model.parameters(), lr=0.01)

# ============================================================
# Step 5: Train the Model
# ============================================================
epochs = 100
for epoch in range(epochs):
    # Forward pass
    outputs = model(X_train)
    loss = criterion(outputs, y_train)

    # Backward pass & optimization
    optimizer.zero_grad()  # reset gradients
    loss.backward()        # compute gradients
    optimizer.step()       # update weights

    if (epoch+1) % 20 == 0:
        print(f"Epoch [{epoch+1}/{epochs}], Loss: {loss.item():.4f}")

# ============================================================
# Step 6: Evaluate the Model
# ============================================================
# Switch to evaluation mode
model.eval()
with torch.no_grad():  # No need to track gradients for evaluation
    test_outputs = model(X_test)
    _, predicted = torch.max(test_outputs, 1)  # predicted class
    accuracy = (predicted == y_test).sum().item() / y_test.size(0)

print(f"\nTest Accuracy: {accuracy*100:.2f}%")

"""
------------------------------------------------------------
Explanation:

1️⃣ Optimizers:
- Adam, SGD, Adagrad update model weights to minimize loss
- Adam = adaptive learning rate, usually faster convergence

2️⃣ Loss Functions:
- CrossEntropyLoss = measures difference between predicted and actual class
- MSE = for regression, measures average squared error

3️⃣ Validation & Testing:
- Split data to check model on **unseen data**
- Helps detect overfitting or underfitting

4️⃣ Overfitting vs Underfitting:
- Overfitting: model performs well on train but poorly on test
- Underfitting: model performs poorly on both train and test
- Solutions: adjust layers, epochs, regularization, or more data

5️⃣ Model Evaluation:
- Check accuracy, precision, recall, F1 (classification)
- Check MSE, MAE (regression)
- Here we calculated **accuracy on Iris test set**

------------------------------------------------------------
Key Notes:
- Optimizer + Loss + Proper Evaluation = backbone of model training
- Intermediate topics help **improve model performance** and generalization
- Experiment with learning rate, hidden layers, epochs to see effect
"""


Epoch [20/100], Loss: 0.6663
Epoch [40/100], Loss: 0.3969
Epoch [60/100], Loss: 0.2743
Epoch [80/100], Loss: 0.1808
Epoch [100/100], Loss: 0.1296

Test Accuracy: 100.00%


'\n------------------------------------------------------------\nExplanation:\n\n1️⃣ Optimizers:\n- Adam, SGD, Adagrad update model weights to minimize loss\n- Adam = adaptive learning rate, usually faster convergence\n\n2️⃣ Loss Functions:\n- CrossEntropyLoss = measures difference between predicted and actual class\n- MSE = for regression, measures average squared error\n\n3️⃣ Validation & Testing:\n- Split data to check model on **unseen data**\n- Helps detect overfitting or underfitting\n\n4️⃣ Overfitting vs Underfitting:\n- Overfitting: model performs well on train but poorly on test\n- Underfitting: model performs poorly on both train and test\n- Solutions: adjust layers, epochs, regularization, or more data\n\n5️⃣ Model Evaluation:\n- Check accuracy, precision, recall, F1 (classification)\n- Check MSE, MAE (regression)\n- Here we calculated **accuracy on Iris test set**\n\n------------------------------------------------------------\nKey Notes:\n- Optimizer + Loss + Proper Evalua