# Note
In the few upcoming notebooks we will understand the basic machine learning workflow which consists of:

    1. Working with Data - Tensorsm, Datasets, DataLoaders and Transforms

    2. Creating Models - Building the neural network, understanding the automatic differentiation using torch.autograd

    3. Optimizing Model Parameters

    4. Save and Load Model.

All the scripts will be written in .ipynb files and using pytorch api.

Dataset used: FashionMNIST
Tutorial Link: https://pytorch.org/tutorials/beginner/basics/intro.html

# 1. Working with data

PyTorch has two methods to work with data: 'torch.utils.data.DataLoader' and 'torch.utils.data.Dataset'. Dataset stores the samples and labels while DataLoader is an iterable object which wraps about the Dataset object.

In [2]:
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor

Every TorchVision "Dataset" includes two arguments - "transform" and "target_transform" - to modify the samples and labels respectively

In [3]:
# Download training data from open datasets
training_data = datasets.FashionMNIST(
    root="./data",
    train=True,
    download=True,
    transform=ToTensor(),
)

# Download test data from open datasets
test_data = datasets.FashionMNIST(
    root="./data",
    train=False,
    download=True,
    transform=ToTensor(),
)

Now, we pass "Dataset" as an argument to "DataLoader". This wraps an iterable over our FashionMNIST dataset and supports the following operation:

    a) automatic batching 

    b) Sampling 

    c) shuffling 

    d) multiprocess data loading 

In [4]:
batch_size = 64

In [5]:
# Create data loaders
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)

for X, y in test_dataloader:
    print(f"Shape of X [N, C, H, W]: {X.shape}")
    print(f"Shape of y: {y.shape} {y.dtype}")
    break

Shape of X [N, C, H, W]: torch.Size([64, 1, 28, 28])
Shape of y: torch.Size([64]) torch.int64


# 2. Creating Models

In Pytorch, we define a neural network with a help of a class. The class will inherits from nn.Module and consists of two methods __inti__() and forward(). __init__() define the layers of the network. foward() specify how data wil pass through the network. 

To accelerate the processing in neural nets, we can use CUDA, MPS, MTIA or XPU, these are called accelerator. If nothing is specified then it will choose CPU.

In [6]:
device = torch.accelerator.current_accelerator().type if torch.accelerator.is_available() else "cpu"

print(f"Using  {device} device")

Using  cuda device


In [None]:
'''
__init__(): is a special method called 'constructor', it is automatically create when NeuralNetwork object is created. Here 
            we defined the layers of neural network

self - is a parameter and refers to the instance of the object (or current object)

super().__init__() - calls the method of the parent class (nn.Module). It's required to properly initialize everything from
                    nn.Module, like tracking layers and parameters

self.flatten - it is layer called flatten. nn.Flatten() converts a multi-dimension input (e.g. 28 x 28 image) into a 1-d vector.
               it is required because nn.Layer work with 1D inputs.

self.linear_relu_stack = nn.Sequential() - defined the sequences of layers in the neural net. nn.Sequential allows to combine
                                           multiple layers into a single block

nn.Linear(28 * 28, 512) - it is a fully connected layer flattened to 784 inputs and ouputs 512 features.
nn.ReLU() - is a activation function which introduces non-Linearity, it replaces negative values in output with zero.
nn.Linear(512, 512) and nn.ReLU() - fully connected layer with 512 inputs and 512 outputs followed ReLU activation function
nn.Linear(512, 10) -  final fully connected layer where 512 features are inputs and outputs 10 features. It is corresponds
                      to 10 possible classes
'''
# Define model
class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28 * 28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10)
        )
    '''
    def forward: defined forward pass of the network, i.e. how input data flows through networks to produce an output
    x - represents input data
    self.flatten - converts to 1D vector
    self.linear_relu_stack = flattened input passed through network defined earler and final outputs stored in logits.
    '''

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

In [8]:
model = NeuralNetwork().to(device)
print(model)

NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
  )
)


# 3. Optimizing the Model Parameters

to train a model we need a loss function and an optimizer.

In [9]:
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)