<a href="https://colab.research.google.com/github/davidandw190/pytorch-deep-learning-workspace/blob/main/notebooks/mnist_pytorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
from torchvision import datasets
from torchvision.transforms import ToTensor


In [None]:

# Load the MNIST dataset for training
train_data = datasets.MNIST(
    root = 'data',          # Directory where the data will be stored
    train = True,           # Directory where the data will be stored
    transform = ToTensor(), # Transforms the images to PyTorch tensors
    download = True         # Downloads the dataset if not present
)

# Load the MNIST dataset for testing
test_data = datasets.MNIST(
    root = 'data',
    train = False,          # Specifies that this is the test dataset
    transform = ToTensor(),
    download = True
)

In [4]:
train_data

Dataset MNIST
    Number of datapoints: 60000
    Root location: data
    Split: Train
    StandardTransform
Transform: ToTensor()

In [5]:
test_data


Dataset MNIST
    Number of datapoints: 10000
    Root location: data
    Split: Test
    StandardTransform
Transform: ToTensor()

The `train_data.data` attribute contains the raw pixel values of the images in the MNIST training dataset. When using the `datasets.MNIST` class from `torchvision`, this attribute is a `torch.Tensor` of shape (60000, 28, 28), where:
- 60000 is the number of training images.
- 28 is the height of each image in pixels.
- 28 is the width of each image in pixels.


2The `.shape` attribute of a `torch.Tensor` provides the dimensions of the tensor as a tuple. This is similar to the .shape attribute in numpy arrays. Understanding the shape of your data is crucial for ensuring that it is correctly formatted for input into neural networks.

In [6]:
train_data.data.shape


torch.Size([60000, 28, 28])

The `train_data.targets` attribute contains the labels for the images in the MNIST training dataset. Each label corresponds to the digit (0-9) that the image represents. In the case of the MNIST dataset, this attribute is a `torch.Tensor` of shape (60000,), where 60000 is the number of training images.

The `.size()` method of a `torch.Tensor` returns the size of the tensor as a `torch.Size` object, which is a subclass of Python's tuple. This method is useful for quickly inspecting the dimensions of a tensor.

In [None]:
train_data.targets.size()

torch.Size([60000])

In [7]:
from torch.utils.data import DataLoader

In [9]:
# Create data loaders for training and testing datasets
loaders = {
    'train': DataLoader(train_data,
                        batch_size=100,  # Number of samples per batch to load
                        shuffle=True,    # Shuffle the data at every epoch
                        num_workers=1),  # Number of subprocesses to use

    'test': DataLoader(test_data,
                        batch_size=100,
                        shuffle=True,
                        num_workers=1)
}

In [10]:
loaders

{'train': <torch.utils.data.dataloader.DataLoader at 0x7a66cae0aaa0>,
 'test': <torch.utils.data.dataloader.DataLoader at 0x7a66cae0aa10>}

In [11]:
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

class CNN(nn.Module):

  def __init__(self):
    super(CNN, self).__init__()

    # Convolutional layers
    self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
    self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
    self.conv2_drop = nn.Dropout2d() # Dropout layer for regularization

    # Fully connected layers
    self.fc1 = nn.Linear(320, 50)
    self.fc2 = nn.Linear(50, 10)

  def forward(self, x):
    # Apply first convolutional layer followed by ReLU activation
    # and max pooling
    x = F.relu(F.max_pool2d(self.conv1(x), 2))

    # Apply second convolutional layer, dropout, followed by ReLU
    # activation and max pooling
    x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))

    # Flatten the tensor to fit into fully connected layer
    x = x.view(-1, 320)

    # Apply first fully connected layer followed by ReLU activation
    x = F.relu(self.fc1(x))

    # Apply dropout during training
    x = F.dropout(x, training=self.training)

    # Apply second fully connected layer
    x = self.fc2(x)

    return F.log_softmax(x, dim=1)  # Apply log softmax on the output

In [12]:
import torch

# Set the device to GPU if available, otherwise CPU
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Initialize the model and move it to the specified device
model = CNN().to(device)

# Define the optimizer, here using Adam with a learning rate of 0.001
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Define the loss function, here using CrossEntropyLoss which is suitable for multi-class classification problems
loss_fn = nn.CrossEntropyLoss()

def train(epoch):
    model.train()  # Set the model to training mode
    for batch_idx, (data, target) in enumerate(loaders['train']):  # Corrected loader key to 'train'
        data, target = data.to(device), target.to(device)  # Move data and target tensors to the specified device
        optimizer.zero_grad()  # Clear the gradients
        output = model(data)  # Forward pass
        loss = loss_fn(output, target)  # Compute the loss
        loss.backward()  # Backward pass
        optimizer.step()  # Update the weights

        # Print training status every 25 batches
        if batch_idx % 25 == 0:
            print(f'Train Epoch: {epoch} [{batch_idx * len(data)}/{len(loaders["train"].dataset)} '
                  f'({100. * batch_idx / len(loaders["train"]):.0f}%)]\tLoss: {loss.item():.6f}')

def test():
    model.eval()  # Set the model to evaluation mode
    test_loss = 0
    correct = 0

    with torch.no_grad():  # Disable gradient computation for evaluation
        for data, target in loaders['test']:  # Iterate over the test data
            data, target = data.to(device), target.to(device)  # Move data and target tensors to the specified device
            output = model(data)  # Forward pass
            test_loss += loss_fn(output, target).item()  # Accumulate the loss
            pred = output.argmax(dim=1, keepdim=True)  # Get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()  # Count correct predictions

    test_loss /= len(loaders['test'].dataset)  # Calculate average test loss
    accuracy = 100. * correct / len(loaders['test'].dataset)  # Calculate accuracy

    print(f'\nTest set: Average loss: {test_loss:.4f}, Accuracy: {correct}/{len(loaders["test"].dataset)} '
          f'({accuracy:.0f}%)\n')  # Print test results


In [13]:
for epoch in range(1, 11):
  train(epoch)
  test()


Test set: Average loss: 0.0014, Accuracy: 9575/10000 (96%)


Test set: Average loss: 0.0010, Accuracy: 9690/10000 (97%)


Test set: Average loss: 0.0008, Accuracy: 9733/10000 (97%)


Test set: Average loss: 0.0007, Accuracy: 9767/10000 (98%)


Test set: Average loss: 0.0006, Accuracy: 9799/10000 (98%)


Test set: Average loss: 0.0006, Accuracy: 9827/10000 (98%)


Test set: Average loss: 0.0005, Accuracy: 9826/10000 (98%)


Test set: Average loss: 0.0005, Accuracy: 9828/10000 (98%)


Test set: Average loss: 0.0005, Accuracy: 9838/10000 (98%)


Test set: Average loss: 0.0005, Accuracy: 9853/10000 (99%)

