Data does not always come in its final processed form that is required for training machine learning algorithms. We use transforms to perform some manipulation of the data and make it suitable for training.

All TorchVision datasets have two parameters -transform to modify the features and target_transform to modify the labels. The torchvision.transforms module offers several commonly-used transforms out of the box.

The FashionMNIST features are in PIL Image format, and the labels are integers. For training, we need the features as normalized tensors, and the labels as one-hot encoded tensors. 

To make these transformations, we use ToTensor and Lambda.

In [1]:
import torch
from torchvision import datasets
from torchvision.transforms import ToTensor, Lambda

ds = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor(),
    target_transform=Lambda(lambda y: torch.zeros(10, dtype=torch.float).scatter_(0, torch.tensor(y), value=1))
)

- **lambda y**: ...: This is a lambda function that takes an input y, which represents the target label of an image in the dataset.

- **torch.zeros(10, dtype=torch.float)**: This creates a tensor of zeros with a size of 10. Since the FashionMNIST dataset has 10 classes (0 to 9), this tensor will be used to create the one-hot encoded representation of the target label.

- **.scatter_(0, torch.tensor(y), value=1)**: This is where the magic happens. The scatter_ method is called on the tensor of zeros to modify it in-place based on the provided indices and values. Let's break down its arguments:

- **0**: The first argument specifies the dimension along which to index. In this case, it's 0, which means we want to modify the tensor along the first dimension.

- **torch.tensor(y)**: This creates a tensor from the input y, which represents the target label of an image. For example, if y is 3, then torch.tensor(y) will be a scalar tensor with the value 3.

- **value=1**: This is the value that will be assigned to the specified indices. In this case, it's 1, which means we want to set the element at index 3 (if y is 3) to 1.

Since the tensor was initialized with zeros, this effectively creates a one-hot encoded representation.

### ToTensor()

ToTensor converts a PIL image or NumPy ndarray into a FloatTensor. and scales the image’s pixel intensity values in the range [0., 1.]

### Lambda Transforms

Lambda transforms apply any user-defined lambda function. Here, we define a function to turn the integer into a one-hot encoded tensor. It first creates a zero tensor of size 10 (the number of labels in our dataset) and calls scatter_ which assigns a value=1 on the index as given by the label y.

In [2]:
target_transform = Lambda(lambda y: torch.zeros(
    10, dtype=torch.float).scatter_(dim=0, index=torch.tensor(y), value=1))