# Transform

Data does not always come in its final processed from that is required for training machine learning algorithms. We use transforms to perform some manipulation of the data and make it suitable for training.

All TorchVision datasets have two parameters(**transform** to modify the features and **target_transform** to modify the labels) that accept callables containing the transformation logic. The torchvision.transform module offers several commonly-used transforms out of the box

## About the dataset

The FashionMNIST features are in PIL image format and the labels are intergers. For training, we need the features as normalized and the labels as one-hot encoded tensors. To make these transformations, we use **ToTensor** and **Lambda**.


In [1]:
import torch
from torchvision import datasets
from  torchvision.transforms import ToTensor, Lambda

ds = datasets.FashionMNIST(
    root='data',
    train=True,
    download=True,
    transform=ToTensor(),
    target_transform=Lambda(lambda y: torch.zeros(10, dtype=torch.float).scatter_(0, torch.tensor(y), value=1))
)

  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)


## ToTensor()

ToTensor converts a PIL image or NumPy ndarray into a FloatTensor and scales the image's pixel intensity value in the range[0,1]

## Lambda transforms

Lambda transforms apply any user-defined lambda function. Here, we define a function to turn the interger in to a one-hot encoded tensor. It first creates a zero tensor of size 10( the number of labels in our dataset) and calls scatter which assigns a value=1 on the index as given by the label y.

In [15]:
def onehot_encode(y):
     z = torch.zeros(10, dtype=torch.float).scatter_(dim=0, index=torch.tensor(y), value=1)
     print(z)

onehot_encode(0)


tensor([1., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
torch.Size([])
