# Transforms

# Import required libraries

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os
%matplotlib inline

import torch
from torchvision import datasets
from torchvision.transforms import ToTensor, Lambda

# Transforms


Data does not always come in its final processed form that is required for training machine learning algorithms. We use **transforms** to perform some manipulation of the data and make it suitable for training.

All TorchVision datasets have two parameters -``transform`` to modify the features and
``target_transform`` to modify the labels - that accept callables containing the transformation logic.
The [`torchvision.transforms`](https://pytorch.org/vision/stable/transforms.html) module offers several commonly-used transforms out of the box.

The FashionMNIST features are in PIL Image format, and the labels are integers.
For training, we need the features as normalized tensors, and the labels as one-hot encoded tensors.
To make these transformations, we use ``ToTensor`` and ``Lambda``.

ToTensor()
-------------------------------

[`ToTensor`](https://pytorch.org/vision/stable/transforms.html#torchvision.transforms.ToTensor)
converts a PIL image or NumPy ``ndarray`` into a ``FloatTensor``. and scales the image's pixel intensity values in the range [0., 1.]

Lambda Transforms
-------------------------------

Lambda transforms apply any user-defined lambda function. Here, we define a function 
to turn the integer into a one-hot encoded tensor. 
It first creates a zero tensor of size 10 (the number of labels in our dataset) and calls 
[`scatter_`](https://pytorch.org/docs/stable/tensors.html#torch.Tensor.scatter_) which assigns a 
``value = 1`` on the index as given by the label ``y``.

### Understanding torch.zeros

In [2]:
# There are 10 output classes.
# Create a (10, ) tensor of zeros
y_zeros = torch.zeros(10, dtype = torch.float)
print(f"y_zeros is: {y_zeros}")

y_zeros is: tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])


### Understanding torch.scatter

In [3]:
# One-Hot encode y
y = 5
y_ohe = y_zeros.scatter_(dim = 0, index = torch.tensor(y), src = torch.tensor(1.0))
print(y_ohe)

tensor([0., 0., 0., 0., 0., 1., 0., 0., 0., 0.])


### Understanding Lambda function

In [4]:
target_transform = Lambda(lambda y: torch.zeros(10,\
                                                dtype = torch.float).\
                          scatter_(dim = 0, index = torch.tensor(y), src = torch.tensor(1.0)))
print(target_transform(5))
print()
print(target_transform(8))

tensor([0., 0., 0., 0., 0., 1., 0., 0., 0., 0.])

tensor([0., 0., 0., 0., 0., 0., 0., 0., 1., 0.])


# Import dataset

In [5]:
dataset_path = os.path.normpath(r'E:\Sync_With_NAS_Ext\Datasets\Image_Datasets\Pytorch_Datasets')
ds = datasets.FashionMNIST(root = dataset_path, train = True, download = True,\
                           transform = ToTensor(), target_transform = target_transform)

In [6]:
print(f"Type of ds[0] is {type(ds[0])}")
print(f"Type of ds[0][0] is {type(ds[0][0])}")
print(f"Type of ds[0][1] is {type(ds[0][1])}")
print(f"Shape of ds[0][0] is {ds[0][0].shape}")
print(f"Shape of ds[0][1] is {ds[0][1].shape}")
print(f"Value of ds[0][1] is {ds[0][1]}")

Type of ds[0] is <class 'tuple'>
Type of ds[0][0] is <class 'torch.Tensor'>
Type of ds[0][1] is <class 'torch.Tensor'>
Shape of ds[0][0] is torch.Size([1, 28, 28])
Shape of ds[0][1] is torch.Size([10])
Value of ds[0][1] is tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 1.])
