In [None]:
%matplotlib inline


Transforms
===================

Data does not always come in its final processed form that is required for 
training machine learning algorithms. We use **transforms** to perform some
manipulation of the data and make it suitable for training.

All TorchVision datasets have two parameters -``transform`` to modify the features and
``target_transform`` to modify the labels - that accept callables containing the transformation logic.

The FashionMNIST features are in PIL Image format, and the labels are integers.
For training, we need the features as normalized tensors, and the labels as one-hot encoded tensors.
To make these transformations, we use ``ToTensor`` and ``Lambda``.



In [None]:
from torchvision import datasets
from torchvision.transforms import ToTensor, Lambda
import torch 
import numpy as np
ds = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor(),
    target_transform=Lambda(lambda y: torch.zeros(10, dtype=torch.float).scatter_(0, torch.tensor(y), value=1))
)

In [None]:
a = lambda y: torch.zeros(10, dtype=torch.float).scatter_(0, torch.tensor(y), value=1)

In [None]:
torch.zeros(10).scatter_(-1,torch.tensor(1),1)

tensor([0., 1., 0., 0., 0., 0., 0., 0., 0., 0.])

ToTensor()
-------------------------------

`ToTensor <https://pytorch.org/vision/stable/transforms.html#torchvision.transforms.ToTensor>`_ 
converts a PIL image or NumPy ``ndarray`` into a ``FloatTensor``. and scales 
the image's pixel intensity values in the range [0., 1.]


There is a mistake in PyTorch Documentation. ToTensor won't automatically scales the data

Lambda Transforms
-------------------------------

Lambda transforms apply any user-defined lambda function. Here, we define a function 
to turn the integer into a one-hot encoded tensor. 
It first creates a zero tensor of size 10 (the number of labels in our dataset) and calls 
`scatter_ <https://pytorch.org/docs/stable/tensors.html#torch.Tensor.scatter_>`_ which assigns a 
``value=1`` on the index as given by the label ``y``.



In [None]:
target_transform = Lambda(lambda y: torch.zeros(
    10, dtype=torch.float).scatter_(dim=0, index=torch.tensor(y), value=1))

--------------




Further Reading
~~~~~~~~~~~~~~~~~
- `torchvision.transforms API <https://pytorch.org/vision/stable/transforms.html>`_



In [None]:
data = np.random.rand(5,5)
data *= 10
data

array([[2.2618976 , 8.13607603, 0.09087905, 3.33010455, 7.40326179],
       [5.47015373, 1.35842145, 6.74888793, 7.50638372, 6.43637557],
       [3.37894968, 8.35823174, 4.84454888, 2.76349987, 7.4866254 ],
       [8.30306295, 6.35991555, 7.82510809, 8.4991081 , 7.3997656 ],
       [4.8398277 , 6.40551415, 0.07326943, 6.8269829 , 0.29764871]])

In [None]:
ToTensor()(data)

tensor([[[2.2619, 8.1361, 0.0909, 3.3301, 7.4033],
         [5.4702, 1.3584, 6.7489, 7.5064, 6.4364],
         [3.3789, 8.3582, 4.8445, 2.7635, 7.4866],
         [8.3031, 6.3599, 7.8251, 8.4991, 7.3998],
         [4.8398, 6.4055, 0.0733, 6.8270, 0.2976]]], dtype=torch.float64)