<h1>Dataset and Transforms</h1> 



<h3 span style='color:yellow'>When we load a dataset using the PyTorch Dataset class, we can set the 'transform' argument within the class constructor.</h3>

<h3 span style='color:yellow'>Transform is a method that converts the data's shape, format, etc., such as converting data from the NumPy format into the PyTorch tensor format. Other kinds of transformations include data normalization and scaling.</h3>

<h3 span style='color:yellow'>PyTorch includes many different off-the-shelf transforms that are easy to implement.

</h3>

<h3 span style='color:yellow'>Transforms can be applied to images, tensors, ndarrays, or custom data
during creation of the Dataset.</h3>

<ul>
<h3 span style='color:yellow'> Transforms that are applied to tensor data:</h3>
<ul>
  <li style="color:lightgreen;"><span style="font-size:18px;"> LinearTransformation, Normalize, RandomErasing, etc.</span></li>
</ul>

<h3 span style='color:yellow'> Transform that are applied to images:</h3>
<ul>
  <li style="color:lightgreen;"><span style="font-size:18px;"> CenterCrop, Grayscale, Pad, RandomAffine,
RandomCrop, RandomHorizontalFlip, RandomRotation
Resize, Scale, etc.</span></li>
</ul>

<h3 span style='color:yellow'> Transforms that are used for conversion:</h3>
<ul>
  <li style="color:lightgreen;"><span style="font-size:18px;"> ToPILImage: from tensor or ndrarray. 
ToTensor : from numpy.ndarray or PILImage</span></li>
</ul>

<h3 span style='color:yellow'> Generic transforms:</h3>
<ul>
  <li style="color:lightgreen;"><span style="font-size:18px;"> UseLambda</span></li>
</ul>

<h3 span style='color:yellow'> Custom transforms:</h3>
<ul>
  <li style="color:lightgreen;"><span style="font-size:18px;">That is implemented by writing your own Python class, provided that the __call__ method is used to convert the class object into a callable object.</span></li>
</ul>
</ul>

<h3 span style='color:yellow'>The complete list of built-in transforms can be found:  <span style='color:lightgreen'>https://pytorch.org/docs/stable/torchvision/transforms.html</span></h3>



In [36]:
import torch, torchvision
from torch.utils.data import Dataset
import numpy as np

In [37]:
class MyDataset:
    def __init__(self,transform=None):
        Xy=np.loadtxt("synthetic_classification_data copy.csv",delimiter=',',skiprows=1, dtype=np.float32)
        
        # There's no need to convert the following NumPy array to a tensor; we will do that using transform
        self.X=Xy[:,1:]
        self.y=Xy[:,[0]]
        self.n_samples=self.X.shape[0]
        
        self.transform=transform
        
    def __len__(self):
        return self.n_samples

    def __getitem__(self,index):
        sample= self.X[index], self.y[index]
        
        if self.transform:
            sample=self.transform(sample)
        return sample

In [38]:
# We define our own transform class that will convert the numpy array to a tensor (This is as an example; we could do this inside the dataset class, but we want to show the analogy of a custom transform).
class Totensor:
    def __call__(self,sample):
        # unpack sample
        data,label=sample
        return torch.from_numpy(data), torch.from_numpy(label)
dataset1= MyDataset(transform=Totensor())

first_sample1=dataset[0]
data1,label1=first_sample1
# Assure that the numpy array has been transformed into a tensor, we print the data type 
print(type(data1))

<class 'torch.Tensor'>


In [39]:
# Let us apply another transform
class Facotrscaler:
    def __init__(self,factor):
        self.factor=factor
    
    def __call__(self, sample):
        data, label=sample
        data*=self.factor
        return data,label

In [40]:
# To apply the above two tarnsforms together, i.e., Totensor and Factorscaler we use the composed transforms
from torchvision import transforms
composed_transform=transforms.Compose([Totensor(),Facotrscaler(5.5)])  # Suppose the 
dataset2=MyDataset(transform=composed_transform)
first_sample2=dataset2[0]
data2,label2=first_sample2
print(f'f The data with Totensor transform {data1}<><> The data with bboth transform {data2}')


f The data with Totensor transform tensor([0.3257, 0.6295, 0.2986])<><> The data with bboth transform tensor([1.7914, 3.4622, 1.6425])
