## Readings:

mastering dataset:
 - https://pytorch.org/tutorials/beginner/data_loading_tutorial.html
 - https://pytorch.org/docs/stable/data.html
 
torch datasets:
 - https://pytorch.org/docs/stable/torchvision/datasets.html
 - https://pytorch.org/audio/datasets.html
 
transforms:
 - https://pytorch.org/docs/stable/torchvision/transforms.html
 - https://pytorch.org/audio/transforms.html

## loading packages

In [3]:
import torch
from torch.utils.data import Dataset, DataLoader, random_split
from torchvision import transforms as vt
from torchaudio import transforms as at
import numpy as np

## defining dataset

In [4]:
class CustomDataset(Dataset):
    def __init__(self,):
        super(CustomDataset, self).__init__()        ## define dataset attributes here
        self.X = np.random.normal(size=(10,4))
        self.y = np.random.randint(0,2, size=(10))
        
        
    def __len__(self):          ## length of dataset, number of elements
        return self.X.shape[0]
    
    def __getitem__(self, idx):       ## getting an item at specific index
        if torch.is_tensor(idx):
            idx = idx.tolist()
        return self.X[idx] , self.y[idx]

In [5]:
dataset = CustomDataset()

In [6]:
n = len(dataset)
t = int(0.2*n)
print(t,n)

2 10


In [7]:
train_data, test_data = random_split(dataset, [n-t, t])
print('train:',len(train_data), 'test:', len(test_data))

train: 8 test: 2


In [8]:
dataset[0]

(array([ 0.40196957, -2.00037414, -0.04192366, -0.08493043]), 0)

In [12]:
loader = DataLoader(train_data, batch_size=2, shuffle=True)

'''For data loading, passing pin_memory=True to a DataLoader will automatically
put the fetched data Tensors in pinned memory, 
and thus enables faster data transfer to CUDA-enabled GPUs.'''

'For data loading, passing pin_memory=True to a DataLoader will automatically\nput the fetched data Tensors in pinned memory, \nand thus enables faster data transfer to CUDA-enabled GPUs.'

In [13]:
for batch in loader:
    x,y = batch
    print(x.shape, y.shape)

torch.Size([2, 4]) torch.Size([2])
torch.Size([2, 4]) torch.Size([2])
torch.Size([2, 4]) torch.Size([2])
torch.Size([2, 4]) torch.Size([2])


## transforms

In [None]:
final_transform = vt.Compose([
         transforms.CenterCrop(10),
         transforms.ToTensor(),
])