To speed up your data processing in PyTorch, use the torch.utils.data.DataLoader with num_workers and pin_memory parameters. 



Setting num_workers to a value greater than 0 allows data loading and augmentation to happen in parallel with model training, using separate worker subprocesses. 



Adjust num_workers based on your system's specifications and data location for optimal performance. For GPU-based training, set pin_memory=True. This makes DataLoader allocate data in pinned (page-locked) memory, facilitating faster data transfer to the GPU.

In [3]:
from torch.utils.data import DataLoader, Dataset

# Example dataset
class MyDataset(Dataset):
  def __len__(self):
    # Example size
    return 1000

def __getitem__(self, index):
  # Returning data and a label as an example
 return (index, index % 2)

# Creating a DataLoader instance
dataset = MyDataset()

data_loader = DataLoader(dataset, batch_size=32, num_workers=4, pin_memory=True)
