# 1. Imports

In [6]:
# importing other dependencies
import numpy as np
import pandas as pd
# importing PyTorch
import torch
# importing torch.nn Module
import torch.nn as nn
# importing Dataset, DataLoader
from torch.utils.data import Dataset, DataLoader, TensorDataset
# import Compose to compose transforms
from torchvision.transforms import Compose
# plotting
import matplotlib.pyplot as plt

In [7]:
# checks whether MPS is available
print(torch.backends.mps.is_available())

# this ensures that the current current PyTorch installation was built with MPS activated.
print(torch.backends.mps.is_built())

# setting the device to "mps" instead of default "cpu"
device = torch.device("mps" if torch.backends.mps.is_available else "cpu")

True
True


# 2. DataLoaders

Prediction, calculation of loss and gradient computation can be expensive on large datasets. This can be solved by loading the data in smaller batches. We follow the steps for a single mini-batch and update the weights accordingly.

Some terminology:
- **epoch**: one forward and backward pass of *all* training samples
- **batch_size**: number of training samples used in one forward/backward pass
- **#iterations**: number of passes, each pass (forward+backward) using [batch_size] number of samples
- **Example**: 100 samples, batch_size=20 -> 100/20=5 iterations for 1 epoch


This is going to be the template now:
```Python
# training loop
for epoch in range(num_epochs):
    # loop over all the mini-batches:
    for i in range(total_batches):
        batch_X, batch_y = ...
```

DataLoader can do batch computation for us. Before we do that, we implement a custom Dataset class.

## 2.1 Wine Dataset

In this example, we will be loading the `Wine Dataset` by reading the `.csv` file via the custom Dataset class that we will be implementing.

In [3]:
data_df = pd.read_csv('data/wine.csv')
data_df.head()

Unnamed: 0,Wine,Alcohol,Malic.acid,Ash,Acl,Mg,Phenols,Flavanoids,Nonflavanoid.phenols,Proanth,Color.int,Hue,OD,Proline
0,1,14.23,1.71,2.43,15.6,127,2.8,3.06,0.28,2.29,5.64,1.04,3.92,1065
1,1,13.2,1.78,2.14,11.2,100,2.65,2.76,0.26,1.28,4.38,1.05,3.4,1050
2,1,13.16,2.36,2.67,18.6,101,2.8,3.24,0.3,2.81,5.68,1.03,3.17,1185
3,1,14.37,1.95,2.5,16.8,113,3.85,3.49,0.24,2.18,7.8,0.86,3.45,1480
4,1,13.24,2.59,2.87,21.0,118,2.8,2.69,0.39,1.82,4.32,1.04,2.93,735


In [4]:
# Implementing a custom Dataset:
#   inherit 'Dataset' class
#   implement:
#       - __init__: to define the variables
#       - __getitem__: to onbtain (x,y) from a certain index
#       - __len__: to get the length of the dataset, or get #samples


# custom class for wine dataset

class WineDataset(Dataset):
    def __init__(self):
        # initialize data, download etc.
        # read with numpy or pandas
        df = pd.read_csv('data/wine.csv')
        
        # extract features and target
        features = df.drop('Wine', axis=1).values
        target = df[['Wine']].values
        
        # convert to pytorch tensor (after converting it to float32) using TensorDataset
        data_tensor = TensorDataset(
            torch.tensor(features, dtype=torch.float32),
            torch.tensor(target, dtype=torch.float32)
        )
        # TensorDataset is not exactly a tensor but each row looks like (x,y) tuples

        # extract the no. of samples
        self.n_samples = len(data_tensor)
        
        # extract features and target
        self.X = data_tensor[:][0] # size [n_samples, n_features]
        self.y = data_tensor[:][1] # size [n_samples, 1]
    
    # support indexing such that dataset[i] can be used to obtain the i-th sample
    def __getitem__(self, index):
        # return tuple (xi,yi)
        return self.X[index], self.y[index]

    # we can also obtain n_samples by calling length of the dataset
    def __len__(self):
        return self.n_samples  

Now that we have created the custom `WineDataset` class, we can wrap it via `DataLoader` to extract data in batches.

In [5]:
# create dataset
dataset = WineDataset()

# get first sample and unpack
first_data = dataset[0]
x0, y0 = first_data
print(x0, y0)

tensor([1.4230e+01, 1.7100e+00, 2.4300e+00, 1.5600e+01, 1.2700e+02, 2.8000e+00,
        3.0600e+00, 2.8000e-01, 2.2900e+00, 5.6400e+00, 1.0400e+00, 3.9200e+00,
        1.0650e+03]) tensor([1.])


In [6]:
# create dataset
dataset = WineDataset()

# Load whole dataset with DataLoader
# shuffle: shuffle data for good training
train_loader = DataLoader(
    dataset=dataset,
    batch_size=4,
    shuffle=True,
)

In [7]:
# we can also convert the DataLoader into an iterator and extract the data one batch at a time
train_iter = iter(train_loader)
data_batch = next(train_iter)

features, targets = data_batch
print("Batch_X:\n",features)
print("Batch_y:\n", targets)
print("Shapes:", features.shape, targets.shape)

Batch_X:
 tensor([[1.2290e+01, 1.4100e+00, 1.9800e+00, 1.6000e+01, 8.5000e+01, 2.5500e+00,
         2.5000e+00, 2.9000e-01, 1.7700e+00, 2.9000e+00, 1.2300e+00, 2.7400e+00,
         4.2800e+02],
        [1.1650e+01, 1.6700e+00, 2.6200e+00, 2.6000e+01, 8.8000e+01, 1.9200e+00,
         1.6100e+00, 4.0000e-01, 1.3400e+00, 2.6000e+00, 1.3600e+00, 3.2100e+00,
         5.6200e+02],
        [1.2820e+01, 3.3700e+00, 2.3000e+00, 1.9500e+01, 8.8000e+01, 1.4800e+00,
         6.6000e-01, 4.0000e-01, 9.7000e-01, 1.0260e+01, 7.2000e-01, 1.7500e+00,
         6.8500e+02],
        [1.3410e+01, 3.8400e+00, 2.1200e+00, 1.8800e+01, 9.0000e+01, 2.4500e+00,
         2.6800e+00, 2.7000e-01, 1.4800e+00, 4.2800e+00, 9.1000e-01, 3.0000e+00,
         1.0350e+03]])
Batch_y:
 tensor([[2.],
        [2.],
        [3.],
        [1.]])
Shapes: torch.Size([4, 13]) torch.Size([4, 1])


## 2.2 Dummy Training Loop

In [8]:
# dummy training loop with DataLoaders

dataset = WineDataset()

num_epochs = 2
n_samples = len(dataset)

train_loader = DataLoader(
    dataset=dataset,
    batch_size=4,
    shuffle=True,
    num_workers=0
)

for epoch in range(num_epochs):
    for (idx, batch) in enumerate(train_loader):
        # extract feature and labels
        features, targets = batch

        # run your training processes here: forward, backward, update
        
        # print info after every 5 mini-batches
        if idx%5 == 0:
            print('-------------------------------')
            print(f'Epoch: {epoch+1}/{num_epochs}')
            print(f'Steps: {idx+1}')
            print(f'Batch Features shape: {features.shape} | Batch Target shape: {targets.shape}')

-------------------------------
Epoch: 1/2
Steps: 1
Batch Features shape: torch.Size([4, 13]) | Batch Target shape: torch.Size([4, 1])
-------------------------------
Epoch: 1/2
Steps: 6
Batch Features shape: torch.Size([4, 13]) | Batch Target shape: torch.Size([4, 1])
-------------------------------
Epoch: 1/2
Steps: 11
Batch Features shape: torch.Size([4, 13]) | Batch Target shape: torch.Size([4, 1])
-------------------------------
Epoch: 1/2
Steps: 16
Batch Features shape: torch.Size([4, 13]) | Batch Target shape: torch.Size([4, 1])
-------------------------------
Epoch: 1/2
Steps: 21
Batch Features shape: torch.Size([4, 13]) | Batch Target shape: torch.Size([4, 1])
-------------------------------
Epoch: 1/2
Steps: 26
Batch Features shape: torch.Size([4, 13]) | Batch Target shape: torch.Size([4, 1])
-------------------------------
Epoch: 1/2
Steps: 31
Batch Features shape: torch.Size([4, 13]) | Batch Target shape: torch.Size([4, 1])
-------------------------------
Epoch: 1/2
Steps: 

# 3. Data Transforms

There are many data transform techniques. The complete list of built-in transforms in PyTorch can be found [here](https://pytorch.org/vision/stable/transforms.html).

- On **Images**:
    - `CenterCrop`, `Grayscale`, `Pad`, `RandomAffine`
    - `RandomCrop`, `RandomHorizontalFlip`, `RandomRotation`
    - `Resize`, `Scale`

- On **Tensors**:
    - `LinearTransformation`, `Normalize`, `RandomErasing`

- **Conversion**:
    - `ToPILImage`: from tensor or ndrarray
    - `ToTensor` : from numpy.ndarray or PILImage

- **Generic**:
    - use `Lambda`

- **Custom**:
    - write own `class`

- **Composing multiple Transforms**:
```Python
from torchvision.transforms import Compose
compose = Compose(
    [transform_1],
    [transform_2],
    ...
)
```

We can implement `Transforms` into the custom Dataset class itself. We again use the `wine.csv` dataset for this purpose.

In [43]:
# Implementing a custom Dataset:
#   inherit 'Dataset' class
#   implement:
#       - __init__: to define the variables
#       - __getitem__: to onbtain (x,y) from a certain index
#       - __len__: to get the length of the dataset, or get #samples


# custom class for wine dataset

class WineDataset(Dataset):
    def __init__(self, transform = None):
        # initialize data, download etc.
        # read with numpy or pandas
        df = pd.read_csv('data/wine.csv')
        
        # extract features and target
        self.X = df.drop('Wine', axis=1).values
        self.y = df[['Wine']].values
        # we do not convert to tensor here

        # extract the transform into a variable
        self.transform = transform

        # extract the no. of samples
        self.n_samples = df.shape[0]

    
    # support indexing such that dataset[i] can be used to obtain the i-th sample
    def __getitem__(self, index):
        feature = self.X[index].reshape(1,-1)
        target = self.y[index].reshape(-1,1)

        # if transform is not `None`
        if self.transform:
            feature = self.transform(feature)
            target = self.transform(target)
        
        return feature, target

    # we can also obtain n_samples by calling length of the dataset
    def __len__(self):
        return self.n_samples

## 3.1 Without Transforms

Note that by default without any transforms, these will give numpy arrays as output.

In [44]:
# create dataset without transform
dataset = WineDataset()

# get first sample and unpack
first_data = dataset[0]
x0, y0 = first_data
print(x0, y0)
print(type(x0), type(y0))
print(x0.shape, y0.shape)

[[1.423e+01 1.710e+00 2.430e+00 1.560e+01 1.270e+02 2.800e+00 3.060e+00
  2.800e-01 2.290e+00 5.640e+00 1.040e+00 3.920e+00 1.065e+03]] [[1]]
<class 'numpy.ndarray'> <class 'numpy.ndarray'>
(1, 13) (1, 1)


## 3.2 With Transforms

In this case, we first we convert the samples into PyTorch tensors from numpy arrays. For this we use the built-in `ToTensor` method.

In [45]:
from torchvision.transforms import ToTensor

# create dataset with transform
dataset = WineDataset(transform=ToTensor())

# get first sample and unpack
first_data = dataset[0]
x0, y0 = first_data
print(x0, y0)
print(type(x0), type(y0))
print(x0.shape, y0.shape)

tensor([[[1.4230e+01, 1.7100e+00, 2.4300e+00, 1.5600e+01, 1.2700e+02,
          2.8000e+00, 3.0600e+00, 2.8000e-01, 2.2900e+00, 5.6400e+00,
          1.0400e+00, 3.9200e+00, 1.0650e+03]]], dtype=torch.float64) tensor([[[1]]])
<class 'torch.Tensor'> <class 'torch.Tensor'>
torch.Size([1, 1, 13]) torch.Size([1, 1, 1])


Notice that the numpy arrays have been converted to PyTorch Tensors.

## 3.3 Custom Transforms

We can create custom transforms by either modifying the in-built methods, by modifying the `__call__` attribute in them, or building a transform class from scratch.

Suppose, we decide to modify the `ToTensor()` method.

In [97]:
# Implementing a custom Dataset:
#   inherit 'Dataset' class
#   implement:
#       - __init__: to define the variables
#       - __getitem__: to onbtain (x,y) from a certain index
#       - __len__: to get the length of the dataset, or get #samples


# custom class for wine dataset

class WineDataset(Dataset):
    def __init__(self, transform = None):
        # initialize data, download etc.
        # read with numpy or pandas
        df = pd.read_csv('data/wine.csv')
        
        # extract features and target
        self.X = df.drop('Wine', axis=1).values
        self.y = df[['Wine']].values
        # we do not convert to tensor here

        # extract the transform into a variable
        self.transform = transform

        # extract the no. of samples
        self.n_samples = df.shape[0]


    # support indexing such that dataset[i] can be used to obtain the i-th sample
    def __getitem__(self, index):
        sample = self.X[index], self.y[index]

        # if transform is not `None`
        if self.transform:
            sample = self.transform(sample)
        
        return sample
        

    # we can also obtain n_samples by calling length of the dataset
    def __len__(self):
        return self.n_samples

In [98]:
# modify the `ToTensor` method:
class ToTensor:
    # converting numpy arrays to Tensors
    def __call__(self, sample):
        feature, target = sample
        # convert them to Tendors
        feature = torch.from_numpy(feature.astype(np.float32))
        target = torch.from_numpy(target.astype(np.float32))
        return feature, target

In [99]:
# create dataset with transform
dataset = WineDataset(transform=ToTensor())

# get first sample and unpack
first_data = dataset[0]
x0, y0 = first_data
print(x0, y0)
print(type(x0), type(y0))
print(x0.shape, y0.shape)

tensor([1.4230e+01, 1.7100e+00, 2.4300e+00, 1.5600e+01, 1.2700e+02, 2.8000e+00,
        3.0600e+00, 2.8000e-01, 2.2900e+00, 5.6400e+00, 1.0400e+00, 3.9200e+00,
        1.0650e+03]) tensor([1.])
<class 'torch.Tensor'> <class 'torch.Tensor'>
torch.Size([13]) torch.Size([1])


Similarly, we can also decide to create a new transform method, say `MulTransform()`, which takes in a float as an argument and multiplies that to all the input features.

In [100]:
# create a new transform class
class MulTransform:
    # this takes in a no. as an arg
    def __init__(self, factor=1):
        self.factor = factor
    
    # multiplying factor to the input features
    def __call__(self, sample):
        feature, target = sample
        feature *= self.factor
        return feature, target

In [101]:
# create dataset with transform
dataset = WineDataset(transform=MulTransform(factor=2))

# get first sample and unpack
first_data = dataset[0]
x0, y0 = first_data
print(x0, y0)
print(type(x0), type(y0))
print(x0.shape, y0.shape)

[2.846e+01 3.420e+00 4.860e+00 3.120e+01 2.540e+02 5.600e+00 6.120e+00
 5.600e-01 4.580e+00 1.128e+01 2.080e+00 7.840e+00 2.130e+03] [1]
<class 'numpy.ndarray'> <class 'numpy.ndarray'>
(13,) (1,)


Note, that the samples are still numpy arrays.

## 3.4 Composing Transforms

Suppose, we want to compose 2 transforms `ToTensor()` and `MulTransform()`. We can do this by using `torchvision.transforms.Compose()`.

In [102]:
from torchvision.transforms import Compose

composedTransform = Compose([
    ToTensor(),
    MulTransform(factor=2)
])

In [105]:
# create dataset with transform
dataset = WineDataset(transform=composedTransform)

# get first sample and unpack
first_data = dataset[0]
x0, y0 = first_data
print(x0, y0)
print(type(x0), type(y0))
print(x0.shape, y0.shape)

tensor([2.8460e+01, 3.4200e+00, 4.8600e+00, 3.1200e+01, 2.5400e+02, 5.6000e+00,
        6.1200e+00, 5.6000e-01, 4.5800e+00, 1.1280e+01, 2.0800e+00, 7.8400e+00,
        2.1300e+03]) tensor([1.])
<class 'torch.Tensor'> <class 'torch.Tensor'>
torch.Size([13]) torch.Size([1])


Notice, that these have been multiplied by a factor of 2 and have also been converted to PyTorch Tensors.

# 4. Full Pipeline

We train a Logistic Regression model on the Breast Cancer dataset using the techniques above.

## 4.1 Preparing Data

### 4.1.1 Creating the Dataset Class

Note: Inorder to fit a regression model, we need to standard scale the data to make the model fit properly.

In [112]:
# Implementing a custom Dataset:
#   inherit 'Dataset' class
#   implement:
#       - __init__: to define the variables
#       - __getitem__: to onbtain (x,y) from a certain index
#       - __len__: to get the length of the dataset, or get #samples


# custom class for Breast Cancer dataset

from torch.utils.data import Dataset
from sklearn.datasets import load_breast_cancer
from sklearn.preprocessing import StandardScaler

class BCDataset(Dataset):
    def __init__(self, transform = None):
        # initialize data, download etc.
        # read with numpy or pandas
        
        # extract features and target
        self.X, self.y = load_breast_cancer(return_X_y=True)
        
        # reshape y
        self.y = self.y.reshape(-1,1)
        
        # standard scale X
        ss = StandardScaler()
        self.X = ss.fit_transform(self.X)

        # we do not convert to tensor here

        # extract the transform into a variable
        self.transform = transform

        # extract the no. of samples
        self.n_samples = self.X.shape[0]

        # extract the no. of input features
        self.n_features = self.X.shape[1]


    # support indexing such that dataset[i] can be used to obtain the i-th sample
    def __getitem__(self, index):
        sample = self.X[index], self.y[index]

        # if transform is not `None`
        if self.transform:
            sample = self.transform(sample)
        
        return sample
        

    # we can also obtain n_samples by calling length of the dataset
    def __len__(self):
        return self.n_samples

We define some custom Transforms here.

In [113]:
from torchvision.transforms import ToTensor

# modify the `ToTensor` method:
class ToTensor:
    # converting numpy arrays to Tensors
    def __call__(self, sample):
        feature, target = sample
        # convert them to Tendors
        feature = torch.from_numpy(feature.astype(np.float32))
        target = torch.from_numpy(target.astype(np.float32))
        return feature, target

In [120]:
# create dataset with transform
dataset = BCDataset(transform=ToTensor())

### 4.1.2 Splitting into Training and Test Datasets

Inorder to split the dataset we need to use `torch.utils.data.random_split()` method, before feeding the training data into a DataLoader.

In [121]:
from torch.utils.data import random_split

# determining the train,test size
n_samples = len(dataset)
test_ratio = 0.1

test_size = int(n_samples*test_ratio)
train_size = n_samples - test_size

train_dataset, test_dataset = random_split(dataset, [train_size, test_size])

print("Training Dataset Size:", len(train_dataset))
print("Test Dataset Size:", len(test_dataset))

Training Dataset Size: 513
Test Dataset Size: 56


### 4.1.3 Feeding into DataLoader

In [122]:
from torch.utils.data import DataLoader

# for the test dataset, we consider the full batch size for evaluation purposes, hence batch_size=len(test_dataset) and shuffle = False
test_loader = DataLoader(
    dataset=test_dataset,
    batch_size=len(test_dataset),
    shuffle=False,
    num_workers=0
)

# we use an iterator to extract the full batch of test dataset
X_test, y_test = next(iter(test_loader))
print(X_test.shape, y_test.shape)

# set the batch_size
batch_size = 5

# for the train dataset, we split this into batches of size 4
train_loader = DataLoader(
    dataset=train_dataset,
    batch_size=batch_size,
    shuffle=True,
    num_workers=0   
)

torch.Size([56, 30]) torch.Size([56, 1])


## 4.2 Creating the Model

In [129]:
class LogisticRegression(nn.Module):
    def __init__(self, input_dim):
        super(LogisticRegression, self).__init__()

        # define the layers here
        self.linear = nn.Linear(input_dim, 1)
        # here output dim is going to be 1
    
    def forward(self, X):
        # we use sigmoid activation
        y_hat = torch.sigmoid(self.linear(X))
        return y_hat

In [130]:
input_dim = dataset.n_features

# initialise linear regression model instance
model = LogisticRegression(input_dim)

In [131]:
# we also check the initial loss on the test data before training

from sklearn.metrics import accuracy_score

with torch.no_grad():
    # this gives the sigmoid output
    y_test_pred = model(X_test)
    # this assigns 1 if sigmoid op is >=0.5 and 0 o.w.
    y_test_hat = y_test_pred.round()
    print("Initial Accuracy on test data:", accuracy_score(y_test, y_test_hat))

Initial Accuracy on test data: 0.07142857142857142


## 4.3 Training

In [132]:
# in this scenario, we use the MSE Loss, with SGD optimization

# learning rate
lr = 0.01

# no. of epochs
num_epochs = 300

# define loss criterion: Binary Cross Entropy loss
criterion = nn.BCELoss()

# define optimizer
optimizer = torch.optim.SGD(model.parameters(), lr=lr)

# format of optimizer: torch.optim.SGD(weights, lr, ...)

In [133]:
for epoch in range(num_epochs):
    # load training data in batched via train_loader
    for (idx, batch) in enumerate(train_loader):
        # extract features and labels
        X_train_batch, y_train_batch = batch

        # training processes
        # forward pass:
        # compute prediction
        y_train_batch_pred = model(X_train_batch)
        # compute loss
        loss = criterion(y_train_batch_pred, y_train_batch)

        # backward pass:
        # compute gradients
        loss.backward()

        # update weights:
        optimizer.step()
        # zero-gradients after updating
        optimizer.zero_grad()

        # print info after every 5 mini-batches
        if idx%5 == 0:
            print('-------------------------------')
            print(f'Epoch: {epoch+1}/{num_epochs}')
            print(f'Steps: {idx+1}')
            print(f'Batch Features shape: {X_train_batch.shape} | Batch Target shape: {y_train_batch.shape}')
            print(f'Loss: {loss}')


-------------------------------
Epoch: 1/300
Steps: 1
Batch Features shape: torch.Size([5, 30]) | Batch Target shape: torch.Size([5, 1])
Loss: 0.9560569524765015
-------------------------------
Epoch: 1/300
Steps: 6
Batch Features shape: torch.Size([5, 30]) | Batch Target shape: torch.Size([5, 1])
Loss: 0.8834799528121948
-------------------------------
Epoch: 1/300
Steps: 11
Batch Features shape: torch.Size([5, 30]) | Batch Target shape: torch.Size([5, 1])
Loss: 0.6790915131568909
-------------------------------
Epoch: 1/300
Steps: 16
Batch Features shape: torch.Size([5, 30]) | Batch Target shape: torch.Size([5, 1])
Loss: 0.5666138529777527
-------------------------------
Epoch: 1/300
Steps: 21
Batch Features shape: torch.Size([5, 30]) | Batch Target shape: torch.Size([5, 1])
Loss: 0.61336749792099
-------------------------------
Epoch: 1/300
Steps: 26
Batch Features shape: torch.Size([5, 30]) | Batch Target shape: torch.Size([5, 1])
Loss: 0.5747867226600647
--------------------------

## 4.4 Evaluation

In [134]:
# we also check the initial loss on the test data after training

from sklearn.metrics import accuracy_score

with torch.no_grad():
    # this gives the sigmoid output
    y_test_pred = model(X_test)
    # this assigns 1 if sigmoid op is >=0.5 and 0 o.w.
    y_test_hat = y_test_pred.round()
    print("Final Accuracy on test data:", accuracy_score(y_test, y_test_hat))

Final Accuracy on test data: 1.0


So, we manage to improve the test accuracy from `7%` to `100%`.

# 5. In-built datasets in PyTorch

PyTorch comes with several built-in datasets, all of which are pre-loaded in the class `torch.datasets`. The package `torch` consists of all the core classes and methods required to implement neural networks, while `torchvision` is a supporting package consisting of popular datasets, model architectures, and common image transformations for computer vision. There is one more package named `torchtext` which has all the basic utilities of PyTorch Natural Language Processing. This package consists of datasets that are related to text.

## 5.1 Datasets in TorchVision

- **MNIST:** `torchvision.datasets.MNIST()`
- **Fashion MNIST:** `torchvision.datasets.FashionMNIST()`
- **CIFAR:** `torchvision.datasets.CIFAR10()`, `torchvision.datasets.CIFAR100()`
- **COCO:** `torchvision.datasets.CocoCaptions()`
- **EMNIST:** `torchvision.datasets.EMNIST()`
- **IMAGE-NET:** `torchvision.datasets.ImageNet()`

## 5.1 Datasets in TorchText

- **IMDB:** `torchtext.datasets.IMDB()`
- **WikiText2:** `torchtext.datasets.WikiText2()`

## 5.3 `ImageFolder Class`

`ImageFolder` is a generic data loader class in `torchvision` that helps you load your own image dataset. Let’s imagine you are working on a classification problem and building a neural network to identify if a given image is an apple or an orange. To do this in PyTorch, the first step is to arrange images in a default folder structure as shown below:
```
 root
├── orange
│   └── orange_image1.png
│   └── orange_image1.png
├── apple
│   └── apple_image1.png
│   └── apple_image2.png
│   └── apple_image3.png
```
After we have arranged our dataset as shown, we can use the `ImageLoader` class to load all these images. Below is the code snippet you would use to do so:
```Python
torchvision.datasets.ImageFolder(root, transform)
```

In [None]:
%%time