# Baseline Transfer Learning Model for TrashNet Classification

Our baseline model will include a pretrained DenseNet feature extractor with a shallow and wide CNN head. This model will have a homogenous learning rate. We are going to use K-Fold CV as well as F1 score and multi-class AUC to validate our model.

This model acts as a stepping stone / template for future experiments.

In [1]:
!pip install pkbar



In [2]:
import os
import pkbar
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import pandas as pd
import numpy as np
from PIL import Image
from skimage import io, transform
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms, utils
from torchvision import models
import matplotlib.pyplot as plt
from google.colab import drive
from sklearn.model_selection import StratifiedKFold, StratifiedShuffleSplit
from sklearn.metrics import roc_auc_score, f1_score, accuracy_score

## Data Pre-processing
For the baseline model, we will not be applying any data augmentation or color manipulation.
- Get the index CSV file that includes all files their respective directory and labels.

In [5]:
dataset = 'trashnet'
csv_path = os.path.join(os.path.join(root, dataset), f'{dataset}_index.csv')
trash_index = pd.read_csv(csv_path)

In [6]:
print(trash_index)

                    Filename  metal  cardboard  paper  trash  glass  plastic
0         metal/metal282.jpg      1          0      0      0      0        0
1         metal/metal296.jpg      1          0      0      0      0        0
2           metal/metal2.jpg      1          0      0      0      0        0
3         metal/metal255.jpg      1          0      0      0      0        0
4         metal/metal241.jpg      1          0      0      0      0        0
...                      ...    ...        ...    ...    ...    ...      ...
2525  plastic/plastic441.jpg      0          0      0      0      0        1
2526  plastic/plastic482.jpg      0          0      0      0      0        1
2527  plastic/plastic327.jpg      0          0      0      0      0        1
2528  plastic/plastic323.jpg      0          0      0      0      0        1
2529  plastic/plastic109.jpg      0          0      0      0      0        1

[2530 rows x 7 columns]


### Trash Dataset
Dataset object to handle various sets of data that we will be dealing with including: TrashNet, ISBNet, and ISBNet extended.



In [7]:
class TrashDataset(Dataset):
  def __init__(self, csv_file, directory, root_dir, transform=None):
    """
    csv_file: CSV file that contains information about each image and their labels.
    directory: the directory where the trash data is kept
    root_dir: path to the `directory`
    transform: optional augmentations that are to be applied onto the images
    """
    self.images = os.path.join(root_dir, directory)
    self.csv = csv_file
    self.transform = transform
  
  def __len__(self):
    return len(self.csv)
  
  def __getitem__(self, idx):
    if torch.is_tensor(idx):
      idx = idx.tolist()
    
    img_name = os.path.join(self.images, self.csv.iloc[idx, 0])
    image = Image.open(img_name)
    labels = self.csv.iloc[idx, 1:]
    sample = {'image': image,
              'label': torch.tensor(labels.tolist(), dtype=torch.float)}

    if self.transform:
      sample['image'] = self.transform(sample['image'])

    return sample

## Model and Training Setup
- VGG16 pretrained with ImageNet
- Wide and shallow CNN with fully connected and log-softmax activation
- CrossEntropy loss and Adam optimizer.

### Device Setup

In [8]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

### Constants

In [9]:
FOLDS = 5
EPOCHS = 150
BATCH_SIZE = 32

### Model

In [10]:
class Net(nn.Module):
  def __init__(self):
    super(Net, self).__init__()
    self.model = models.vgg16_bn(pretrained=True)
    # Remove classification layers so that we are able to add our own CNN layers
    self.model.classifier[6] = nn.Sequential(
                                nn.Linear(4096, 1024, bias=True),
                                nn.BatchNorm1d(1024),
                                nn.Dropout(.25),
                                nn.ReLU(),
                                nn.Linear(1024, 512, bias=True),
                                nn.BatchNorm1d(512),
                                nn.Dropout(.5),
                                nn.ReLU(),
                                nn.Linear(512, 6, bias=True),
                                nn.LogSoftmax(dim=0))
  def forward(self, x):
    return self.model(x)
  
  def num_flat_features(self, x):
    """
    https://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#sphx-glr-beginner-blitz-neural-networks-tutorial-py
    """
    size = x.size()[1:]  # get all dimensions except for batch size
    features = 1
    for s in size:
      features *= s
    return features



### KFold Training and CV
* KFold setup with `StratifiedKFold`
* Creating Dataloaders in training loop.
* Using Adam and CrossEntropy Loss
* Center crop on images to make them 224x224 so VGG will be able to take them.

In [11]:
labels = trash_index.iloc[:,1:].values
labels = [list(v).index(1) for v in labels]
# s = StratifiedKFold(n_splits=FOLDS, shuffle=True).split(trash_index, labels)
s = StratifiedShuffleSplit(n_splits=5, test_size=0.17, random_state=0).split(trash_index, labels)

In [12]:
transform = transforms.Compose([
                                transforms.Resize(300),
                                transforms.RandomResizedCrop(224),
                                transforms.ToTensor(),
                                transforms.Normalize([0.485, 0.456, 0.406],
                                                     [0.229, 0.224, 0.225])
])

### Training and Validation

In [13]:

for fold, (train_idx, test_idx) in enumerate(s):
  # Create model and send it to device
  model = Net()
  model.to(device)

  # Freeze layers that are a part of vgg.
  # c = 0
  # vgg = next(model.children())
  # for param in vgg:
  #     if c <= 39:
  #         if hasattr(param, 'weight') and hasattr(param, 'bias'):
  #             param.weight.requires_grad = False
  #             param.bias.requires_grad = False
  #         param.requires_grad = False

  #     c += 1

  loss = nn.CrossEntropyLoss()
  optimizer = optim.Adam(filter(lambda p: p.requires_grad, model.parameters()))

  # Create TrashData set using newly seperated folds.
  train = TrashDataset(trash_index.iloc[train_idx,:], dataset, root, transform)
  test = TrashDataset(trash_index.iloc[test_idx,:], dataset, root, transform)

  # Use these fragmented datasets to create dataloaders.
  train_loader = torch.utils.data.DataLoader(train, 
                                             batch_size=BATCH_SIZE,
                                             shuffle=True,
                                             num_workers=4)
  test_loader = torch.utils.data.DataLoader(test, 
                                             batch_size=BATCH_SIZE,
                                             shuffle=True,
                                             num_workers=4)

  # Wrap dataloaders into a dictionary for ease of access
  dataloaders = {'train': train_loader, 'test': test_loader}
  best_val = 0.
  for epoch in range(EPOCHS):
    # Generate Keras-like progress bar
    train_steps_per_epoch = len(train) // BATCH_SIZE
    test_steps_per_epoch = len(test) // BATCH_SIZE
    print(f'Fold: {fold+1} Epochs: {epoch+1}/{EPOCHS} Train for {train_steps_per_epoch} steps, Validate for {test_steps_per_epoch} steps')
    kbar = pkbar.Kbar(target=len(train), width=10)

    for phase in ['train', 'test']:
      if phase == 'train':
        model.train()
      else:
        model.eval()

      loss_log = []
      f1_log = []
      acc_log = [] 

      for batch_num, inputs in enumerate(dataloaders[phase]):
        # Load data onto device: GPU or CPU
        images = torch.autograd.Variable(inputs['image'])
        labels = torch.autograd.Variable(torch.max(inputs['label'], 1)[1])

        images = images.to(device, dtype=torch.float)
        labels = labels.to(device, dtype=torch.long)

        # Zero the optimizer
        optimizer.zero_grad()
        
        # Forward Feeding
        with torch.set_grad_enabled(phase=='train'):
          outputs = model(images)
          loss_value = loss(outputs, labels)
          preds = torch.max(outputs, 1)[1].cpu().detach().numpy()

          # Calculating Metrics
          acc = accuracy_score(preds, labels.cpu().detach().numpy())
          f1 = f1_score(preds, labels.cpu().detach().numpy(), average='micro')

          if phase == 'train':
            loss_value.backward()
            optimizer.step()
            kbar.update((batch_num+1) * BATCH_SIZE, values=[('loss', loss_value), 
                                                            ('f1_score', f1), 
                                                            ('acc', acc)])
          if phase == 'test':
            loss_log.append(loss_value)
            f1_log.append(f1)
            acc_log.append(acc)

      if phase == 'test':
        kbar.add(1, values=[('val_loss', sum(loss_log)/len(loss_log)), 
                            ('val_f1_score', sum(f1_log)/len(f1_log)), 
                            ('val_acc',  sum(acc_log)/len(acc_log))])

Fold: 1 Epochs: 1/150 Train for 65 steps, Validate for 13 steps
Fold: 1 Epochs: 2/150 Train for 65 steps, Validate for 13 steps
Fold: 1 Epochs: 3/150 Train for 65 steps, Validate for 13 steps
Fold: 1 Epochs: 4/150 Train for 65 steps, Validate for 13 steps
Fold: 1 Epochs: 5/150 Train for 65 steps, Validate for 13 steps
Fold: 1 Epochs: 6/150 Train for 65 steps, Validate for 13 steps
Fold: 1 Epochs: 7/150 Train for 65 steps, Validate for 13 steps
Fold: 1 Epochs: 8/150 Train for 65 steps, Validate for 13 steps
Fold: 1 Epochs: 9/150 Train for 65 steps, Validate for 13 steps
Fold: 1 Epochs: 10/150 Train for 65 steps, Validate for 13 steps
Fold: 1 Epochs: 11/150 Train for 65 steps, Validate for 13 steps
Fold: 1 Epochs: 12/150 Train for 65 steps, Validate for 13 steps
Fold: 1 Epochs: 13/150 Train for 65 steps, Validate for 13 steps
Fold: 1 Epochs: 14/150 Train for 65 steps, Validate for 13 steps
Fold: 1 Epochs: 15/150 Train for 65 steps, Validate for 13 steps
Fold: 1 Epochs: 16/150 Train for 6

KeyboardInterrupt: ignored