## Bagging of U-Nets

This notebook contains the code to generate an ensemble of a certain number of U-Net models trained independently.

In [1]:
try: # Google Colab integration
  from google.colab import drive

  print('Colab environment detected. Mounting drive...')
  drive.mount('/content/drive')

  print('Mounted. Switching to directory... ', end = '')
  %cd /content/drive/'My Drive'/CILroadseg
  print('done.')
except:
  print('Colab environment not found. Working on ordinary directory.')

Colab environment detected. Mounting drive...
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
Mounted. Switching to directory... /content/drive/My Drive/CILroadseg
done.


In [2]:
import numpy as np
np.random.seed(18)

import tensorflow as tf
tf.random.set_seed(33)

import sys
import os
import matplotlib.image as mpimg

from util.submit import *      # util/submit.py contains the functions used to generate the CSV file for Kaggle Competition
from util.visualize import *   # util/visualize.py provides functions for image visualization
from util.notebooks import *   # util/notebooks.py contains various util functions used in notebooks

## Loading Training Data

`nb_load_data` is an helper function provided in `util/notebooks.py`

In [3]:
train_dir = "training/images/"
gt_dir = "training/groundtruth/"
test_dir = "test/images/"

X, Y, X_test = nb_load_data(train_dir, gt_dir, test_dir)

Y = (Y >= 0.25) * 1

Loading training input...
Progress: done (100 images).
Loading training groundtruth...
Progress: done (100 images).
Loading test input...
Progress: done (94 images).

       Training data shape: (100, 400, 400, 3)
Training groundtruth shape: (100, 400, 400)
           Test data shape: (94, 608, 608, 3)


## Loading the Models

Here we load the weights of the models from `.h5` that were previously generated.

Please have a look at `unet.ipynb` and `unet.py` to see how to train a new U-Net model, and generate a new `.h5` weight file.

In [4]:
from tensorflow import keras

from util.helpers import Patchifier
from recomposer import *

from discretize import *
from rotate_mean import *
from unet import *

Using TensorFlow backend.


In [5]:
weights_path = "saves/final/bagging/"

files = [weights_path + file for file in os.listdir(weights_path)]

voters = []
for i in range(len(files)):

  model = UNetModel()

  # model = RotAndMean(model)
  # We do not use rot and mean here. Rot and Mean is useful to remove false
  # positives, but a bagging should obtain the same effect by itself.

  model = Discretizer(model, threshold=0.5)
  # Computing the mean of rounded predictions and then rounding again is
  # equivalent to computing the majority.
  # Comment this out to obtain a mean instead of a majority.

  model.initialize()
  model.load(files[i])

  model = Recomposer(Patchifier(model))
  # This divides and then recomposes the images in patches.
  # Comment this out to obtain pixelwise mean/majority.

  voters.append(model)

print('Voters: '+ str(len(voters)))

Voters: 9


## Computing the Mean

We create a new decorator `VoterMean` that takes a list of models and returns the mean of the predictions among these voters.

In [6]:
from util.model_base import ModelBase

In [7]:
class VoterMean(ModelBase):

  def __init__(self, voters):
    self.voters = voters

  def classify(self, X):
    Z = np.empty((len(voters), X.shape[0], X.shape[1], X.shape[2]))
    
    for i in range(len(voters)):
      Z[i] = voters[i].classify(X)

    return np.mean(Z, axis=0)

In [8]:
voter_model = VoterMean(voters)

voter_model_majority = Discretizer(voter_model, threshold=0.5)

## Making Predictions

The function `nb_predict_masks` is an helper function provided in `util/notebooks.py`, while `masks_to_submission` is a function based on the implementation provided in the Kaggle competition.

The following two cells can be skipped if you do not want to generate the `.csv` file.

In [9]:
test_masks_dir = "test/pred/bagging/"
test_dir = "test/images/"

nb_predict_masks(voter_model_majority, test_dir, test_masks_dir)

Predicting test cases... 
Progress: done.


In [10]:
image_paths = [test_masks_dir + file for file in os.listdir(test_masks_dir)]
masks_to_submission("test/bagging.csv", image_paths)

This prediction achieved an F1 score of 0.9195 in Kaggle's public test set.

# Visualizing predictions

The function `view_image_array` is provided in `util/visualize.py`. It uses `matplotlib` to visualize the images and the corresponding predictions.

In [11]:
Xt = X_test[0:10]

Y_pred = voter_model.classify(Xt)
Y_pred_dsc1 = voter_model_majority.classify(Xt)

view_image_array(Xt, Y_pred, Y_pred_dsc1)

Output hidden; open in https://colab.research.google.com to view.