This notebook handles the data processing, modelling, predictions, and post-processing for the Unet with 'resnet34' architecture. The notebook is set up to run in Google Colab to access free GPU. To run the notebook, please upload the following dataset on your Google Drive: TODO

### Imports and setup

In [3]:
# Mounts drive in Colab
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


Installing the segmentation models for Keras library

In [4]:
pip install -U segmentation-models

Collecting segmentation-models
  Downloading segmentation_models-1.0.1-py3-none-any.whl (33 kB)
Collecting keras-applications<=1.0.8,>=1.0.7
  Downloading Keras_Applications-1.0.8-py3-none-any.whl (50 kB)
[?25l[K     |██████▌                         | 10 kB 19.9 MB/s eta 0:00:01[K     |█████████████                   | 20 kB 15.7 MB/s eta 0:00:01[K     |███████████████████▍            | 30 kB 14.1 MB/s eta 0:00:01[K     |█████████████████████████▉      | 40 kB 10.4 MB/s eta 0:00:01[K     |████████████████████████████████| 50 kB 4.5 MB/s 
[?25hCollecting image-classifiers==1.0.0
  Downloading image_classifiers-1.0.0-py3-none-any.whl (19 kB)
Collecting efficientnet==1.0.0
  Downloading efficientnet-1.0.0-py3-none-any.whl (17 kB)
Installing collected packages: keras-applications, image-classifiers, efficientnet, segmentation-models
Successfully installed efficientnet-1.0.0 image-classifiers-1.0.0 keras-applications-1.0.8 segmentation-models-1.0.1


In [27]:
import numpy as np
import tensorflow as tf
import tensorflow.keras as keras
from tensorflow.keras.models import load_model
from tensorflow.keras.utils import plot_model
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import segmentation_models as sm
sm.set_framework('tf.keras')
import os
import matplotlib.image as mpimg
from PIL import Image
import pandas as pd
import cv2
import math
from sklearn.metrics import f1_score, accuracy_score
import matplotlib.pyplot as plt
import seaborn as sns
import re

# Setting the style for plots using seaborn
sns.set()
sns.set_style("white")

### Helper functions
The helper functions are included here such that the notebook might exported and run in Colab.

In [25]:
def patch_to_label(patch, thr):
  ''' Converting a patch to road if the average pixel value in the patch is larger than the threshold

  Parameters
  ------------
  patch: ndarray
    An array with predictions or values for a patch
  thr: float
    The threshold for converting a patch to road
  
  Returns
  --------
  value: int
    1 if the patch is classified as road, 0 otherwise
  '''

  df = np.mean(patch)
  value = 0
  if df > thr:
    value = 1
  
  return value

def window_predict(img, model):
    ''' Predicting segmentation on an image using the window method, i.e. predicting on 256x256 crops of the image.
    
    Parameters
    ------------
    img: ndarray
        An image that should be segmented
    model: Keras model
      The model that should make predictions
    
    Returns
    ------------
    pred: ndarray
        The predicted image
    '''

    # Cropping the images into images of size 256x256
    img1 = img[0:256,0:256,:]
    img2 = img[0:256,256:512,:]
    img3 = img[0:256,352:608,:]
    img4 = img[256:512,0:256,:]
    img5 = img[256:512,256:512,:]
    img6 = img[256:512,352:608,:]
    img7 = img[352:608,0:256,:]
    img8 = img[352:608,256:512,:]
    img9 = img[352:608,352:608,:]

    # Predicting on each of the cropped images
    pred_1 = model.predict(np.expand_dims(img1, axis=0))[0]
    pred_2 = model.predict(np.expand_dims(img2, axis=0))[0]
    pred_3 = model.predict(np.expand_dims(img3, axis=0))[0]
    pred_4 = model.predict(np.expand_dims(img4, axis=0))[0]
    pred_5 = model.predict(np.expand_dims(img5, axis=0))[0]
    pred_6 = model.predict(np.expand_dims(img6, axis=0))[0]
    pred_7 = model.predict(np.expand_dims(img7, axis=0))[0]
    pred_8 = model.predict(np.expand_dims(img8, axis=0))[0]
    pred_9 = model.predict(np.expand_dims(img9, axis=0))[0]

    # Cropping the images which are to the right in the original image
    pred_3 = pred_3[:,160:256,:]
    pred_6 = pred_6[:,160:256,:]
    pred_9 = pred_9[:,160:256,:]

    # Stacking images horizontally into three parts, top, middle, and bottom of the original image
    top = np.hstack([np.hstack([pred_1,pred_2]),pred_3])
    middle = np.hstack([np.hstack([pred_4,pred_5]),pred_6])
    bottom = np.hstack([np.hstack([pred_7,pred_8]),pred_9])

    # Cropping the bottom
    bottom = bottom[160:256,:,:]

    # Stacking top, middle, and bottom to create finished prediction
    pred = np.vstack([np.vstack([top, middle]), bottom])
    
    return pred


def save_predictions(img, name):
  ''' Saving an image to file.
  
  Parameters
  -----------
  img: ndarray
    The image that should be saved
  name: string
    The filename for the image
  '''

  # Converting the image from one to three channels
  w = img.shape[0]
  h = img.shape[1]
  gt_img_3c = np.zeros((w, h, 3), dtype=np.uint8)
  gt_img8 = img_float_to_uint8(img)          
  gt_img_3c[:, :, 0] = gt_img8[:,:,0]
  gt_img_3c[:, :, 1] = gt_img8[:,:,0]
  gt_img_3c[:, :, 2] = gt_img8[:,:,0]

  # Saving the image
  cv2.imwrite('/content/%s.png'%(name), gt_img_3c)


def mask_to_submission_strings(image_filename, thr):
    """ Reading a single image and outputs the strings that should go into the submission file.
    
    Parameters
    ------------
    image_filename: string
      The image filename
    thr: float
      The threshold for converting a patch to label
    
    Yields
    --------
    A formatted prediction string
    """

    img_number = int(re.search(r"\d+", image_filename).group(0))
    im = mpimg.imread(image_filename)
    patch_size = 16
    for j in range(0, im.shape[1], patch_size):
        for i in range(0, im.shape[0], patch_size):
            patch = im[i:i + patch_size, j:j + patch_size]
            label = patch_to_label(patch, thr)
            yield("{:03d}_{}_{},{}".format(img_number, j, i, label))


def masks_to_submission(submission_filename, thr, *image_filenames):
    """ Converting images into a submission file.
    
    Parameters
    ------------
    submission_filename: string
      the name of the submission file
    thr: float
      The threshold for converting a patch to label
    *image_filenames: list
      list of the image filnames that should be included in the prediction
    """
    
    with open(submission_filename, 'w') as f:
        f.write('id,prediction\n')
        for fn in image_filenames[0:]:
            f.writelines('{}\n'.format(s) for s in mask_to_submission_strings(fn, thr))


def img_float_to_uint8(img):
    ''' Converting image array with floats to uint8
    
    parameters
    -----------
    img: ndarray
        image array
    
    returns
    -------
    rimg: ndarray
        converted array'''

    rimg = img - np.min(img)
    rimg = (rimg / np.max(rimg) * 255).round().astype(np.uint8)
    return rimg


def extract_data(folderpath):
    """ (ETH) Extracting the images into a 4D tensor [image index, y, x, channels].
    Values are rescaled from [0, 255] down to [-0.5, 0.5].

    Parameters
    ----------
    filename: string
        The name of the image file
    num_images: int
        The number of images that should be extracted

    Returns
    -------
    data: ndarray
        A numpy array containting the images
    """

    files = os.listdir(folderpath)
    n = len(files)
    imgs = [(mpimg.imread(folderpath+files[i])) for i in range(n)]
    data = np.asarray(imgs)
    return data

def extract_data_test(folderpath):
    """ (ETH) Extracting the images into a 4D tensor [image index, y, x, channels].
    Values are rescaled from [0, 255] down to [-0.5, 0.5].

    Parameters
    ----------
    filename: string
        The name of the image file
    num_images: int
        The number of images that should be extracted

    Returns
    -------
    data: ndarray
        A numpy array containting the images
    """

    imgs=[]
    for i in range(1,51):
      img = mpimg.imread(folderpath+'test_%d.png'%i)
      imgs.append(img)
    data = np.asarray(imgs)
    return data


def extract_labels(folderpath):
    """ (ETH) Extracting the labels into a 1-hot matrix [image index, label index].
    
    Parameters
    ----------
    filename: string
        The name of the image file
    num_images: int
        The number of images
    
    Returns
    --------
    labels: ndarray
        1-hot matrix [image index, label index]
    """

    gt_imgs = []
    files = os.listdir(folderpath)
    n = len(files)
    for i in range(n):
        img = mpimg.imread(folderpath+files[i])
        try:
            gt_imgs.append(img[:,:,0])
        except:
            gt_imgs.append(img)

    return np.asarray(gt_imgs)

def test_threshold(preds, gts, min, max):
  """ Finding the optimal threshold for labeling a patch as road. Searches with stepsize 0.01.

  parameters
  -----------
  preds: ndarray
    Pixelwise label predictions of an iamge
  gts:
    Pixelwise labels for the original image
  min:
    100 times the value to start searching for an optimum
  max:
    100 times the value for stopping searching for an optimum
  """
  
  # Defining a list of potential thresholds
  thresholds = [0.01*i for i in range(min, max)]
  
  # List for saving f1-scores
  f1s = []
  
  # Saving highest f1-score 
  highest = 0
  foreground_threshold = 0
  
  # Iterating through each threshold and calculating F1-score and accuracy
  for thr in thresholds:

    # Converting pixelwise predictions to patchwise predictions
    y_pred_flattened = []
    for im in preds:
      for j in range(0, im.shape[1], patch_size):
            for i in range(0, im.shape[0], patch_size):
                patch = im[i:i + patch_size, j:j + patch_size]
                label = patch_to_label(patch, thr)
                y_pred_flattened.append(label)
    y_pred_flattened = np.array(y_pred_flattened)

    # Converting mask to patchwise values
    y_val_flattened = []
    for im in gts:
      for j in range(0, im.shape[1], patch_size):
            for i in range(0, im.shape[0], patch_size):
                patch = im[i:i + patch_size, j:j + patch_size]
                label = patch_to_label(patch, thr)
                y_val_flattened.append(label)

    # Calculating and storing f1-score and accuracy
    f1 = f1_score(y_val_flattened, y_pred_flattened)
    acc = accuracy_score(y_val_flattened, y_pred_flattened)
    f1s.append(f1)

    # Setting foreground_threshold for future predcitions to thr if thr gives the best f1-score
    if f1>highest:
      foreground_threshold=thr
      highest = f1

  print("The best threshold is: %.2f and achieves a F1-score of : %.4f"%(foreground_threshold, highest))
  return foreground_threshold

### Extracting data and masks

Unzipping the data folders for quick access in Colab

In [7]:
!unzip "/content/drive/My Drive/road_segmentation/data.zip" -d "/content"

Archive:  /content/drive/My Drive/road_segmentation/data.zip
   creating: /content/data_volt/training/
   creating: /content/data_volt/training/groundtruth/
  inflating: /content/data_volt/training/groundtruth/satImage_001.png  
  inflating: /content/data_volt/training/groundtruth/satImage_001_Aug00.png  
  inflating: /content/data_volt/training/groundtruth/satImage_001_Aug01.png  
  inflating: /content/data_volt/training/groundtruth/satImage_001_Aug02.png  
  inflating: /content/data_volt/training/groundtruth/satImage_001_Aug03.png  
  inflating: /content/data_volt/training/groundtruth/satImage_001_Aug04.png  
  inflating: /content/data_volt/training/groundtruth/satImage_001_Aug05.png  
  inflating: /content/data_volt/training/groundtruth/satImage_001_Aug06.png  
  inflating: /content/data_volt/training/groundtruth/satImage_001_Aug07.png  
  inflating: /content/data_volt/training/groundtruth/satImage_001_Aug08.png  
  inflating: /content/data_volt/training/groundtruth/satImage_001_Aug

In [8]:
!unzip "/content/drive/My Drive/road_segmentation/testing.zip" -d "/content"

Archive:  /content/drive/My Drive/road_segmentation/testing.zip
  inflating: /content/testing/test_1.png  
  inflating: /content/testing/test_10.png  
  inflating: /content/testing/test_11.png  
  inflating: /content/testing/test_12.png  
  inflating: /content/testing/test_13.png  
  inflating: /content/testing/test_14.png  
  inflating: /content/testing/test_15.png  
  inflating: /content/testing/test_16.png  
  inflating: /content/testing/test_17.png  
  inflating: /content/testing/test_18.png  
  inflating: /content/testing/test_19.png  
  inflating: /content/testing/test_2.png  
  inflating: /content/testing/test_20.png  
  inflating: /content/testing/test_21.png  
  inflating: /content/testing/test_22.png  
  inflating: /content/testing/test_23.png  
  inflating: /content/testing/test_24.png  
  inflating: /content/testing/test_25.png  
  inflating: /content/testing/test_26.png  
  inflating: /content/testing/test_27.png  
  inflating: /content/testing/test_28.png  
  inflating: /

In [9]:
# Defining paths to images and masks
train_data_path = '/content/data_volt/training/images/'
train_labels_path = '/content/data_volt/training/groundtruth/'

# Extracting the data and masks
x = extract_data(train_data_path)
y = extract_labels(train_labels_path)

### Model

In [10]:
# Defining backbone for the model
BACKBONE = 'resnet34'

# Downloading preprocessing function for the model
preprocess_input = sm.get_preprocessing(BACKBONE)

patch_size = 16


In [26]:
num_models = 5

for i in range(num_models):
    # Splitting the dataset into two, one training set and one validation set
    x_val, y_val = x[340*i:340*(i+1)], y[340*i:340*(i+1)]
    x_train, y_train = x[np.isin(np.arange(len(x)), np.arange(340*0,340*(0+1)), invert=True)], y[np.isin(np.arange(len(y)), np.arange(340*0,340*(0+1)), invert=True)]

    # Sreprocessing training and validation data
    x_train = preprocess_input(x_train)
    x_val = preprocess_input(x_val)

    # Defining model, using 'imagenet' as weights to converge faster
    model = sm.Unet(BACKBONE, encoder_weights='imagenet', input_shape=(256, 256, 3))

    # Adding  L2 kernel regularizer
    sm.utils.set_regularization(model, kernel_regularizer=keras.regularizers.l2(1))

    # Compiling the model using Adam optimizer and Binary Cross Entropy with Jaccard loss
    model.compile(
        'Adam',
        loss=sm.losses.bce_jaccard_loss,
        metrics=[sm.metrics.iou_score, sm.metrics.FScore(),'accuracy'],
    )

    # Saving the model thats scores best on the validation data
    callbacks = [keras.callbacks.ModelCheckpoint("m%d.h5"%(i+1), save_best_only=True)]
    print("Training model %d\n"%(i+1))
    
    # Training the model for 50 epochs with batch size = 32
    history = model.fit(x=x_train, y=y_train,
      epochs=50, batch_size=32,
      callbacks=callbacks,
      validation_data=(x_val,y_val)
    )

    # Testing the model and finding optimal threshold
    model = load_model('m%d.h5'%(i+1), custom_objects
                   = {'binary_crossentropy_plus_jaccard_loss':sm.losses.bce_jaccard_loss, 
                      'iou_score': sm.metrics.iou_score, 'f1-score': sm.metrics.FScore()})
    
    # Generating predictions on validation set
    y_pred = model.predict(x_val)
    
    # Finding optimal threshold on validation set
    thr = test_threshold(y_pred, y_val, 0, 30)

    print('\n')
    print('Creating predictions for model %d'%(i+1))
    
    # Generating predictions on test set
    test_images = extract_data_test('/content/testing/')

    test_images = preprocess_input(test_images)

    # Generating and saving predictions for the test images
    for k in range(len(test_images)):
      pred = window_predict(test_images[k], model)
      save_predictions(pred, 'test%d'%(k+1))

    # Generating the prediction file for the test set
    submission_filename = 'm%d_pred.csv' % (i+1)
    image_filenames = []
    for j in range(1, 51):
        image_filename = '/content/test%d.png' % j
        image_filenames.append(image_filename)
    masks_to_submission(submission_filename, thr, *image_filenames)


  layer_config = serialize_layer_fn(layer)


Training model 1

The best threshold is: 0.00 and achieves a F1-score of : 0.1068


Creating predictions for model 1


  layer_config = serialize_layer_fn(layer)


Training model 2

The best threshold is: 0.00 and achieves a F1-score of : 0.5193


Creating predictions for model 2


  layer_config = serialize_layer_fn(layer)


Training model 3

The best threshold is: 0.00 and achieves a F1-score of : 0.2368


Creating predictions for model 3


  layer_config = serialize_layer_fn(layer)


Training model 4

The best threshold is: 0.00 and achieves a F1-score of : 0.5360


Creating predictions for model 4


  layer_config = serialize_layer_fn(layer)


Training model 5

The best threshold is: 0.00 and achieves a F1-score of : 0.5268


Creating predictions for model 5


In [28]:
# Filepaths to the five models' predictions on the test set
X1 = '/content/m1_pred.csv'
X2 = '/content/m2_pred.csv'
X3 = '/content/m3_pred.csv'
X4 = '/content/m4_pred.csv'
X5 = '/content/m5_pred.csv'

Finally, we create the ensemble model.

In [30]:
# Reading the models' prediction into five dataframes

df1 = pd.read_csv(X1)
df1 = df1.set_index(['id'])
df1 = df1.rename({'prediction':'p1'},axis=1)

df2 = pd.read_csv(X2)
df2 = df2.set_index(['id'])
df2 = df2.rename({'prediction':'p2'},axis=1)

df3 = pd.read_csv(X3)
df3 = df3.set_index(['id'])
df3 = df3.rename({'prediction':'p3'},axis=1)

df4 = pd.read_csv(X4)
df4 = df4.set_index(['id'])
df4 = df4.rename({'prediction':'p4'},axis=1)

df5 = pd.read_csv(X5)
df5 = df5.set_index(['id'])
df5 = df5.rename({'prediction':'p5'},axis=1)

# Dataframe containing the prediction of all models for each patch
df = pd.concat([df1,df2,df3,df4,df5], axis=1)

# Inspecting the dataframe to ensure correct loading
df.head()

# Generating predictions, predicting road if all models predict road
df['prediction'] = df.apply(lambda x: 1 if np.sum(x)>4 else 0, axis=1)

# Extracting only the prediction column
df = df['prediction']

# Writing predctions to csv
df.to_csv('/content/ensemble.csv')