# U-Net for vehicle detection 

## This is the notebook with the training code



## Overview

In this notebook , we will implement and train an  U-net for detecting vehicles in a video stream of images provided by Udacity. U-net is a encoder-decoder type of network for pixel-wise predictions. UNet are special Convets: receptive fields after convolution are concatenated with the receptive fields in up-convolving process. This allows the network to use features from lower layers and features from up-convolution. This up-convolution makes training harder in the sense that much more memory is required as in standard Conv-Nets where only downconvolution is done.  U-nets are used extensively for biomedical applications to detect cancer, kidney pathologies and tracking cells and so on. U-net has proven to be very powerful segmentation tool.

<img src='output_images/u-net-architecture.png'>
U-net, taken from http://lmb.informatik.uni-freiburg.de/Publications/2015/RFB15a/ 


The input to U-net is a resized 960X640 3-channel RGB image and output is 960X640 1-channel mask of predictions. We wanted the predictions to reflect probability of a pixel being a vehicle or not, so we used an activation function of sigmoid on the last layer.

The solution in this notebook is based in the original research paper on [U-net](http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net/) and the prize winning submission to kaggle’s ultrasound segmentation challenge. The UNet-Code is based on this repository https://github.com/orobix/retina-unet


## The Data

We used annotated vehicle data set provided by [Udacity](https://www.udacity.com/). The [4.5 GB data set](https://github.com/udacity/self-driving-car/tree/master/annotation) was composed of frames collected from two of videos while driving the Udacity car around Mountain View area in heavy traffic. The data set contained a label file with bounding boxes marking other cars, trucks and pedestrians. The entire data set was comprised of about 22000 images. We combined cars and trucks into one class vehicle, and dropped all the bounding boxes for pedestrians. For each image a set of bounding boxes is provided.



## Data preparation and augmentation

Frames were obtained from a video feed, therefor shuffling is very important.  Data is splited into training and testing data sets, 2000 images are used for testing. Data augmentation on training data:

- translation to account for cars beeing at different locations
- brightness to account for differenct lightning conditions
- stretching


## Training 
Goal is to train the UNet to identify the bounding boxes of the cars in the image. That means that it shall learn to generate a mask which highlights all cars in an image. I choose image sizes of 640x980 and 480x720 for training. For the large image size I trained on a titan X pascal GPU with a batch size of 8 (larger batch sizes raise an out of memory error) and the the smaller image size I trained on a GTX1080 GPU with a batch size of 8. In both cases the training time for 1000 epochs where about 160s and 106 s. I trained for several hundred epochs. I used Keras with tensorflow backed and used approximate Intersection over Union (IoU) between the network output and target mask as objective function.

## Remarks

As in  many cases preparing the data and training a conv net is quite straighforward. The only difference here is the special UNet structure and that we train a deep network to do image segmentation and not image classification.  Compared to the complicated feature engineering with color histograms, HOG, and the sliding windowing technique with the classical image processing approach in the CarND P5 project it is fairly simple and no tuning and so on is required this stage! We will see in the notebook which does the car dectection that using the pretrained Unet model it is really very simple and powerful and very robust.

## Additional links:

1. U-Net: Convolutional Networks for Biomedical Image Segmentation: http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net/
2. Good collection of various segmentation models: https://handong1587.github.io/deep_learning/2015/10/09/segmentation.html
3. Original prize winning submission to Kaggle https://github.com/jocicmarko/ultrasound-nerve-segmentation



In [None]:
# Import libraries
import cv2
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
import numpy as np
% matplotlib inline
import glob

from keras.models import Model
from keras.layers import Input, merge, Convolution2D, MaxPooling2D, UpSampling2D, Reshape, core, Dropout,Lambda
from keras.optimizers import Adam
from keras.callbacks import ModelCheckpoint, LearningRateScheduler
from keras import backend as K
from scipy.ndimage.measurements import label



In [None]:
dir_label = ['object-dataset','object-detection-crowdai']

In [None]:
#load label data crowed ai

import pandas as pd

df_files1 = pd.read_csv(dir_label[1]+'/labels.csv', header=0)
df_vehicles1 = df_files1[(df_files1['Label']=='Car') | (df_files1['Label']=='Truck')].reset_index()
df_vehicles1 = df_vehicles1.drop('index', 1)
df_vehicles1['File_Path'] =  dir_label[1] + '/' +df_vehicles1['Frame']
df_vehicles1 = df_vehicles1.drop('Preview URL', 1)
print(dir_label[1])
df_vehicles1.head()

In [None]:
### #load label data from second source
### Renamed frames and labels to match crowd-awi source 
names=['Frame',  'xmin', 'xmax', 'ymin','ymax', 'ind', 'Label','RM']#, 'Color']
df_files2 = pd.read_csv('object-dataset/labels.csv', header=None,names=names,sep='\s+')
#df_files2.columns= ['Frame',  'xmin', 'xmax', 'ymin','ymax', 'ind', 'Label','RM', 'XX']
df_vehicles2 = df_files2[(df_files2['Label']=='car') | (df_files2['Label']=='truck')].reset_index()
df_vehicles2 = df_vehicles2.drop('index', 1)
df_vehicles2 = df_vehicles2.drop('RM', 1)
df_vehicles2 = df_vehicles2.drop('ind', 1)

df_vehicles2['File_Path'] = dir_label[0] + '/' +df_vehicles2['Frame']

df_vehicles2.head()

In [None]:
### concat both data frames
df_vehicles = pd.concat([df_vehicles1,df_vehicles2]).reset_index()
df_vehicles = df_vehicles.drop('index', 1)
df_vehicles.columns =['File_Path','Frame','Label','ymin','xmin','ymax','xmax']
df_vehicles.head()

In [None]:
# Augmentation functions 

def translate_image(image,bb_boxes_f,trans_range):
    bb_boxes_f = bb_boxes_f.copy(deep=True)
    tr_x = trans_range*np.random.uniform()-trans_range/2
    tr_y = trans_range*np.random.uniform()-trans_range/2

    Trans_M = np.float32([[1,0,tr_x],[0,1,tr_y]])
    rows,cols,channels = image.shape
    bb_boxes_f['xmin'] = bb_boxes_f['xmin']+tr_x
    bb_boxes_f['xmax'] = bb_boxes_f['xmax']+tr_x
    bb_boxes_f['ymin'] = bb_boxes_f['ymin']+tr_y
    bb_boxes_f['ymax'] = bb_boxes_f['ymax']+tr_y    
    image_tr = cv2.warpAffine(image,Trans_M,(cols,rows))    
    return image_tr,bb_boxes_f


def stretch_image(img,bb_boxes_f,scale_range):
    bb_boxes_f = bb_boxes_f.copy(deep=True)
    
    tr_x1 = scale_range*np.random.uniform()
    tr_y1 = scale_range*np.random.uniform()
    p1 = (tr_x1,tr_y1)
    tr_x2 = scale_range*np.random.uniform()
    tr_y2 = scale_range*np.random.uniform()
    p2 = (img.shape[1]-tr_x2,tr_y1)

    p3 = (img.shape[1]-tr_x2,img.shape[0]-tr_y2)
    p4 = (tr_x1,img.shape[0]-tr_y2)

    pts1 = np.float32([[p1[0],p1[1]],[p2[0],p2[1]],
                   [p3[0],p3[1]],[p4[0],p4[1]]])
    pts2 = np.float32([[0,0],[img.shape[1],0], [img.shape[1],img.shape[0]],
                   [0,img.shape[0]] ] )

    M = cv2.getPerspectiveTransform(pts1,pts2)
    img = cv2.warpPerspective(img,M,(img.shape[1],img.shape[0]))
    img = np.array(img,dtype=np.uint8)
    
    bb_boxes_f['xmin'] = (bb_boxes_f['xmin'] - p1[0])/(p2[0]-p1[0])*img.shape[1]
    bb_boxes_f['xmax'] = (bb_boxes_f['xmax'] - p1[0])/(p2[0]-p1[0])*img.shape[1]
    bb_boxes_f['ymin'] = (bb_boxes_f['ymin'] - p1[1])/(p3[1]-p1[1])*img.shape[0]
    bb_boxes_f['ymax'] = (bb_boxes_f['ymax'] - p1[1])/(p3[1]-p1[1])*img.shape[0]
    
    return img,bb_boxes_f

def augment_brightness(image):
    image1 = cv2.cvtColor(image,cv2.COLOR_RGB2HSV)
    random_bright = .25+np.random.uniform()
    image1[:,:,2] = image1[:,:,2]*random_bright
    image1 = cv2.cvtColor(image1,cv2.COLOR_HSV2RGB)
    return image1


def load_image_name(df,ind,size=(640,300),augmentation = False,trans_range = 20,scale_range=20):
 
    file_name = df['File_Path'][ind]
    img = cv2.imread(file_name)
    img_size = np.shape(img)
    
    img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
    img = cv2.resize(img,size)
    name_str = file_name.split('/')
    name_str = name_str[-1]
    bb_boxes = df[df['Frame'] == name_str].reset_index()
    img_size_post = np.shape(img)
    
    if augmentation == True:
        img,bb_boxes = translate_image(img,bb_boxes,trans_range)
        img,bb_boxes = stretch_image(img,bb_boxes,scale_range)
        img = augment_brightness(img)
        
    bb_boxes['xmin'] = np.round(bb_boxes['xmin']/img_size[1]*img_size_post[1])
    bb_boxes['xmax'] = np.round(bb_boxes['xmax']/img_size[1]*img_size_post[1])
    bb_boxes['ymin'] = np.round(bb_boxes['ymin']/img_size[0]*img_size_post[0])
    bb_boxes['ymax'] = np.round(bb_boxes['ymax']/img_size[0]*img_size_post[0])
    bb_boxes['Area'] = (bb_boxes['xmax']- bb_boxes['xmin'])*(bb_boxes['ymax']- bb_boxes['ymin']) 
    return name_str,img,bb_boxes


def get_mask_segmentation(img,bb_boxes_f):
    img_mask = np.zeros_like(img[:,:,0])
    for i in range(len(bb_boxes_f)):
      
        bb_box_i = [bb_boxes_f.iloc[i]['xmin'],bb_boxes_f.iloc[i]['ymin'],
                bb_boxes_f.iloc[i]['xmax'],bb_boxes_f.iloc[i]['ymax']]
        img_mask[bb_box_i[1]:bb_box_i[3],bb_box_i[0]:bb_box_i[2]]= 1.
        img_mask = np.reshape(img_mask,(np.shape(img_mask)[0],np.shape(img_mask)[1],1))
    return img_mask

In [None]:
def plot_bbox(bb_boxes,ind_bb,color='r',linewidth=2):
    bb_box_i = [bb_boxes.iloc[ind_bb]['xmin'],
                bb_boxes.iloc[ind_bb]['ymin'],
                bb_boxes.iloc[ind_bb]['xmax'],
                bb_boxes.iloc[ind_bb]['ymax']]
    plt.plot([bb_box_i[0],bb_box_i[2],bb_box_i[2],bb_box_i[0],bb_box_i[0]],
             [bb_box_i[1],bb_box_i[1],bb_box_i[3],bb_box_i[3],bb_box_i[1]],
             color,linewidth=linewidth)
    
def plot_img_bbox(img,bb_boxes):
    plt.imshow(img)
    for i in range(len(bb_boxes)):
        plot_bbox(bb_boxes,i,'b')
    
        bb_box_i = [bb_boxes.iloc[i]['xmin'],bb_boxes.iloc[i]['ymin'],
                    bb_boxes.iloc[i]['xmax'],bb_boxes.iloc[i]['ymax']]
    plt.axis('off');

def plot_image_mask(img,img_mask):
    img = np.array(img,dtype=np.uint8)
    img_mask = np.array(img_mask,dtype=np.uint8)
    plt.figure(figsize=(12,8))
    plt.subplot(1,3,1)
    plt.imshow(img)
    plt.axis('off')
    plt.subplot(1,3,2)
    plt.imshow(img_mask[:,:,0])
    plt.axis('off')
    plt.subplot(1,3,3)
    plt.imshow(cv2.bitwise_and(img,img,mask=img_mask));
    plt.axis('off')
    plt.show();


In [None]:
#Testing translation and stretching augmentations

name,img,bb_boxes = load_image_name(df_vehicles,1,augmentation=False,trans_range=10,scale_range=0)

tr_x1 = 80
tr_y1 = 30
tr_x2 = 40
tr_y2 = 20

p1 = (tr_x1,tr_y1)
p2 = (img.shape[1]-tr_x2,tr_y1)

p3 = (img.shape[1]-tr_x2,img.shape[0]-tr_y2)
p4 = (tr_x1,img.shape[0]-tr_y2)

pts1 = np.float32([[p1[0],p1[1]],
                   [p2[0],p2[1]],
                   [p3[0],p3[1]],
                   [p4[0],p4[1]]])
pts2 = np.float32([[0,0],
                   [img.shape[1],0],
                   [img.shape[1],img.shape[0]],[0,img.shape[0]] ]
                   )

M = cv2.getPerspectiveTransform(pts1,pts2)
dst = cv2.warpPerspective(img,M,(img.shape[1],img.shape[0]))
dst = np.array(dst,dtype=np.uint8)


plt.figure(figsize=(18,12))
plt.subplot(1,2,1)
plt.imshow(img)
for i in range(len(bb_boxes)):
    plot_bbox(bb_boxes,i,'b')
    
    bb_box_i = [bb_boxes.iloc[i]['xmin'],bb_boxes.iloc[i]['ymin'],
                bb_boxes.iloc[i]['xmax'],bb_boxes.iloc[i]['ymax']]
plt.axis('off')
plt.subplot(1,2,2)
plt.imshow(dst)
bb_boxes1 = bb_boxes.copy(deep=True)
bb_boxes1['xmin'] = (bb_boxes['xmin'] - p1[0])/(p2[0]-p1[0])*img.shape[1]
bb_boxes1['xmax'] = (bb_boxes['xmax'] - p1[0])/(p2[0]-p1[0])*img.shape[1]
bb_boxes1['ymin'] = (bb_boxes['ymin'] - p1[1])/(p3[1]-p1[1])*img.shape[0]
bb_boxes1['ymax'] = (bb_boxes['ymax'] - p1[1])/(p3[1]-p1[1])*img.shape[0]
plot_img_bbox(dst,bb_boxes1)

plt.axis('off');

In [None]:
#### Testing translation and stretching augmentations

name,img,bb_boxes = load_image_name(df_vehicles,1,augmentation=False)
img_mask =get_mask_segmentation(img,bb_boxes)

plt.figure(figsize=(18,12))
plt.subplot(2,2,1)
plot_img_bbox(img,bb_boxes)

plt.subplot(2,2,2)
plt.imshow(img_mask[:,:,0])
plt.axis('off')

plt.subplot(2,2,3)
#bb_boxes1 = bb_boxes.copy()
dst,bb_boxes1 = stretch_image(img,bb_boxes,100)

plt.imshow(dst)

plot_img_bbox(dst,bb_boxes1)

plt.subplot(2,2,4)
img_mask2 =get_mask_segmentation(dst,bb_boxes1)
plt.imshow(img_mask2[:,:,0])
plt.axis('off');

In [None]:
name_str,img,bb_boxes = load_image_name(df_vehicles,1,augmentation=False)
img_mask =get_mask_segmentation(img,bb_boxes)
plt.figure(figsize=(12,8))
plt.imshow(img)
plot_img_bbox(img,bb_boxes)
plt.show()
plot_image_mask(img,img_mask)

In [None]:
#training and test generators using augmentation
def generate_train_batch(data,batch_size = 32):
    
    batch_images = np.zeros((batch_size, img_rows, img_cols, 3))
    batch_masks = np.zeros((batch_size, img_rows, img_cols, 1))
    while 1:
        for i_batch in range(batch_size):
            i_line = np.random.randint(len(data)-2000)
            name,img,bb_boxes = load_image_name(df_vehicles,i_line,size=(img_cols, img_rows),
                                                  augmentation=True, trans_range=50, scale_range=50)
            img_mask = get_mask_segmentation(img,bb_boxes)
            batch_images[i_batch] = img
            batch_masks[i_batch] =img_mask
        yield batch_images, batch_masks
        

def generate_test_batch(data,batch_size = 32):
    batch_images = np.zeros((batch_size, img_rows, img_cols, 3))
    batch_masks = np.zeros((batch_size, img_rows, img_cols, 1))
    while 1:
        for i_batch in range(batch_size):
            i_line = np.random.randint(2000)
            i_line = i_line+len(data)-2000
            name,img,bb_boxes = load_image_name(df_vehicles,i_line, size=(img_cols, img_rows),
                                                  augmentation=False,  trans_range=0, scale_range=0 )
            img_mask = get_mask_segmentation(img,bb_boxes)
            batch_images[i_batch] = img
            batch_masks[i_batch] =img_mask
        yield batch_images, batch_masks

In [None]:
##### Image size, 
img_rows = 640
img_cols = 960

#img_rows = 480
#img_cols = 720
#img_rows = 320
#img_cols = 480


In [None]:
# Testing the generator
training_gen = generate_train_batch(df_vehicles,10)

In [None]:
batch_img,batch_mask = next(training_gen)

In [None]:
### Plotting generator output
for i in range(10):
    im = np.array(batch_img[i],dtype=np.uint8)
    im_mask = np.array(batch_mask[i],dtype=np.uint8)
    plt.figure(figsize=(12,8))
    plt.subplot(1,3,1)
    plt.imshow(im)
    plt.axis('off')
    plt.subplot(1,3,2)
    plt.imshow(im_mask[:,:,0])
    plt.axis('off')
    plt.subplot(1,3,3)
    plt.imshow(cv2.bitwise_and(im,im,mask=im_mask));
    plt.axis('off')
    plt.show();

In [None]:
### IOU  coeff and loss calculation
smooth=1.0
def IOU_calc(y_true, y_pred):
    y_true_f = K.flatten(y_true)
    y_pred_f = K.flatten(y_pred)
    intersection = K.sum(y_true_f * y_pred_f)
    
    return 2*(intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)


def IOU_calc_loss(y_true, y_pred):
    return -IOU_calc(y_true, y_pred)



In [None]:
#based on repository https://github.com/orobix/retina-unet, 
#slightly modified to save GPU memory, original code in comments
def get_gnet(drop=0.0):
    inputs = Input((img_rows, img_cols,3))
    inputs_norm = Lambda(lambda x: x/127.5 - 1.)
  #  conv1 = Convolution2D(32, 3, 3, activation='relu', border_mode='same')(inputs)  
    conv1 = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(inputs)  
    conv1 = Dropout(drop)(conv1)
  #  conv1 = Convolution2D(32, 3, 3, activation='relu', border_mode='same')(conv1)
    conv1 = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(conv1)
   # up1 = UpSampling2D(size=(2, 2))(conv1)
  
    conv2 = Convolution2D(16, 3, 3, activation='relu', border_mode='same')(conv1)
    conv2 = Dropout(drop)(conv2)
    conv2 = Convolution2D(16, 3, 3, activation='relu', border_mode='same')(conv2)
    pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
    #
    conv3 = Convolution2D(32, 3, 3, activation='relu', border_mode='same')(pool2)
    conv3 = Dropout(drop)(conv3)
    conv3 = Convolution2D(32, 3, 3, activation='relu', border_mode='same')(conv3)
    pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
    #
    conv4 = Convolution2D(64, 3, 3, activation='relu', border_mode='same')(pool3)
    conv4 = Dropout(drop)(conv4)
    conv4 = Convolution2D(64, 3, 3, activation='relu', border_mode='same')(conv4)
    pool4 = MaxPooling2D(pool_size=(2, 2))(conv4)
    #
    conv5 = Convolution2D(128, 3, 3, activation='relu', border_mode='same')(pool4)
    conv5 = Dropout(drop)(conv5)
    conv5 = Convolution2D(128, 3, 3, activation='relu', border_mode='same')(conv5)
    #
    up6 = merge([UpSampling2D(size=(2, 2))(conv5), conv4], mode='concat', concat_axis=3)
    conv6 = Convolution2D(64, 3, 3, activation='relu', border_mode='same')(up6)
    conv6 = Dropout(drop)(conv6)
    conv6 = Convolution2D(64, 3, 3, activation='relu', border_mode='same')(conv6)
    #
    up7 = merge([UpSampling2D(size=(2, 2))(conv6), conv3], mode='concat', concat_axis=3)
    conv7 = Convolution2D(32, 3, 3, activation='relu', border_mode='same')(up7)
    conv7 = Dropout(drop)(conv7)
    conv7 = Convolution2D(32, 3, 3, activation='relu', border_mode='same')(conv7)
    #
    up8 = merge([UpSampling2D(size=(2, 2))(conv7), conv2], mode='concat', concat_axis=3)
    conv8 = Convolution2D(16, 3, 3, activation='relu', border_mode='same')(up8)
    conv8 = Dropout(drop)(conv8)
    conv8 = Convolution2D(16, 3, 3, activation='relu', border_mode='same')(conv8)
    #
  #  pool4 = MaxPooling2D(pool_size=(2, 2))(conv8)

  #  conv9 = Convolution2D(32, 3, 3, activation='relu', border_mode='same')(pool4)
    conv9 = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(conv8)
    conv9 = Dropout(drop)(conv9)
 #   conv9 = Convolution2D(32, 3, 3, activation='relu', border_mode='same')(conv9)
    conv9 = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(conv9)

    #
    conv10 = Convolution2D(1, 1, 1, activation='sigmoid')(conv9)

    model = Model(input=inputs, output=conv10)

   
    return model



In [None]:
training_gen = generate_train_batch(df_vehicles,8)
model = get_gnet(drop=0.1)
model.summary()
model.compile(optimizer=Adam(lr=1e-4),  loss=IOU_calc_loss, metrics=[IOU_calc])
history = model.fit_generator(training_gen, samples_per_epoch=1000,  nb_epoch=100)

In [None]:
### Save model and weights IOU_calc: 0.6839
model.save('model_Unet_640_960_e100.h5')

model.save_weights("model_Unet_Weights_640_960_e100.h5", overwrite=True)
print("done")

In [None]:
# 100 more epochs
history = model.fit_generator(training_gen, samples_per_epoch=1000,  nb_epoch=100)

In [None]:
### Save model and weights IOU_calc: 0.7262 
model.save('model_Unet_640_960_e200.h5')

model.save_weights("model_Unet_Weights_640_960_e200.h5", overwrite=True)
print("done")

In [None]:
# 100 more epochs
history = model.fit_generator(training_gen, samples_per_epoch=1000,  nb_epoch=100)

In [None]:
### Save model and weights IOU_calc: 0.7539
model.save('model_Unet_640_960_e300.h5')

model.save_weights("model_Unet_Weights_640_960_e300.h5", overwrite=True)
print("done")

In [None]:
# 100 more epochs
history = model.fit_generator(training_gen, samples_per_epoch=1000,  nb_epoch=100)

In [None]:
### Save model and weights IOU_calc: 0.7617
model.save('model_Unet_640_960_e400.h5')

model.save_weights("model_Unet_Weights_640_960_e400.h5", overwrite=True)
print("done")

In [None]:
# 100 more epochs
history = model.fit_generator(training_gen, samples_per_epoch=1000,  nb_epoch=100)

In [None]:
### Save model and weights IOU_calc: 0.7757
model.save('model_Unet_640_960_e500.h5')

model.save_weights("model_Unet_Weights_640_960_e500.h5", overwrite=True)
print("done")

In [None]:
def get_adopted_unet():
    inputs = Input((img_rows, img_cols,3))
    inputs_norm = Lambda(lambda x: x/127.5 - 1.)
    conv1 = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(inputs)
    conv1 = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(conv1)
    pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)

    conv2 = Convolution2D(16, 3, 3, activation='relu', border_mode='same')(pool1)
    conv2 = Convolution2D(16, 3, 3, activation='relu', border_mode='same')(conv2)
    pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)

    conv3 = Convolution2D(32, 3, 3, activation='relu', border_mode='same')(pool2)
    conv3 = Convolution2D(32, 3, 3, activation='relu', border_mode='same')(conv3)
    pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)

    conv4 = Convolution2D(64, 3, 3, activation='relu', border_mode='same')(pool3)
    conv4 = Convolution2D(64, 3, 3, activation='relu', border_mode='same')(conv4)
    pool4 = MaxPooling2D(pool_size=(2, 2))(conv4)

    conv5 = Convolution2D(128, 3, 3, activation='relu', border_mode='same')(pool4)
    conv5 = Convolution2D(128, 3, 3, activation='relu', border_mode='same')(conv5)

    up6 = merge([UpSampling2D(size=(2, 2))(conv5), conv4], mode='concat', concat_axis=3)
    conv6 = Convolution2D(64, 3, 3, activation='relu', border_mode='same')(up6)
    conv6 = Convolution2D(64, 3, 3, activation='relu', border_mode='same')(conv6)

    up7 = merge([UpSampling2D(size=(2, 2))(conv6), conv3], mode='concat', concat_axis=3)
    conv7 = Convolution2D(32, 3, 3, activation='relu', border_mode='same')(up7)
    conv7 = Convolution2D(32, 3, 3, activation='relu', border_mode='same')(conv7)

    up8 = merge([UpSampling2D(size=(2, 2))(conv7), conv2], mode='concat', concat_axis=3)
    conv8 = Convolution2D(16, 3, 3, activation='relu', border_mode='same')(up8)
    conv8 = Convolution2D(16, 3, 3, activation='relu', border_mode='same')(conv8)

    up9 = merge([UpSampling2D(size=(2, 2))(conv8), conv1], mode='concat', concat_axis=3)
    conv9 = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(up9)
    conv9 = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(conv9)

    conv10 = Convolution2D(1, 1, 1, activation='sigmoid')(conv9)

    model = Model(input=inputs, output=conv10)

    
    return model


In [None]:
training_gen = generate_train_batch(df_vehicles,16)
model = get_adopted_unet()
model.summary()
model.compile(optimizer=Adam(lr=1e-4),  loss=IOU_calc_loss, metrics=[IOU_calc])
history = model.fit_generator(training_gen, samples_per_epoch=1000,  nb_epoch=100)


In [None]:
### Save model and weights IOU_calc: 0.6989
model.save('model_AdoptedUnet_640_960_e100.h5')

model.save_weights("model_AdoptedUnet_Weights_640_960_e100.h5", overwrite=True)
print("done")

In [None]:
# 100 more epochs
history = model.fit_generator(training_gen, samples_per_epoch=1000,  nb_epoch=100)

In [None]:
### Save model and weights IOU_calc: 0.7480
model.save('model_AdoptedUnet_640_960_e200.h5')

model.save_weights("model_AdoptedUnet_Weights_640_960_e200.h5", overwrite=True)
print("done")

In [None]:
# 100 more epochs
history = model.fit_generator(training_gen, samples_per_epoch=1000,  nb_epoch=100)

In [None]:
### Save model and weights IOU_calc: 0.7674
model.save('model_AdoptedUnet_640_960_e300.h5')

model.save_weights("model_AdoptedUnet_Weights_640_960_e300.h5", overwrite=True)
print("done")

In [None]:
# 100 more epochs
history = model.fit_generator(training_gen, samples_per_epoch=1000,  nb_epoch=100)

In [None]:
### Save model and weights IOU_calc: 0.7872
model.save('model_AdoptedUnet_640_960_e400.h5')
model.save_weights("model_AdoptedUnet_Weights_640_960_e400.h5", overwrite=True)
print("done")

In [None]:
# 100 more epochs
history = model.fit_generator(training_gen, samples_per_epoch=1000,  nb_epoch=100)

In [None]:
### Save model and weightsIOU_calc: 0.8010 
model.save('model_AdoptedUnet_640_960_e500.h5')
model.save_weights("model_AdoptedUnet_Weights_640_960_e500.h5", overwrite=True)
print("done")

# Reload GNet model

In [None]:
##### Image size, 
img_rows = 640
img_cols = 960

#img_rows = 480
#img_cols = 720


#recreate model and load trained weights
#del model
model = get_gnet()
model.compile(optimizer=Adam(lr=1e-4), loss=IOU_calc_loss, metrics=[IOU_calc])
model.load_weights("model_Unet_Weights_640_960_e500.h5")   
   
model.summary()


In [None]:
#### Function for drawing bounding boxes, taken from Udacity

def draw_labeled_bboxes(img, labels):
    # Iterate through all detected cars
    for car_number in range(1, labels[1]+1):
        # Find pixels with each car_number label value
        nonzero = (labels[0] == car_number).nonzero()
        # Identify x and y values of those pixels
        nonzeroy = np.array(nonzero[0])
        nonzerox = np.array(nonzero[1])
        # Define a bounding box based on min/max x and y
        if ((np.max(nonzeroy)-np.min(nonzeroy)>50) & (np.max(nonzerox)-np.min(nonzerox)>50)):
            bbox = ((np.min(nonzerox), np.min(nonzeroy)), (np.max(nonzerox), np.max(nonzeroy)))
            # Draw the box on the image       
            cv2.rectangle(img, bbox[0], bbox[1], (0,0,255),6)
    # Return the image
    return img

def pred_for_img(img):
    img = cv2.resize(img,(img_cols, img_rows))
    img = np.reshape(img,(1,img_rows, img_cols,3))
    pred = model.predict(img)
    return pred,img[0]

def get_BB_new_img(img):
    # Take in RGB image
    img  = np.array(img,dtype= np.uint8)
    img_pred = np.array(255*pred[0],dtype=np.uint8)
    heatmap = img_pred[:,:,0]
    labels = label(heatmap)
    draw_img = draw_labeled_bboxes(np.copy(img), labels)
    return draw_img

In [None]:
### Testing generator

testing_gen = generate_test_batch(df_vehicles,20)


In [None]:

%time pred_all= model.predict(batch_img)


In [None]:
### Test on last frames of data

batch_img,batch_mask = next(testing_gen)
pred_all= model.predict(batch_img)
np.shape(pred_all)

for i in range(20):
    
    im = np.array(batch_img[i],dtype=np.uint8)
    im_mask = np.array(255*batch_mask[i],dtype=np.uint8)
    im_pred = np.array(255*pred_all[i],dtype=np.uint8)
    
    rgb_mask_pred = cv2.cvtColor(im_pred,cv2.COLOR_GRAY2RGB)
    rgb_mask_pred[:,:,1:3] = 0*rgb_mask_pred[:,:,1:2]
    rgb_mask_true= cv2.cvtColor(im_mask,cv2.COLOR_GRAY2RGB)
    rgb_mask_true[:,:,0] = 0*rgb_mask_true[:,:,0]
    rgb_mask_true[:,:,2] = 0*rgb_mask_true[:,:,2]
    
    img_pred = cv2.addWeighted(rgb_mask_pred,0.5,im,0.5,0)
    img_true = cv2.addWeighted(rgb_mask_true,0.5,im,0.5,0)
    
    plt.figure(figsize=(12,6))
    plt.subplot(1,3,1)
    plt.imshow(im)
    plt.title('Original image')
    plt.axis('off')
    plt.subplot(1,3,2)
    plt.imshow(img_true)
    plt.title('Ground truth BB')
    plt.axis('off')
    plt.subplot(1,3,3)
    plt.imshow(img_pred)
    plt.title('Predicted segmentation mask')
    plt.axis('off')
    plt.show()


In [None]:
test_img = 'test_images/test5.jpg'
im = cv2.imread(test_img)
im = cv2.cvtColor(im,cv2.COLOR_BGR2RGB)
pred,im = pred_for_img(im)
im  = np.array(im,dtype= np.uint8)
im_pred = np.array(255*pred[0],dtype=np.uint8)
rgb_mask_pred = cv2.cvtColor(im_pred,cv2.COLOR_GRAY2RGB)
rgb_mask_pred[:,:,1:3] = 0*rgb_mask_pred[:,:,1:2]

img_pred = cv2.addWeighted(rgb_mask_pred,0.85,im,1,0)


draw_img = get_BB_new_img(im)

plt.figure(figsize=(12,6))
plt.subplot(1,3,1)
plt.imshow(im)
plt.title('Original')
plt.axis('off')
plt.subplot(1,3,2)
plt.imshow(img_pred)
plt.title('Segmentation')
plt.axis('off')
plt.subplot(1,3,3)
plt.imshow(draw_img)
plt.title('Bounding Box')
plt.axis('off');



# Reload Adopted UNet

In [None]:
##### Image size, 
img_rows = 640
img_cols = 960

#img_rows = 480
#img_cols = 720


#recreate model and load trained weights
#del model
model = get_adopted_unet()
model.compile(optimizer=Adam(lr=1e-4), loss=IOU_calc_loss, metrics=[IOU_calc])
model.load_weights("model_AdoptedUnet_Weights_640_960_e500.h5")   
   
model.summary()

In [None]:
### Test on last frames of data

batch_img,batch_mask = next(testing_gen)
pred_all= model.predict(batch_img)
np.shape(pred_all)

for i in range(20):
    
    im = np.array(batch_img[i],dtype=np.uint8)
    im_mask = np.array(255*batch_mask[i],dtype=np.uint8)
    im_pred = np.array(255*pred_all[i],dtype=np.uint8)
    
    rgb_mask_pred = cv2.cvtColor(im_pred,cv2.COLOR_GRAY2RGB)
    rgb_mask_pred[:,:,1:3] = 0*rgb_mask_pred[:,:,1:2]
    rgb_mask_true= cv2.cvtColor(im_mask,cv2.COLOR_GRAY2RGB)
    rgb_mask_true[:,:,0] = 0*rgb_mask_true[:,:,0]
    rgb_mask_true[:,:,2] = 0*rgb_mask_true[:,:,2]
    
    img_pred = cv2.addWeighted(rgb_mask_pred,0.5,im,0.5,0)
    img_true = cv2.addWeighted(rgb_mask_true,0.5,im,0.5,0)
    
    plt.figure(figsize=(12,6))
    plt.subplot(1,3,1)
    plt.imshow(im)
    plt.title('Original image')
    plt.axis('off')
    plt.subplot(1,3,2)
    plt.imshow(img_true)
    plt.title('Ground truth BB')
    plt.axis('off')
    plt.subplot(1,3,3)
    plt.imshow(img_pred)
    plt.title('Predicted segmentation mask')
    plt.axis('off')
    plt.show()


In [None]:
test_img = 'test_images/test5.jpg'
im = cv2.imread(test_img)
im = cv2.cvtColor(im,cv2.COLOR_BGR2RGB)
pred,im = pred_for_img(im)
im  = np.array(im,dtype= np.uint8)
im_pred = np.array(255*pred[0],dtype=np.uint8)
rgb_mask_pred = cv2.cvtColor(im_pred,cv2.COLOR_GRAY2RGB)
rgb_mask_pred[:,:,1:3] = 0*rgb_mask_pred[:,:,1:2]

img_pred = cv2.addWeighted(rgb_mask_pred,0.85,im,1,0)


draw_img = get_BB_new_img(im)

plt.figure(figsize=(12,6))
plt.subplot(1,3,1)
plt.imshow(im)
plt.title('Original')
plt.axis('off')
plt.subplot(1,3,2)
plt.imshow(img_pred)
plt.title('Segmentation')
plt.axis('off')
plt.subplot(1,3,3)
plt.imshow(draw_img)
plt.title('Bounding Box')
plt.axis('off');