## Semantic Segmentation Project

Introduction 

In this project, The model will label the pixels of a road in images using a Fully Convolutional Network (FCN) and a pre-trained VGG16 model.

### Dataset

[Kitti Road dataset](http://www.cvlibs.net/datasets/kitti/eval_road.php) which can be downloaded from [here](http://www.cvlibs.net/download.php?file=data_road.zip)

### Architecture

The model is a Fully Convolutional Network (You can Check this paper for more info [Link](https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdf))

which is built above a pre-trained VGG 16 model, by removing the last layer and converting it by a 1x1 convolution with 2 Classes as the depth (Road, Not Road). Then Using Upsample to restore the spatial dimensions of the input image. Some skip connections between VGG layers and the new Layers were used to improve the Performance.

### Training

The hyperparameters used for training are:

-   keep_prob: 0.5
    
-   learning_rate: 0.001
    
-   epochs: 60
    
-   batch_size: 5
    

The model was Trained using Google Colab GPU Runtime. it took about 1-2 hours of Training.

### Results

After the 60 epochs the model reached 1.4 as epoch loss.

### Setup

##### Frameworks and Packages

helper.py and the Pre-Trained VGG model is provided by Udacity. Please check their github Repository from [here](https://github.com/udacity/CarND-Semantic-Segmentation)

Make sure you have the following is installed:

-   [Python 3](https://www.python.org/)
    
-   [TensorFlow](https://www.tensorflow.org/)
    
-   [NumPy](http://www.numpy.org/)
    
-   [SciPy](https://www.scipy.org/)

In [0]:
import tensorflow as tf 

In [0]:
num_classes = 2
image_shape = (160, 576)
data_dir = './data'
runs_dir = './runs'
EPOCHS=60
BATCH_SIZE=5

In [0]:
## Execute this cell for the first time only to download dataset and VGG weights
!wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_road.zip
!unzip data_road.zip
helper.maybe_download_pretrained_vgg(data_dir)

--2018-09-02 21:53:23--  https://s3.eu-central-1.amazonaws.com/avg-kitti/data_road.zip
Resolving s3.eu-central-1.amazonaws.com (s3.eu-central-1.amazonaws.com)... 52.219.73.0
Connecting to s3.eu-central-1.amazonaws.com (s3.eu-central-1.amazonaws.com)|52.219.73.0|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 470992343 (449M) [application/zip]
Saving to: ‘data_road.zip.7’


2018-09-02 21:53:52 (15.5 MB/s) - ‘data_road.zip.7’ saved [470992343/470992343]

Archive:  data_road.zip
replace data_road/training/image_2/umm_000032.png? [y]es, [n]o, [A]ll, [N]one, [r]ename: 

In [0]:
def load_vgg(sess,path):
  model=tf.saved_model.loader.load(sess,['vgg16'],path)
  graph=tf.get_default_graph()
  graph = tf.get_default_graph()
  image_input = graph.get_tensor_by_name('image_input:0')
  keep_prob = graph.get_tensor_by_name('keep_prob:0')
  layer3 = graph.get_tensor_by_name('layer3_out:0')
  layer4 = graph.get_tensor_by_name('layer4_out:0')
  layer7 = graph.get_tensor_by_name('layer7_out:0')

  return image_input, keep_prob, layer3, layer4, layer7

In [0]:
def layers(layer3, layer4, layer7,num_classes):
  fcn8=tf.layers.conv2d(layer7,filters=num_classes,kernel_size=1,name='fcn8')
  
  fcn9 = tf.layers.conv2d_transpose(fcn8, filters=layer4.get_shape().as_list()[-1],
    kernel_size=4, strides=(2, 2), padding='SAME', name="fcn9")
  
  fcn9_skip=tf.add(fcn9,layer4,name="fcn9_plus_vgg_layer4")
  
  fcn10 = tf.layers.conv2d_transpose(fcn9_skip, filters=layer3.get_shape().as_list()[-1],
    kernel_size=4, strides=(2, 2), padding='SAME', name="fcn10_conv2d")  
  fc10_skip=tf.add(fcn10,layer3,name='fcn10_plus_vgg_layer3')
  
  fcn11 = tf.layers.conv2d_transpose(fc10_skip, filters=num_classes,
    kernel_size=16, strides=(8, 8), padding='SAME', name="fcn11")
  
  return fcn11

In [0]:
def optimize(last_layer,correct_label,lr,num_classes):
  
  logits = tf.reshape(last_layer, (-1, num_classes), name="fcn_logits")
  correct_label_reshaped = tf.reshape(correct_label, (-1, num_classes))
  cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=correct_label_reshaped[:])
  # Take mean for total loss
  loss_op = tf.reduce_mean(cross_entropy, name="fcn_loss")
  train_op = tf.train.AdamOptimizer(learning_rate=lr).minimize(loss_op, name="fcn_train_op")
  return logits, train_op, loss_op

In [0]:
def train_nn(sess, epochs, batch_size, get_batches_fn, train_op,
             cross_entropy_loss, input_image,
             correct_label, keep_prob, learning_rate):

  keep_prob_value = 0.5
  learning_rate_value = 0.001
  for epoch in range(epochs):
      # Create function to get batches
      total_loss = 0
      for X_batch, gt_batch in get_batches_fn(batch_size):

          loss, _ = sess.run([cross_entropy_loss, train_op],
          feed_dict={input_image: X_batch, correct_label: gt_batch,
          keep_prob: keep_prob_value, learning_rate:learning_rate_value})
          print("Batch Loss = {:.5f}".format(loss))
          total_loss += loss;

      print("EPOCH {} ...".format(epoch + 1))
      print("Total Loss = {:.3f}".format(total_loss))
      print()

In [0]:
import helper
tf.reset_default_graph()
# A function to get batches
get_batches_fn = helper.gen_batch_function("data_road/training", image_shape)
session=tf.Session()

vgg_path = os.path.join(data_dir, 'vgg')
# Returns the three layers, keep probability and input layer from the vgg architecture
image_input, keep_prob, layer3, layer4, layer7 = load_vgg(session, vgg_path)

# The resulting network architecture from adding a decoder on top of the given vgg model
model_output = layers(layer3, layer4, layer7, num_classes)

# Returns the output logits, training operation and cost operation to be used
# - logits: each row represents a pixel, each column a class
# - train_op: function used to get the right parameters to the model to correctly label the pixels
# - cross_entropy_loss: function outputting the cost which we are minimizing, lower cost should yield higher accuracy


correct_label = tf.placeholder(tf.int32, [None, None, None, num_classes], name='correct_label')
learning_rate = tf.placeholder(tf.float32, name='learning_rate')

logits, train_op, cross_entropy_loss = optimize(model_output, correct_label, learning_rate, num_classes)

# Initialize all variables
session.run(tf.global_variables_initializer())
session.run(tf.local_variables_initializer())

print("Model build successful, starting training")

# Train the neural network
train_nn(session, EPOCHS, BATCH_SIZE, get_batches_fn, 
         train_op, cross_entropy_loss, image_input,
         correct_label, keep_prob, learning_rate)

# Run the model with the test images and save each painted output image (roads painted Violet)
helper.save_inference_samples(runs_dir, "./", session, image_shape, logits, keep_prob, image_input)

print("All done!")

In [0]:
## Save The model to Restore it Later
saver = tf.train.Saver()
saver.restore(session, "./tmp/model.ckpt")

INFO:tensorflow:Restoring parameters from ./tmp/model.ckpt


In [0]:
## You can zip the Test Output using this Cell then upolad it to your google drive using the next Cell
!zip -r img08.zip ./runs/1535902746.2466056


zip error: Nothing to do! (try: zip -r img08.zip . -i ./runs/1535902746.2466056)


In [0]:
# Install the PyDrive wrapper & import libraries.
# This only needs to be done once in a notebook.
!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

# Authenticate and create the PyDrive client.
# This only needs to be done once in a notebook.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

# Create & upload a file.
uploaded = drive.CreateFile({'title': 'img08.zip'})
uploaded.SetContentFile('img08.zip')
uploaded.Upload()
print('Uploaded file with ID {}'.format(uploaded.get('id')))

Uploaded file with ID 1n52Jai_cHTlMdOkyFSwOOKe-8nfXvj47


In [0]:
files.download('./runs/img0-out/image-027.png')