# `FStream` Online Training

**[THIS IS WORK IN PROGRESS]**

This notebook performs online training of the **flow stream parent model** on the **car-shadow** sequence, so make sure you've run the [`FStream` Offline Training](fstream_offline_training.ipynb) notebook before running this one.


The online training of the `FStream` network is done by finetuning the parent model **on the first frame** of the video sequence. This is the only frame for which a mask is provided. It is augmented using scaling and vertical flipping. The network is trained for 500 iterations using the same training parameters as during offline training, except that deep supervision is disabled.

![](img/osvos_child.png)

To monitor training, run:
```
tensorboard --logdir E:\repos\tf-video-seg\tfvos\models\fstream_car-shadow
http://localhost:6006
```

In [1]:
"""
fstream_online_training.ipynb

FStream online trainer

Written by Phil Ferriere

Licensed under the MIT License (see LICENSE for details)

Based on:
  - https://github.com/scaelles/OSVOS-TensorFlow/blob/master/osvos_parent_demo.py
    Written by Sergi Caelles (scaelles@vision.ee.ethz.ch)
    This file is part of the OSVOS paper presented in:
      Sergi Caelles, Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Laura Leal-Taixe, Daniel Cremers, Luc Van Gool
      One-Shot Video Object Segmentation
      CVPR 2017
    Unknown code license
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import os, sys
from PIL import Image
import numpy as np
import tensorflow as tf
slim = tf.contrib.slim
import matplotlib.pyplot as plt

In [2]:
# Import model files
import model
import datasets

## Configuration

In [3]:
# Model paths
seq_name = "car-shadow"
segnet_stream = 'fstream'
parent_path = 'models/' + segnet_stream + '_parent/' + segnet_stream + '_parent.ckpt-50000'
ckpt_name = segnet_stream + '_' + seq_name
logs_path = 'models/' + ckpt_name

# Online training parameters
gpu_id = 0
max_training_iters = 500
learning_rate = 1e-8
save_step = max_training_iters
side_supervision = 3
display_step = 10

## Dataset load

In [4]:
# Load the DAVIS 2016 sequence
options = datasets._DEFAULT_DAVIS16_OPTIONS
options['use_cache'] = False
options['data_aug'] = True
# Set the following to wherever you have downloaded the DAVIS 2016 dataset
dataset_root = 'E:/datasets/davis2016/' if sys.platform.startswith("win") else '/media/EDrive/datasets/davis2016/'
test_frames = sorted(os.listdir(dataset_root + 'JPEGImages/480p/' + seq_name))
test_imgs = ['JPEGImages/480p/' + seq_name + '/' + frame for frame in test_frames]
train_imgs = ['JPEGImages/480p/' + seq_name + '/' + '00000.jpg ' + 'Annotations/480p/' + seq_name + '/' + '00000.png']
dataset = datasets.davis16(train_imgs, test_imgs, dataset_root, options)

Initializing dataset...
['JPEGImages/480p/car-shadow/00000.jpg Annotations/480p/car-shadow/00000.png']
Loading training images and masks...


100%|█████████████████████████████████████████████| 1/1 [00:00<00:00, 26.30it/s]


...done loading training images and masks.
Performing scaling data augmentation on frames/masks...


100%|████████████████████████████████████████████| 1/1 [00:00<00:00, 999.83it/s]
100%|████████████████████████████████████████████| 1/1 [00:00<00:00, 500.99it/s]


... done performing scaling data augmentation on frames/masks.
Performing flipping data augmentation on frames/masks...


100%|████████████████████████████████████████████| 3/3 [00:00<00:00, 749.43it/s]


... done performing flipping data augmentation on frames/masks.
Converting images and masks to numpy arrays...


100%|████████████████████████████████████████████| 6/6 [00:00<00:00, 667.16it/s]


...done converting images and masks to numpy arrays.
Computing optical flows and warping masks...


video: 100%|█████████████████████████████████████| 6/6 [00:00<00:00, 752.68it/s]


...done with computing optical flows and warping masks.
Loading testing images...


100%|███████████████████████████████████████████| 40/40 [00:00<00:00, 79.62it/s]


...done loading testing images.
...done initializing Dataset


In [5]:
# Display dataset configuration
dataset.print_config()


Configuration:
  in_memory            True
  data_aug             True
  use_cache            False
  use_optical_flow     True
  use_warped_masks     True
  use_bboxes           True
  optical_flow_mgr     pyflow


## Online Training

In [6]:
# Finetune this branch of the binary segmentation network
with tf.Graph().as_default():
    with tf.device('/gpu:' + str(gpu_id)):
        global_step = tf.Variable(0, name='global_step', trainable=False)
        model.train_finetune(dataset, parent_path, side_supervision, learning_rate, logs_path, max_training_iters,
                             save_step, display_step, global_step, segnet_stream, iter_mean_grad=1, ckpt_name=ckpt_name)


Network Layers:
   name = fstream/conv1/conv1_1/Relu:0, shape = (1, ?, ?, 64)
   name = fstream/conv1/conv1_2/Relu:0, shape = (1, ?, ?, 64)
   name = fstream/pool1/MaxPool:0, shape = (1, ?, ?, 64)
   name = fstream/conv2/conv2_1/Relu:0, shape = (1, ?, ?, 128)
   name = fstream/conv2/conv2_2/Relu:0, shape = (1, ?, ?, 128)
   name = fstream/pool2/MaxPool:0, shape = (1, ?, ?, 128)
   name = fstream/conv3/conv3_1/Relu:0, shape = (1, ?, ?, 256)
   name = fstream/conv3/conv3_2/Relu:0, shape = (1, ?, ?, 256)
   name = fstream/conv3/conv3_3/Relu:0, shape = (1, ?, ?, 256)
   name = fstream/pool3/MaxPool:0, shape = (1, ?, ?, 256)
   name = fstream/conv4/conv4_1/Relu:0, shape = (1, ?, ?, 512)
   name = fstream/conv4/conv4_2/Relu:0, shape = (1, ?, ?, 512)
   name = fstream/conv4/conv4_3/Relu:0, shape = (1, ?, ?, 512)
   name = fstream/pool4/MaxPool:0, shape = (1, ?, ?, 512)
   name = fstream/conv5/conv5_1/Relu:0, shape = (1, ?, ?, 512)
   name = fstream/conv5/conv5_2/Relu:0, shape = (1, ?, ?, 512)

2018-02-02 09:19:29.456114 Iter 350: Training Loss = 172.9282
2018-02-02 09:19:31.696558 Iter 360: Training Loss = 98.5829
2018-02-02 09:19:33.947753 Iter 370: Training Loss = 191.5003
2018-02-02 09:19:36.088972 Iter 380: Training Loss = 97.0496
2018-02-02 09:19:38.448853 Iter 390: Training Loss = 96.4809
2018-02-02 09:19:40.683812 Iter 400: Training Loss = 99.6578
2018-02-02 09:19:43.013047 Iter 410: Training Loss = 182.6668
2018-02-02 09:19:45.130933 Iter 420: Training Loss = 180.7202
2018-02-02 09:19:47.357807 Iter 430: Training Loss = 97.6054
2018-02-02 09:19:49.689024 Iter 440: Training Loss = 158.0362
2018-02-02 09:19:51.795934 Iter 450: Training Loss = 175.0791
2018-02-02 09:19:54.008078 Iter 460: Training Loss = 173.5029
2018-02-02 09:19:56.260132 Iter 470: Training Loss = 153.9403
2018-02-02 09:19:58.494076 Iter 480: Training Loss = 149.3242
2018-02-02 09:20:00.610344 Iter 490: Training Loss = 93.8205
2018-02-02 09:20:02.708245 Iter 500: Training Loss = 89.5559
INFO:tensorflow

## Training losses & learning rate
You should get training curves similar to the following:
![](img/fstream_car-shadow_main_loss.png)
![](img/fstream_car-shadow_total_loss.png)
![](img/fstream_car-shadow_learning_rate.png)

## Testing

In [None]:
# Result path (if you want to check how well this branch is doing on its own)
result_path = dataset_folder + 'Results/Segmentations/480p/' + ckpt_name

# Test this branch of the network
with tf.Graph().as_default():
    with tf.device('/gpu:' + str(gpu_id)):
        ckpt_path = logs_path + '/' + ckpt_name + '.ckpt-' + str(max_training_iters))
        model.test(dataset, ckpt_path, result_path)

Log output should be similar to the following:
```
INFO:tensorflow:Restoring parameters from models\car-shadow_new\car-shadow_new.ckpt-500
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00000.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00001.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00002.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00003.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00004.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00005.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00006.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00007.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00008.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00009.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00010.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00011.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00012.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00013.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00014.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00015.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00016.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00017.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00018.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00019.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00020.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00021.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00022.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00023.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00024.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00025.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00026.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00027.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00028.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00029.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00030.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00031.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00032.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00033.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00034.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00035.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00036.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00037.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00038.png
Saving E:/datasets/davis2016\Results\Segmentations\480p\OSVOS\car-shadow_new\00039.png
```

## Visual evaluation

Let's load the original images and their predicted masks to get an idea of how this branch of the network is doing. Note that the first mask is displayed in red overlay, as it is given to us. The predicted masks are displayed using a green overlay.

In [None]:
# Load results
frames = []
predicted_masks=[]
for test_frame in test_frames:
    frame_num = test_frame.split('.')[0]
    frame = np.array(Image.open(dataset_root + 'JPEGImages/480p/' + seq_name + '/' + test_frame))
    predicted_mask = np.array(Image.open(result_path + frame_num +'.png'))
    frames.append(frame)
    predicted_masks.append(predicted_mask)

In [None]:
# Overlay the masks on top of the frames
frames_with_predictions = visualize.overlay_frames_with_predictions(frames, predicted_masks')

### Display individual frames

In [None]:
visualize.display_images(frames_with_predictions)

### Display results as a video clip

In [None]:
# Set path to video clips
video_clip_folder = dataset_root + 'clips/'
video_clip = video_clip_folder + ckpt_name + '.mp4'

# Combine images in a video clip
visualize.make_clip(video_clip, frames_with_predictions)

In [None]:
# Display video
visualize.show_clip(video_clip)