PWC-Net-small model pre-training (with cyclical learning rate schedule)
==========================================================

In this notebook we:
- Use a small model (no dense or residual connections), 6 level pyramid, uspample level 1 by 2 as the final flow prediction
- Pre-train the PWC-Net-small model on the `FlyingChairs` dataset using a Cyclic<sub>long</sub> schedule of our own
- **ABANDONED** Finetune the pre-trained PWC-Net-small model on the `FlyingThings3D` using the Cyclic<sub>fine</sub> schedule

The third step below was skipped after observing that uspampling level 1 by 2 instead of uspampling level 2 by 4 doesn't lead to a reduction in EPE while incurring longer training times (batch of 2 instead of 8 due to larger model size) -> dropped.

Below, look for `TODO` references and customize this notebook based on your own needs.

## Reference

[2018a]<a name="2018a"></a> Sun et al. 2018. PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume. [[arXiv]](https://arxiv.org/abs/1709.02371) [[web]](http://research.nvidia.com/publication/2018-02_PWC-Net%3A-CNNs-for) [[PyTorch (Official)]](https://github.com/NVlabs/PWC-Net/tree/master/PyTorch) [[Caffe (Official)]](https://github.com/NVlabs/PWC-Net/tree/master/Caffe)

In [1]:
"""
pwcnet_train.ipynb

PWC-Net model training.

Written by Phil Ferriere

Licensed under the MIT License (see LICENSE for details)

Tensorboard:
    [win] tensorboard --logdir=E:\\repos\\tf-optflow\\tfoptflow\\pwcnet-sm-6-1-cyclic-chairs
    [ubu] tensorboard --logdir=/media/EDrive/repos/tf-optflow/tfoptflow/pwcnet-sm-6-1-cyclic-chairs
then,    
    [win] tensorboard --logdir=E:\\repos\\tf-optflow\\tfoptflow\\pwcnet-sm-6-1-cyclic-chairs-things
    [ubu] tensorboard --logdir=/media/EDrive/repos/tf-optflow/tfoptflow/pwcnet-sm-6-1-cyclic-chairs-things
"""
from __future__ import absolute_import, division, print_function
import sys
from copy import deepcopy

from dataset import FlyingChairsDataset, _DEFAULT_DS_TRAIN_OPTIONS, FlyingThings3DDataset, _DEFAULT_DS_TUNE_OPTIONS
from model_pwcnet import ModelPWCNet, _DEFAULT_PWCNET_TRAIN_OPTIONS, _DEFAULT_PWCNET_FINETUNE_OPTIONS

## TODO: Set this first!

In [2]:
# TODO: You MUST set dataset_root to the correct path on your machine!
if sys.platform.startswith("win"):
    _DATASET_ROOT = 'E:/datasets/'
    _FLYINGTHINGS3D_ROOT = '//naspro/devt/Datasets/FlyingThings3D'
else:
    _DATASET_ROOT = '/media/EDrive/datasets/'
    _FLYINGTHINGS3D_ROOT = '/media/YDrive/datasets/FlyingThings3D'
_FLYINGCHAIRS_ROOT = _DATASET_ROOT + 'FlyingChairs_release'
    
# TODO: You MUST adjust the settings below based on the number of GPU(s) used for training
# Set controller device and devices
# A one-gpu setup would be something like controller='/device:GPU:0' and gpu_devices=['/device:GPU:0']
# Here, we use a dual-GPU setup, as shown below
gpu_devices = ['/device:GPU:0', '/device:GPU:1']
controller = '/device:CPU:0'

# TODO: You MUST adjust this setting below based on the amount of memory on your GPU(s)
# Batch size
batch_size = 2

# Pre-train on `FlyingChairs`

## Load the dataset

In [3]:
# TODO: You MUST set the batch size based on the capabilities of your GPU(s) 
#  Load train dataset
ds_opts = deepcopy(_DEFAULT_DS_TRAIN_OPTIONS)
ds_opts['in_memory'] = False                          # Too many samples to keep in memory at once, so don't preload them
ds_opts['aug_type'] = 'heavy'                         # Apply all supported augmentations
ds_opts['batch_size'] = batch_size * len(gpu_devices) # Multiply by number of GPUs (Titan X & 1080 Ti)
ds_opts['crop_preproc'] = (384, 448)                  # Crop to a smaller input size
ds = FlyingChairsDataset(mode='train_with_val', ds_root=_FLYINGCHAIRS_ROOT, options=ds_opts)

In [4]:
# Display dataset configuration
ds.print_config()


Dataset Configuration:
  verbose              False
  in_memory            False
  crop_preproc         (384, 448)
  scale_preproc        None
  input_channels       3
  tb_test_imgs         False
  random_seed          1969
  val_split            0.03
  aug_type             heavy
  aug_labels           True
  fliplr               0.5
  flipud               0.5
  translate            (0.5, 0.05)
  scale                (0.5, 0.05)
  batch_size           4
  mode                 train_with_val
  train size           22232
  val size             640


## Configure the training

In [5]:
# Start from the default options
nn_opts = deepcopy(_DEFAULT_PWCNET_TRAIN_OPTIONS)
nn_opts['verbose'] = True
nn_opts['ckpt_dir'] = './pwcnet-sm-6-1-cyclic-chairs/'
nn_opts['batch_size'] = ds_opts['batch_size']
nn_opts['x_shape'] = [2, ds_opts['crop_preproc'][0], ds_opts['crop_preproc'][1], 3]
nn_opts['y_shape'] = [ds_opts['crop_preproc'][0], ds_opts['crop_preproc'][1], 2]
nn_opts['use_tf_data'] = True # Use tf.data reader
nn_opts['gpu_devices'] = gpu_devices
nn_opts['controller'] = controller

# Use the PWC-Net-small model in high-res mode
nn_opts['use_dense_cx'] = False
nn_opts['use_res_cx'] = False
nn_opts['flow_pred_lvl'] = 1

In [6]:
# Set the learning rate schedule. This schedule is for a single GPU using a batch size of 8.
nn_opts['lr_policy'] = 'cyclic'
nn_opts['cyclic_lr_max'] = 4e-04
nn_opts['cyclic_lr_base'] = 1e-05
nn_opts['cyclic_lr_stepsize'] = 10000
nn_opts['max_steps'] = 100000

# Below, we adjust the schedule to the size of the batch and our number of GPUs (2).
nn_opts['cyclic_lr_stepsize'] /= len(gpu_devices)
nn_opts['max_steps'] /= len(gpu_devices)
nn_opts['cyclic_lr_stepsize'] = int(nn_opts['cyclic_lr_stepsize'] / (float(ds_opts['batch_size']) / 8))
nn_opts['max_steps'] = int(nn_opts['max_steps'] / (float(ds_opts['batch_size']) / 8))

In [7]:
# Instantiate the model and display the model configuration
nn = ModelPWCNet(mode='train_with_val', options=nn_opts, dataset=ds)
nn.print_config()

Building model towers...
  Building tower_0...
Instructions for updating:
`normal` is a deprecated alias for `truncated_normal`
  ...tower_0 built.
  Building tower_1...
  ...tower_1 built.
... model towers built.
Initializing model with random values for initial training...

... model initialized

Model Configuration:
  verbose                True
  ckpt_dir               ./pwcnet-sm-6-1-cyclic-chairs/
  max_to_keep            10
  x_dtype                <dtype: 'float32'>
  x_shape                [2, 384, 448, 3]
  y_dtype                <dtype: 'float32'>
  y_shape                [384, 448, 2]
  train_mode             train
  display_step           100
  snapshot_step          1000
  val_step               1000
  val_batch_size         -1
  tb_val_imgs            pyramid
  tb_test_imgs           None
  gpu_devices            ['/device:GPU:0', '/device:GPU:1']
  controller             /device:CPU:0
  use_tf_data            True
  batch_size             4
  lr_policy              cycl

## Train the model

In [8]:
# Train the model
nn.train()

Start training from scratch...
2018-09-06 13:33:45 Iter 100 [Train]: loss=1141.15, epe=15.81, lr=0.000014, samples/sec=5.6, sec/step=1.418, eta=1 day, 15:20:17
2018-09-06 13:34:51 Iter 200 [Train]: loss=1140.38, epe=15.80, lr=0.000018, samples/sec=13.0, sec/step=0.616, eta=17:04:56
2018-09-06 13:35:57 Iter 300 [Train]: loss=1123.94, epe=15.58, lr=0.000022, samples/sec=12.7, sec/step=0.629, eta=17:24:32
2018-09-06 13:37:02 Iter 400 [Train]: loss=1155.73, epe=16.02, lr=0.000026, samples/sec=13.1, sec/step=0.612, eta=16:56:29
2018-09-06 13:38:06 Iter 500 [Train]: loss=1187.41, epe=16.46, lr=0.000029, samples/sec=13.2, sec/step=0.608, eta=16:48:40
2018-09-06 13:39:10 Iter 600 [Train]: loss=1118.06, epe=15.50, lr=0.000033, samples/sec=13.1, sec/step=0.609, eta=16:48:37
2018-09-06 13:40:15 Iter 700 [Train]: loss=1132.66, epe=15.69, lr=0.000037, samples/sec=13.2, sec/step=0.608, eta=16:46:50
2018-09-06 13:41:19 Iter 800 [Train]: loss=1147.07, epe=15.85, lr=0.000041, samples/sec=13.2, sec/step

2018-09-06 14:35:16 Iter 5600 [Train]: loss=864.13, epe=10.75, lr=0.000228, samples/sec=13.2, sec/step=0.606, eta=15:53:56
2018-09-06 14:36:20 Iter 5700 [Train]: loss=845.25, epe=10.49, lr=0.000232, samples/sec=13.2, sec/step=0.608, eta=15:55:00
2018-09-06 14:37:25 Iter 5800 [Train]: loss=824.78, epe=10.31, lr=0.000236, samples/sec=13.2, sec/step=0.607, eta=15:53:29
2018-09-06 14:38:29 Iter 5900 [Train]: loss=797.37, epe=10.03, lr=0.000240, samples/sec=13.2, sec/step=0.606, eta=15:51:09
2018-09-06 14:39:33 Iter 6000 [Train]: loss=861.58, epe=10.83, lr=0.000244, samples/sec=13.2, sec/step=0.607, eta=15:51:40
2018-09-06 14:39:54 Iter 6000 6000 [Val]: loss=654.54, epe=8.43
Saving model...
INFO:tensorflow:./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt-6000 is not in all_model_checkpoint_paths. Manually adding it.
... model saved in ./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt-6000
2018-09-06 14:41:06 Iter 6100 [Train]: loss=795.48, epe=9.88, lr=0.000248, samples/sec=13.2, sec/step=0.604, eta=15:45

... model saved in ./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt-11000
2018-09-06 15:37:00 Iter 11100 [Train]: loss=578.36, epe=7.49, lr=0.000357, samples/sec=13.2, sec/step=0.605, eta=14:55:48
2018-09-06 15:38:04 Iter 11200 [Train]: loss=561.84, epe=7.24, lr=0.000353, samples/sec=13.2, sec/step=0.608, eta=14:59:34
2018-09-06 15:39:08 Iter 11300 [Train]: loss=529.67, epe=6.82, lr=0.000349, samples/sec=13.1, sec/step=0.609, eta=14:59:36
2018-09-06 15:40:12 Iter 11400 [Train]: loss=556.94, epe=7.16, lr=0.000345, samples/sec=13.1, sec/step=0.610, eta=15:00:02
2018-09-06 15:41:17 Iter 11500 [Train]: loss=550.88, epe=7.08, lr=0.000342, samples/sec=13.2, sec/step=0.608, eta=14:57:05
2018-09-06 15:42:21 Iter 11600 [Train]: loss=559.79, epe=7.24, lr=0.000338, samples/sec=13.2, sec/step=0.606, eta=14:53:20
2018-09-06 15:43:25 Iter 11700 [Train]: loss=561.15, epe=7.25, lr=0.000334, samples/sec=13.2, sec/step=0.606, eta=14:51:14
2018-09-06 15:44:29 Iter 11800 [Train]: loss=536.05, epe=6.95, lr=0.0003

2018-09-06 16:37:12 Iter 16500 [Train]: loss=456.56, epe=5.79, lr=0.000147, samples/sec=13.2, sec/step=0.606, eta=14:02:56
2018-09-06 16:38:16 Iter 16600 [Train]: loss=444.40, epe=5.65, lr=0.000143, samples/sec=13.2, sec/step=0.606, eta=14:02:49
2018-09-06 16:39:21 Iter 16700 [Train]: loss=443.72, epe=5.62, lr=0.000139, samples/sec=13.2, sec/step=0.608, eta=14:04:00
2018-09-06 16:40:25 Iter 16800 [Train]: loss=423.95, epe=5.34, lr=0.000135, samples/sec=13.2, sec/step=0.608, eta=14:02:55
2018-09-06 16:41:29 Iter 16900 [Train]: loss=426.75, epe=5.36, lr=0.000131, samples/sec=13.2, sec/step=0.607, eta=14:00:36
2018-09-06 16:42:33 Iter 17000 [Train]: loss=430.72, epe=5.39, lr=0.000127, samples/sec=13.1, sec/step=0.608, eta=14:01:36
2018-09-06 16:42:54 Iter 17000 17000 [Val]: loss=384.20, epe=4.92
Saving model...
INFO:tensorflow:./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt-17000 is not in all_model_checkpoint_paths. Manually adding it.
... model saved in ./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ck

2018-09-06 17:38:34 Iter 22000 [Train]: loss=389.41, epe=4.87, lr=0.000049, samples/sec=13.1, sec/step=0.610, eta=13:12:42
2018-09-06 17:38:55 Iter 22000 22000 [Val]: loss=355.85, epe=4.53
Saving model...
INFO:tensorflow:./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt-22000 is not in all_model_checkpoint_paths. Manually adding it.
... model saved in ./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt-22000
2018-09-06 17:40:08 Iter 22100 [Train]: loss=401.91, epe=5.02, lr=0.000051, samples/sec=13.2, sec/step=0.606, eta=13:06:28
2018-09-06 17:41:13 Iter 22200 [Train]: loss=389.81, epe=4.86, lr=0.000053, samples/sec=13.1, sec/step=0.610, eta=13:10:23
2018-09-06 17:42:17 Iter 22300 [Train]: loss=387.47, epe=4.86, lr=0.000055, samples/sec=13.1, sec/step=0.609, eta=13:08:43
2018-09-06 17:43:22 Iter 22400 [Train]: loss=390.40, epe=4.87, lr=0.000057, samples/sec=13.1, sec/step=0.610, eta=13:08:44
2018-09-06 17:44:26 Iter 22500 [Train]: loss=393.76, epe=4.94, lr=0.000059, samples/sec=13.1, sec/step=0.609, eta=

2018-09-06 18:38:24 Iter 27300 [Train]: loss=401.80, epe=5.01, lr=0.000152, samples/sec=13.1, sec/step=0.608, eta=12:17:09
2018-09-06 18:39:28 Iter 27400 [Train]: loss=390.39, epe=4.89, lr=0.000154, samples/sec=13.1, sec/step=0.610, eta=12:17:59
2018-09-06 18:40:33 Iter 27500 [Train]: loss=389.59, epe=4.88, lr=0.000156, samples/sec=13.1, sec/step=0.609, eta=12:16:16
2018-09-06 18:41:37 Iter 27600 [Train]: loss=407.95, epe=5.13, lr=0.000158, samples/sec=13.2, sec/step=0.606, eta=12:11:03
2018-09-06 18:42:41 Iter 27700 [Train]: loss=400.08, epe=5.01, lr=0.000160, samples/sec=13.2, sec/step=0.606, eta=12:10:20
2018-09-06 18:43:45 Iter 27800 [Train]: loss=365.71, epe=4.54, lr=0.000162, samples/sec=13.2, sec/step=0.606, eta=12:09:35
2018-09-06 18:44:49 Iter 27900 [Train]: loss=390.71, epe=4.84, lr=0.000164, samples/sec=13.2, sec/step=0.606, eta=12:07:52
2018-09-06 18:45:53 Iter 28000 [Train]: loss=403.25, epe=5.05, lr=0.000166, samples/sec=13.2, sec/step=0.607, eta=12:08:34
2018-09-06 18:46

2018-09-06 19:42:06 Iter 33000 33000 [Val]: loss=339.30, epe=4.33
Saving model...
INFO:tensorflow:./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt-33000 is not in all_model_checkpoint_paths. Manually adding it.
... model saved in ./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt-33000
2018-09-06 19:43:20 Iter 33100 [Train]: loss=359.04, epe=4.47, lr=0.000145, samples/sec=13.1, sec/step=0.609, eta=11:19:35
2018-09-06 19:44:25 Iter 33200 [Train]: loss=341.03, epe=4.22, lr=0.000143, samples/sec=13.1, sec/step=0.613, eta=11:21:55
2018-09-06 19:45:29 Iter 33300 [Train]: loss=343.12, epe=4.26, lr=0.000141, samples/sec=13.2, sec/step=0.605, eta=11:12:03
2018-09-06 19:46:33 Iter 33400 [Train]: loss=364.74, epe=4.53, lr=0.000139, samples/sec=13.2, sec/step=0.606, eta=11:12:37
2018-09-06 19:47:37 Iter 33500 [Train]: loss=380.91, epe=4.75, lr=0.000137, samples/sec=13.2, sec/step=0.608, eta=11:13:43
2018-09-06 19:48:41 Iter 33600 [Train]: loss=373.07, epe=4.63, lr=0.000135, samples/sec=13.2, sec/step=0.608, eta=

2018-09-06 20:43:26 Iter 38400 [Train]: loss=317.02, epe=3.88, lr=0.000041, samples/sec=12.9, sec/step=0.618, eta=10:34:44
2018-09-06 20:44:31 Iter 38500 [Train]: loss=319.62, epe=3.90, lr=0.000039, samples/sec=12.9, sec/step=0.619, eta=10:34:34
2018-09-06 20:45:37 Iter 38600 [Train]: loss=330.88, epe=4.06, lr=0.000037, samples/sec=13.0, sec/step=0.618, eta=10:31:57
2018-09-06 20:46:43 Iter 38700 [Train]: loss=319.62, epe=3.91, lr=0.000035, samples/sec=12.9, sec/step=0.620, eta=10:33:13
2018-09-06 20:47:48 Iter 38800 [Train]: loss=312.78, epe=3.83, lr=0.000033, samples/sec=12.9, sec/step=0.618, eta=10:30:42
2018-09-06 20:48:54 Iter 38900 [Train]: loss=322.10, epe=3.95, lr=0.000031, samples/sec=12.9, sec/step=0.619, eta=10:30:31
2018-09-06 20:49:59 Iter 39000 [Train]: loss=330.13, epe=4.01, lr=0.000030, samples/sec=12.9, sec/step=0.619, eta=10:29:28
2018-09-06 20:50:21 Iter 39000 39000 [Val]: loss=300.62, epe=3.77
Saving model...
INFO:tensorflow:./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt

2018-09-06 21:46:07 Iter 43900 [Train]: loss=322.93, epe=3.93, lr=0.000048, samples/sec=13.0, sec/step=0.616, eta=9:35:57
2018-09-06 21:47:12 Iter 44000 [Train]: loss=305.88, epe=3.72, lr=0.000049, samples/sec=13.0, sec/step=0.616, eta=9:34:41
2018-09-06 21:47:33 Iter 44000 44000 [Val]: loss=301.08, epe=3.76
Saving model...
INFO:tensorflow:./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt-44000 is not in all_model_checkpoint_paths. Manually adding it.
... model saved in ./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt-44000
2018-09-06 21:48:49 Iter 44100 [Train]: loss=310.33, epe=3.79, lr=0.000050, samples/sec=13.1, sec/step=0.612, eta=9:30:36
2018-09-06 21:49:54 Iter 44200 [Train]: loss=328.48, epe=4.02, lr=0.000051, samples/sec=13.0, sec/step=0.616, eta=9:32:38
2018-09-06 21:50:59 Iter 44300 [Train]: loss=325.25, epe=3.97, lr=0.000052, samples/sec=13.0, sec/step=0.616, eta=9:31:55
2018-09-06 21:52:04 Iter 44400 [Train]: loss=324.33, epe=3.97, lr=0.000053, samples/sec=12.9, sec/step=0.620, eta=9:34:

2018-09-06 22:47:58 Iter 49400 [Train]: loss=328.09, epe=4.04, lr=0.000102, samples/sec=13.3, sec/step=0.604, eta=8:29:06
2018-09-06 22:49:02 Iter 49500 [Train]: loss=334.15, epe=4.09, lr=0.000103, samples/sec=13.2, sec/step=0.605, eta=8:29:35
2018-09-06 22:50:06 Iter 49600 [Train]: loss=344.96, epe=4.26, lr=0.000104, samples/sec=13.2, sec/step=0.605, eta=8:28:04
2018-09-06 22:51:10 Iter 49700 [Train]: loss=335.74, epe=4.13, lr=0.000105, samples/sec=13.2, sec/step=0.605, eta=8:27:34
2018-09-06 22:52:14 Iter 49800 [Train]: loss=343.74, epe=4.26, lr=0.000106, samples/sec=13.2, sec/step=0.604, eta=8:25:35
2018-09-06 22:53:18 Iter 49900 [Train]: loss=348.84, epe=4.32, lr=0.000107, samples/sec=13.3, sec/step=0.604, eta=8:24:09
2018-09-06 22:54:22 Iter 50000 [Train]: loss=336.24, epe=4.12, lr=0.000108, samples/sec=13.3, sec/step=0.602, eta=8:21:42
2018-09-06 22:54:43 Iter 50000 50000 [Val]: loss=306.56, epe=3.85
Saving model...
INFO:tensorflow:./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt-50000 

2018-09-06 23:50:19 Iter 55000 55000 [Val]: loss=293.75, epe=3.66
Saving model...
INFO:tensorflow:./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt-55000 is not in all_model_checkpoint_paths. Manually adding it.
... model saved in ./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt-55000
2018-09-06 23:51:34 Iter 55100 [Train]: loss=310.31, epe=3.78, lr=0.000058, samples/sec=13.4, sec/step=0.597, eta=7:27:02
2018-09-06 23:52:37 Iter 55200 [Train]: loss=317.77, epe=3.88, lr=0.000057, samples/sec=13.3, sec/step=0.603, eta=7:30:09
2018-09-06 23:53:41 Iter 55300 [Train]: loss=323.05, epe=3.94, lr=0.000056, samples/sec=13.3, sec/step=0.602, eta=7:28:33
2018-09-06 23:54:45 Iter 55400 [Train]: loss=321.59, epe=3.92, lr=0.000055, samples/sec=13.3, sec/step=0.602, eta=7:27:37
2018-09-06 23:55:49 Iter 55500 [Train]: loss=317.69, epe=3.86, lr=0.000054, samples/sec=13.3, sec/step=0.603, eta=7:26:54
2018-09-06 23:56:53 Iter 55600 [Train]: loss=300.79, epe=3.63, lr=0.000053, samples/sec=13.3, sec/step=0.604, eta=7:26:

2018-09-07 00:50:28 Iter 60400 [Train]: loss=310.86, epe=3.75, lr=0.000012, samples/sec=13.3, sec/step=0.603, eta=6:37:57
2018-09-07 00:51:32 Iter 60500 [Train]: loss=293.98, epe=3.53, lr=0.000012, samples/sec=13.3, sec/step=0.603, eta=6:37:18
2018-09-07 00:52:36 Iter 60600 [Train]: loss=295.16, epe=3.56, lr=0.000013, samples/sec=13.3, sec/step=0.603, eta=6:35:54
2018-09-07 00:53:40 Iter 60700 [Train]: loss=296.55, epe=3.58, lr=0.000013, samples/sec=13.3, sec/step=0.603, eta=6:35:14
2018-09-07 00:54:44 Iter 60800 [Train]: loss=293.00, epe=3.52, lr=0.000014, samples/sec=13.3, sec/step=0.603, eta=6:34:11
2018-09-07 00:55:48 Iter 60900 [Train]: loss=291.06, epe=3.51, lr=0.000014, samples/sec=13.2, sec/step=0.604, eta=6:33:49
2018-09-07 00:56:52 Iter 61000 [Train]: loss=285.51, epe=3.44, lr=0.000015, samples/sec=13.3, sec/step=0.602, eta=6:31:16
2018-09-07 00:57:12 Iter 61000 61000 [Val]: loss=276.98, epe=3.43
Saving model...
INFO:tensorflow:./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt-61000 

2018-09-07 01:51:29 Iter 65900 [Train]: loss=296.73, epe=3.58, lr=0.000039, samples/sec=13.3, sec/step=0.602, eta=5:42:22
2018-09-07 01:52:33 Iter 66000 [Train]: loss=303.70, epe=3.68, lr=0.000039, samples/sec=13.3, sec/step=0.602, eta=5:40:59
2018-09-07 01:52:54 Iter 66000 66000 [Val]: loss=278.84, epe=3.45
Saving model...
INFO:tensorflow:./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt-66000 is not in all_model_checkpoint_paths. Manually adding it.
... model saved in ./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt-66000
2018-09-07 01:54:06 Iter 66100 [Train]: loss=290.67, epe=3.50, lr=0.000040, samples/sec=13.3, sec/step=0.601, eta=5:39:34
2018-09-07 01:55:10 Iter 66200 [Train]: loss=302.42, epe=3.65, lr=0.000040, samples/sec=13.2, sec/step=0.604, eta=5:40:10
2018-09-07 01:56:14 Iter 66300 [Train]: loss=298.14, epe=3.63, lr=0.000041, samples/sec=13.3, sec/step=0.602, eta=5:37:59
2018-09-07 01:57:18 Iter 66400 [Train]: loss=299.39, epe=3.62, lr=0.000041, samples/sec=13.3, sec/step=0.602, eta=5:37:

2018-09-07 02:52:59 Iter 71400 [Train]: loss=321.92, epe=3.92, lr=0.000052, samples/sec=13.3, sec/step=0.603, eta=4:47:29
2018-09-07 02:54:03 Iter 71500 [Train]: loss=311.86, epe=3.78, lr=0.000051, samples/sec=13.3, sec/step=0.603, eta=4:46:26
2018-09-07 02:55:07 Iter 71600 [Train]: loss=313.24, epe=3.83, lr=0.000051, samples/sec=13.3, sec/step=0.602, eta=4:45:10
2018-09-07 02:56:11 Iter 71700 [Train]: loss=288.49, epe=3.48, lr=0.000050, samples/sec=13.3, sec/step=0.602, eta=4:43:58
2018-09-07 02:57:15 Iter 71800 [Train]: loss=292.00, epe=3.54, lr=0.000050, samples/sec=13.3, sec/step=0.603, eta=4:43:11
2018-09-07 02:58:19 Iter 71900 [Train]: loss=312.43, epe=3.77, lr=0.000049, samples/sec=13.3, sec/step=0.602, eta=4:42:03
2018-09-07 02:59:23 Iter 72000 [Train]: loss=299.09, epe=3.62, lr=0.000049, samples/sec=13.3, sec/step=0.603, eta=4:41:17
2018-09-07 02:59:44 Iter 72000 72000 [Val]: loss=276.25, epe=3.42
Saving model...
INFO:tensorflow:./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt-72000 

2018-09-07 03:54:04 Iter 76900 [Train]: loss=284.61, epe=3.41, lr=0.000025, samples/sec=13.2, sec/step=0.605, eta=3:52:55
2018-09-07 03:55:08 Iter 77000 [Train]: loss=286.55, epe=3.45, lr=0.000025, samples/sec=13.3, sec/step=0.603, eta=3:51:08
2018-09-07 03:55:29 Iter 77000 77000 [Val]: loss=266.31, epe=3.27
Saving model...
INFO:tensorflow:./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt-77000 is not in all_model_checkpoint_paths. Manually adding it.
... model saved in ./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt-77000
2018-09-07 03:56:41 Iter 77100 [Train]: loss=282.77, epe=3.38, lr=0.000024, samples/sec=13.3, sec/step=0.600, eta=3:49:00
2018-09-07 03:57:45 Iter 77200 [Train]: loss=284.34, epe=3.43, lr=0.000024, samples/sec=13.3, sec/step=0.604, eta=3:49:20
2018-09-07 03:58:49 Iter 77300 [Train]: loss=284.94, epe=3.41, lr=0.000023, samples/sec=13.3, sec/step=0.603, eta=3:48:17
2018-09-07 03:59:53 Iter 77400 [Train]: loss=306.81, epe=3.70, lr=0.000023, samples/sec=13.3, sec/step=0.603, eta=3:47:

2018-09-07 04:53:37 Iter 82200 [Train]: loss=280.93, epe=3.36, lr=0.000015, samples/sec=13.3, sec/step=0.603, eta=2:58:58
2018-09-07 04:54:41 Iter 82300 [Train]: loss=275.24, epe=3.29, lr=0.000016, samples/sec=13.2, sec/step=0.604, eta=2:58:19
2018-09-07 04:55:45 Iter 82400 [Train]: loss=282.34, epe=3.37, lr=0.000016, samples/sec=13.2, sec/step=0.606, eta=2:57:45
2018-09-07 04:56:50 Iter 82500 [Train]: loss=293.95, epe=3.53, lr=0.000016, samples/sec=13.2, sec/step=0.606, eta=2:56:39
2018-09-07 04:57:54 Iter 82600 [Train]: loss=273.24, epe=3.27, lr=0.000016, samples/sec=13.2, sec/step=0.607, eta=2:56:03
2018-09-07 04:58:58 Iter 82700 [Train]: loss=277.08, epe=3.31, lr=0.000017, samples/sec=13.2, sec/step=0.606, eta=2:54:46
2018-09-07 05:00:02 Iter 82800 [Train]: loss=275.15, epe=3.30, lr=0.000017, samples/sec=13.2, sec/step=0.605, eta=2:53:24
2018-09-07 05:01:07 Iter 82900 [Train]: loss=288.87, epe=3.49, lr=0.000017, samples/sec=13.2, sec/step=0.606, eta=2:52:48
2018-09-07 05:02:11 Iter

2018-09-07 05:55:03 Iter 87700 [Train]: loss=296.77, epe=3.58, lr=0.000029, samples/sec=13.2, sec/step=0.606, eta=2:04:08
2018-09-07 05:56:07 Iter 87800 [Train]: loss=291.91, epe=3.53, lr=0.000029, samples/sec=13.2, sec/step=0.607, eta=2:03:25
2018-09-07 05:57:11 Iter 87900 [Train]: loss=298.38, epe=3.56, lr=0.000029, samples/sec=13.2, sec/step=0.605, eta=2:02:05
2018-09-07 05:58:15 Iter 88000 [Train]: loss=277.15, epe=3.31, lr=0.000030, samples/sec=13.2, sec/step=0.605, eta=2:00:56
2018-09-07 05:58:37 Iter 88000 88000 [Val]: loss=266.11, epe=3.28
Saving model...
INFO:tensorflow:./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt-88000 is not in all_model_checkpoint_paths. Manually adding it.
... model saved in ./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt-88000
2018-09-07 05:59:50 Iter 88100 [Train]: loss=272.66, epe=3.25, lr=0.000030, samples/sec=13.3, sec/step=0.602, eta=1:59:23
2018-09-07 06:00:54 Iter 88200 [Train]: loss=281.29, epe=3.39, lr=0.000030, samples/sec=13.2, sec/step=0.604, eta=1:58:

2018-09-07 06:57:51 Iter 93300 [Train]: loss=271.65, epe=3.24, lr=0.000026, samples/sec=13.3, sec/step=0.603, eta=1:07:23
2018-09-07 06:58:55 Iter 93400 [Train]: loss=285.05, epe=3.41, lr=0.000026, samples/sec=13.2, sec/step=0.604, eta=1:06:29
2018-09-07 06:59:59 Iter 93500 [Train]: loss=292.68, epe=3.50, lr=0.000026, samples/sec=13.2, sec/step=0.606, eta=1:05:39
2018-09-07 07:01:04 Iter 93600 [Train]: loss=287.91, epe=3.47, lr=0.000026, samples/sec=13.2, sec/step=0.605, eta=1:04:32
2018-09-07 07:02:08 Iter 93700 [Train]: loss=277.35, epe=3.30, lr=0.000025, samples/sec=13.2, sec/step=0.605, eta=1:03:31
2018-09-07 07:03:12 Iter 93800 [Train]: loss=278.76, epe=3.34, lr=0.000025, samples/sec=13.2, sec/step=0.605, eta=1:02:34
2018-09-07 07:04:16 Iter 93900 [Train]: loss=279.22, epe=3.35, lr=0.000025, samples/sec=13.2, sec/step=0.604, eta=1:01:27
2018-09-07 07:05:21 Iter 94000 [Train]: loss=300.72, epe=3.61, lr=0.000025, samples/sec=13.2, sec/step=0.605, eta=1:00:27
2018-09-07 07:05:42 Iter

2018-09-07 07:59:03 Iter 98800 [Train]: loss=273.18, epe=3.26, lr=0.000013, samples/sec=13.3, sec/step=0.603, eta=0:12:04
2018-09-07 08:00:07 Iter 98900 [Train]: loss=282.97, epe=3.38, lr=0.000013, samples/sec=13.3, sec/step=0.602, eta=0:11:02
2018-09-07 08:01:11 Iter 99000 [Train]: loss=293.29, epe=3.50, lr=0.000012, samples/sec=13.3, sec/step=0.604, eta=0:10:04
2018-09-07 08:01:32 Iter 99000 99000 [Val]: loss=260.52, epe=3.19
Saving model...
INFO:tensorflow:./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt-99000 is not in all_model_checkpoint_paths. Manually adding it.
... model saved in ./pwcnet-sm-6-1-cyclic-chairs/pwcnet.ckpt-99000
2018-09-07 08:02:45 Iter 99100 [Train]: loss=268.93, epe=3.20, lr=0.000012, samples/sec=13.3, sec/step=0.601, eta=0:09:01
2018-09-07 08:03:49 Iter 99200 [Train]: loss=278.70, epe=3.32, lr=0.000012, samples/sec=13.2, sec/step=0.604, eta=0:08:04
2018-09-07 08:04:53 Iter 99300 [Train]: loss=259.01, epe=3.08, lr=0.000012, samples/sec=13.3, sec/step=0.603, eta=0:07:

## Training log

Here are the training curves for the run above:

![](img/pwcnet-sm-6-1-cyclic-chairs/loss.png)
![](img/pwcnet-sm-6-1-cyclic-chairs/epe.png)
![](img/pwcnet-sm-6-1-cyclic-chairs/lr.png)

Here are the predictions issued by the model for a few validation samples:

![](img/pwcnet-sm-6-1-cyclic-chairs/val1.png)
![](img/pwcnet-sm-6-1-cyclic-chairs/val2.png)
![](img/pwcnet-sm-6-1-cyclic-chairs/val3.png)
![](img/pwcnet-sm-6-1-cyclic-chairs/val4.png)

# Finetune on `FlyingThings3D`

## Load the dataset

In [None]:
# Load the dataset
ds_opts = deepcopy(_DEFAULT_DS_TUNE_OPTIONS)
ds_opts['in_memory'] = False                 # Too many samples to keep in memory at once, so don't preload them
ds_opts['aug_type'] = 'heavy'                # Apply all supported augmentations
ds_opts['batch_size'] = 4 * len(gpu_devices) # Use a multiple of 4; here, 8 for dual-GPU mode (Titan X & 1080 Ti)
ds_opts['crop_preproc'] = (384, 768)         # Crop to a smaller input size
ds_opts['type'] = 'into_future'
ds = FlyingThings3DDataset(mode='train_with_val', ds_root=_FLYINGTHINGS3D_ROOT, options=ds_opts)

In [None]:
# Display dataset configuration
ds.print_config()

## Configure the finetuning

In [None]:
# Configure the finetuning, starting from the default options
nn_opts = deepcopy(_DEFAULT_PWCNET_FINETUNE_OPTIONS)
nn_opts['verbose'] = True
nn_opts['ckpt_path'] = './pwcnet-cyclic-chairs/pwcnet.ckpt-25000'
nn_opts['ckpt_dir'] = './pwcnet-cyclic-chairs-things/'
nn_opts['batch_size'] = ds_opts['batch_size']
nn_opts['x_shape'] = [2, ds_opts['crop_preproc'][0], ds_opts['crop_preproc'][1], 3]
nn_opts['y_shape'] = [ds_opts['crop_preproc'][0], ds_opts['crop_preproc'][1], 2]
nn_opts['use_tf_data'] = True # Use tf.data reader
nn_opts['gpu_devices'] = gpu_devices
nn_opts['controller'] = controller

# Use the PWC-Net-small model
nn_opts['use_dense_cx'] = False
nn_opts['use_res_cx'] = False

# Couldn't get the training to converge with robust loss... switch back to multiscale loss
nn_opts['loss_fn'] = 'loss_multiscale'
nn_opts['q'] = 1.
nn_opts['epsilon'] = 0.

In [None]:
# Set the learning rate schedule. This schedule is for a single GPU using a batch size of 8.
# Below,we adjust the schedule to the size of the batch and the number of GPUs.
nn_opts['lr_policy'] = 'cyclic'
nn_opts['cyclic_lr_max'] = 2e-05
nn_opts['cyclic_lr_base'] = 1e-06
nn_opts['cyclic_lr_stepsize'] = 10000
nn_opts['max_steps'] = 100000

# Below,we adjust the schedule to the size of the batch and our number of GPUs (2).
nn_opts['cyclic_lr_stepsize'] /= len(gpu_devices)
nn_opts['max_steps'] /= len(gpu_devices)
nn_opts['cyclic_lr_stepsize'] = int(nn_opts['cyclic_lr_stepsize'] / (float(ds_opts['batch_size']) / 4))
nn_opts['max_steps'] = int(nn_opts['max_steps'] / (float(ds_opts['batch_size']) / 4))

In [None]:
# Instantiate the model and display the model configuration
nn = ModelPWCNet(mode='train_with_val', options=nn_opts, dataset=ds)
nn.print_config()

## Finetune the model

In [None]:
# Finetune the model
nn.train()

## Training log

Here are the training curves for the run above:
    
![](pwcnet-sm-6-1-cyclic-chairs-things/img/loss.png)
![](pwcnet-sm-6-1-cyclic-chairs-things/img/epe.png)
![](pwcnet-sm-6-1-cyclic-chairs-things/img/lr.png)

Here are the predictions issued by the model for a few validation samples:
    
![](pwcnet-sm-6-1-cyclic-chairs-things/img/val1.png)
![](pwcnet-sm-6-1-cyclic-chairs-things/img/val2.png)