PWC-Net-small mixed-precision model evaluation (on three datasets)
==========================================================

In this notebook we:
- Evaluate the PWC-Net-small mixed-precision model trained on a mix of the `FlyingChairs` and `FlyingThings3DHalfRes` datasets using a Cyclic<sub>short</sub> schedule
- Evaluate the trained model on the **validation split** of the `FlyingChairs` dataset and on the **'final'** and **'clean'** versions of the `MPI-Sintel` training dataset, yiedling the following results:

| | FlyingChairs | Sintel clean' | Sintel 'final' |
| :---: | :---: | :---: | :---: |
| Avg EPE | 2.47 | 3.77 | 4.90 |
| Inference Time | 39.53ms | 55.16ms | 55.09ms |

Below, look for `TODO` references and customize this notebook based on your own needs.

## Reference

[2018a]<a name="2018a"></a> Sun et al. 2018. PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume. [[arXiv]](https://arxiv.org/abs/1709.02371) [[web]](http://research.nvidia.com/publication/2018-02_PWC-Net%3A-CNNs-for) [[PyTorch (Official)]](https://github.com/NVlabs/PWC-Net/tree/master/PyTorch) [[Caffe (Official)]](https://github.com/NVlabs/PWC-Net/tree/master/Caffe)

In [1]:
"""
pwcnet_eval.ipynb

PWC-Net model evaluation.

Written by Phil Ferriere

Licensed under the MIT License (see LICENSE for details)
"""
from __future__ import absolute_import, division, print_function
import sys
from copy import deepcopy
import tensorflow as tf

from dataset_base import _DEFAULT_DS_VAL_OPTIONS
from dataset_flyingchairs import FlyingChairsDataset
from dataset_mpisintel import MPISintelDataset
from model_pwcnet import ModelPWCNet, _DEFAULT_PWCNET_VAL_OPTIONS

## TODO: Set this first!

In [2]:
# TODO: You MUST set dataset_root to the correct path on your machine!
if sys.platform.startswith("win"):
    _DATASET_ROOT = 'E:/datasets/'
else:
    _DATASET_ROOT = '/media/EDrive/datasets/'
_FLYINGCHAIRS_ROOT = _DATASET_ROOT + 'FlyingChairs_release'
_MPISINTEL_ROOT = _DATASET_ROOT + 'MPI-Sintel'
    
# TODO: Set device on which to perform the evaluation
gpu_devices = ['/device:GPU:0'] # We're doing the evaluation on a single GPU
controller = '/device:GPU:0'

# Model to eval
ckpt_path = './models/pwcnet-sm-6-2-cyclic-chairsthingsmix-fp16/pwcnet.ckpt-41375'

## Eval on the `FlyingChairs` dataset

In [3]:
# We're doing the evaluation on the validation split of the dataset
mode = 'val'  

# Load the dataset in evaluation mode, starting with the default evaluation options
ds_opts = deepcopy(_DEFAULT_DS_VAL_OPTIONS)
ds = FlyingChairsDataset(mode=mode, ds_root=_FLYINGCHAIRS_ROOT, options=ds_opts)

# Configure the model for evaluation, starting with the default evaluation options
nn_opts = deepcopy(_DEFAULT_PWCNET_VAL_OPTIONS)
nn_opts['verbose'] = True
nn_opts['ckpt_path'] = ckpt_path
nn_opts['batch_size'] = 1  # Setting this to 1 leads to more accurate evaluations of the processing time
nn_opts['use_tf_data'] = False  # Don't use tf.data reader for this simple task
nn_opts['gpu_devices'] = gpu_devices
nn_opts['controller'] = controller  # Evaluate on CPU or GPU?

# We're evaluating the PWC-Net-small model in quarter-resolution mode
# That is, with a 6 level pyramid, and uspampling of level 2 by 4 in each dimension as the final flow prediction
nn_opts['use_dense_cx'] = False
nn_opts['use_res_cx'] = False
nn_opts['pyr_lvls'] = 6
nn_opts['flow_pred_lvl'] = 2

# Mixed precision fields
nn_opts['use_mixed_precision'] = True
nn_opts['x_dtype'] = tf.float16
nn_opts['y_dtype'] = tf.float32

# Instantiate the model in evaluation mode and display the model configuration
nn = ModelPWCNet(mode=mode, options=nn_opts, dataset=ds)

# Evaluate the model
avg_metric, avg_duration, _ = nn.eval(metric_name='EPE', save_preds=False)

Building model...
Instructions for updating:
`normal` is a deprecated alias for `truncated_normal`
... model built.
Loading model checkpoint ./models/pwcnet-sm-6-2-cyclic-chairsthingsmix-fp16/pwcnet.ckpt-41375 for eval or testing...

INFO:tensorflow:Restoring parameters from ./models/pwcnet-sm-6-2-cyclic-chairsthingsmix-fp16/pwcnet.ckpt-41375
... model loaded


Measuring EPE: 100%|##############################################| 640/640 [00:43<00:00, 14.76it/s]


In [4]:
print(f'Average EPE={avg_metric:.2f}, mean inference time={avg_duration*1000.:.2f}ms')

Average EPE=2.47, mean inference time=39.53ms


## Eval on the 'clean' `MPI-Sintel` dataset

In [5]:
# We're doing evaluation using the entire dataset for evaluation
mode = 'val_notrain'  

# Load the dataset in evaluation mode, starting with the default evaluation options
ds_opts = deepcopy(_DEFAULT_DS_VAL_OPTIONS)
ds_opts['type'] = 'clean'
ds = MPISintelDataset(mode=mode, ds_root=_MPISINTEL_ROOT, options=ds_opts)

# Configure the model for evaluation, starting with the default evaluation options
nn_opts = deepcopy(_DEFAULT_PWCNET_VAL_OPTIONS)
nn_opts['verbose'] = True
nn_opts['ckpt_path'] = ckpt_path
nn_opts['batch_size'] = 1  # Setting this to 1 leads to more accurate evaluations of the processing time
nn_opts['use_tf_data'] = False  # Don't use tf.data reader for this simple task
nn_opts['gpu_devices'] = gpu_devices
nn_opts['controller'] = controller  # Evaluate on CPU or GPU?

# We're evaluating the PWC-Net-small model in quarter-resolution mode
# That is, with a 6 level pyramid, and uspampling of level 2 by 4 in each dimension as the final flow prediction
nn_opts['use_dense_cx'] = False
nn_opts['use_res_cx'] = False
nn_opts['pyr_lvls'] = 6
nn_opts['flow_pred_lvl'] = 2

# Mixed precision fields
nn_opts['use_mixed_precision'] = True
nn_opts['x_dtype'] = tf.float16
nn_opts['y_dtype'] = tf.float32

# The size of the images in this dataset are not multiples of 64, while the model generates flows padded to multiples
# of 64. Hence, we need to crop the predicted flows to their original size
nn_opts['adapt_info'] = (1, 436, 1024, 2)

# Instantiate the model in evaluation mode and display the model configuration
nn = ModelPWCNet(mode=mode, options=nn_opts, dataset=ds)

# Evaluate the model
avg_metric, avg_duration, _ = nn.eval(metric_name='EPE', save_preds=False)

Building model...
... model built.
Loading model checkpoint ./models/pwcnet-sm-6-2-cyclic-chairsthingsmix-fp16/pwcnet.ckpt-41375 for eval or testing...

INFO:tensorflow:Restoring parameters from ./models/pwcnet-sm-6-2-cyclic-chairsthingsmix-fp16/pwcnet.ckpt-41375
... model loaded


Measuring EPE: 100%|############################################| 1041/1041 [02:46<00:00,  6.25it/s]


In [6]:
print(f'Average EPE={avg_metric:.2f}, mean inference time={avg_duration*1000.:.2f}ms')

Average EPE=3.77, mean inference time=55.16ms


## Eval on the 'final' `MPI-Sintel` dataset

In [7]:
# We're doing evaluation using the entire dataset for evaluation
mode = 'val_notrain'

# Load the dataset in evaluation mode, starting with the default evaluation options
ds_opts = deepcopy(_DEFAULT_DS_VAL_OPTIONS)
ds_opts['type'] = 'final'
ds = MPISintelDataset(mode=mode, ds_root=_MPISINTEL_ROOT, options=ds_opts)

# Configure the model for evaluation, starting with the default evaluation options
nn_opts = deepcopy(_DEFAULT_PWCNET_VAL_OPTIONS)
nn_opts['verbose'] = True
nn_opts['ckpt_path'] = ckpt_path
nn_opts['batch_size'] = 1               # Setting this to 1 leads to more accurate evaluations of the processing time
nn_opts['use_tf_data'] = False          # Don't use tf.data reader for this simple task
nn_opts['gpu_devices'] = gpu_devices
nn_opts['controller'] = controller      # Evaluate on CPU or GPU?

# We're evaluating the PWC-Net-small model in quarter-resolution mode
# That is, with a 6 level pyramid, and uspampling of level 2 by 4 in each dimension as the final flow prediction
nn_opts['use_dense_cx'] = False
nn_opts['use_res_cx'] = False
nn_opts['pyr_lvls'] = 6
nn_opts['flow_pred_lvl'] = 2

# Mixed precision fields
nn_opts['use_mixed_precision'] = True
nn_opts['x_dtype'] = tf.float16
nn_opts['y_dtype'] = tf.float32

# The size of the images in this dataset are not multiples of 64, while the model generates flows padded to multiples
# of 64. Hence, we need to crop the predicted flows to their original size
nn_opts['adapt_info'] = (1, 436, 1024, 2)

# Instantiate the model in evaluation mode and display the model configuration
nn = ModelPWCNet(mode=mode, options=nn_opts, dataset=ds)

# Evaluate the model
avg_metric, avg_duration, _ = nn.eval(metric_name='EPE', save_preds=False)

Building model...
... model built.
Loading model checkpoint ./models/pwcnet-sm-6-2-cyclic-chairsthingsmix-fp16/pwcnet.ckpt-41375 for eval or testing...

INFO:tensorflow:Restoring parameters from ./models/pwcnet-sm-6-2-cyclic-chairsthingsmix-fp16/pwcnet.ckpt-41375
... model loaded


Measuring EPE: 100%|############################################| 1041/1041 [02:45<00:00,  6.28it/s]


In [8]:
print(f'Average EPE={avg_metric:.2f}, mean inference time={avg_duration*1000.:.2f}ms')

Average EPE=4.90, mean inference time=55.09ms
