<h1>Comparative analysis on different versions of the "Stere-omics" algorithm</h1>
<h2>Introduction</h2>
<p>The next step &nbsp;from constant weight (CW) approach was decided to be the implementation of an adaptive support weight approach. The technique selected is called bilateral support weights. Experiences in this regard are discussed in this notebook in a comparative manner.</p>
<h2>Abstract</h2>
<p>Two variants of bilateral weights were&nbsp; introduced to the pipeline as part of the cost function. These were referred to as &ldquo;summed&rdquo; and &ldquo;product weight&rdquo; variants. As their name suggests, the summed entailed the summation of the intensity and spatial weights whereas the &ldquo;product&rdquo; expression entailed the multiplication of the two components of bilateral weights. The achieved improvement with respect to the previous increment, where CW windows were used, was 19.2%. Achieved best percentage of bad pixels (with threshold of 1 &ndash; bad4 here) was 26.8 % after introducing a truncation operation to the cost function. Horizontal structural noise and streaking artefacts were observed upon visual analysis.</p>
<h2>Relevant theory</h2>
<p>There have been&nbsp; countless techniques proposed for cost aggregation in the stereo matching pipeline in order to improve on matching accuracy. Perhaps the first and simplest is what Fang termed as Constant Window Aggregation (Fang <em>et al.</em>, 2012). This entails the aggregation of cost within a window (a matrix of adjacent pixels around a pixel that is &nbsp;subject to cost). This, using Fang&rsquo;s notation, can be generalised as</p>
<table width="100%">
<tbody>
<tr>
<td width="7%">
<p>&nbsp;</p>
</td>
<td width="86%">$$C_{\text{CW}}\left( x,y,d \right) = \sum_{\forall\left( x^{'},y^{'} \right)\mathcal{\in N}\left( x,y \right)}^{}{C\left( x^{'},y^{'},d \right)}$$</td>
<td width="7%">
<p>(1)</p>
</td>
</tr>
</tbody>
</table>
<p>&nbsp;</p>
<p>where <em>ùí©</em>(<em>ùë•</em><em>, </em><em>ùë¶</em>) is a set of pixels (represented by their x and y coordinates) within a support window around a pixel (P(x,y))&nbsp; each one of them having constant weight, a constant say in the outcome. This approach was employed as part of this project&rsquo;s previously conducted experiment (ALG_005_EXP_001) resulting in a 6-9% accuracy improvement compared to the results reported by Madeo et al. (2016) with regards to the scenes Teddy and Cones of the Middlebury 2003 dataset. A more challenging dataset, Middlebury 2014 was tested as well (<a href="ALG_005_EXP_003-PatchMatch_2014-VIS.ipynb">ALG_005_EXP_003-PatchMatch_2014-VIS</a>) achieving an overall 46% bad pixel ratio (with the threshold of 1 pixel difference).</p>
<p>According to Fang (2012), after CW approaches, the proposed solutions branched into two categories: Cross-Based Aggregation (CROSS) and Adaptive Weight Aggregation (AW). While both ideas seemed interesting to consider for implementation, later work (Hamzah and Ibrahim, 2016)&nbsp; suggested that Adaptive Support Weights (ASW &ndash; which is equivalent to AW in Fang&rsquo;s terminology) would perform the best (Chen, Ardabilian and Chen, 2015). &nbsp;Therefore, it was decided that an ASW method would be implemented.</p>
<p>Here, similarly to CW, cost is aggregated for both a&nbsp; reference and a target image pixel &nbsp;for a set of neighbouring pixels, though, their weight is dynamically calculated. A variety of methods has been proposed to calculate these dynamic weights (Hosni, Bleyer and Gelautz, 2013). One main &nbsp;characteristic of these various strategies that help differentiate between these methods is whether they performed in a symmetric or asymmetric manner. While the symmetric calculates a weight for both windows (left and right), asymmetric is only concerned with the left image window&rsquo;s weights. However, Hosni et al. concludes, there was no significant nor conclusive difference in accuracy between the two strategies in their comparison.</p>
<p>Perhaps one of the most advanced techniques to calculate adaptive weights is trilateral filtering (Chen, Ardabilian and Chen, 2015). Chen et al.&rsquo;s ASW technique was based on&nbsp; three main rules: colour, spatial distance and boundary. According to the colour rule, if the centre pixel and a pixel&nbsp; in the support window around it have similar intensities, the probability of them having the same disparities is high, therefore this pixel&rsquo;s weight in the window should be high. According to the&nbsp; spatial distance rule the greater the distance of a pixel from the centre pixel is, the lower the &nbsp;probability of them having the same disparity. The proposed boundary rule, which also happened to be the novelty in their solution,&nbsp; states that if there is a boundary (edge i.e. discontinuity)&nbsp; between two pixels then the probability of them having the same disparity is smaller. The proposed weight function was</p>
<table width="100%">
<tbody>
<tr>
<td width="7%">
<p>&nbsp;</p>
</td>
<td width="86%">$$
w_{\text{tf}}(p,q) = e^{- \frac{\mathrm{\Delta}C_{\text{pq}}}{\gamma_{c}}} + \ e^{- \frac{\mathrm{\Delta}S_{\text{pq}}}{\gamma_{s}}} + \ \sqrt{e^{- \frac{\mathrm{\Delta}C_{\text{pq}}}{\gamma_{c}}} + \ e^{- \frac{\mathrm{\Delta}S_{\text{pq}}}{\gamma_{s}}} + \ e^{- \frac{\mathrm{\Delta}E_{\text{pq}}}{\gamma_{E}}}}\ 
$$,</td>
<td width="7%">
<p>(2)</p>
</td>
</tr>
</tbody>
</table>

<p>where&nbsp; q is a pixel in the support window centred around pixel p.&nbsp; &nbsp;is calculated as the Eucledian distance between the pixel intensities of p and q,&nbsp; &nbsp;is the geometric Eucledian distance between the x,y coordinates of&nbsp; p and q, while &nbsp;is a local energy model responsible for enforcing the boundary rule. The &ldquo;&rdquo; terms are arbitrary parameters used to adjust the strength of the respective constraints.&nbsp; They proposed the weight terms to be used in an asymmetric manner and as the multiplicator of the calculated cost at pixel q (See equation 3 below). Pixel &ldquo;q&rdquo; is a pixel in the support window around pixel &ldquo;p&rdquo;.</p>
<table width="100%">
<tbody>
<tr>
<td width="7%">
<p>&nbsp;</p>
</td>
<td width="86%">
    $$C_{d}^{A}(p) = \sum_{q \in w_{p}}^{}{w\left( p,\ q \right) \bullet C_{d}(q)}$$
    </td>
<td width="7%">
<p>(3)</p>
</td>
</tr>
</tbody>
</table>
<p>However, as a first step towards adaptive weights in this project, this was deviated from in three regards. In this alternative implementation the first key difference was that&nbsp; only the spatial and colour terms were used as part of the cost aggregation function. Secondly, instead of using the support weight (w(p,q))&nbsp; term as the multiplicator of the cost matrix within a support window , it was made part of the cost calculation. This decision was made to be able to store intermediary weight results and not have to recalculate them thus speeding up stereo correspondence. &nbsp;Thirdly, instead of the sum of&nbsp; the two&nbsp; weight terms, their product was used. This was suggested by (De-Maeztu, Villanueva and Cabeza, 2011). This proved to be more stable after initial trials &nbsp;when it comes to the attempt of finding optimal parameters for this pipeline. &nbsp;This could be formulated as</p>
<table width="100%">
<tbody>
<tr>
<td width="7%">
<p>&nbsp;</p>
</td>
<td width="86%">
$$C\left( p,d \right) = \sum_{q \in w_{p},\ \ \ r \in w_{d}}^{}{\left| \left( e^{- \frac{\mathrm{\Delta}C_{\text{pq}}}{\gamma_{c}}}*\ e^{- \frac{\mathrm{\Delta}S_{\text{pq}}}{\gamma_{s}}} \right) \bullet I_{\text{pq}} - \ \left( e^{- \frac{\mathrm{\Delta}C_{\text{dr}}}{\gamma_{c}}}*\ e^{- \frac{\mathrm{\Delta}S_{\text{dr}}}{\gamma_{s}}} \right) \bullet I_{\text{dr}} \right|\text{ }}$$,
    </td>
<td width="7%">
<p>(4)</p>
</td>
</tr>
</tbody>
</table>
<p>where w<sub>p</sub> and w<sub>d </sub>&nbsp;denote the support windows around the left image&rsquo;s pixel p and the right image&rsquo;s pixel d. q and r denote pixel coordinates in the support window while I stands for pixel intensity.</p>
<p>A variant where the bilateral terms were added together ($e^{- \frac{\mathrm{\Delta}C_{\text{pq}}}{\gamma_{c}}} + \ e^{- \frac{\mathrm{\Delta}S_{\text{pq}}}{\gamma_{s}}}$) was tested as well. For a more detailed discussion or bilateral terms please visit cited sources suggested at the beginning of this section. The final version of the algorithm discussed in this report included one last altercation: the truncation operation of the calculated cost, which was conducted as</p>
<table width="100%">
<tbody>
<tr>
<td width="86%">
$$C'\left( p,d \right) = \ min(C\left( p,d \right),\ \gamma)$$</td>
<td width="7%">
<p>(5)</p>
</td>
</tr>
</tbody>
</table>
<p>Where C&rsquo; is the minimum of a user defined &nbsp;term and the cost whose calculation was descried in the previous section (equation 4). This truncation operation was suggested by multiple sources for different cost functions, therefore it was thought to be justified to implement such operation (De-Maeztu, Villanueva and Cabeza, 2011; Chen, Ardabilian and Chen, 2013, 2015; Hosni, Bleyer and Gelautz, 2013).</p>
<h2>Methodology</h2>
<p>First, in order to mitigate the effects of ill-parameter configuration and reduce the time needed to demonstrate relevant results, the three algorithm variants were tested on the Middlebury 2003 dataset‚Äôs quarter resolution scenes in grayscale mode.  The methodology was similar to the one followed for&nbsp; <a href="./ALG_005_EXP_001-VIS.ipynb">ALG_005_EXP_001</a>. However, here, first, a range of the parameters were tested with a relatively big step (20-50). Then, the ranges with the best results were tested again, but with smaller steps (5-10). This way local minimums or locations near local minimums were thought to have been found. Notebook versions of these experiments can be found at <a href="ALG_006_EXP_001-Bilateral_product_2003.ipynb">ALG_006_EXP_001</a>, <a href="ALG_006_EXP_002-Bilateral_summed_2003.ipynb">ALG_006_EXP_002</a>, <a href="./ALG_006_EXP_006-Bilateral_sum_truncated_2014-VIS.ipynb">ALG_006_EXP_006</a>. Separate written analysis was not created for these experiments.</p>
<p>Therefore, numerically informed by these three initial experiments, the Middlebury 2014 dataset&rsquo;s training images were tested at quarter resolution in grayscale for a set of parameters thought to be near the local minimums. For the list of parameters please see Table 1.</p>
<p>&nbsp;</p>
<table width="601">
<tbody>
<tr>
<td width="145">
<p>Algorithm variant</p>
</td>
<td width="123">
<p>Alias</p>
</td>
<td width="156">
<p>Support window dimensions</p>
</td>
<td width="67">
<p>Match values</p>
</td>
<td width="67">
<p>&gamma;<sub>c</sub></p>
</td>
<td width="43">
<p>&gamma;<sub>s</sub></p>
</td>
</tr>
<tr>
<td width="145">
<p>bilateral-summed weights*</p>
</td>
<td width="123">
<p>trunc_plusblg</p>
</td>
<td width="156">
<p>5x7</p>
</td>
<td width="67">
<p>30, 35</p>
</td>
<td width="67">
<p>1</p>
</td>
<td width="43">
<p>5</p>
</td>
</tr>
<tr>
<td width="145">
<p>bilateral-product-weights</p>
</td>
<td width="123">
<p>blg</p>
</td>
<td width="156">
<p>3x3, 3x5, 5x7, 7x3</p>
</td>
<td width="67">
<p>30, 40, 50</p>
</td>
<td width="67">
<p>8, 10, 12</p>
</td>
<td width="43">
<p>90</p>
</td>
</tr>
<tr>
<td width="145">
<p>bilateral-summed weights</p>
</td>
<td width="123">
<p>plusblg</p>
</td>
<td width="156">
<p>3x3, 3x5, 5x7, 7x3</p>
</td>
<td width="67">
<p>30, 40, 45, 55</p>
</td>
<td width="67">
<p>1-7</p>
</td>
<td width="43">
<p>1,2</p>
</td>
</tr>
<tr>
<td width="145">
<p>constant window</p>
</td>
<td width="123">
<p>bm</p>
</td>
<td width="156">
<p>1x1, 1x3, 1x5, 1x7, 3x1, 3x3, 3x5, 3x7, 5x1, 5x3, 7x1, 7x3, 7x5, 9x3, 9x5, 9x7, 11x3, 11x5, 13x3, 13x5, 13x7, 15x3, 15x5, 15x7, 15x9</p>
</td>
<td width="67">
<p>10-110 with a step of 20</p>
</td>
<td width="67">
<p>-</p>
</td>
<td width="43">
<p>-</p>
</td>
</tr>
</tbody>
</table>
<p>* This last version was only tested with a limited set of parameters informed by the previous two variants experimental outcomes. As it was near the end of the project, final tweaks were achieved by changing the gap and egap values which remained relatively constant throughout the project. The values tested were: -4, -7, -8, -9, -11, -12, -13 for gap and -1, -3, -5 for egap.</p>
<p>Table 1: Parameter list for the last three conducted experiments (ALG_005_EXP_003, ALG_006_EXP_001, ALG_006_EXP_003-Bilateral_product_2014-VIS, ALG_006_EXP_004-Bilateral_summed_2014-VIS)</p>
<h2>Discussion and results</h2>
<p>Compared to the results logged&nbsp; during the experiment on constant weight windows (CW)&nbsp; (ALG_005_EXP_001) there had been a 17% increase in accuracy (best CW: 46.01%, best bilateral: 29.4%) overall when analysing bad4 results. Introduction of the truncation operation yielded an additional 2.5% reduction in erroneous pixels overall.</p>
<p>Significant difference in terms of accuracy between the summed and product weight term methods was first thought to not exist, the difference in between best performances (26.8% ad 29.4%) was hypothesized to be on the account of more favourable parameter selection. However, this assumption was not confirmed analytically.</p>
<p>On the other hand, when analysing how each variant performed against individual scenes, two scenes, Jadeplant and Vintage stood out with a respective rounded difference of 26% and 24%. This suggested that the summed term performed better, although the product term was not tested with gap and egap values other than the default ones used throughout this project. Therefore, the experiment in this regard was deemed as inconclusive.</p>
<p>The greatest improvements in accuracy between the constant and bilateral weighted methods were achieved on the scenes Adirondack (29%), ArtL (27%), Jadeplant (32%), MotorcycleE (61%), PianoL (26%) and Vintage (29%). This suggested that the application of bilateral weights (especially using the &ldquo;plus termed&rdquo; version) made the algorithm more robust towards lighting and exposure variations within a scene. Additionally, scenes with large number of discontinuities (Jadeplant) improved a lot as well. This was an interesting result since this pipeline only considered pixel intensities, not gradients (which are thought to be more invariant when lighting and exposure conditions change (De-Maeztu, Villanueva and Cabeza, 2011)).&nbsp; The worst improvements were observed w.r.t. the scene Playtable and PlaytableP. The reason behind this was not confirmed, though it was &nbsp;thought to be either the high amount of noise present in these scenes (streaking artefacts were observable) and the mainly diagonal orientation of edges within that scene which the algorithm was not able to interpret properly. The streaking artefacts (horizontal structural noise (Boyat and Joshi, 2015)) were observed throughout the dataset which was characterised as a symptomatic artefact of DP approaches (Scharstein and Szeliski, 2002).</p>
<p>Near the right edges of the images, especially near discontinuities, controversial phenomenon was observed. Streaking artefacts than occluded regions (black pixels) affected the matching accuracy negatively. While the first one was previously identified as a result of ill-configured match value, occlusions reported the opposite effect (too low match value). The reason for this was unknown.</p>
<p>The truncation values worked the best when they equalled with match.</p>
<h2>Conclusion</h2>
<p>The best results achieved overall were with the plus termed bilateral weighted variant after the application of truncation values. This resulted in an additional 19% improvement in accuracy measured by bad1. The greatest improvements benchmarked were achieved on scenes with a high amount of discontinuities and lighting and exposure variant ones. Horizontal structural noise and many streaking artefacts were observed upon visual analysis. The reasons were not determined, though were hypothesized as the result of noise and diagonal edges.</p>
<h2>Summative conclusion</h2>
<p>The application of bilateral support weights combined with a truncation operation yielded the best results overall and individually as well. When it comes to the reported results by Madeo et al. (2016) at quarter resolution images of the scenes of Middlebury 2003 in grayscale mode in nonoccluded regions there has been a 13.14% and 11.86% improvement achieved measured by the metric bad1. This outperformed said authors&rsquo; reported results in RGB as well (See Table 2 below).</p>
<p>&nbsp;</p>
<table width="444">
<tbody>
<tr>
<td width="274">
<p>&nbsp;Algorithm</p>
</td>
<td width="76">
<p>Teddy</p>
</td>
<td width="95">
<p>Cones</p>
</td>
</tr>
<tr>
<td width="274">
<p>Proposed</p>
</td>
<td width="76">
<p>11.27%</p>
</td>
<td width="95">
<p>7.18%</p>
</td>
</tr>
<tr>
<td width="274">
<p>Madeo et al. (2016)</p>
</td>
<td width="76">
<p>24.41%</p>
</td>
<td width="95">
<p>19.04%</p>
</td>
</tr>
<tr>
<td width="274">
<p>Madeo et al. (2016) -RGB</p>
</td>
<td width="76">
<p>13.9%</p>
</td>
<td width="95">
<p>9.6%</p>
</td>
</tr>
<tr>
<td width="274">
<p>Improvement w.r.t. Madeo et al. (2016) grayscale</p>
</td>
<td width="76">
<p>13.14%</p>
</td>
<td width="95">
<p>11.86%</p>
</td>
</tr>
</tbody>
</table>
<p>&nbsp;</p>
<p>Table 2: Comparison chart to the results reported by Madeo et al. (2016) on grayscale images at quarter resolution in non-occluded regions of the scenes of the Middlebury 2003 dataset.</p>
<p>When it comes to the Middlebury 2014 dataset, an overall 27.7% increase in accuracy was measured by the metric bad1 compared to the baseline. However, after the last experiment, many streaking artefacts, horizontal structural noise was still observable throughout the scenes symptomatic to DP approaches. Hypothesized reason for this was noise present in scenes.</p>
<p><img src="https://raw.githubusercontent.com/regorigregory/FYP_PUBLIC/master/git_assets/insert_graph.png" alt="Benchmarking result at Middlebury 2014 - quater resolution, grayscale, non-occluded - bad1" width="1914" height="852" /></p>
<p>Figure 1: the performance of different versions of the algorithms developed in this project against the Middlebury 2014 dataset&rsquo;s training images at quarter resolution in grayscale mode. X-axis: Algorithm versions. Y-axis: bad1: percentage of bad pixels whose absolute difference to the ground truth is greater than one.</p>
<p>With regards to exposure and lighting variations, though, improvements were achieved, it was certainly not to the satisfactory level.</p>
<h1>References</h1>
<ol>
<li>Boyat, A. K. and Joshi, B. K. (2015) &lsquo;A Review Paper‚ÄØ: Noise Models in Digital Image Processing&rsquo;, <em>Signal &amp; Image Processing‚ÄØ: An International Journal</em>. Academy and Industry Research Collaboration Center (AIRCC), 6(2), pp. 63&ndash;75. doi: 10.5121/sipij.2015.6206.</li>
<li>Chen, D., Ardabilian, M. and Chen, L. (2013) &lsquo;A novel trilateral filter based adaptive support weight method for stereo matching&rsquo;, in <em>BMVC 2013 - Electronic Proceedings of the British Machine Vision Conference 2013</em>. British Machine Vision Association, BMVA. doi: 10.5244/C.27.96.</li>
<li>Chen, D., Ardabilian, M. and Chen, L. (2015) &lsquo;A fast trilateral filter-based adaptive support weight method for stereo matching&rsquo;, <em>IEEE Transactions on Circuits and Systems for Video Technology</em>. Institute of Electrical and Electronics Engineers Inc., 25(5), pp. 730&ndash;743. doi: 10.1109/TCSVT.2014.2361422.</li>
<li>De-Maeztu, L., Villanueva, A. and Cabeza, R. (2011) &lsquo;Stereo matching using gradient similarity and locally adaptive support-weight&rsquo;, <em>Pattern Recognition Letters</em>. North-Holland, 32(13), pp. 1643&ndash;1651. doi: 10.1016/j.patrec.2011.06.027.</li>
<li>Fang, J. <em>et al.</em> (2012) &lsquo;Accelerating cost aggregation for real-time stereo matching&rsquo;, in <em>Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS</em>, pp. 472&ndash;481. doi: 10.1109/ICPADS.2012.71.</li>
<li>Hamzah, R. A. and Ibrahim, H. (2016) &lsquo;Literature Survey on Stereo Vision Disparity Map Algorithms&rsquo;, <em>Journal of Sensors</em>, 2016, pp. 1&ndash;23. doi: 10.1155/2016/8742920.</li>
<li>Hosni, A., Bleyer, M. and Gelautz, M. (2013) &lsquo;Secrets of adaptive support weight techniques for local stereo matching&rsquo;, <em>Computer Vision and Image Understanding</em>. Academic Press, 117(6), pp. 620&ndash;632. doi: 10.1016/j.cviu.2013.01.007.</li>
<li>Madeo, S. <em>et al.</em> (2016) &lsquo;An optimized stereo vision implementation for embedded systems: application to RGB and infra-red images&rsquo;, <em>Journal of Real-Time Image Processing</em>, 12(4), pp. 725&ndash;746. doi: 10.1007/s11554-014-0461-7.</li>
<li>Scharstein, D. and Szeliski, R. (2002) &lsquo;A taxonomy and evaluation of dense two-frame stereo correspondence algorithms&rsquo;, <em>International Journal of Computer Vision</em>, 47(1&ndash;3), pp. 7&ndash;42. doi: 10.1023/A:1014573219977.</li>
</ol>

In [1]:
import pandas as pd
import ipywidgets as widgets
import numpy as np
import sys
import os

sys.path.append(os.path.join("..", ".."))
import glob
import cv2

import plotly.graph_objs as go
import plotly.express as px
from ipywidgets import HBox, VBox, Button

from components.utils import plotly_helpers as ph
from components.utils import utils as u

In [2]:

available_metrix = ['abs_error',
       'mse', 'avg', 'eucledian', 'bad1', 'bad2', 'bad4', 'bad8']

metrics_selector = widgets.Dropdown(
    options=[(m,m) for m in available_metrix],
    description='Metric:',
    value="bad4"
)


nonoccluded = widgets.Dropdown(
    options=[("yes", False), ("No", True)],
    description='Nonoccluded:'
)


### Please select metrics and whether occlusions are counted as errors

In [3]:
VBox([metrics_selector, nonoccluded])

VBox(children=(Dropdown(description='Metric:', index=6, options=(('abs_error', 'abs_error'), ('mse', 'mse'), (‚Ä¶

### Loading the data and building the plots

In [4]:
# best_results.csv

selected_file = os.path.join("..","..", "benchmarking", "MiddEval", "custom_log", "best_results.csv")
#selected_file = "./fixed_csv2.csv"
df = ph.load_n_clean(selected_file)

##Filtering to selected occlusion parameter

df = df[df["are_occlusions_errors"]==nonoccluded.value]
df.sort_values(by=["scene", "match", "h", "w"], inplace=True)

In [5]:
### Dashboard 1


from ipywidgets import Image, Layout

img_widget = Image(value=df["loaded_imgs"].iloc[0], 
                   layout=Layout(height='375px', width='450px'))

fig_a = ph.get_figure_widget (df, "scene", metrics_selector.value, 
                           "Scene w.r.t."+metrics_selector.value)
fig_b = ph.get_figure_widget (df, "match", "kernel_size", "Kernel sizes w.r.t. match values")


figs = [fig_a, fig_b]
ph.bind_hover_function(figs, img_widget, df)
ph.bind_brush_function(figs, df)

button = ph.get_reset_brush_button(figs)
dashboard1 = VBox([button, fig_a,
                  HBox([img_widget, fig_b])])


### Dashboard 2

df.sort_values(by=["experiment_id"])
traced_fig_1, dfs_1 = ph.get_figure_widget_traced(df, "scene", metrics_selector.value, "experiment_id")

traced_fig_widget_1 = go.FigureWidget(traced_fig_1)



traced_fig_1_imw_1 = Image(value=df["loaded_imgs"].iloc[0], 
                   layout=Layout(height='375px', width='450px'))
traced_fig_1_imw_2 = Image(value=df["loaded_gts"].iloc[0], 
                   layout=Layout(height='375px', width='450px'))

#figs, img_widget, selected_scene_df
ph.bind_hover_function2([traced_fig_widget_1], traced_fig_1_imw_1, dfs_1, img_widget_groundtruth=traced_fig_1_imw_2)


turn_the_lights_on = ph.get_dropdown_widget(["On", "Off"], label="Turn plots:", values = [True, False])

ph.bind_dropdown_switch_traces_fn(turn_the_lights_on, traced_fig_widget_1)

dashboard2 = VBox([turn_the_lights_on, traced_fig_widget_1, HBox([traced_fig_1_imw_1,traced_fig_1_imw_2])])


### Dashboard 3


traced_fig_2, dfs_2 = ph.get_figure_widget_traced(df, "experiment_id", metrics_selector.value, "scene")

traced_fig_widget_2 = go.FigureWidget(traced_fig_2)

traced_fig_2_imw_1 = Image(value=df["loaded_imgs"].iloc[0], 
                   layout=Layout(height='375px', width='450px'))
traced_fig_2_imw_2 = Image(value=df["loaded_gts"].iloc[0], 
                   layout=Layout(height='375px', width='450px'))



#figs, img_widget, selected_scene_df
ph.bind_hover_function2([traced_fig_widget_2], traced_fig_2_imw_1, dfs_2, img_widget_groundtruth=traced_fig_2_imw_2)

turn_the_lights_on_2 = ph.get_dropdown_widget(["On", "Off"], label="Turn plots:", values = [True, False])

ph.bind_dropdown_switch_traces_fn(turn_the_lights_on_2, traced_fig_widget_2)


dashboard3 = VBox([turn_the_lights_on_2, traced_fig_widget_2, HBox([traced_fig_2_imw_1,traced_fig_2_imw_2])])


### To aid interaction with the plots, the best results in a tabular form are displayed below

In [6]:
pd.pivot_table(df, index = ["experiment_id", "kernel_size", "match"], values = "bad4", aggfunc=np.mean).sort_values(by="bad4").head(5)

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,bad4
experiment_id,kernel_size,match,Unnamed: 3_level_1
trunc_plusblg_30_5x7no_infmge_30_-9_-1_gc_5_gs_1_a_30,5x7,30,0.268293
plusblg_45_5x7gc_5_gs_1_alph_0,5x7,45,0.29496
blg_40_5x7gc_8_gs_90_alph_0,5x7,40,0.336915
bm_30_9x3,9x3,30,0.460599
bm_30_1x1,1x1,30,0.54546


In [7]:
df.pivot_table(index = "experiment_id", columns="scene", values = "bad4", aggfunc=np.min).head(10)

scene,Adirondack,ArtL,Jadeplant,Motorcycle,MotorcycleE,Piano,PianoL,Pipes,Playroom,Playtable,PlaytableP,Recycle,Shelves,Teddy,Vintage
experiment_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
blg_40_5x7gc_8_gs_90_alph_0,0.20303,0.232686,0.628386,0.152385,0.352644,0.352539,0.548609,0.269689,0.359628,0.36198,0.243838,0.224692,0.396556,0.114476,0.612593
bm_30_1x1,0.512256,0.619496,0.643227,0.305835,0.862832,0.499068,0.800975,0.405057,0.533133,0.483722,0.458109,0.4748,0.545383,0.294317,0.743693
bm_30_9x3,0.452485,0.483601,0.637492,0.177582,0.839589,0.437705,0.727591,0.3065,0.454554,0.369285,0.291468,0.396472,0.482954,0.188396,0.663315
plusblg_45_5x7gc_5_gs_1_alph_0,0.190174,0.245474,0.357256,0.152771,0.318755,0.344669,0.528892,0.262903,0.334247,0.359784,0.245759,0.204286,0.39917,0.104634,0.375619
trunc_plusblg_30_5x7no_infmge_30_-9_-1_gc_5_gs_1_a_30,0.162497,0.212571,0.313128,0.139842,0.227339,0.303526,0.467895,0.206119,0.300056,0.375146,0.250492,0.177198,0.413843,0.093185,0.381559


### Dashboard 1: Scene w.r.t. {metric} (selection plot)
<ol>
    <li>The following figure allows to use the "lasso" tool as a tool of selection.</li>
    <li>As a result, the relevant datapoints and their corresponding values in the figure in the bottom right corner will be highlighted.</li>
    <li>Pressing the "clear selection" button will reset the figure.</li>
    <li> Additionally, if a data point is hovered, the corresponding disparity output value will be displayed in the bottom right corner.</li>

</ol>

In [8]:
dashboard1

VBox(children=(Button(description='clear selection', style=ButtonStyle()), FigureWidget({
    'data': [{'custo‚Ä¶

### Dashboard 2: Scenes w.r.t. {metric} with color coded "epochs"
An "epoch" in this context means an experiment with the same settings evaluated across every scene in the Middlebury 2004 training dataset.<br>
<ol>
    <li>The following figure allows to turn all the plots on and off</li>
    <li>Additionally, their visibiliy can also be handled by interacting with their legend entries on the right side of the plot.
    </li>
    <li> Therefore custom comparison can be made between different scenes, kernel sizes and match values. </li>
    <li> The figure in the bottom left corner shows the corresponding disparity map. </li>
    <li> The figure in the bottom right corner shows the corresponding ground truth disparity map. </li>
</ol>

In [9]:
dashboard2

VBox(children=(Dropdown(description='Turn plots:', options=(('On', True), ('Off', False)), value=True), Figure‚Ä¶

### Dashboard 3: "Epoch" w.r.t. {metric} with color coded scenes
An "epoch" in this context means an experiment with the same settings evaluated across every scene in the Middlebury 2004 training dataset.<br>
<ol>
    <li>The following figure allows to turn all the plots on and off</li>
    <li>Additionally, their visibiliy can also be handled by interacting with their legend entries on the right side of the plot.
    </li>
    <li> Therefore custom comparison can be made between different scenes, kernel sizes and match values. </li>
    <li> The figure in the bottom left corner shows the corresponding disparity map. </li>
    <li> The figure in the bottom right corner shows the corresponding ground truth disparity map. </li>
</ol>

In [10]:
dashboard3

VBox(children=(Dropdown(description='Turn plots:', options=(('On', True), ('Off', False)), value=True), Figure‚Ä¶