# Analysis of the Runtime

In this notebook, we now analyze the performance of the different experiment-wares in terms of runtime.
More precisely, we only compare the experiment-wares based on the time they spent before completing their task (and thus, experiments in which the timeout is reached are considered unsuccessful, whatever their outcome).

## Imports

As usual, we start by importing the needed classes and functions from *Metrics-Wallet*.

In [None]:
from metrics.wallet import BasicAnalysis, DecisionAnalysis, LineType
from metrics.wallet import find_best_cpu_time_input

## Loading the data of the experiments

In a [dedicated notebook]({{ load_notebook }}.ipynb), we already read and preprocessed the data collected during our experiments.
We can now simply reload the cached `BasicAnalysis` to retrieve it.

In [None]:
basic_analysis = BasicAnalysis.import_from_file('.cache')

Since we now want to perform a more specific analysis, we need to create a `DecisionAnalysis` that will provide methods dedicated to the analysis of the runtime of experiments.

In [None]:
analysis = DecisionAnalysis(basic_analysis=basic_analysis)

## Virtual Best Experiment-Ware

The *Virtual Best Experiment-Ware* (or VBEW) is an experiment-ware that does not really exist.
Its runtime on a particular input is that of the fastest experiment-ware that was run on that input (even though one could define a VBEW based on other criteria).
If one had an oracle to select the best experiment-ware for a particular input, and then run the experiment-ware on this input, its runtime would be that of the VBEW.

In [None]:
analysis = analysis.add_virtual_experiment_ware(function=find_best_cpu_time_input, name='VBEW')

We can now compute the contribution of each experiment-ware to the VBEW.

In [None]:
analysis.remove_experiment_wares(['VBEW'])\
        .contribution_table(deltas=(1, 10))

Let us describe how to read this table.
In the first column, we can see for each experiment-ware the number of inputs for which the runtime of the experiment-ware is equal to that of the VBEW.
In the second column (resp. third column), we can see for each experiment-ware the number of inputs for which the experiment-ware is at least 1 second faster (resp. 10 seconds faster) than any other experiment-ware.
Finally, in the fourth column, we can see the number of inputs for which this experiment-ware is the only one to run until completion (and thus, all other experiment-wares reached the time limit on this input).

For a more visual representation of these contributions, we can represent the information provided in the table above with a bar plot.
This plot shows for each solver the number of times the runtime of this solver is equal to that of the VBEW.

In [None]:
analysis.marginal_contribution()

## Overview of the results

An overview of the results can easily be obtained using a so-called *cactus-plot*, which is a figure that is particularly popular in the SAT or CP communities.

In [None]:
analysis.cactus_plot(
    cactus_col='cpu_time',
    show_marker=False,

    title='Cactus-plot',
    x_axis_name='Number of solved inputs',
    y_axis_name='Time (s)',

    color_map={ 'VBEW': '#000000' },
    style_map={ 'VBEW': LineType.DASH_DOT },

    dynamic=False
)

On this plot, we can easily read for each experiment-ware the number of inputs on which it can run until completion within a certain time limit.
In particular, the more an experiment-ware is to the right, the faster it is in general.

**TODO: ADD HERE AN INTERPRETATION FOR THIS CACTUS-PLOT!**


Another way to get an overview of the results is to use the *cumulative distribution function* (CDF), which may be seen as a cactus-plot in which the axes have been switched.

In [None]:
analysis.cdf_plot(
    cdf_col='cpu_time',
    show_marker=False,
    normalized=True,

    title='CDF',
    x_axis_name='Time (s)',
    y_axis_name='Percentage of solved inputs',

    color_map={ 'VBEW': '#000000' },
    style_map={ 'VBEW': LineType.DASH_DOT },

    dynamic=False
)

The interpretation of this plot is similar to that of a cactus-plot.
One of the advantage of this representation is that the order of the lines in the plot is the same as that of the legend, and thus best experiment-wares are on the top.
Additionally, it has more connections with the theory of statistics (while cactus-plots are not so meaningful outside the community).

Talking about statistics, box-plots can also be used to have an overview of the distribution of the runtime of the different experiment-wares.

In [None]:
analysis.box_plot(
    box_by='experiment_ware',
    box_col='cpu_time',

    title='Box-plots of the runtime',

    dynamic=False
)

**TODO: ADD HERE AN INTERPRETATION FOR THESE BOXPLOTS!**

## Numerical results

To get more information about the statistics of our experiments, let us refer to the following table.

In [None]:
analysis.stat_table(
    commas_for_number=True,
    dollars_for_number=True
)

Let us describe the content of this table, for each considered experiment-ware:

- The column `count` is the number of inputs solved by the experiment-ware.
- The column `sum` is the time taken by the experiment-ware to run on all inputs (including timeouts).
- The columns `PARx` are equivalent to `sum` but add a penalty of `x` times the timeout to failed experiments (*PAR* stands for *Penalized Average Runtime*).
- The column `common count` is the number of inputs commonly solved by all experiment-wares.
- The column `common sum` is the time taken by the (current) experiment-ware to solve the commonly solved inputs.
- The column `uncommon count` is the number of inputs solved by the experiment-ware without considering common ones (which are considered as *easy*).
- The column `total` is the total number of experiments run with the experiment-ware.

**TODO: ADD AN INTERPRETATION OF THE RESULTS IN THE TABLE ABOVE.**

Let us now consider another table, which provides for each input and for each experiment-ware the information relative to a particular variable in the analysis, for instance the `cpu_time` of the experiment-ware on an input.

In [None]:
analysis.pivot_table(
    index='input',
    columns='experiment_ware',
    values='cpu_time',
    commas_for_number=True,
    dollars_for_number=True
)#.head()

**TODO: ADD HERE AN INTERPRETATION OF THE TABLE ABOVE.**

## Pairwise comparisons

Now that we have an overview of the results, we can make a pairwise comparison of the experiment-wares, to have a closer look at their behavior.
We can do so by drawing so-called *scatter-plots*.

First, we need to select two of the experiment-wares among those run during the experiments.

In [None]:
xp_ware_x = analysis.experiment_wares[0]
xp_ware_y = analysis.experiment_wares[1]

Once the experiment-wares have been selected, we can draw a scatter-plot that compares the runtime of both solvers on each input.
Here, each point is an input, and the x-axis and y-axis correspond to the runtime of `xp_ware_x` and `xp_ware_y` on this input, respectively.

In [None]:
analysis.scatter_plot(
    xp_ware_x,
    xp_ware_y,

    scatter_col='cpu_time',
    title=f'Comparison between {xp_ware_x} and {xp_ware_y}',

    x_min=1,
    x_max={{ timeout }},
    y_min=1,
    y_max={{ timeout }},

    logx=True,
    logy=True,

    dynamic=False
)

**TODO: ADD HERE AN INTERPRETATION FOR THIS SCATTER-PLOT!**

While the analysis presented in this template only presents one scatter-plot for demonstration purposes, it may be useful for you to draw more scatter-plots, based on your needs.
In fact, all pairwise comparisons between two experiment-wares could be visualized using scatter-plots.