# Loading Experiment Data

In this notebook, we start by loading the data collected while running different experiment-wares (in this case, MIP solvers), and perform some preprocessing on this data to allow its use for further analysis in dedicated notebooks.

## Imports

We first need to import the modules we need to load the data.
In particular, we must obviously import *Metrics-Wallet*, which we will use to deal with our data.

In [1]:
from itertools import product
from metrics.wallet import BasicAnalysis

## Reading the data

The next step is to read the data from the log files produced by our different experiment-wares.
This data is described in the file [`metrics_scalpel.yml`](config/metrics_scalpel.yml), and automatically parsed by *Metrics-Scalpel* to create a `BasicAnalysis` object.

In [2]:
analysis = BasicAnalysis(input_file='config/metrics_scalpel.yml')

The `BasicAnalysis` object instantiated above provides elementary and general methods for preprocessing our data before actually analyzing the results (which will require more specific methods as it can be seen in the dedicated notebooks).

An important thing to do now is to visualize the collected data, to make sure that everything was properly read.
This can be achieved by looking at the data-frame that has been built inside the `BasicAnalysis` object.

In [3]:
analysis.data_frame

Unnamed: 0,input,experiment_ware,cpu_time,timestamp_list,bound_list,status,decision,objective,timeout,success,user_success,missing,consistent_xp,consistent_input,error
0,/home/cril/wwf/Benchmarks/MILP/roi5alpha10n8.m...,scip.sh,1200.860,"[52.3, 52.9, 71.8, 73.3, 75.5, 80.0, 84.5, 90....","[0.0, -24.62681, -24.62681, -24.62681, -24.626...",,,,1200.0,True,True,False,True,True,False
240,/home/cril/wwf/Benchmarks/MILP/roi5alpha10n8.m...,ilog_4.sh,1200.430,,,,,,1200.0,True,True,False,True,True,False
480,/home/cril/wwf/Benchmarks/MILP/roi5alpha10n8.m...,scip_c.sh,1198.330,"[52.3, 52.9, 72.0, 73.4, 75.7, 80.3, 84.8, 90....","[0.0, -24.62681, -24.62681, -24.62681, -24.626...",interrupted,,-43.566620,1200.0,True,True,False,True,True,False
720,/home/cril/wwf/Benchmarks/MILP/roi5alpha10n8.m...,ilog_6.sh,1200.070,,,,,,1200.0,True,True,False,True,True,False
960,/home/cril/wwf/Benchmarks/MILP/roi5alpha10n8.m...,scip_b.sh,1198.190,"[52.2, 52.8, 71.8, 73.3, 75.6, 79.8, 85.0, 91....","[0.0, -24.62681, -24.62681, -24.62681, -24.626...",interrupted,,-50.430375,1200.0,True,True,False,True,True,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
479,/home/cril/wwf/Benchmarks/MILP/neos-4722843-wi...,ilog_4.sh,177.752,,,"optimal, tolerance",optimal,25009.663366,1200.0,True,True,False,True,True,False
719,/home/cril/wwf/Benchmarks/MILP/neos-4722843-wi...,scip_c.sh,891.971,"[77.5, 78.6, 81.4, 84.9, 95.4, 107.0, 108.0, 1...","[33439.4, 31171.87, 30103.76, 30103.76, 27707....",solved,optimal,25009.663366,1200.0,True,True,False,True,True,False
959,/home/cril/wwf/Benchmarks/MILP/neos-4722843-wi...,ilog_6.sh,180.399,,,"optimal, tolerance",optimal,25009.663366,1200.0,True,True,False,True,True,False
1199,/home/cril/wwf/Benchmarks/MILP/neos-4722843-wi...,scip_b.sh,763.956,"[77.7, 78.8, 81.7, 85.2, 95.6, 108.0, 108.0, 1...","[33439.4, 31171.87, 30103.76, 30103.76, 27707....",solved,optimal,25009.663366,1200.0,True,True,False,True,True,False


As everything seems to be OK, we can now proceed with the preprocessing of our data.

## More meaningful names for solvers

Currently, the names of the solvers correspond to the scripts that were used to run them during our experiments.
While it is enough to discriminate them in our analysis, they are not really meaningful.
We will thus replace them by more descriptive names, that may even contain LaTeX code if you want pretty names in your papers.

In [4]:
name_map = {
    'ilog.sh'  : '$CPLEX_{default}$',
    'ilog_4.sh': '$CPLEX_{barrier}$',
    'ilog_6.sh': '$CPLEX_{concurrent-optimizers}$',
    'scip.sh'  : '$SCIP_{default}$',
    'scip_b.sh': '$SCIP_{barrier}$',
    'scip_c.sh': '$SCIP_{barrier-crossover}$'
}

Based on the map above, we can easily replace the name of the experiment-wares as follows.

In [5]:
analysis = analysis.add_variable(
    new_var='experiment_ware', 
    function=lambda xp: name_map[xp['experiment_ware']]
)

## Missing bound and timestamp lists for *CPLEX*

In our experiments, the solver *CPLEX* did not output the intermediate bounds it found.
As the list of bounds is required to perform an *optimization* analysis, we need to provide it for the results of *CPLEX*, when it managed to find a bound.
First, we need a function to identify when an experiments is solved by *CPLEX*.

In [6]:
def is_solved_by_cplex(xp):
    """
    This function allows both to check that an experiment is a CPLEX experiment,
    and that CPLEX found a solution during this experiment.
    """
    return xp['decision'] == 'optimal' and xp['status'] != 'solved'

Then, we compute the list of bounds by setting it to the best bound provided by the solver, and the corresponding timestamp to its overall runtime.
When the solver is not *CPLEX* or when no bound is provided (i.e., when the instance is not solved), we leave both lists unchanged.

In [7]:
analysis = analysis.add_variable(
    'timestamp_list',
    lambda xp: [xp['cpu_time']] if is_solved_by_cplex(xp) else xp['timestamp_list']
).add_variable(
    'bound_list',
    lambda xp: [xp['objective']] if is_solved_by_cplex(xp) else xp['bound_list']
)

We can now check that our changes to the bound and timestamp lists have been taken into account by having a look to the data-frame of our analysis.

In [8]:
analysis.data_frame

Unnamed: 0,input,experiment_ware,cpu_time,timestamp_list,bound_list,status,decision,objective,timeout,success,user_success,missing,consistent_xp,consistent_input,error
0,/home/cril/wwf/Benchmarks/MILP/roi5alpha10n8.m...,$SCIP_{default}$,1200.860,"[52.3, 52.9, 71.8, 73.3, 75.5, 80.0, 84.5, 90....","[0.0, -24.62681, -24.62681, -24.62681, -24.626...",,,,1200.0,True,True,False,True,True,False
240,/home/cril/wwf/Benchmarks/MILP/roi5alpha10n8.m...,$CPLEX_{barrier}$,1200.430,,,,,,1200.0,True,True,False,True,True,False
480,/home/cril/wwf/Benchmarks/MILP/roi5alpha10n8.m...,$SCIP_{barrier-crossover}$,1198.330,"[52.3, 52.9, 72.0, 73.4, 75.7, 80.3, 84.8, 90....","[0.0, -24.62681, -24.62681, -24.62681, -24.626...",interrupted,,-43.566620,1200.0,True,True,False,True,True,False
720,/home/cril/wwf/Benchmarks/MILP/roi5alpha10n8.m...,$CPLEX_{concurrent-optimizers}$,1200.070,,,,,,1200.0,True,True,False,True,True,False
960,/home/cril/wwf/Benchmarks/MILP/roi5alpha10n8.m...,$SCIP_{barrier}$,1198.190,"[52.2, 52.8, 71.8, 73.3, 75.6, 79.8, 85.0, 91....","[0.0, -24.62681, -24.62681, -24.62681, -24.626...",interrupted,,-50.430375,1200.0,True,True,False,True,True,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
479,/home/cril/wwf/Benchmarks/MILP/neos-4722843-wi...,$CPLEX_{barrier}$,177.752,[177.752],[25009.663366],"optimal, tolerance",optimal,25009.663366,1200.0,True,True,False,True,True,False
719,/home/cril/wwf/Benchmarks/MILP/neos-4722843-wi...,$SCIP_{barrier-crossover}$,891.971,"[77.5, 78.6, 81.4, 84.9, 95.4, 107.0, 108.0, 1...","[33439.4, 31171.87, 30103.76, 30103.76, 27707....",solved,optimal,25009.663366,1200.0,True,True,False,True,True,False
959,/home/cril/wwf/Benchmarks/MILP/neos-4722843-wi...,$CPLEX_{concurrent-optimizers}$,180.399,[180.399],[25009.663366],"optimal, tolerance",optimal,25009.663366,1200.0,True,True,False,True,True,False
1199,/home/cril/wwf/Benchmarks/MILP/neos-4722843-wi...,$SCIP_{barrier}$,763.956,"[77.7, 78.8, 81.7, 85.2, 95.6, 108.0, 108.0, 1...","[33439.4, 31171.87, 30103.76, 30103.76, 27707....",solved,optimal,25009.663366,1200.0,True,True,False,True,True,False


## Checking the success and consistency of the results

During our analysis, we will need to know whether a given experiment was successful.
Typically, this is the case when the solver provided an optimal bound or proved the infeasibility of the problem within the time limit (even though we can use different criteria depending on what we want to analyze).

In [9]:
def is_success(xp):
    """
    This function checks that the solver proved the optimality
    of its best bound within the time limit.
    """
    return xp['decision'] == 'optimal' or xp['decision'] == 'infeasible'

To make sure that our experiments are consistent, we also need to compare the results obtained by the different solvers.
In particular, if different solvers claim to have found an optimal value, this value must be the same for all solvers, or at least within a certain tolerance when floating point values are computed.

In [10]:
def is_consistent_by_input(df_input):
    """
    This function checks that the pairwise comparison between two different
    optimal bounds found on the same input is small enough to consider these
    bounds as consistent.
    """
    best_values = df_input[df_input['decision'] == 'optimal']['objective'].unique()
    for o1, o2 in product(best_values, best_values):
        if abs(o1 - o2) > 0.01:
            return False
    return True

In [11]:
analysis.check_success(is_success)
analysis.check_input_consistency(is_consistent_by_input)

4 inputs are inconsistent and linked experiments are now declared as unsuccessful.


We can see here that for 4 inputs, there are inconsistent results provided by the solvers.
Let us have a look at these inputs.

In [12]:
analysis.error_table()

Unnamed: 0,input,experiment_ware,cpu_time,timestamp_list,bound_list,status,decision,objective,timeout,success,user_success,missing,consistent_xp,consistent_input,error
41,/home/cril/wwf/Benchmarks/MILP/dano3_5.mps.gz,$SCIP_{default}$,412.0,"[15.8, 61.8, 62.0, 62.5, 63.3, 64.1, 65.2, 65....","[588.43, 588.43, 588.43, 588.43, 588.43, 588.4...",solved,optimal,576.9249,1200.0,False,True,False,True,False,True
281,/home/cril/wwf/Benchmarks/MILP/dano3_5.mps.gz,$CPLEX_{barrier}$,378.057,[378.057],[576.94524466],"optimal, tolerance",optimal,576.9452,1200.0,False,True,False,True,False,True
521,/home/cril/wwf/Benchmarks/MILP/dano3_5.mps.gz,$SCIP_{barrier-crossover}$,387.075,"[15.7, 61.7, 61.8, 62.4, 63.1, 64.0, 65.0, 65....","[588.43, 588.43, 588.43, 588.43, 588.43, 588.4...",solved,optimal,576.9249,1200.0,False,True,False,True,False,True
761,/home/cril/wwf/Benchmarks/MILP/dano3_5.mps.gz,$CPLEX_{concurrent-optimizers}$,373.588,[373.588],[576.94524466],"optimal, tolerance",optimal,576.9452,1200.0,False,True,False,True,False,True
1001,/home/cril/wwf/Benchmarks/MILP/dano3_5.mps.gz,$SCIP_{barrier}$,413.488,"[15.7, 61.6, 61.8, 62.4, 63.1, 64.0, 65.0, 65....","[588.43, 588.43, 588.43, 588.43, 588.43, 588.4...",solved,optimal,576.9249,1200.0,False,True,False,True,False,True
1241,/home/cril/wwf/Benchmarks/MILP/dano3_5.mps.gz,$CPLEX_{default}$,372.586,[372.586],[576.94524466],"optimal, tolerance",optimal,576.9452,1200.0,False,True,False,True,False,True
79,/home/cril/wwf/Benchmarks/MILP/neos-957323.mps.gz,$SCIP_{default}$,182.686,"[3.5, 4.3, 8.3, 19.6, 19.7, 19.7, 21.5, 22.0, ...","[0.0, -61.94329, -87.92383, -87.92383, -205.80...",solved,optimal,-237.7567,1200.0,False,True,False,True,False,True
319,/home/cril/wwf/Benchmarks/MILP/neos-957323.mps.gz,$CPLEX_{barrier}$,26.2247,[26.2247],[-237.74195062],"optimal, tolerance",optimal,-237.742,1200.0,False,True,False,True,False,True
559,/home/cril/wwf/Benchmarks/MILP/neos-957323.mps.gz,$SCIP_{barrier-crossover}$,302.23,"[3.5, 4.3, 8.3, 19.7, 19.7, 19.8, 21.6, 22.0, ...","[0.0, -61.94329, -87.92383, -87.92383, -205.80...",solved,optimal,-237.7567,1200.0,False,True,False,True,False,True
799,/home/cril/wwf/Benchmarks/MILP/neos-957323.mps.gz,$CPLEX_{concurrent-optimizers}$,23.8214,[23.8214],[-237.74195062],"optimal, tolerance",optimal,-237.742,1200.0,False,True,False,True,False,True


For some inconsistent inputs, the difference between the optimal bounds found by the different solvers remains quite small, but for some others, there seems to be a problem as the optimal bound may be twice bigger for one solver compared to the others.
Depending on the context of your analysis, you may do different things from here, e.g., disqualify inconsistent solvers or fix them before running again the analysis.
In our case, as we only demonstrate *Metrics*' capabilities, we will only remove the 4 inputs on which inconsistent results were obtained from our future figures (this is already achieved by *Metrics*, thanks to the `consistent_input` column that has been added to the data-frame).

## Summary and export of the analysis

We can now give a summary of the analysis, that we obtain through the following table.

In [13]:
analysis.description_table()

Unnamed: 0,analysis
n_experiment_wares,6
n_inputs,240
n_experiments,1440
n_missing_xp,0
n_inconsistent_xp,0
n_inconsistent_xp_due_to_input,24
more_info_about_variables,<analysis>.data_frame.describe(include='all')


Finally, the analysis is exported, both to share the data to allow the reproducibility of the analysis, and to reuse it in other notebooks dedicated to more specific analyses.

In [14]:
analysis.export('.cache')