# Measuring impact of tolerances on Gurobi

Following the second analysis, I did some necessary changes to the model and the numerics look worse again:

```
[2019-04-11 14:03:10] SOLVER   Matrix range     [4e-06, 3e+02]
[2019-04-11 14:03:10] SOLVER   Objective range  [1e+00, 1e+00]
[2019-04-11 14:03:10] SOLVER   Bounds range     [5e-02, 2e+03]
[2019-04-11 14:03:10] SOLVER   RHS range        [7e-03, 2e+03]
```

In the last round of performance runs the best parameter setting solved the model in less than 2 hours. Now, some scenarios aren't solved within 48h using the same parameter values. So I wanted to see whether I can find better values for the adapted model. Because I did not change anything related to the objective function, I kept `OptimalityTol` constant at 1e-5. I did vary `FeasibilityTol` for the baseline scenario and applied the following values:

* 1e-6 (default value)
* 1e-5
* 1e-4
* 1e-3
* 1e-2 (max value)

In [1]:
from pathlib import Path
import re

from scipy import stats

In [2]:
SOLVE_DURATION_PREFIX = "SOLVER   Solved in"
SOLUTION_PREFIX = "SOLVER   Optimal objective"


def _preprocess_log(path_to_log):
    path_to_log = Path(path_to_log)
    with path_to_log.open("r") as log_file:
        lines = log_file.readlines()
    return [line[22:] for line in lines]
    

def parse_solution_time(path_to_log):
    lines = _preprocess_log(path_to_log)
    duration_line = list(filter(lambda line: line.startswith(SOLVE_DURATION_PREFIX), lines))[0]
    return float(re.findall('\d+.?\d+', duration_line)[1])


def parse_iterations(path_to_log):
    lines = _preprocess_log(path_to_log)
    iterations_line = list(filter(lambda line: line.startswith(SOLVE_DURATION_PREFIX), lines))[0]
    return float(re.findall('\d+.?\d+', iterations_line)[0])


def parse_solution(path_to_log):
    lines = _preprocess_log(path_to_log)
    solution_line = list(filter(lambda line: line.startswith(SOLUTION_PREFIX), lines))[0]
    return float(re.findall('\d+.?\d+e?\+?\d+', solution_line)[0])

In [3]:
def analyse_runs(path_to_root_folder):
    durations = [parse_solution_time(path) for path in Path(path_to_root_folder).glob("*.log")]
    solutions = [parse_solution(path) for path in Path(path_to_root_folder).glob("*.log")]
    iterations = [parse_iterations(path) for path in Path(path_to_root_folder).glob("*.log")]
    
    mean, std = stats.norm.fit(solutions)
    print(f"Gurobi found the following optimal value on average: {mean} (std {std}).")
    mean, std = stats.norm.fit(iterations)
    print(f"Gurobi needed on average {mean:.0f} (std {std:.0f}) iterations to find the optimal value.")
    mean, std = stats.norm.fit(durations)
    print(f"Gurobi needed on average {mean:.0f}s (std {std:.0f}s) to find the optimal value.")

    
def diff(path_to_root_folder, path_to_root_folder_of_base):
    durations = [parse_solution_time(path) for path in Path(path_to_root_folder).glob("*.log")]
    solutions = [parse_solution(path) for path in Path(path_to_root_folder).glob("*.log")]
    iterations = [parse_iterations(path) for path in Path(path_to_root_folder).glob("*.log")]
    
    base_durations = [parse_solution_time(path) for path in Path(path_to_root_folder_of_base).glob("*.log")]
    base_solutions = [parse_solution(path) for path in Path(path_to_root_folder_of_base).glob("*.log")]
    base_iterations = [parse_iterations(path) for path in Path(path_to_root_folder_of_base).glob("*.log")]
    
    mean, std = stats.norm.fit(solutions)
    mean_base, std_base = stats.norm.fit(base_solutions)
    print(f"Relative diff of objective average to base: {mean / mean_base}")
    mean, std = stats.norm.fit(iterations)
    mean_base, std_base = stats.norm.fit(base_iterations)
    print(f"Relative diff of iterations average to base: {mean / mean_base}")
    mean, std = stats.norm.fit(durations)
    mean_base, std_base = stats.norm.fit(base_durations)
    print(f"Relative diff of durations average to base: {mean / mean_base}")
    

In [6]:
print("Tolerances e-6")
print("All runs needed more than 24h and were killed by the cluster.")
print()
print("Tolerances e-5")
print("--------------")
analyse_runs("minus5/")
print()
print("Tolerances e-4")
print("--------------")
print("All runs needed more than 24h and were killed by the cluster.")
print()
print("Tolerances e-3")
print("--------------")
print("All runs needed more than 24h and were killed by the cluster.")
print()
print("Tolerances e-2")
print("--------------")
print("All runs needed more than 24h and were killed by the cluster.")
print()

Tolerances e-6
All runs needed more than 24h and were killed by the cluster.

Tolerances e-5
--------------
Gurobi found the following optimal value on average: 272895.576 (std 0.0).
Gurobi needed on average 1047331 (std 0) iterations to find the optimal value.
Gurobi needed on average 22026s (std 103s) to find the optimal value.

Tolerances e-4
All runs needed more than 24h and were killed by the cluster.

Tolerances e-3
--------------
All runs needed more than 24h and were killed by the cluster.

Tolerances e-2
--------------
All runs needed more than 24h and were killed by the cluster.



## Results

Only runs with `FeasibilityTol` of 1e5 were able to solve the problem within 24h. Seems it is still the best parameter settings.