# Running the `ialg` algorithm on the TSPLIB and the HardTSPLIB instances

Here, we want to compute the minimum number of SECs needed to prove optimality for some small famous instances in the TSPLIB [1] and some hard-to-solve instances in the HardTSPLIB [2]. We will use the `ialg` algorithm to compute the minimum number of SECs needed to prove optimality for these instances.

### References
[1] Reinelt, Gerhard. "TSPLIBâ€”A traveling salesman problem library." ORSA journal on computing 3.4 (1991): 376-384.

[2] Vercesi, Eleonora, et al. "On the generation of metric TSP instances with a large integrality gap by branch-and-cut." Mathematical Programming Computation 15.2 (2023): 389-416.


In [40]:
from ialg import ialg, mip
from cover import set_cover_subroutine
from utils import from_tsplib_file_to_graph
import pandas as pd

## TSPLIB instances

These are the instances we were able to conduct the experiment on. Bigger instances were not feasible to run on our machine. If you want to test other instances, just add them to the `tsplib_instances` list at position 0.

In [41]:
tsplib_instances = [("burma14", 14), ("ulysses16", 16), ("gr17", 17), ("gr21", 21), ("ulysses22", 22),  ("gr24", 24), ("fri26", 29), ("bayg29", 29), ("bays29", 29), ("dantzig42", 42), ("swiss42", 42), ("att48", 48), ("gr48", 48), ("hk48", 48), ("eil51", 51), ("berlin52", 52), ("brazil58", 58), ("st70", 70), ("eil76", 76), ("pr76", 76)]

In [42]:
# Store the values in a dictionary
out = {}

Now, we run the `ialg` algorithm on the TSPLIB instances. Unfortunately, this may take a while. If you want to make it faster, just reduce the `max_instance` index

In [43]:
max_instance = 6

In [44]:
for instance_name, n in tsplib_instances[:max_instance]:
    # Parse the instance
    G = from_tsplib_file_to_graph("./data/" + instance_name)
    print("******* Instance:", instance_name, "*******")
    (S_family, size_S_family, partitions, c, runtime) = ialg(G, verbose=True)
    out[instance_name] = (S_family, size_S_family, partitions, c, runtime)
    
    if size_S_family == -1:
        print("Ran into time limit.")
        continue
        
    # check that k* >= size_S_family
    partitions_list = [ [ list(part) for part in partition ] for npts in partitions for partition in partitions[npts] ]
    smallest_S_family = set_cover_subroutine(partitions_list, verbose=False)
    print("k* =",size_S_family,"for S_family =",smallest_S_family)
    assert len(smallest_S_family) == size_S_family
    
    # check that k* <= size_S_family
    two_factor_cost = mip(G, initial_subtours=smallest_S_family, verbose=False)
    tsp_cost = mip(G, subtour_callbacks=True, verbose=False)
    assert round(two_factor_cost) == round(tsp_cost)
    
    print(" ") # Leave some space

******* Instance: burma14 *******
TSP compute in 0.0038750171661376953 seconds. TSP cost = 3323
With smart initialization, we begin with #SECs = 2
They are:
frozenset({2, 3, 4, 5, 6, 11, 12, 13})
frozenset({0, 1, 2, 3, 4, 5, 6, 7, 11, 12, 13})
Found a solution with #SECs: 2
Specifically, they are:
frozenset({2, 3, 4, 5, 6, 11, 12, 13})
frozenset({0, 1, 2, 3, 4, 5, 6, 7, 11, 12, 13})
k* = 2 for S_family = [frozenset({0, 1, 7, 8, 9, 10}), frozenset({8, 9, 10})]
 
******* Instance: ulysses16 *******
TSP compute in 0.004925966262817383 seconds. TSP cost = 6859
With smart initialization, we begin with #SECs = 4
They are:
frozenset({0, 1, 2, 3, 7})
frozenset({4, 5, 6, 8, 9, 10, 14})
frozenset({4, 5, 6, 8, 9, 10, 11, 12, 13, 14})
frozenset({0, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15})
Found a solution with #SECs: 4
Specifically, they are:
frozenset({0, 1, 2, 3, 7})
frozenset({4, 5, 6, 8, 9, 10, 14})
frozenset({4, 5, 6, 8, 9, 10, 11, 12, 13, 14})
frozenset({0, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13

Now, we print the table as reported in the paper

In [45]:
# Create a dataframe out of the dictionary out
df = pd.DataFrame([(x[0], x[1][1], x[1][3], x[1][4]) for x in out.items()], columns=["instance", "S_min", "b_max", "runtime"])
# Print the dataframe
df

Unnamed: 0,instance,S_min,b_max,runtime
0,burma14,2,2,0.003528
1,ulysses16,4,2,0.003805
2,gr17,5,2,0.088158
3,gr21,0,0,0.005231
4,ulysses22,5,2,0.014288
5,gr24,1,2,0.050329


## On the impact of the minimalization procedure

## HardTSPLIB instances

HardTSPLIB is made of instances generated both at random and starting from instances of the TSPLIB.  We divide such instances into to, to make direct comparison with TSPLIB. `hardtsplib_instances_random` and `hardtsplib_instances_tsplib`. We will only run the algorithm on the instances that are feasible to run on our machine. 

In [46]:
hardtsplib_instances_random = [("10001_hard", 10), ("10007_hard", 10), ("10008_hard", 10), ("10010_hard", 10), ("11675_hard", 11), ("12290_hard", 12), ("14850_hard", 14), ("15002_hard", 15), ("15005_hard", 15), ("15007_hard", 15), ("16038_hard", 16), ("20004_hard", 20), ("20007_hard", 20), ("20009_hard", 20), ("20181_hard", 20), ("25001_hard", 25), ("25004_hard", 25), ("25006_hard", 25), ("30001_hard", 30), ("30003_hard", 30), ("30005_hard", 30), ("33001_hard", 33), ("35002_hard", 35), ("35003_hard", 35), ("35009_hard", 35), ("40003_hard", 40), ("40004_hard", 40), ("40008_hard", 40)]

hardtsplib_instances_tsplib = [("gr24_hard", 24), ("bayg29_hard", 29), ("bays29_hard", 29), ("dantzig42_hard", 42),  ("gr48_hard", 48), ("hk48_hard", 48), ("att48_hard", 48), ("eil51_hard", 51), ("brazil58_hard", 58), ("st70_hard", 70), ("pr76_hard", 76)]


In [47]:
max_instance_random = 10
max_instance_from_tsplib = 1


## Hard instances derived from TSPLIB instances

In [48]:
# Store the values in a dictionary
out_tsplib = {}

In [49]:
for instance_name, n in hardtsplib_instances_tsplib[:max_instance_from_tsplib]:
    # Parse the instance
    G = from_tsplib_file_to_graph("./data/" + instance_name)
    print("******* Instance:", instance_name, "*******")
    (S_family, size_S_family, partitions, c, runtime) = ialg(G, verbose=True)
    out_tsplib[instance_name] = (S_family, size_S_family, partitions, c, runtime)
    
    if size_S_family == -1:
        print("Ran into time limit.")
        continue
        
    # check that k* >= size_S_family
    partitions_list = [ [ list(part) for part in partition ] for npts in partitions for partition in partitions[npts] ]
    smallest_S_family = set_cover_subroutine(partitions_list, verbose=False)
    print("k* =",size_S_family,"for S_family =",smallest_S_family)
    assert len(smallest_S_family) == size_S_family
    
    # check that k* <= size_S_family
    two_factor_cost = mip(G, initial_subtours=smallest_S_family, verbose=False)
    tsp_cost = mip(G, subtour_callbacks=True, verbose=False)
    assert round(two_factor_cost) == round(tsp_cost)
    
    print(" ") # Leave some space

******* Instance: gr24_hard *******
TSP compute in 0.20900607109069824 seconds. TSP cost = 1000
With smart initialization, we begin with #SECs = 6
They are:
frozenset({0, 1, 2, 3, 4, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 21, 22})
frozenset({4, 5, 6, 7, 20, 9, 23})
frozenset({0, 1, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 18, 19, 20, 22, 23})
frozenset({4, 5, 6, 7, 20, 23})
frozenset({17, 2, 10, 21})
frozenset({4, 5, 6, 7, 9, 16, 20, 23})
Found smaller partition: 5 -> 2
num_bb_nodes, num_subtour_constrs, num_conn_comp = 1 6 2
num_bb_nodes, num_subtour_constrs, num_conn_comp = 2 7 2
num_bb_nodes, num_subtour_constrs, num_conn_comp = 3 8 2
Found smaller partition: 4 -> 2
num_bb_nodes, num_subtour_constrs, num_conn_comp = 4 9 2
Found smaller partition: 4 -> 2
num_bb_nodes, num_subtour_constrs, num_conn_comp = 5 10 2
Found smaller partition: 4 -> 2
num_bb_nodes, num_subtour_constrs, num_conn_comp = 6 11 2
Found smaller partition: 4 -> 2
num_bb_nodes, num_subtour_constrs, num

In this case, we can see a line-by-line comparison between the values of TSPLIB and HardTSPLIB
    

In [61]:
# Create a dataframe out of the dictionary out and out_tsplib
df_list = []
for i in range(len(out_tsplib)):
    tsplib = list(out_tsplib.items())[i]
    hardtsplib = list(out_tsplib.items())[i]
df_list.append((tsplib[0], tsplib[1][1], hardtsplib[1][1], tsplib[1][3], hardtsplib[1][3]))
df = pd.DataFrame(df_list, columns=["instance", "S_min_TSPLIB", "S_min_HardTSPLIB", "b_TSPLIB", "b_HardTSPLIB"])
# Print the dataframe
df

Unnamed: 0,instance,S_min_TSPLIB,S_min_HardTSPLIB,b_TSPLIB,b_HardTSPLIB
0,gr24_hard,63,63,2,2



## Hard instances derived from random instances

In [62]:
# Store the values in a dictionary
out_random = {}

In [63]:
for instance_name, n in hardtsplib_instances_random[:max_instance_random]:
    # Parse the instance
    G = from_tsplib_file_to_graph("./data/" + instance_name)
    print("******* Instance:", instance_name, "*******")
    (S_family, size_S_family, partitions, c, runtime) = ialg(G, verbose=True)
    out_random[instance_name] = (S_family, size_S_family, partitions, c, runtime)
    
    if size_S_family == -1:
        print("Ran into time limit.")
        continue
        
    # check that k* >= size_S_family
    partitions_list = [ [ list(part) for part in partition ] for npts in partitions for partition in partitions[npts] ]
    smallest_S_family = set_cover_subroutine(partitions_list, verbose=False)
    print("k* =",size_S_family,"for S_family =",smallest_S_family)
    assert len(smallest_S_family) == size_S_family
    
    # check that k* <= size_S_family
    two_factor_cost = mip(G, initial_subtours=smallest_S_family, verbose=False)
    tsp_cost = mip(G, subtour_callbacks=True, verbose=False)
    assert round(two_factor_cost) == round(tsp_cost)
    
    print(" ") # Leave some space

******* Instance: 10001_hard *******
TSP compute in 0.015692949295043945 seconds. TSP cost = 1000
With smart initialization, we begin with #SECs = 2
They are:
frozenset({2, 3, 4, 5, 6, 7})
frozenset({0, 1, 9})
num_bb_nodes, num_subtour_constrs, num_conn_comp = 1 2 2
num_bb_nodes, num_subtour_constrs, num_conn_comp = 2 3 2
num_bb_nodes, num_subtour_constrs, num_conn_comp = 3 4 2
num_bb_nodes, num_subtour_constrs, num_conn_comp = 4 5 2
num_bb_nodes, num_subtour_constrs, num_conn_comp = 5 6 2
Found a solution with #SECs: 7
Specifically, they are:
frozenset({2, 3, 4, 5, 6, 7})
frozenset({0, 1, 9})
[2, 6, 7]
[8, 1, 9]
[0, 1, 5, 8, 9]
[0, 9, 5]
[0, 9, 5, 1]
k* = 7 for S_family = [frozenset({0, 1, 9}), frozenset({2, 6, 7}), frozenset({0, 9, 5, 1}), frozenset({8, 1, 9}), frozenset({0, 1, 5, 8, 9}), frozenset({0, 9, 5}), frozenset({0, 1, 9, 8})]
 
******* Instance: 10007_hard *******
TSP compute in 0.004911184310913086 seconds. TSP cost = 1000
With smart initialization, we begin with #SECs = 2


In [65]:
# Create a dataframe out of the dictionary out
df = pd.DataFrame([(x[0], x[1][1], x[1][3], x[1][4]) for x in out_random.items()],
                  columns=["instance", "S_min", "b_max", "runtime"])
# Print the dataframe
df

Unnamed: 0,instance,S_min,b_max,runtime
0,10001_hard,7,2,0.018102
1,10007_hard,6,2,0.006753
2,10008_hard,7,2,0.007804
3,10010_hard,7,2,0.011068
4,11675_hard,8,2,0.022102
5,12290_hard,13,2,0.071965
6,14850_hard,19,2,0.220108
7,15002_hard,22,2,0.266334
8,15005_hard,21,2,0.245669
9,15007_hard,27,2,0.494747
