# Evolutionary Computation - Assignment 3: Local Search

Bartosz Stachowiak 148259<br>
Andrzej Kajdasz 148273

## 1. Problem Statement

There are columns of integers representing nodes. Each row corresponds to a node and contains its x and y coordinates in a plane, as well as a cost associated with the node. There were 4 such data sets each consisting of 200 rows (each representing a single node).

Problem to solve is to choose precisely 50% of the nodes (rounding up if there is an odd number of nodes) and create a Hamiltonian cycle (a closed path) using this subset of nodes. The goal is to minimize the combined total length of the path and the total cost of the selected nodes.

To calculate the distances between nodes, the Euclidean distance formula was used and then round the results to the nearest integer. As suggested, the distances between the nodes were calculated after loading the data and placed in a matrix, so that during the subsequent evaluation of the problem, it was only necessary to read these values which reduced the cost of the operation of the algorithm.

To solve the problem the local search alogrithm was used with different configurations:
- Type of local search:
  - greedy - with random order ensured by shuffling the vector of all possible moves (inter and intra)
  - steepest
- Type of intra neighborhood:
  - edge
  - node
  - edge + node (both)
- Type of starting solutions:
  - random
  - best greedy heuristic (Weighted Greedy Regret Heuristic (0.25))

## 2. Pseudocode of all implemented algorithms

### Local Greedy Search

```
function local_greedy_search(solution, nodes, distance_matrix, neighborhood_type):
    operations = generate_possible_operations(solution, len(nodes), neighborhood_type)
    
    while true:
        // randomization of the operations is ensured by shuffling
        shuffle(operations)
        selected_operation = None
        for operation in operations:
            if op.evaluate_delta(solution, nodes, distance_matrix) < 0:
                selected_operation = operation
                break
        if selected_operation is None:
            break
        
        // When exchanging nodes between solution and outside of the solution,
        // we need to update the operations list.
        update_operations_list(operations, selected_operation)
        
        solution = selected_operation.apply(solution)

    return solution
```

### Local Steepest Search

```
function local_steepest_search(solution, nodes, distance_matrix, neighborhood_type):
    operations = generate_possible_operations(solution, len(nodes), neighborhood_type)
    
    while true:
        best_operation = None
        best_delta = 0
        for operation in operations:
            delta = op.evaluate_delta(solution, nodes, distance_matrix)
            if delta < best_delta:
                best_operation = operation
                best_delta = delta

        if best_operation is None:
            break
        
        // When exchanging nodes between solution and outside of the solution,
        // we need to update the operations list.
        update_operations_list(operations, selected_operation)
        
        solution = selected_operation.apply(solution)

    return solution
```

## 3. Results of the computational experiments

### 3.1. Code for visualization of the results

In [None]:
import pathlib
import itertools

import numpy as np
import matplotlib.pyplot as plt

import pandas as pd
from common import *

In [None]:
DATA_FOLDER = '../data/'
OLD_RESULTS_FOLDER = f'{DATA_FOLDER}old_results/'
RESULT_FOLDER = f'{DATA_FOLDER}results/'
INSTANCE_FOLDER = f'{DATA_FOLDER}tsp_instances/'

SOLVERS = {
    'r': "Random",
    'd': "Weighted Greedy Regret Heuristic (0.25)",
    'lssnode-r' : "Steepest LS, node (random)",
    'lssedge-r' : "Steepest LS, edge (random)",
    'lssboth-r' : "Steepest LS, both (random)",
    'lsgnode-r' : "Greedy LS, node (random)",
    'lsgedge-r' : "Greedy LS, edge (random)",
    'lsgboth-r' : "Greedy LS, both (random)",
    'lssnode-d' : "Steepest LS, node (WGRH)",
    'lssedge-d' : "Steepest LS, edge (WGRH)",
    'lssboth-d' : "Steepest LS, both (WGRH)",
    'lsgnode-d' : "Greedy LS, node (WGRH)",
    'lsgedge-d' : "Greedy LS, edge (WGRH)",
    'lsgboth-d' : "Greedy LS, both (WGRH)",
}

OLD_SOLVERS = {
    'n': "Nearest Neighbor",
    'g': "Greedy Cycle",
    'd-1': "Greedy 2-regret heuristics",
    'd-1.75': "Weighted greedy heuristics (0.75)",
}
SOLVERS_TO_PLOT = SOLVERS.copy()
SOLVERS_TO_PLOT.pop("r")
SOLVERS.update(OLD_SOLVERS)
NUM_NODES = 200

instance_files = [path for path in pathlib.Path(INSTANCE_FOLDER).iterdir() if path.is_file()]
instance_names = [path.name[:4] for path in instance_files]

In [None]:
instances_data = {
    name: read_instance(f'{INSTANCE_FOLDER}{name}.csv')
    for name in instance_names
}

In [None]:
instances_solvers_pairs = itertools.product(instances_data.keys(), SOLVERS.keys())

all_results = {}
all_costs = {}
all_times = {}
all_stats = {}

for instance, solver in instances_solvers_pairs:
    all_results[instance] = all_results.get(instance, {})
    all_costs[instance] = all_costs.get(instance, {})
    all_times[instance] = all_times.get(instance, {})
    all_stats[instance] = all_stats.get(instance, {})
    costs = []
    times = []
    paring_results = []
    for idx in range(NUM_NODES):
        if(solver in OLD_SOLVERS):
            time = 0
            solution, cost = read_solution_timeless(f'{OLD_RESULTS_FOLDER}{instance}-{solver}-{idx}.txt')
        else:
            solution, cost, time = read_solution(f'{RESULT_FOLDER}{instance}-{solver}-{idx}.txt')
        paring_results.append(solution)
        costs.append(cost)
        times.append(time)
    all_results[instance][solver] = np.array(paring_results)
    all_costs[instance][solver] = np.array(costs)
    all_stats[instance][solver] = {
        'mean': np.mean(costs),
        'std': np.std(costs),
        'min': np.min(costs),
        'max': np.max(costs),
    }
    all_times[instance][solver] = {
        'mean': np.mean(times),
        'std': np.std(times),
        'min': np.min(times),
        'max': np.max(times),
    }

In [None]:
costs_df = pd.DataFrame(all_stats).T
time_df = pd.DataFrame(all_times).T
max_df = pd.DataFrame(all_stats).T
min_df = pd.DataFrame(all_stats).T
mean_time_df = pd.DataFrame(all_times).T

for column in SOLVERS.keys():
    costs_df[column] = costs_df[column].apply(lambda x: f'{x["mean"]:.0f} ({x["min"]:.0f} - {x["max"]:.0f})')
    time_df[column] = time_df[column].apply(lambda x: f'{x["mean"]/1000:.2f} ({x["min"]/1000:.2f} - {x["max"]/1000:.2f})')
    max_df[column] = max_df[column].apply(lambda x: x['max'])
    min_df[column] = min_df[column].apply(lambda x: x['min'])
    mean_time_df[column] = mean_time_df[column].apply(lambda x: x['mean']/1000)

for df in [costs_df, time_df, max_df, min_df, mean_time_df]:
    df.rename(columns=SOLVERS, inplace=True)
time_df = time_df.drop(columns = OLD_SOLVERS.values())
mean_time_df = mean_time_df.drop(columns = OLD_SOLVERS.values())

### 3.2. Visualizations and statistics of cost for all dataset-algorithm pairs

In tabular form we present the Mean, Minimum and Maximum of the results of the algorithms for each dataset.

In [None]:
print("Mean (min-max) of the costs:")
costs_df.T

In [None]:
fig, axs = plt.subplots(2, 2, figsize=(15, 8), sharey=True)

for idx, instance in enumerate(instances_data.keys()):
    if idx%2 == 0:
        axs[(idx//2)%2][idx%2].set_ylabel('Cost')
    axs[(idx//2)%2][idx%2].set_title(instance)

    axs[(idx//2)%2][idx%2].violinplot(
        [all_costs[instance][solver] for solver in SOLVERS_TO_PLOT.keys()],
        showmeans=True,
    )

    axs[(idx//2)%2][idx%2].set_xticks(range(1, len(SOLVERS_TO_PLOT.keys()) + 1))
    if idx > 1:
        axs[(idx//2)%2][idx%2].set_xticklabels(SOLVERS_TO_PLOT.values(), rotation=45, ha='right')
    else :
        axs[(idx//2)%2][idx%2].set_xticklabels([])

plt.suptitle('Distribution of the costs')
plt.show()

### 3.3. Visualizations and statistics of running times for all dataset-algorithm pairs

Note: The running times for non Local Search algorithms have been averaged over 100 runs.

The times for the Local Search algorithms is not averaged and does not take into account the time for the creation of the initial (input) solution.

Min/Mean Time for Random algorithm is inaccurate due to its very short running time (at times yielding 0 μs in our measurements - indicating that the time is too short to be measured).

In [None]:
print("Mean (min-max) of the time [ms]:")
time_df.T

In [None]:
x_range = np.arange(len(SOLVERS_TO_PLOT))
bar_width = 0.8 / len(instances_data.keys())

mean_time_plot_df = mean_time_df.drop(columns = "Random").T.sort_values(by="TSPA", ascending=False).T

fig, ax = plt.subplots(figsize=(15, 8), sharey=True)
for idx, instance in enumerate(instances_data.keys()):
    ax.bar(
        x_range + idx * bar_width,
        height=mean_time_plot_df.loc[instance].values,
        width=bar_width,
        label=instance,
    )

ax.set_xticks(x_range + bar_width * (len(instances_data.keys()) - 1) / 2)
ax.set_xticklabels(mean_time_plot_df.columns, rotation=45, ha='right')
plt.title('Time per instance per solver')
plt.ylabel('Running Time [ms]')
plt.legend()
plt.show()


## 4. Best solutions for all datasets and algorithms

To more easily compare the results, we present the best solutions for each dataset side by side.

The weight of each node is denoted both by its size and color. The bigger and brighter the node, the higher its weight.

In [None]:
for solver_idx, solver in enumerate(SOLVERS_TO_PLOT.keys()):
    fig, axs = plt.subplots(1, 4, figsize=(20, 5))
    for idx, instance in enumerate(instances_data.keys()):
        best_instance_idx = np.argmin(all_costs[instance][solver])
        plot_solution_for_instance(instances_data[instance], all_results[instance][solver][best_instance_idx], axs[idx])
        axs[idx].set_title(f'{instance}: {all_costs[instance][solver][best_instance_idx]:.0f}')
    fig.suptitle(f'{SOLVERS_TO_PLOT[solver]}', fontsize=16, y=1.05)
plt.show()


## 5. Source Code

[GitHub](https://github.com/Tremirre/ECP)

## 6. Conclusions



Analyzing the results and visualizations, one can come to several conclusions about the algorithms used in the task:
- Most local search algorithms obtained comparable results with two exceptions - both steepest and greedy with node neighborhood based on a random initial solution. Looking at the visualization of their solutions, one can see that their relatively poor performance may be due to their inability to get rid of very long edges. Starting with a random solution, they replace expensive nodes with cheaper ones, but in this way it finds a local minimum very quickly with no further possible moves in this type of neighborhood. This leads to the conclusion that it is **much better to consider edge-swapped neighbors** in the solution and that even for a random solution, which may contain expensive nodes, then setting them in the right order i.e. avoiding long edges, gives a better result.

- It is also a very interesting phenomenon that for TSPA and TSPB the better results are for those solutions that started on random. This may mean that the Greedy Heuristic solution algorithm was often originally already relatively close to some final local optimum. This is also confirmed by the execution time of this type of local search. In such a situation, **the random solution allows for more exploration of the solution space**.

- Analyzing the execution time of the algorithms themselves, it can be noted that **the steepest version of local search tends to be faster than its greedy counterpart**. Taking into account the objective function, i.e. the final costs, the results are almost identical, which may mean that very often they end up in a similar local optimum only that the greedy version performs more less significant substitutions.

- Comparing the time efficiency of both greedy and local search approaches, it can be noted that **it's most time and cost efficient to start of with a greedy solution and then apply local search to it**.
