# Assignment 1 Part 2 - Ant Colony Optimization

**Note**: The following assumes that the shortest path between two cities is their Euclidian distance.

## (0) Setting up an ACO algorithm

### Parameters

I decided to go with simple parameters for these experiments.

* _coordinates_ - Obviously we'll need those.
* _iteration\_count_ - Always good to keep the number of performed iterations adjustable.
* _ant\_count_ - The number of ants that will be used in the algorithm.
* _heuristic\_strength_ - The relative strength of the heuristic value (inverse distance) against the pheromone value.
* _evaporation\_rate_ - The rate at which the pheromone decays.  

For convenient evaluation I added a print method.

In [117]:
from numpy import ndarray

class Parameters:
  def __init__(self, coordinates: ndarray, iteration_count: int, ant_count: int, heuristic_strength: float, evaporation_rate: float):
    assert iteration_count >= 0, "Invalid number of iterations"
    assert ant_count >= 0, "Invalid number of ants"
    assert 0 <= heuristic_strength <= 1, "Invalid heuristic strength"
    assert 0 <= evaporation_rate <= 1, "Invalid pheromone evaporation rate"

    self.coordinates: ndarray = coordinates
    self.iteration_count: int = iteration_count
    self.ant_count: int = ant_count
    self.heuristic_strength: float = heuristic_strength
    self.evaporation_rate: float = evaporation_rate

  def print(self):
    print("PARAMETERS")
    print(f"Iterations: {self.iteration_count}")
    print(f"Ant count: {self.ant_count}")
    print(f"Heuristic strength: {self.heuristic_strength}")
    print(f"Evaporation rate: {self.evaporation_rate}")
    print("Coordinates: ", end="")
    for i in range(min(5, len(self.coordinates))):
      print(f"({self.coordinates[i][0]}|{self.coordinates[i][1]})", end="")
      print(", ", end="") if i < min(4, len(self.coordinates) - 1) else print(", …")

### Measurements

For measuring the algorithm performance, I added a few measurements compared to the last assignment:

* _time_ - The execution time, this one I already used for the GA.
* _avg_ - The average shortest path, this one I used for the GA as well.
* _std_ - The standard deviation of the shortest path. This is a new one. Should help in determining the volatility of determined solutions.
* _min_ - The minimum shortest path (the best value). This one I also used in the GA.
* _max_ - The maximum shortest path (the worst value). This one is new also. Should give a better understanding of the range of solutions.

For convenient evaluation I added a print experiment method for printing any number of measurements.

In [118]:
from typing import List, Self

class Measurements:
  def __init__(self, time: float, avg: float, std: float, min: float, max: float):
    self.time: float = time
    self.avg: float = avg
    self.std: float = std
    self.min: float = min
    self.max: float = max

  @staticmethod
  def print_experiment(title: str, values: List[int], measurements: List[Self], col_width: int = 12):
    headers = ['Experiment', title, 'Time', 'Avg', 'Std', 'Min', 'Max']
    print(' '.join(f'{h:{col_width}}' for h in headers))
    for i, (value, measurement) in enumerate(zip(values, measurements), start=1):
      row = [i, f'{value:{col_width}.2f}', f'{measurement.time:{col_width}.2f}',
              f'{measurement.avg:{col_width}.2f}', f'{measurement.std:{col_width}.2f}',
              f'{measurement.min:{col_width}.2f}', f'{measurement.max:{col_width}.2f}']
      print(' '.join(str(cell).ljust(col_width) for cell in row))

### The algorithm

This time, I chose to go with just executing the algorithm after initialization. The property _is\_finished_ makes it possible to wait for the optimization to complete.

This implementation handles most everything from calculating the distances, performing the actual optimization and providing the measurements.

In [119]:
from math import sqrt
import time
from typing import List, Optional
import numpy as np;

class ACOAlgorithm:
  def __init__(self, parameters: Parameters):
    self.parameters: Parameters = parameters
    self.distances: Parameters = self.calculate_distance_matrix(parameters.coordinates)

    self.pheromones: ndarray = np.ones_like(self.distances) / len(parameters.coordinates)
    self.shortest_path: Optional[ndarray] = None
    self.measurements: Optional[Measurements] = None

    self.optimize()

  @property
  def is_finished(self):
    return self.measurements is not None

  def calculate_distance_matrix(self, city_coordinates: ndarray) -> ndarray:
    city_count = city_coordinates.shape[0]
    distance_matrix = np.zeros((city_count, city_count))

    for i in range(city_count):
      for j in range(city_count):
        if i != j:
          dx = city_coordinates[i][0] - city_coordinates[j][0]
          dy = city_coordinates[i][1] - city_coordinates[j][1]
          distance_matrix[i][j] = sqrt(dx*dx + dy*dy)

    return distance_matrix

  def optimize(self) -> None:
    start_time = time.time()
    shortest_path_lengths = []

    for iteration in range(self.parameters.iteration_count):
      paths = self.find_paths()
      self.update_pheromones(paths)
      self.update_shortest_path(paths)
      if self.shortest_path is not None:
        shortest_path_lengths.append(self.calculate_path_length(self.shortest_path))

    end_time = time.time()

    self.measurements = Measurements(
        time=end_time - start_time,
        avg=np.mean(shortest_path_lengths),
        std=np.std(shortest_path_lengths),
        min=np.min(shortest_path_lengths),
        max=np.max(shortest_path_lengths),
    )
 
  def find_paths(self) -> List[ndarray]:
    return [self.find_path(ant) for ant in range(self.parameters.ant_count)]
  
  def update_pheromones(self, paths: List[ndarray]):
    self.pheromones *= (1 - self.parameters.evaporation_rate)
    for path in paths:
      path_length = self.calculate_path_length(path)
      for i in range(len(path)):
        from_node, to_node = path[i], path[(i+1) % len(path)]
        pheromone = 1.0 / path_length if path_length != 0 else 0
        self.pheromones[from_node, to_node] += pheromone
        self.pheromones[to_node, from_node] += pheromone

  def update_shortest_path(self, paths: List[ndarray]):
    path_lengths = [self.calculate_path_length(path) for path in paths]
    shortest_path_index = np.argmin(path_lengths)
    if self.shortest_path is None or path_lengths[shortest_path_index] < self.calculate_path_length(self.shortest_path):
      self.shortest_path = paths[shortest_path_index] 

  def find_path(self, ant_index: int) -> ndarray:
    num_cities = len(self.parameters.coordinates)
    path = []
    visited = np.zeros(num_cities, dtype=bool)

    for i in range(num_cities):
      city = ant_index % num_cities if i == 0 else self.choose_next_city(path[-1], visited)
      path.append(city)
      visited[city] = True

    return np.array(path)

  def choose_next_city(self, current_city: int, visited: ndarray) -> int:
    probabilities = self.calculate_probabilities(current_city, visited)
    next_city_index = np.searchsorted(np.cumsum(probabilities), np.random.rand())
    return next_city_index if next_city_index < len(probabilities) else np.argmax(probabilities)
  
  def calculate_probabilities(self, current_city: int, visited: ndarray) -> ndarray:
    num_cities = len(self.parameters.coordinates)
    probabilities = np.zeros(num_cities)
    beta = self.parameters.heuristic_strength
    for i in range(num_cities):
      probabilities[i] = 0.0 if visited[i] else (self.pheromones[current_city][i] ** (1-beta)) * ((1.0 / max(self.distances[current_city][i], 1e-12)) ** beta)
    return probabilities / probabilities.sum() if probabilities.sum() != 0 else np.ones(num_cities) / num_cities
  
  def calculate_path_length(self, path: ndarray) -> float:
    return sum(self.distances[path[i], path[(i+1) % len(path)]] for i in range(len(path)))

### Getting a baseline

Below I created a first baseline experiment. I define some baseline parameters which will be adjusted during the experiments. The execution is very simple: Just wait for the algorithm to complete and print relevant information.

In [120]:
from copy import deepcopy
from time import sleep
from numpy import loadtxt

baseline_params = Parameters(
  loadtxt("d200-41.tsp", dtype=float).reshape(-1, 2), 
  iteration_count=10,
  ant_count=10,
  heuristic_strength=0.5,
  evaporation_rate=0.5
)

experiments = [0.0 for i in range(10)]
all_measurements = []

for evaporation_rate in experiments:
  aco = ACOAlgorithm(baseline_params)
  while not aco.is_finished: sleep(0.1)
  all_measurements.append(aco.measurements)

baseline_params.print()
print()
Measurements.print_experiment("Baseline", experiments, all_measurements)

# declare final params which will be adjusted in the experiments
final_params = deepcopy(baseline_params)

PARAMETERS
Iterations: 10
Ant count: 10
Heuristic strength: 0.5
Evaporation rate: 0.5
Coordinates: (0.08792|0.114808), (0.886453|0.438732), (0.710248|0.321491), (0.248268|1.306361), (0.336466|0.449712), …

Experiment   Baseline     Time         Avg          Std          Min          Max         
1                    0.00         1.29        78.32         7.39        67.78        90.08
2                    0.00         1.28        78.81         7.32        68.69        90.82
3                    0.00         1.28        77.70         7.09        68.22        88.14
4                    0.00         1.28        79.56         7.32        68.38        90.90
5                    0.00         1.28        78.21         8.05        68.05        89.73
6                    0.00         1.28        80.30         7.44        69.35        92.02
7                    0.00         1.28        78.89         7.29        70.16        91.22
8                    0.00         1.28        77.65         7.88  

As can be seen on average it takes about 1.3 seconds to complete a run. Average distance is at about 78.5 with a standard deviation of about 7.

## (1) What are the effects of the heuristic strength on the quality of solutions obtained from ACO?

In [121]:
heuristic_strength_params = deepcopy(baseline_params)

experiments = [x/10 for x in range(11)]
all_measurements = []

for heuristic_strength in experiments:
  heuristic_strength_params.heuristic_strength = heuristic_strength
  aco = ACOAlgorithm(heuristic_strength_params)
  while not aco.is_finished: sleep(0.1)
  all_measurements.append(aco.measurements)

heuristic_strength_params.print()
print()
Measurements.print_experiment("Heuristic Strength", experiments, all_measurements)

PARAMETERS
Iterations: 10
Ant count: 10
Heuristic strength: 1.0
Evaporation rate: 0.5
Coordinates: (0.08792|0.114808), (0.886453|0.438732), (0.710248|0.321491), (0.248268|1.306361), (0.336466|0.449712), …

Experiment   Heuristic Strength Time         Avg          Std          Min          Max         
1                    0.00         1.26       100.37         1.03        98.02       102.01
2                    0.10         1.28        96.85         2.60        92.92       102.60
3                    0.20         1.28        92.24         4.19        86.95        98.30
4                    0.30         1.28        86.20         5.07        78.66        92.57
5                    0.40         1.28        82.06         7.08        71.62        90.56
6                    0.50         1.28        79.03         7.81        66.60        89.06
7                    0.60         1.28        76.12         7.37        65.42        86.76
8                    0.70         1.28        73.06         

The experimental result indicate that a higher heuristic strength leads to better results.

* execution time - slight increase in median values.
* average - seems to decrease with increasing heuristic strength but with diminishing intensity.
* standard deviation - largest at about 60%, smaller in the edges.
* range - the range of values is largest in the median and smaller in the edges.

### 100% heuristic strength?

A 100% heuristic strength leads to the best solution on average here. So why not just go for it? Well, here the trade-off between exploration and exploitation comes to bear. If you were to go for the 100% heuristic strength it would mean full exploitation, but it would be easy to get stuck in a local minimum.

### Choosing a heuristic strength

Based on the findings and conclusions I will opt for a heuristic strength of 70%. This gives a good average result while keeping a high standard deviation for exploration.

In [122]:
final_params.heuristic_strength = 0.7


## (2) What are the effects of the pheromone evaporation rate on the quality of the solutions obtained from ACO?

In [123]:
evaporation_rate_params = deepcopy(baseline_params)

experiments = [x/10 for x in range(11)]
all_measurements = []

for evaporation_rate in experiments:
  evaporation_rate_params.evaporation_rate = evaporation_rate
  aco = ACOAlgorithm(evaporation_rate_params)
  while not aco.is_finished: sleep(0.1)
  all_measurements.append(aco.measurements)

evaporation_rate_params.print()
print()
Measurements.print_experiment("Evaporation Rate", experiments, all_measurements)

PARAMETERS
Iterations: 10
Ant count: 10
Heuristic strength: 0.5
Evaporation rate: 1.0
Coordinates: (0.08792|0.114808), (0.886453|0.438732), (0.710248|0.321491), (0.248268|1.306361), (0.336466|0.449712), …

Experiment   Evaporation Rate Time         Avg          Std          Min          Max         
1                    0.00         1.30        82.81         2.20        81.33        87.00
2                    0.10         1.31        85.07         2.82        81.18        89.31
3                    0.20         1.31        82.58         4.80        75.44        91.47
4                    0.30         1.30        81.97         4.79        75.71        89.75
5                    0.40         1.32        81.84         5.49        72.92        89.22
6                    0.50         1.28        79.22         6.90        69.75        92.82
7                    0.60         1.28        77.19         8.46        63.66        87.69
8                    0.70         1.29        72.49        10.

The experimental result indicate that a higher evaporation rate leads to better results.

* execution time - seems to slightly decrease with higher evaporation rate but not consistenly or noticeably.
* average - seems to decrease with increasing evaporation rate but here with increasing intensity.
* standard deviation - largest at about 90%, smaller in the edges.
* range - seems to increase with increasing evaporation rate.

### 100% evaporation rate?
A 100% evaporation rate also seems to lead to the best solution on average. In contrast to 100% heuristic strength, here every round will be completely random. That leads to a very large standard deviation. So this does not seem like a great approach. Here we would eliminate exploitation completely which would make it unlikely to get stuck on local minimum but also likely to miss good solutions in the near neighborhood.

### Choosing an evaporation rate
A reasonable evaporation rate seems to be in the lower ranges, as otherwise the variance grows very large. I think a value of 40%.

In [127]:
final_params.evaporation_rate = 0.4

## (3) Compare the best results you have obtained using GA and ACO. Comment on your findings.

In [128]:
experiments = [0.0 for i in range(10)]
all_measurements = []

for evaporation_rate in experiments:
  aco = ACOAlgorithm(final_params)
  while not aco.is_finished: sleep(0.1)
  all_measurements.append(aco.measurements)

final_params.print()
print()
Measurements.print_experiment("Final", experiments, all_measurements)

PARAMETERS
Iterations: 10
Ant count: 10
Heuristic strength: 0.7
Evaporation rate: 0.4
Coordinates: (0.08792|0.114808), (0.886453|0.438732), (0.710248|0.321491), (0.248268|1.306361), (0.336466|0.449712), …

Experiment   Baseline     Time         Avg          Std          Min          Max         
1                    0.00         1.32        76.84         5.74        67.12        88.44
2                    0.00         1.30        75.04         6.20        64.59        86.12
3                    0.00         1.29        75.92         4.92        69.80        83.07
4                    0.00         1.29        75.93         4.74        68.99        85.40
5                    0.00         1.29        75.61         5.92        66.76        84.08
6                    0.00         1.31        75.32         4.67        68.25        84.32
7                    0.00         1.30        75.29         5.46        68.83        86.29
8                    0.00         1.29        75.39         6.01  

The best result with the GA was:
                             
* Average result fitness: 81.497.
* Average execution time: 0.554s.

The score is a little lower for the GA. But when comparing the execution times, that does not seem to be a fair comparison. So I decided to use fewer ants to match the execution time of the GA.

In [129]:
final_params.ant_count = 4

experiments = [0.0 for i in range(10)]
all_measurements = []

for evaporation_rate in experiments:
  aco = ACOAlgorithm(final_params)
  while not aco.is_finished: sleep(0.1)
  all_measurements.append(aco.measurements)

final_params.print()
print()
Measurements.print_experiment("Final", experiments, all_measurements)

PARAMETERS
Iterations: 10
Ant count: 4
Heuristic strength: 0.7
Evaporation rate: 0.4
Coordinates: (0.08792|0.114808), (0.886453|0.438732), (0.710248|0.321491), (0.248268|1.306361), (0.336466|0.449712), …

Experiment   Final        Time         Avg          Std          Min          Max         
1                    0.00         0.53        80.82         3.52        75.06        86.79
2                    0.00         0.51        79.64         3.85        73.33        86.46
3                    0.00         0.51        80.48         4.27        75.17        86.90
4                    0.00         0.51        77.89         2.77        72.49        79.96
5                    0.00         0.52        80.56         3.85        73.07        84.59
6                    0.00         0.51        78.28         3.03        74.69        82.17
7                    0.00         0.51        78.14         4.42        72.59        82.64
8                    0.00         0.52        78.82         4.13   

And as you can see, the results are now hardly any different. Unfortunately, I do not have the other measurements for the GA.

## Conclusion

All in all, the ACO was definitely an interesting algorithm to explore. In the end it did not provide a great improvement on the GA but that might just need some more parameter tuning. 