# Evolutionary algorithms analysis

Complete run of the evolutionary algorithms analysis. This notebook is used to run different evolutionary algorithms and test them on benchmark functions from benchamrk_function.py file. The results are saved in the results folder. 

Analysis is done via average ranks, with specific notes on outliers.

In [4]:
# Evo algorithms setup

# Imports
import source.evolutionary_algorithms as ea
import numpy as np
import source.helpers as hp
import importlib
import inspect
import time
import sys
import os

# Load benchmark functions
module_path = os.path.abspath(os.path.join('./source'))
sys.path.append(module_path)
benchmarks = importlib.import_module('benchmark_functions')
functions = [obj for _, obj in inspect.getmembers(benchmarks) if inspect.isfunction(obj)]

# Hyperparameters
DIMENSIONS = [2, 10, 30]

lower_bound = -100.0
upper_bound = 100.0

POPULATION_SIZES = [10, 20, 50]
FUNCTION_EVALUATIONS = 2000
RESULTS = dict()
REPEATS = 30

## Running ea

Run ea for benchmark functions and write results to global variables.


In [5]:
USE_PERSISTENT_DATA = True
FILENAME = f"results/full_run_1703071365.4187908.json" # Rename this in case of 

if USE_PERSISTENT_DATA:
    RESULTS = hp.load_results(FILENAME)
    print("Loaded results from file")
else:
    for i, dim in enumerate(DIMENSIONS):
        BOUNDS = np.array([[lower_bound, upper_bound]] * DIMENSIONS[i])
        pop_size = POPULATION_SIZES[i]
        evaluations = dim * FUNCTION_EVALUATIONS
        dimension_results = []
        print(f"Calculating dimension {dim} - ", end="")
        counter = 0

        for function in functions:
            runs = []
            counter += 1
            
            for r in range(REPEATS):
                row = [0, 0, 0, 0, 0]
                row[0] = ea.differential_evolution(function, BOUNDS, pop_size, max_evaluations=evaluations, F=0.8, strategy="rand/1/bin")[1]
                row[1] = ea.differential_evolution(function, BOUNDS, pop_size, max_evaluations=evaluations, F=0.5, strategy="best/1/bin")[1]
                row[2] = ea.particle_swarm_optimization(function, BOUNDS, pop_size, max_evaluations=evaluations)[1]
                row[3] = ea.soma_ato(function, BOUNDS, pop_size, max_evaluations=evaluations)[1]
                row[4] = ea.soma_ata(function, BOUNDS, pop_size, max_evaluations=evaluations)[1]
                runs.append(row)

            print("#", end="")
            if counter % 5 == 0:
                print("|", end="")

            dimension_results.append(runs)
        
        RESULTS[dim] = dimension_results
        print("")
        
    if REPEATS == 30:
        RUN_NAME = "full"
    else:
        RUN_NAME = "partial"

    hp.save_results(RESULTS, f"results/{RUN_NAME}_run_{time.time()}.json")


Loaded results from file


## Analyze results

Legend/note:
- **Rank**: evolutionary algorithm's average rank for benchmark function across number of runs (REPEATS)
- **Rank Avg.**: distance between Rank and Total Avg. Rank
- **Total Avg. Rank**: average rank of evolutionary algorithm across all benchmark functions
- **Chi square**: used to test if Rank Avg. is significantly different from Average
  - if p-value is less than 0.05 then Rank Avg. is significantly different from Average

In [6]:
for dim in RESULTS:
    print(f"\nResults for {dim} dimensions:\n")
    ranking_table = []
    hp.table_header()
    result_dict = dict()
    for i, fun in enumerate(RESULTS[dim]):
        res = np.zeros(5)
        for run in fun:
            res += hp.rank_array(run)
        res /= REPEATS 
        result_dict[functions[i]._custom_name] = res
        ranking_table.append(res)

    average_ranks = np.mean(ranking_table, axis=0)
    avg_distance = 0
    
    for pair in result_dict:
        result_rank = result_dict[pair]
        distance = hp.euclidean_distance(result_rank, average_ranks)
        avg_distance += distance
        hp.table_row(pair, result_rank, distance)

    avg_distance /= len(result_dict)
    
    hp.table_footer(average_ranks, avg_distance)
    chi, p = hp.friedman_test(ranking_table)
    print("\nFriedman test:")
    print("Chi-square: {:>14.6}\nP-value: {:>18.6}".format(chi, p))

    print("\n\n")


Results for 2 dimensions:

--------------------------------------------------------------------------------------------------
| Evo. Algorithm → | DE Rand 1  | DE Best 1  |    PSO     |  SOMA ato  |  SOMA ata  | Rank Avg.  |
| Benchmark ↓      |   (Rank)   |   (Rank)   |   (Rank)   |   (Rank)   |   (Rank)   |   (Diff)   |
--------------------------------------------------------------------------------------------------
| Ackley 1st       |    2.73    |    3.20    |    3.47    |    2.93    |    3.57    |    0.93    |
| Ackley 1st Alt.  |    3.87    |    4.63    |    1.23    |    2.90    |    2.37    |    2.70    |
| Alpine 1st       |    1.50    |    3.10    |    2.27    |    3.50    |    4.67    |    1.58    |
| Alpine 2nd       |    2.47    |    4.40    |    3.13    |    1.80    |    3.20    |    1.53    |
| Csendes          |    1.47    |    2.23    |    2.60    |    3.73    |    4.97    |    2.23    |
| Custom 1 (LLM)   |    2.20    |    4.13    |    3.80    |    1.97    |    2.97 

## Analysis of results
*(based on data from results/full_run_1703071365.4187908.json)*

### Biggest differences

#### Dimension 2

##### Csendes

Csendes bowllike shape is easy to traverse and offers single-global minima. This makes it easy for algorithms to converge to global minima. Although, there is not enough steps for more complex algorithms to outperform simpler ones. This is why DE rand/1/bin and DE best/1/bin are able to outperform PSO and SOMAs.

##### Svanda 3rd

Wave like shape with many local minima and maxima provide difficult terrain to functions that are prone to converge to local minima. It seems that with available steps, best strategy were more stochastic ones, like DE rand/1/bin and SOMA ata. This is likely due to their ability to explore more of the terrain.

##### Quartic

Quartic function is very similar to Csendes. Thus essentially the same reasoning applies as results themselves are very similar to those of Csendes.

##### Ackley 1st Altered

Extremely difficult function/problem with many local minima and maxima. It promotes local exploration and commitment to local minima. That is why PSO did so well (it did so in this benchmark in every dimension), as it's hive like exploration allows it to escape local minima and traverse difficult terrain. As it reaches best member of population, it commits to it explores around it. SOMA ata did well too as its migration from each member towards every other performs local exploration around each member.

##### De Jong 1st

De Jong 1st is very similar to Csendes and Quartic. Once again, same reasoning applies.

#### Dimension 10

##### Svanda 2nd

Svanda 2nd is a very difficult function/problem. It has many local minima and maxima, with them being more pronounced as they near global minimum. It seems that population exploration performs very badly here, with SOMA ato performing best out of those (supposedly due to better exploration pattern via migrations steps towards best). DE best/1/bin performed best as basing new population on best member of previous population allows it to explore in direction of local minima.

##### Schwefel

Schwefel relatively non steep hills and pits that provide difficulty when attempting to escape. This we can confirm with competitive result from DE rand/1/bin. By far best performance was by SOMA ato, as it could effectively explore terrain and escape local minima within few steps.

##### Svanda 6th

This function is another difficult one, requiring traversing flat terrain in search of best narrow pits. We can see very comparable performance here between all algorithms (except SOMA ata) and this applies across all dimensions tested. This shows that neither more stochastic nor more deterministic algorithms are better here. Reason for this might be low number of steps available to algorithms.

##### Ackley 1st Altered

As mentioned in dimension 2, very difficult function. Results here are mostly the same, with PSO being absolutely best and SOMA ata performing worse as problem got even more difficult with more dimensions.

##### Custom 2nd:

Custom 2nd shares some similarities with Zakharov, combining sloped terrain with relatively flat line around the base. This function however sports multiple pronounced minima. All algorithms here were comparable, apart from SOMA ata. Results show that problem was fairly difficult, as DE rand/1/bin performed fairly well. Exploration in direction of best member of population was beneficial here, as it allowed to slide down the slope and explore flat terrain around the base. SOMA ato performed best, likely due to more efficient exploration via migration steps (both on slopes and base).

#### Dimension 30

##### Svanda 2nd

Interestingly this benchmark ended with every algorithm ranking consistently in the same position across 30 runs. Many local minima, increasingly deep around global minima, with hills to match them. Same reasoning as in dimension 10 applies.

##### Zakharov

Interesting result. Zakharov is seemingly very simple function, akin to those like Quartic and De Jong 1st. Yet its U-like shape proposes different challenges. It seems, that mainly stochastic DE rand/1/bin was not enough. DE best/1/bin is by undisputably best, supposedly mainly due to its fastest exploration in direction of current best result. SOMA ato did well too, being able to explore flat terrain effectively. Certain similarity can be seen with Michalewicz.

##### Ackley 1st

It seems that this model's structure was too difficult to traverse effectively and escape local minima, resulting in best performance of DE rand/1/bin (due to it's stochastic nature), with second best by SOMA ato, showing that migration exploration was capable of traversing this difficult terrain.

##### Ackley 1st Altered

Same reasoning as in dimension 2 and 10. PSO did very well being able to traverse difficult terrain and converge on minima in vicinity of best member.

##### Michalewicz

This function requires capability to explore flat terrain effectively. Reasoning here is very same as in Zakharov. Only significant difference is function's structure, where much of the model returns high values, with smaller area of low values. Due to this, function like SOMA ata couldn't perform, as it's many directional migrations didn't have enough steps to find a way into low values area. Same reason for better showing of DE rand/1/bin here than in Zakharov. With other algorithms being less capable to find good values, DE rand/1/bin could perform competitively.


