# LAB9

Write a local-search algorithm (eg. an EA) able to solve the *Problem* instances 1, 2, 5, and 10 on a 1000-loci genomes, using a minimum number of fitness calls. That's all.

### Deadlines:

* Submission: Sunday, December 3 ([CET](https://www.timeanddate.com/time/zones/cet))
* Reviews: Sunday, December 10 ([CET](https://www.timeanddate.com/time/zones/cet))

Notes:

* Reviews will be assigned  on Monday, December 4
* You need to commit in order to be selected as a reviewer (ie. better to commit an empty work than not to commit)


## Work
This code was designed, programmed and tested by
* Giacomo Fantino
* Farisan Fekri
* Lorenzo Bonannella 
* Giacomo Cauda

In [1]:
from dataclasses import dataclass
from random import randint,choice, choices, random, sample
from copy import copy
import lab9_lib

In [24]:
'''
For convenience the instantiation of the fitness is here
After each algorithm the counter of the fitness call is reset to zero
'''

fitness = lab9_lib.make_problem(10)

## A simple Evolutionary Algorithm

In [None]:
"""
Basic configuration for the first part
POPULATION_SIZE = 30
OFFSPRING_SIZE = 20
TOURNAMENT_SIZE = 3
MUTATION_PROBABILITY = .2
"""

#used for the second part
POPULATION_SIZE = 20
OFFSPRING_SIZE = 20
TOURNAMENT_SIZE = 3
MUTATION_PROBABILITY = .2

@dataclass
class Individual:
    fitness: int
    genotype: list[int]

population = [
    Individual(
        genotype=choices([0, 1], k=1000),
        fitness=None,
    )
    for _ in range(POPULATION_SIZE)
]

for i in population:
    i.fitness = fitness(i.genotype)

def select_parent(pop):
    pool = [choice(pop) for _ in range(TOURNAMENT_SIZE)]  
    champion = max(pool, key=lambda i: i.fitness)
    return champion

def mutate(ind: Individual) -> Individual:
    offspring = copy(ind)
    pos = randint(0, len(offspring.genotype)-1) 
    
    offspring.genotype[pos] = not offspring.genotype[pos]
    offspring.fitness = None
    return offspring


def one_cut_xover(ind1: Individual, ind2: Individual) -> Individual:
    cut_point = randint(0, len(ind1.genotype)) 
    offspring = Individual(fitness=None,
                           genotype=ind1.genotype[:cut_point] + ind2.genotype[cut_point:])
    return offspring

"""
we use a really big value for the generations (100k) in order to find 
an upperbound of the value of the fitness function without thinking about the
number of fitness calls
"""

# for generation in range(10_000): used in the first part
for generation in range(1_500):
    offspring = list() 
    for counter in range(OFFSPRING_SIZE):
        if random() < MUTATION_PROBABILITY:
            p = select_parent(population)
            o = mutate(p)
        else:
            # xover # add more xovers
            p1 = select_parent(population)
            p2 = select_parent(population)
            o = one_cut_xover(p1, p2)
        offspring.append(o) 

    for i in offspring:
        i.fitness = fitness(i.genotype)
    population.extend(offspring) 
    population.sort(key=lambda i: i.fitness, reverse=True) 
    population = population[:POPULATION_SIZE] 
    print(f"{population[0].fitness:.2%}")


print(fitness.calls)
fitness._calls = 0

## ES with Diversity in Parent selection

In [None]:
POPULATION_SIZE = 30
OFFSPRING_SIZE = 20
TOURNAMENT_SIZE = 3
MUTATION_PROBABILITY = .2

@dataclass
class Individual:
    fitness: int
    genotype: list[int]

population = [
    Individual(
        genotype=choices([0, 1], k=1000),
        fitness=None,
    )
    for _ in range(POPULATION_SIZE)
]

for i in population:
    i.fitness = fitness(i.genotype)

def select_parent(pop):
    pool = [choice(pop) for _ in range(TOURNAMENT_SIZE)]  
    champion = max(pool, key=lambda i: i.fitness)
    return champion

def most_diverse_couple(k=5):
    best_couple = None
    best_diversity = -1
    for _ in range(k):
        #select a random couple of parents
        p1, p2 = choice(population), choice(population)
        diversity_value = diversity(p1, p2)
        if diversity_value > best_diversity:
            best_couple = p1, p2
            best_diversity = diversity_value
    return best_couple

def mutate(ind: Individual) -> Individual:
    offspring = copy(ind)
    pos = randint(0, len(offspring.genotype)-1) 
    
    offspring.genotype[pos] = 1-offspring.genotype[pos]
    offspring.fitness = None
    return offspring


def one_cut_xover(ind1: Individual, ind2: Individual) -> Individual:
    cut_point = randint(0, len(ind1.genotype)) 
    offspring = Individual(fitness=None,
                           genotype=ind1.genotype[:cut_point] + ind2.genotype[cut_point:])
    return offspring


def diversity(ind1: Individual, ind2: Individual):
    diff = 0.0
    for i in range(0, len(ind1.genotype)):
        if ind1.genotype[i] != ind2.genotype[i]:
            diff = diff + 1
    return float(diff)/float(len(ind1.genotype))

for generation in range(10_000): 
    offspring = list() 
    for counter in range(OFFSPRING_SIZE):
        if random() < MUTATION_PROBABILITY: 
            p = select_parent(population)
            o = mutate(p)
        else:
            p1, p2 = most_diverse_couple()
            o = one_cut_xover(p1, p2)
        offspring.append(o) 

    for i in offspring:
        i.fitness = fitness(i.genotype)
    population.extend(offspring) 
    population.sort(key=lambda i: i.fitness, reverse=True) 
    population = population[:POPULATION_SIZE] 
    print(f"{population[0].fitness:.2%}")


print(fitness.calls)
fitness._calls = 0

## ES + Adaptiveness

In [None]:
n_mutation = 1
POPULATION_SIZE = 30
OFFSPRING_SIZE = 20
TOURNAMENT_SIZE = 3
MUTATION_PROBABILITY = .2

@dataclass
class Individual:
    fitness: int
    genotype: list[int]

population = [
    Individual(
        genotype=choices([0, 1], k=1000),
        fitness=None,
    )
    for _ in range(POPULATION_SIZE)
]

for i in population:
    i.fitness = fitness(i.genotype)

def select_parent(pop):
    pool = [choice(pop) for _ in range(TOURNAMENT_SIZE)]  
    champion = max(pool, key=lambda i: i.fitness)
    return champion

def mutate(ind: Individual) -> Individual:
    offspring = copy(ind)
    pos = sample(range(1000), k=n_mutation) 
    for p in pos:
        offspring.genotype[p] = 1-offspring.genotype[p]
    offspring.fitness = None
    return offspring


def one_cut_xover(ind1: Individual, ind2: Individual) -> Individual:
    cut_point = randint(0, len(ind1.genotype)) 
    offspring = Individual(fitness=None,
                           genotype=ind1.genotype[:cut_point] + ind2.genotype[cut_point:])
    return offspring


def diversity(ind1: Individual, ind2: Individual):
    diff = 0.0
    for i in range(0, len(ind1.genotype)):
        if ind1.genotype[i] != ind2.genotype[i]:
            diff = diff + 1
    return float(diff)/float(len(ind1.genotype))

previous_fitness=-1
counter_same_fitness = 0
for generation in range(10_000): 
    offspring = list() 
    for counter in range(OFFSPRING_SIZE):
        if random() < MUTATION_PROBABILITY:
            p = select_parent(population)
            o = mutate(p)
        else:
            # xover # add more xovers
            p1 = select_parent(population)
            p2 = select_parent(population)
            o = one_cut_xover(p1, p2)
        offspring.append(o) 

    for i in offspring:
        i.fitness = fitness(i.genotype)
    population.extend(offspring) 
    population.sort(key=lambda i: i.fitness, reverse=True) 
    population = population[:POPULATION_SIZE] 
    
    
    if population[0].fitness>previous_fitness:
        print(f"generation {generation} fitness {population[0].fitness:.2%}")
        previous_fitness = population[0].fitness
        counter_same_fitness = 0
        n_mutation = 1 #immediatly back to exploitation
    else:
        counter_same_fitness += 1
        print(f"generation {generation} mutate {n_mutation} fitness {population[0].fitness:.2%}")
        if counter_same_fitness >= 5 : #after 30 same fitness values we increare the n_mutation value
            n_mutation = min(1000, n_mutation+1) #max number of exploration

print(fitness.calls)
fitness._calls = 0

## ES + Diversity + Adaptiveness

In [None]:
n_mutation = 1
POPULATION_SIZE = 30
OFFSPRING_SIZE = 20
TOURNAMENT_SIZE = 3
MUTATION_PROBABILITY = .2

@dataclass
class Individual:
    fitness: int
    genotype: list[int]

population = [
    Individual(
        genotype=choices([0, 1], k=1000),
        fitness=None,
    )
    for _ in range(POPULATION_SIZE)
]

for i in population:
    i.fitness = fitness(i.genotype)

def select_parent(pop):
    pool = [choice(pop) for _ in range(TOURNAMENT_SIZE)]  
    champion = max(pool, key=lambda i: i.fitness)
    return champion

def most_diverse_couple(k=10):
    best_couple = None
    best_diversity = -1
    for _ in range(k):
        #select a random couple of parents
        p1, p2 = choice(population), choice(population)
        diversity_value = diversity(p1, p2)
        if diversity_value > best_diversity:
            best_couple = p1, p2
            best_diversity = diversity_value
    return best_couple

def mutate(ind: Individual) -> Individual:
    offspring = copy(ind)
    pos = sample(range(1000), k=n_mutation) 
    for p in pos:
        offspring.genotype[p] = 1-offspring.genotype[p]
    offspring.fitness = None
    return offspring


def one_cut_xover(ind1: Individual, ind2: Individual) -> Individual:
    cut_point = randint(0, len(ind1.genotype)) 
    offspring = Individual(fitness=None,
                           genotype=ind1.genotype[:cut_point] + ind2.genotype[cut_point:])
    return offspring


def diversity(ind1: Individual, ind2: Individual):
    diff = 0.0
    for i in range(0, len(ind1.genotype)):
        if ind1.genotype[i] != ind2.genotype[i]:
            diff = diff + 1
    return float(diff)/float(len(ind1.genotype))

previous_fitness=-1
counter_same_fitness = 0
for generation in range(10_000): 
    offspring = list() 
    for counter in range(OFFSPRING_SIZE):
        if random() < MUTATION_PROBABILITY:
            p = select_parent(population)
            o = mutate(p)
        else:
            p1, p2 = most_diverse_couple()
            o = one_cut_xover(p1, p2)
        offspring.append(o) 

    for i in offspring:
        i.fitness = fitness(i.genotype)
    population.extend(offspring) 
    population.sort(key=lambda i: i.fitness, reverse=True) 
    population = population[:POPULATION_SIZE] 
    
    
    if population[0].fitness>previous_fitness:
        print(f"generation {generation} fitness {population[0].fitness:.2%}")
        previous_fitness = population[0].fitness
        counter_same_fitness = 0
        n_mutation = 1 #immediatly back to exploitation
    else:
        counter_same_fitness += 1
        print(f"generation {generation} mutate {n_mutation} fitness {population[0].fitness:.2%}")
        if counter_same_fitness >= 5 : #after 30 same fitness values we increare the n_mutation value
            n_mutation = min(1000, n_mutation+1) #max number of exploration

print(fitness.calls)
fitness._calls = 0

For each configuration we have marked for each algorithm the best result and the number of fitness calls.
The idea is that the other two algorithms should outperform the first one and with less individuals and generation
should get a similar result.

# Results

Here what we got using the same configuration (10_000 generations, pop size = 30 and offspring size = 20)
| Conf |  ES    | ES+Div | ES+Adapt|ES+Div+Adapt|
|------|--------|--------|---------|------------|
|  1   | 94.60% | 89.60% | 89.50%  | 81.65%     |
|  2   | 96.04% | 93.00% | 94.72%  | 87.42%     |
|  5   | 98.78% | 96.11% | 98.06%  | 88.28%     |
|  10  | 98.89% | 94.57% | 98.45%  | 87.11%     |

A couple of comments:
1) We were not able to reduce the calls to the fitness function since the other algorithms gave similar or worse results compared to ES.
2) Applying diversity in parent selection wasn't effective. Maybe the choice of ordering tuple based on diversity wasn't the best for this particular task.
3) Applying Diversity with an adaptive strategy always returns the worst results.
4) In general rerunning multiple time the same same algorithm can yield results that different from a factor up to -1.5 and +1.5%.

Since the best algorithm is ES we can now try to reduce the number of calls for the fitness function in each configuration:

| Conf | Fitness | Calls | 
|------|---------|-------|
|  1   | 94.40%  | 30020 |
|  2   | 96.46%  | 30020 |
|  5   | 98.28%  | 30020 |
|  10  | 98.11%  | 30020 |

To lower the fitness call we set the population and offspring size equal to 20 and limited the number of generations to 1_500. This led to very similar results with a great reduction to the number of calls to the fitness function.
With lower values the evolutionary process is stopped way too early and thus the results are worse.

Lastly we tried to use the same configuration with the other algorithm but since the results were already worse compared to ES using fewer generation didn't change anything.