# Evolutionary Computing Exercise No. 3
**Stu. Name:** Mohammad Amin Dadgar

**Stu. Id:** 4003624016

## Algorithm Configurations

### Representation
Representation for our algorithm has three levels.

- First level is just showing a concept for the network architecture in which the green boxes are optional (can be shown or not, 0 or 1)

    <img src='architecture_level1.png'>
- Second level is representing the transformer layer

    <img src='architecture_level2.png'>
- And the third level is showing the FFN and feed-forward network architectures

    <img src='architecture_level3.png'>

For each gene we can use numerical values which are 0 to 9. As we can see in the exercise each hyperparameter has 2, 3, or 4 values and by that we can convert the numerical values from intervals as below

**Two valued hyperparameters:**
- 0:4 → 0
- 5:9 → 1

**Three valued hyperparameters:**
- 0:3 → 0
- 4:6 → 1
- 7:9 → 2

**Four valued hyperparameters:**
- 0:1 → 0
- 2:4 → 1
- 5:7 → 2
- 8:9 → 3

**100 valued hyperparameters:**
- 0 → 0
- 1 → 10
- 2 → 20
- ...
- 9 → 90
- And 100 would not really work in our 100 valued parametered dropout probability, so we'll exclude it from possible values. 


So to represent a chromsome `1+3×(3+3×2)+3` bits are requierd. The first `1` is $d_{model}$, then in the clause `3×(1+3×1)+3`, the first `3` is showing three possible transform layers. In the paranthesis the first `3` is showing the bit for attention head count and normalization layers then the `3` in multiplication is the count of possible hyperparameters in FFN layer (we could have 2 FFN layer in a transformer) and the last `3` in the equation is the final FFN layer for the network. So `31` bits will be used.

### Combinations
To combine chromsomes for two recombination and mutation methods, we should assume the three level architecture for it. To that aim for combining the chromsomes we use the first level at the first step. After combining and finding the first level architecture for the chromsome, then the second level will be a population based combination (more than two parents) and for the third level again we will assume the same parts of the parent chromsomes as multiple chromsome. To find out more each method is explained well in subsections below

#### Recombination
We will apply one point cross-over for first level representation in which for 31 bits representation the break point can be either 1, 10, or 19. For the second and third level represnetation a probabilistic uniform cross-over is used (for example we have the feed forward hyperparameters 7 times repeated and the attention head count hyperparameter 3 times repeated). 

#### Mutation
An integer mutation method will be used for first level mutation in which representing the availability of the transformer layers, and then for second and third levels the simple integer mutation is being used.

### End Condition
The end condition is ... TODO

In [15]:
from population import generate_population
from combination import mutation_creep, single_point
from util import convert_genotype_to_phenotype_values, map_hyperparameters
from fitness import static_fitness
from selection import binary_tournament
import numpy as np

In [8]:
pop = generate_population()
convert_genotype_to_phenotype_values(pop[3])

(64,
 ((20, 'S', 0.1, True), (30, 'S', 0.5, False), 2),
 ((20, 'S', 0.5, True), (5, 'S', 0.6, False), 2),
 ((10, 'R', 0.9, False), (5, 'R', 0.5, True), 4),
 (30, 'S', 0.4))

In [19]:
def algorithm_run(pop_count, SELECTION_METHOD, FITNESS_FUNCTION, MUTATION_METHOD, RECOMBINATION_METHOD, p_m=0.1, p_c =0.9 ,max_generations = 10):
    """
    one constraint should be always given as input, the maximum capacity or maximum distance
    """

    population = generate_population(pop_size=pop_count)
    fitness_pop = []
    for chromosome in population:
        chromosome_fitness = FITNESS_FUNCTION(chromosome)
        fitness_pop.append(chromosome_fitness)

    best_chromosome = None
    best_chromsome_fitness = None

    for generation_idx in range(max_generations):
        print(f'Generation Number: {generation_idx}')

        ## create pair of the parents
        parent_pairs = []
        for _ in range(pop_count):
            pair = SELECTION_METHOD(population, fitness_pop)
            parent_pairs.append(pair)

        
        offsprings = []
        fitness_offsprings = []
        for parents in parent_pairs:
            recombination_p = np.random.random()

            ## the offspring for this iteration
            ## first save the parents to change them later
            iteration_offspring = [parents[0], parents[1]]
            
            ######## Recombination ########
            if recombination_p < p_c:
                offspring1, offspring2 =  RECOMBINATION_METHOD(iteration_offspring[0], iteration_offspring[1])

                iteration_offspring = [offspring1, offspring2]

            ######## Mutation ########
            offspring1 = MUTATION_METHOD(iteration_offspring[0], p_m)
            offspring2 = MUTATION_METHOD(iteration_offspring[1], p_m)

            iteration_offspring = [offspring1, offspring2]
                
            ## finally append the genarated offsprings to offspring array 
            offsprings.append(iteration_offspring[0])
            offsprings.append(iteration_offspring[1])
            
            fitness_offsprings.append(FITNESS_FUNCTION(iteration_offspring[0]))
            fitness_offsprings.append(FITNESS_FUNCTION(iteration_offspring[1]))
            
                
        ######## Replacement ########

        ## the whole generation: parents + offsprings
        generation_population = population.copy()
        generation_population.extend(offsprings)

        ## whole generation fitness: parents fitness + offsprings fitness
        generation_fitness = fitness_pop.copy()
        generation_fitness.extend(fitness_offsprings)

        ## the sorted generation
        generation_population_sorted = np.array(generation_population)[np.argsort(generation_fitness)]
        generation_fitness_sorted = np.sort(generation_fitness)

        ## Step 10
        ## extract the best of the new generation
        best_of_generation_population = generation_population_sorted[:pop_count]
        best_of_generation_fitness = generation_fitness_sorted[:pop_count]

        best_chromosome = generation_population_sorted[0]
        best_chromsome_fitness = generation_fitness_sorted[0]
        
        ## save them into the original population arrays
        population = best_of_generation_population.tolist()
        fitness_pop = best_of_generation_fitness.tolist()

    
    return best_chromosome, best_chromsome_fitness

In [66]:
## starting the algorithm with a static fitness value
## the algorithm will run randomly, but we want to debug any problems if it has
answer_chromosome, answer_chromsome_fitness = algorithm_run(pop_count=10, 
                SELECTION_METHOD=binary_tournament, 
                FITNESS_FUNCTION=static_fitness, 
                MUTATION_METHOD=mutation_creep, 
                RECOMBINATION_METHOD=single_point,
                p_m=0.1,
                p_c=0.9,
                max_generations=10)

Generation Number: 0
Generation Number: 1
Generation Number: 2
Generation Number: 3
Generation Number: 4
Generation Number: 5
Generation Number: 6
Generation Number: 7
Generation Number: 8
Generation Number: 9


In [67]:
convert_genotype_to_phenotype_values(answer_chromosome)

(128,
 ((5, 'R', 0.6, True), (20, 'S', 0.2, True), 4),
 ((10, 'S', 0.3, False), (10, 'R', 0.9, True), 1),
 ((10, 'R', 0.8, False), (20, 'R', 0.8, False), 1),
 (5, 'R', 0))