# Evolutionary Computing Exercise No. 3
**Stu. Name:** Mohammad Amin Dadgar

**Stu. Id:** 4003624016

## Algorithm Configurations

### Representation
Representation for our algorithm has three levels.

- First level is just showing a concept for the network architecture in which the green boxes are optional (can be shown or not, 0 or 1)

    <img src='architecture_level1.png'>
- Second level is representing the transformer layer

    <img src='architecture_level2.png'>
- And the third level is showing the FFN and feed-forward network architectures

    <img src='architecture_level3.png'>

For each gene we can use numerical values which are 0 to 9. As we can see in the exercise each hyperparameter has 2, 3, or 4 values and by that we can convert the numerical values from intervals as below

**Two valued hyperparameters:**
- 0:4 → 0
- 5:9 → 1

**Three valued hyperparameters:**
- 0:3 → 0
- 4:6 → 1
- 7:9 → 2

**Four valued hyperparameters:**
- 0:1 → 0
- 2:4 → 1
- 5:7 → 2
- 8:9 → 3

**100 valued hyperparameters:**
- 0 → 0
- 1 → 10
- 2 → 20
- ...
- 9 → 90
- And 100 would not really work in our 100 valued parametered dropout probability, so we'll exclude it from possible values. 


So to represent a chromsome `1+3×(3+3×2)+3` bits are requierd. The first `1` is $d_{model}$, then in the clause `3×(3+3×2)+3`, the first `3` is showing three possible transform layers. In the paranthesis the first `3` is showing the bit for attention head count and normalization layers then the `3` in multiplication is the count of possible hyperparameters in FFN layer (we could have 2 FFN layer in a transformer) and the last `3` in the equation is the final FFN layer for the network. So `31` bits will be used.

### Recombination and Mutation
To combine chromsomes for two recombination and mutation methods, we should assume the three level architecture for it. To that aim, We had implemented the mutation based on the conceptual (level 1) chromosome meaning the transformers and FFN layer are mutated as a pack. For recombination method, a normal single point or uniform cross-over can be used.

### End Condition
The end condition is the count of generations, which is 10 as given in the exercise.
### Fitness Function
The fitness function is assumed the training 5 epoch of the transformer network and returning the 5 average test accuracy of it. As we will see the transfomer network does have high computational complexity (in time and hardware resources), so that's the reason that the end condition, averaging count, and epochs are set as low as they can be. 

In [1]:
from population import generate_population
from combination import mutation_creep, single_point
from util import convert_genotype_to_phenotype_values, map_hyperparameters
from fitness import static_fitness
from selection import binary_tournament
from transformer_network_creator import fitness_evaluate
import numpy as np

In [2]:
pop = generate_population()
convert_genotype_to_phenotype_values(pop[3])

(128,
 ((20, 'R', 0.2, True), (30, 'R', 0.4, True), 4),
 ((10, 'S', 0.5, False), (20, 'S', 0.9, True), 2),
 ((30, 'S', 0.7, False), (10, 'S', 0.7, True), 1),
 (20, 'S', 0.3))

In [3]:
def algorithm_run(pop_count, SELECTION_METHOD, FITNESS_FUNCTION, MUTATION_METHOD, RECOMBINATION_METHOD, initial_population=None, initial_population_fitness = None, p_m=0.1, p_c =0.9, max_generations = 10, RESULTS_DIR='/content/gdrive/MyDrive/EC Project/'):
    """
    one constraint should be always given as input, the maximum capacity or maximum distance
    """

    ## if we had generated population before
    if initial_population is None:
      population = generate_population(pop_size=pop_count)
      fitness_pop = []
      for chromosome in population:
          chromosome_fitness = FITNESS_FUNCTION(chromosome, None, 5)
          fitness_pop.append(chromosome_fitness)
    else:
      print('Population is loaded from file!\n')
      fitness_pop = initial_population_fitness
      population = initial_population
    
    best_chromosome = None
    best_chromsome_fitness = None

    for generation_idx in range(max_generations):
        print(f'Generation Number: {generation_idx}')

        # with open('result.txt', mode='a') as file:
        #     file.write(f'\nGeneration number: {generation_idx}\n')


        ## create pair of the parents
        parent_pairs = []
        for _ in range(pop_count):
            pair = SELECTION_METHOD(population, fitness_pop)
            parent_pairs.append(pair)

        
        offsprings = []
        fitness_offsprings = []
        for parents in parent_pairs:
            recombination_p = np.random.random()

            ## the offspring for this iteration
            ## first save the parents to change them later
            iteration_offspring = [parents[0], parents[1]]
            
            ######## Recombination ########
            if recombination_p < p_c:
                offspring1, offspring2 =  RECOMBINATION_METHOD(iteration_offspring[0], iteration_offspring[1])

                iteration_offspring = [offspring1, offspring2]

            ######## Mutation ########
            offspring1 = MUTATION_METHOD(iteration_offspring[0], p_m)
            offspring2 = MUTATION_METHOD(iteration_offspring[1], p_m)

            iteration_offspring = [offspring1, offspring2]
                
            ## finally append the genarated offsprings to offspring array 
            offsprings.append(iteration_offspring[0])
            offsprings.append(iteration_offspring[1])
            
            fitness_offsprings.append(FITNESS_FUNCTION(iteration_offspring[0], RESULTS_DIR + f'generation_number_{generation_idx}.txt', 5))
            fitness_offsprings.append(FITNESS_FUNCTION(iteration_offspring[1], RESULTS_DIR + f'generation_number_{generation_idx}.txt', 5))
            
                
        ######## Replacement ########

        ## the whole generation: parents + offsprings
        generation_population = population.copy()
        generation_population.extend(offsprings)

        ## whole generation fitness: parents fitness + offsprings fitness
        generation_fitness = fitness_pop.copy()
        generation_fitness.extend(fitness_offsprings)

        ## the sorted generation
        generation_population_sorted = np.array(generation_population)[np.argsort(generation_fitness)]
        generation_fitness_sorted = np.sort(generation_fitness)

        ## Step 10
        ## extract the best of the new generation
        best_of_generation_population = generation_population_sorted[:pop_count]
        best_of_generation_fitness = generation_fitness_sorted[:pop_count]

        best_chromosome = generation_population_sorted[0]
        best_chromsome_fitness = generation_fitness_sorted[0]
        
        ## save them into the original population arrays
        population = best_of_generation_population.tolist()
        fitness_pop = best_of_generation_fitness.tolist()

    
    return best_chromosome, best_chromsome_fitness

In [4]:
## starting the algorithm with a static fitness value
## the algorithm will run randomly, but we want to debug any problems if it has
answer_chromosome, answer_chromsome_fitness = algorithm_run(pop_count=10, 
                SELECTION_METHOD=binary_tournament, 
                FITNESS_FUNCTION=static_fitness, 
                MUTATION_METHOD=mutation_creep, 
                RECOMBINATION_METHOD=single_point,
                p_m=0.1,
                p_c=0.9,
                max_generations=10)

Generation Number: 0
Generation Number: 1
Generation Number: 2
Generation Number: 3
Generation Number: 4
Generation Number: 5
Generation Number: 6
Generation Number: 7
Generation Number: 8
Generation Number: 9


In [4]:
convert_genotype_to_phenotype_values(pop[1])

(128,
 ((20, 'S', 0.5, True), (20, 'R', 0.7, True), 2),
 ((10, 'S', 0.9, False), (5, 'S', 0.7, True), 8),
 ((5, 'R', 0.3, True), (20, 'S', 0.5, True), 1),
 (10, 'R', 0))

In [8]:
CH = mutation_creep(pop[0], 1)
CH

'4577566333000000000000000000000'

In [9]:
convert_genotype_to_phenotype_values(CH)

(32,
 ((20, 'S', 0.7, False), (20, 'S', 0.3, True), 2),
 ((None, None, None, None), (None, None, None, True), 1),
 ((None, None, None, None), (None, None, None, True), 1),
 (None, None, None))

In [6]:
from transformer_network_creator import start_training

start_training(answer_chromosome)

TypeError: '<=' not supported between instances of 'str' and 'int'