<a href="https://colab.research.google.com/github/amindadgar/Evolutionary-Computing/blob/colab/main.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Evolutionary Computing Exercise No. 3
**Stu. Name:** Mohammad Amin Dadgar

**Stu. Id:** 4003624016

In [1]:
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


In [2]:
RESULTS_DIR = '/content/gdrive/MyDrive/EC Project/'

## Algorithm Configurations

### Representation
Representation for our algorithm has three levels.

- First level is just showing a concept for the network architecture in which the green boxes are optional (can be shown or not, 0 or 1)

    <img src='architecture_level1.png'>
- Second level is representing the transformer layer

    <img src='architecture_level2.png'>
- And the third level is showing the FFN and feed-forward network architectures

    <img src='architecture_level3.png'>

For each gene we can use numerical values which are 0 to 9. As we can see in the exercise each hyperparameter has 2, 3, or 4 values and by that we can convert the numerical values from intervals as below

**Two valued hyperparameters:**
- 0:4 → 0
- 5:9 → 1

**Three valued hyperparameters:**
- 0:3 → 0
- 4:6 → 1
- 7:9 → 2

**Four valued hyperparameters:**
- 0:1 → 0
- 2:4 → 1
- 5:7 → 2
- 8:9 → 3

**100 valued hyperparameters:**
- 0 → 0
- 1 → 10
- 2 → 20
- ...
- 9 → 90
- And 100 would not really work in our 100 valued parametered dropout probability, so we'll exclude it from possible values. 


So to represent a chromsome `1+3×(3+3×2)+3` bits are requierd. The first `1` is $d_{model}$, then in the clause `3×(3+3×2)+3`, the first `3` is showing three possible transform layers. In the paranthesis the first `3` is showing the bit for attention head count and normalization layers then the `3` in multiplication is the count of possible hyperparameters in FFN layer (we could have 2 FFN layer in a transformer) and the last `3` in the equation is the final FFN layer for the network. So `31` bits will be used.

### Recombination and Mutation
To combine chromsomes for two recombination and mutation methods, we should assume the three level architecture for it. To that aim, We had implemented the mutation based on the conceptual (level 1) chromosome meaning the transformers and FFN layer are mutated as a pack. For recombination method, a normal single point or uniform cross-over can be used.

### End Condition
The end condition is the count of generations, which is 10 as given in the exercise.
### Fitness Function
The fitness function is assumed the training 5 epoch of the transformer network and returning the 5 average test accuracy of it. As we will see the transfomer network does have high computational complexity (in time and hardware resources), so that's the reason that the end condition, averaging count, and epochs are set as low as they can be. 

## Evolutionary Algorithm Part

In [3]:
from population import generate_population
from combination import mutation_creep, single_point
from util import convert_genotype_to_phenotype_values, map_hyperparameters
from fitness import static_fitness
from selection import binary_tournament
from transformer_network_creator import fitness_evaluate
import numpy as np

In [4]:
def algorithm_run(pop_count, SELECTION_METHOD, FITNESS_FUNCTION, MUTATION_METHOD, RECOMBINATION_METHOD, start_generation, last_population=None, last_population_fitness = None, p_m=0.1, p_c =0.9 ,max_generations = 10, RESULTS_DIR='/content/gdrive/MyDrive/EC Project/'):
    """
    one constraint should be always given as input, the maximum capacity or maximum distance
    """
    ## if we had generated population before
    if last_population is None:
      population = generate_population(pop_size=pop_count)
      fitness_pop = []
      for chromosome in population:
          chromosome_fitness = FITNESS_FUNCTION(chromosome, None, 5)
          fitness_pop.append(chromosome_fitness)
    else:
      print('Population is loaded from file!\n')
      fitness_pop = last_population_fitness
      population = last_population


    best_chromosome = None
    best_chromsome_fitness = None
    
    generation_num = 0
    if start_generation is not None:
      generation_num = start_generation
    for generation_idx in range( generation_num ,max_generations):
        print('*' * 30 + f'Generation Number: {generation_idx}' + '*' * 30)


        ## create pair of the parents
        parent_pairs = []
        for _ in range(pop_count):
            pair = SELECTION_METHOD(population, fitness_pop)
            parent_pairs.append(pair)

        
        offsprings = []
        fitness_offsprings = []
        for parents in parent_pairs:
            recombination_p = np.random.random()

            ## the offspring for this iteration
            ## first save the parents to change them later
            iteration_offspring = [parents[0], parents[1]]
            
            ######## Recombination ########
            if recombination_p < p_c:
                offspring1, offspring2 =  RECOMBINATION_METHOD(iteration_offspring[0], iteration_offspring[1])

                iteration_offspring = [offspring1, offspring2]

            ######## Mutation ########
            offspring1 = MUTATION_METHOD(iteration_offspring[0], p_m)
            offspring2 = MUTATION_METHOD(iteration_offspring[1], p_m)

            iteration_offspring = [offspring1, offspring2]
                
            ## finally append the genarated offsprings to offspring array 
            offsprings.append(iteration_offspring[0])
            offsprings.append(iteration_offspring[1])
            
            fitness_offsprings.append(FITNESS_FUNCTION(iteration_offspring[0], RESULTS_DIR + f'generation_number_{generation_idx}.txt', 5))
            fitness_offsprings.append(FITNESS_FUNCTION(iteration_offspring[1], RESULTS_DIR + f'generation_number_{generation_idx}.txt', 5))
            
                
        ######## Replacement ########

        ## the whole generation: parents + offsprings
        generation_population = population.copy()
        generation_population.extend(offsprings)

        ## whole generation fitness: parents fitness + offsprings fitness
        generation_fitness = fitness_pop.copy()
        generation_fitness.extend(fitness_offsprings)

        ## the sorted generation
        generation_population_sorted = np.array(generation_population)[np.argsort(generation_fitness)[::-1]]
        generation_fitness_sorted = np.sort(generation_fitness)

        ## Step 10
        ## extract the best of the new generation
        best_of_generation_population = generation_population_sorted[:pop_count]
        best_of_generation_fitness = generation_fitness_sorted[:pop_count]

        best_chromosome = generation_population_sorted[0]
        best_chromsome_fitness = generation_fitness_sorted[0]
        
        ## save them into the original population arrays
        population = best_of_generation_population.tolist()
        fitness_pop = best_of_generation_fitness.tolist()

    return best_chromosome, best_chromsome_fitness

## Start

In [5]:
## Reading the last saved population
## comment this code if you are starting from scratch
with open(RESULTS_DIR + 'generation_number_10.txt', 'r') as file:
  information = file.read()
  print(information)

population = []
population_fitness = []
for individual_info in information.split('\n'):
  if individual_info != '':
    individual, individual_fitness_str_arr = individual_info.replace(' ', '').split(':') 
    ## convert to their specific types
    individual = str(individual)
  
    float_data_arr = individual_fitness_str_arr.replace('[', '').replace(']', '').split(',')
    float_data_arr = [float(data) for data in float_data_arr]
    individual_fitness = np.mean(float_data_arr)

    population.append(individual)
    population_fitness.append(individual_fitness)

6703314083000000000924090931000: [0.6442092657089233, 0.7925621867179871, 0.8464341163635254, 0.8553950786590576, 0.7993895411491394]
5703315711000000000924090931000: [0.6091698408126831, 0.8221049308776855, 0.686714768409729, 0.7401701807975769, 0.7316446304321289]
6105440005000000000724090931000: [0.728226900100708, 0.6634303331375122, 0.7120457291603088, 0.7315895557403564, 0.8217155337333679]
7303314083000000000724090931000: [0.7593838572502136, 0.9333409667015076, 0.7749147415161133, 0.75413578748703, 0.6989219784736633]
7847115888000000000000000000000: [0.559407651424408, 0.5646238923072815, 0.5709661841392517, 0.5234590768814087, 0.6605796813964844]
7309042281000000000924090931000: [0.7848855257034302, 0.8703641295433044, 0.9506310820579529, 0.7922589182853699, 0.7452265024185181]
7147376202000000000924090931000: [0.7380618453025818, 0.6678559184074402, 0.6933618187904358, 0.9267239570617676, 0.6466026306152344]
6703314083000000000000000000000: [0.5978671312332153, 0.61308926343

In [None]:
## starting the algorithm with a static fitness value
## the algorithm will run randomly, but we want to debug any problems if it has
answer_chromosome, answer_chromsome_fitness = algorithm_run(pop_count=10, 
                SELECTION_METHOD=binary_tournament, 
                FITNESS_FUNCTION=fitness_evaluate, 
                MUTATION_METHOD=mutation_creep, 
                RECOMBINATION_METHOD=single_point,
                p_m=0.1,
                p_c=0.9,
                max_generations=14, 
                last_population=population,
                last_population_fitness=population_fitness, 
                start_generation=11)

Population is loaded from file!

******************************Generation Number: 11******************************
25000 Training sequences
25000 Test sequences
Epoch 1/5
391/391 - 8s - loss: 0.3654 - accuracy: 0.8349 - 8s/epoch - 20ms/step
Epoch 2/5
391/391 - 4s - loss: 0.1839 - accuracy: 0.9308 - 4s/epoch - 11ms/step
Epoch 3/5
391/391 - 4s - loss: 0.1282 - accuracy: 0.9535 - 4s/epoch - 11ms/step
Epoch 4/5
391/391 - 4s - loss: 0.0848 - accuracy: 0.9712 - 4s/epoch - 11ms/step
Epoch 5/5
391/391 - 4s - loss: 0.0567 - accuracy: 0.9813 - 4s/epoch - 11ms/step
25000 Training sequences
25000 Test sequences
Epoch 1/5
391/391 - 6s - loss: 0.3742 - accuracy: 0.8282 - 6s/epoch - 14ms/step
Epoch 2/5
391/391 - 4s - loss: 0.1908 - accuracy: 0.9259 - 4s/epoch - 11ms/step
Epoch 3/5
391/391 - 4s - loss: 0.1227 - accuracy: 0.9559 - 4s/epoch - 11ms/step
Epoch 4/5
391/391 - 4s - loss: 0.0854 - accuracy: 0.9702 - 4s/epoch - 11ms/step
Epoch 5/5
391/391 - 4s - loss: 0.0575 - accuracy: 0.9800 - 4s/epoch - 11m

(128,
 ((10, 'S', 0.1, False), (30, 'R', 0.9, True), 4),
 ((20, 'S', 0.9, True), (10, 'S', 0.6, False), 8),
 ((10, 'R', 0.5, False), (5, 'R', 0.7, True), 4),
 (20, 'R', 0.6))