# Create your own fitness function that works similarly to OneMax, except that the maximum fitness is when 50% of the individual's genes are 1, with a linear decay on either side

Even with population size of 10, the problem is solved at generation 1. 
The chance of getting the right solution of each individual is:

``P(X=50)=(50/100) x (0.5)^50 x (1−0.5)^50 = 0.0796``

With population size of 10, 79.6% chance that we get solution at generation 1 without any GA.

Install DEAP. Note that if you are running this on your own computer you might not need to do this. In fact, it is better practice to install it so that it is always available, but when running on Colab, we do need this.

In [1]:
%pip install deap
%pip install numpy



Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


Import the DEAP tools and useful libraries (random and matplotlib).

In [2]:
from deap import base
from deap import creator
from deap import tools

import random

import matplotlib.pyplot as plt

Set our Genetic Algorithm parameters

In [3]:
MAX_GENERATIONS = 10000

Set any problem-specific constants here. In this case we need to know how long the string is.

In [4]:
ONE_MAX_LENGTH = 100  # length of bit string to be optimized

Set the random seed. This is important so that we can reproduce runs later on.

In [5]:
RANDOM_SEED = 42
random.seed(RANDOM_SEED)

Create our toolbox. Note that we can pull in a bunch of predefined operators to tailor our Evolutionary Algorithm, which, of course, in this case is a GA. Notice that it is possible to create our **own** operators and functions to use, which is what we do with our **oneMaxFitness** function below.

In [6]:
toolbox = base.Toolbox()

# create an operator that randomly returns 0 or 1:
toolbox.register("zeroOrOne", random.randint, 0, 1)

# define a single objective, maximizing fitness strategy:
creator.create("FitnessMax", base.Fitness, weights=(1.0,))

# create the Individual class based on list:
creator.create("Individual", list, fitness=creator.FitnessMax)
# creator.create("Individual", array.array, typecode='b', fitness=creator.FitnessMax)

# create the individual operator to fill up an Individual instance:
toolbox.register("individualCreator", tools.initRepeat, creator.Individual, toolbox.zeroOrOne, ONE_MAX_LENGTH)

# create the population operator to generate a list of individuals:
toolbox.register("populationCreator", tools.initRepeat, list, toolbox.individualCreator)


# fitness calculation:
# compute the number of '1's in the individual
def oneHalfFitness(individual):
    i = sum(individual)  # Count the number of 1s
    if i <= len(individual) / 2:
        return i,
    else:
        return len(individual) - i,

toolbox.register("evaluate", oneHalfFitness)

# genetic operators:

# Tournament selection with tournament size of 3:
toolbox.register("select", tools.selTournament, tournsize=3)

# Single-point crossover:
toolbox.register("mate", tools.cxOnePoint)

# Flip-bit mutation:
# indpb: Independent probability for each attribute to be flipped
toolbox.register("mutate", tools.mutFlipBit, indpb=1.0/ONE_MAX_LENGTH)

Here is the main GA loop. We will iterate through it up to the MAX_GENERATIONS parameter and then print out our best individual.

In [11]:
import numpy as np

# for each population_size, grid search over mutation_rate and crossover_rate
result = {}
for populationSize in range(10, 400, 20):

    minGenerationCount = MAX_GENERATIONS + 1
    minGenMutationRate = 0
    minGenCrossoverRate = 0

    for mutationRate in np.arange(0, 0.2, 0.02):
        for crossoverRate in np.arange(0, 1, 0.1):
            population = toolbox.populationCreator(n=populationSize)
            generationCounter = 0

            # calculate fitness tuple for each individual in the population:
            fitnessValues = list(map(toolbox.evaluate, population))
            for individual, fitnessValue in zip(population, fitnessValues):
                individual.fitness.values = fitnessValue

            # extract fitness values from all individuals in population:
            fitnessValues = [individual.fitness.values[0] for individual in population]

            # initialize statistics accumulators:
            maxFitnessValues = []
            meanFitnessValues = []

            # main evolutionary loop:
            # stop if max fitness value reached the known max value
            # OR if number of generations exceeded the preset value:
            while True:
                if generationCounter > MAX_GENERATIONS:
                    # print(f"Stopping since max generations reached. Population size: {populationSize}, Mutation rate: {mutationRate}, Crossover rate: {crossoverRate}")
                    break

                # update counter:
                generationCounter = generationCounter + 1

                # apply the selection operator, to select the next generation's individuals:
                offspring = toolbox.select(population, len(population))
                # clone the selected individuals:
                offspring = list(map(toolbox.clone, offspring))

                # apply the crossover operator to pairs of offspring:
                for child1, child2 in zip(offspring[::2], offspring[1::2]):
                    if random.random() < crossoverRate:
                        toolbox.mate(child1, child2)
                        del child1.fitness.values
                        del child2.fitness.values

                for mutant in offspring:
                    if random.random() < mutationRate:
                        toolbox.mutate(mutant)
                        del mutant.fitness.values

                # calculate fitness for the individuals with no previous calculated fitness value:
                freshIndividuals = [ind for ind in offspring if not ind.fitness.valid]
                freshFitnessValues = list(map(toolbox.evaluate, freshIndividuals))
                for individual, fitnessValue in zip(freshIndividuals, freshFitnessValues):
                    individual.fitness.values = fitnessValue

                # replace the current population with the offspring:
                population[:] = offspring

                # collect fitnessValues into a list, update statistics and print:
                fitnessValues = [ind.fitness.values[0] for ind in population]

                maxFitness = max(fitnessValues)
                meanFitness = sum(fitnessValues) / len(population)
                maxFitnessValues.append(maxFitness)
                meanFitnessValues.append(meanFitness)

                if maxFitness == ONE_MAX_LENGTH // 2 :
                    if generationCounter < minGenerationCount:
                        minGenerationCount = generationCounter
                        minGenMutationRate = mutationRate
                        minGenCrossoverRate = crossoverRate
                    # print(f"Stopping since max fitness reached. Population size: {populationSize}, Mutation rate: {mutationRate}, Crossover rate: {crossoverRate}")
                    break

    if minGenerationCount <= MAX_GENERATIONS:
        print(f"{populationSize*minGenerationCount} Population size: {populationSize}, Min generations: {minGenerationCount}, Min mutation rate: {minGenMutationRate}, Min crossover rate: {minGenCrossoverRate}")
        result[populationSize] = (
            populationSize * minGenerationCount,
            minGenerationCount,
            minGenMutationRate,
            minGenCrossoverRate,
        )
    else:
        print(f"Population size: {populationSize}, No solution found")

10 Population size: 10, Min generations: 1, Min mutation rate: 0.0, Min crossover rate: 0.1
30 Population size: 30, Min generations: 1, Min mutation rate: 0.0, Min crossover rate: 0.0
50 Population size: 50, Min generations: 1, Min mutation rate: 0.0, Min crossover rate: 0.0
70 Population size: 70, Min generations: 1, Min mutation rate: 0.0, Min crossover rate: 0.0
90 Population size: 90, Min generations: 1, Min mutation rate: 0.0, Min crossover rate: 0.0
110 Population size: 110, Min generations: 1, Min mutation rate: 0.0, Min crossover rate: 0.0
130 Population size: 130, Min generations: 1, Min mutation rate: 0.0, Min crossover rate: 0.0
150 Population size: 150, Min generations: 1, Min mutation rate: 0.0, Min crossover rate: 0.0
170 Population size: 170, Min generations: 1, Min mutation rate: 0.0, Min crossover rate: 0.0
190 Population size: 190, Min generations: 1, Min mutation rate: 0.0, Min crossover rate: 0.0
210 Population size: 210, Min generations: 1, Min mutation rate: 0.0, 

In [9]:
import pandas as pd

df = pd.DataFrame(columns=["Population size", "Smallest number of individual", "Generations", "Mutation rate", "Crossover rate"])
for key, value in result.items():
    df.loc[df.__len__()] = {
        "Population size": key,
        "Smallest number of individual": value[0],
        "Generations": value[1],
        "Mutation rate": value[2],
        "Crossover rate": value[3],
    }
df

Unnamed: 0,Population size,Smallest number of individual,Generations,Mutation rate,Crossover rate
0,10,10,1,0.0,0.2
1,30,30,1,0.0,0.0
2,50,50,1,0.0,0.0
3,70,70,1,0.0,0.0
4,90,90,1,0.0,0.0
5,110,110,1,0.0,0.0
6,130,130,1,0.0,0.0
7,150,150,1,0.0,0.0
8,170,170,1,0.0,0.0
9,190,190,1,0.0,0.0
