# Genetic Algorithm

The genetic algorithm emulates Evolution by "breeding" solutions from previous solutions and applying mutation. The likelihood that a solution "survives" is based on its "fitness" value (as defined by some "fitness function").

### Problem to solve

Let's try using it to solve a simple equation:

* x − y = −1
* 3x + y = 9

**(The real solution is x=2, y=3)**

## Generic Genetic Algorithm

First, let's write a basic genetic algorithm.

Each "individual" will be a list in this form: `[fitness, val1, val2, ...]`. This way, we can simply do a `sort()` to determine fitness, since a sort of a list first considers the first value in the list.

In [15]:
import random

def generate_couples(population):
    couples = []
    for i in range(0, len(population)-1, 2):
        c = (population[i], population[i+1])
        couples.append(c)
    return couples

def genetic_algorithm(fit_func, cross_func, mutate_func, init_pop, max_iter=1000,
                      max_pop=100, quit_at_err=0.01, mut_prob=0.1, mut_mu=0, mut_sigma=1):
    population = init_pop
    num_solutions_considered = 0
    top_dog = init_pop[0]
    
    for i in range(max_iter):        
        # Calculate fitness function for each individual
        for individual in population:
            fitness = individual[0]
            if fitness < 0:  # Meaning it has not been calculated, since fitness is always positive or 0
                fitness = fit_func(individual)
                individual[0] = fitness

        # Sort population by fitness - this is needed both to find the top dog (to see if we have a
        # good-enough solution) and to put the population in order for pariing up as couples.
        population.sort()
        num_solutions_considered += len(population)
        
        # Find the "most fit" individual. If this solution is close enough (error less than quit_at_err),
        # then just stop the algorithm and return this solution.
        top_dog = population[0]
        if top_dog[0] < quit_at_err:
            print('Solutions considered: {}, Error: {}'.format(num_solutions_considered, top_dog[0]))
            return top_dog
        
        # Make couples - they pair up according to fitness
        couples = generate_couples(population)
        
        # Mate - the baby's genetics are created by taking the x from one of the parents and y from the other,
        # and then applying mutation (to randomly tweak the 'genes')
        babies = [mutate(have_sex(couple[0], couple[1]), mut_prob, mut_mu, mut_sigma) for couple in couples]
        population += babies
        
        # Sort by fitness and cull (simulating limited resources) - only the strong survive
        population.sort()
        population = population[:max_pop]

    print('Solutions considered: {}, Error: {}'.format(num_solutions_considered, top_dog[0]))
    return top_dog


## Write problem-specific functions

In [16]:
def fitness_function(individual):
    fitness, x, y = individual
    # We square the error so it's always positive
    eq1_error = ((x - y) - (-1))**2
    eq2_error = ((3*x + y) - 9)**2
    return eq1_error + eq2_error

def have_sex(a, b):
    """ Given parents a and b, take the x value from a and the y value from b.
    This is similar to how humans get half their genetic material from the father and half
    from the mother. """
    x = a[1]
    y = b[2]
    # Set fitness at -1, indicating it hasn't been calculated yet (i.e. the babies haven't 
    # been tested in the wild yet)
    return [-1, x, y]

def mutate(a, mutation_probability, mu, sigma):
    """ Mutate the values for x and y according to the probability of mutation - 
    if mutation_probability is 0.01, then there is a 1% chance of a value being mutated.
    
    If we mutate, we do so by generating a random number according to the normal distribution
    as specified by mu (mean) and sigma (standard deviation).
    """
    mutant = [-1]
    for var in a[1:]:
        if random.random() <= mutation_probability:
            new_var = var + random.gauss(mu, sigma)
            mutant.append(new_var)
        else:
            mutant.append(var)
    return mutant

initial_population = [[-1, 0, 0], [-1, 10, 10]]     
most_fit_solution = genetic_algorithm(fitness_function, have_sex, mutate, initial_population)
most_fit_solution

Solutions considered: 5583, Error: 0.0051585675279162005


[0.0051585675279162005, 1.9800869495118212, 2.988398522965479]

**Wow, so that's a pretty close solution in not too many generations (or solutions considered).**

Now, let's compare it these 2 types of random searches:

* Random walk
* Uniform random search (in a constrained window)

## Comparison with random walk solution

In [17]:
def random_walk_next_solution(prev_solution, mu, sigma):
    return [prev_solution[0] + random.gauss(mu, sigma), prev_solution[1] + random.gauss(mu, sigma)]

def error_function(individual):
    x, y = individual
    # We square the error so it's always positive
    eq1_error = ((x - y) - (-1))**2
    eq2_error = ((3*x + y) - 9)**2
    return eq1_error + eq2_error

def do_random_walk_search(mu, sigma, init_guess, max_iter=100*10000+1, quit_at_err=0.01):
    current_solution = init_guess
    
    for i in range(max_iter):
        current_solution = random_walk_next_solution(current_solution, 0, .5)
        current_solution_err = error_function(current_solution)
        if current_solution_err < quit_at_err:
            break
            
    print('Solutions considered: {}, Error: {}'.format(i, current_solution_err))
    return current_solution

do_random_walk_search(0, 0.5, [0, 0])

Solutions considered: 1000000, Error: 2674475.7667086828


[-517.4518389625891, 14.596513748922629]

That is a pretty bad solution, and a ton of error.

## Comparison with random (non-walk) solution

In [20]:
def random_next_solution(x_min, x_max, y_min, y_max):
    return [random.uniform(x_min, x_max), random.uniform(y_min, y_max)]

def do_random_search(x_min, x_max, y_min, y_max, max_iter=100*10000+1, quit_at_err=0.01):
    for i in range(max_iter):
        # Note that with this solution, we are artificially limiting the search window
        current_solution = random_next_solution(x_min, x_max, y_min, y_max)
        current_solution_err = error_function(current_solution)
        if current_solution_err < quit_at_err:
            break

    print('Solutions considered: {}, Error: {}'.format(i, current_solution_err))
    return current_solution

do_random_search(-20, 20, -20, 20)

Solutions considered: 258008, Error: 0.003331292698078634


[2.0172999771900813, 2.9610554044041812]

So looks like the random uniform search is descent (though nowhere near the efficiency as the Genetic Algorithm). BUT, we had to limit the search window for x and y to (-20, 20). The Genetic Algorithm is is really nice because no such limitation of the window is necessary. With some problems, we don't have such a good idea of where the solution (or a good-enough solution) lies.

Now, look how poorly the random search algorithm behaves when we expand the window to (-100, 100):

In [21]:
do_random_search(-100, 100, -100, 100)

Solutions considered: 1000000, Error: 13621.719576470292


[8.392531191861323, -84.92413223145496]

*So when we expand the search space, the results are much worse for the uniform random search.*

# Conclusion

Clearly, the genetic algorithm is much more efficient than either a random walk search or a random uniform search.