Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to implement elitism? #158

Open
ELC opened this issue Jan 11, 2021 · 16 comments
Open

Is it possible to implement elitism? #158

ELC opened this issue Jan 11, 2021 · 16 comments

Comments

@ELC
Copy link

ELC commented Jan 11, 2021

I wrote this function and used it as a callback to attemp elitist selection without success:

best_chromosome = None

def save_best(population):
    global best_chromosome
    
    population_best = population.current_best
    
    if best_chromosome is None or population_best.fitness > best_chromosome.fitness:
        best_chromosome = deepcopy(population_best)

Is there a way to do it?

@koaning
Copy link
Contributor

koaning commented Jan 11, 2021

I think this is where the callback step can really make a difference.

https://github.com/godatadriven/evol/blob/master/evol/evolution.py#L189-L204

@ELC
Copy link
Author

ELC commented Jan 11, 2021

Hi @koaning, I use the callback step with the function I showed but someone the population does not have the behaviour I want

@koaning
Copy link
Contributor

koaning commented Jan 11, 2021

Is there a minimum viable example that you can share? Without one it is very hard for me to understand the missing behavior.

@ELC
Copy link
Author

ELC commented Jan 11, 2021

Thanks for the quick reply!

This an example, it is based on the README example so it doesn't add too much different behaviour. Below is the history of best fitness, which I would have expected to be monotonically increasing but it wasn't.

import random
from evol import Population, Evolution
from copy import deepcopy

random.seed(30)

def random_start():
    """
    This function generates a random (x,y) coordinate
    """
    return (random.random() - 0.5) * 20, (random.random() - 0.5) * 20

def func_to_optimise(xy):
    """
    This is the function we want to optimise (maximize)
    """
    x, y = xy
    return -(1-x)**2 - 2*(2-x**2)**2

def pick_random_parents(pop):
    """
    This is how we are going to select parents from the population
    """
    mom = random.choice(pop)
    dad = random.choice(pop)
    return mom, dad

def make_child(mom, dad):
    """
    This function describes how two candidates combine into a new candidate
    Note that the output is a tuple, just like the output of `random_start`
    We leave it to the developer to ensure that chromosomes are of the same type
    """
    child_x = (mom[0] + dad[0])/2
    child_y = (mom[1] + dad[1])/2
    return child_x, child_y

def add_noise(chromosome, sigma):
    """
    This is a function that will add some noise to the chromosome.
    """
    new_x = chromosome[0] + (random.random()-0.5) * sigma
    new_y = chromosome[1] + (random.random()-0.5) * sigma
    return new_x, new_y


best_chromosome = None
fitness_history = []

def save_best(population):
    global best_chromosome

    if not population.is_evaluated:
        population.evaluate()
        
    population_best = population.current_best

    if best_chromosome is None or population_best.fitness > best_chromosome.fitness:
        best_chromosome = deepcopy(population_best)

    amount = len(population.individuals)
    index = np.random.randint(0, amount)
    population.individuals[index] = best_chromosome
    
    fitness_history.append(population.current_best.fitness)

# We start by defining a population with candidates.
pop = Population(chromosomes=[random_start() for _ in range(10)],
                 eval_function=func_to_optimise, maximize=True)

# We define a sequence of steps to change these candidates
evo1 = (Evolution()
       .survive(fraction=0.5)
       .breed(parent_picker=pick_random_parents, combiner=make_child)
       .mutate(mutate_function=add_noise, sigma=1.5)
       .callback(save_best)
       )

pop = pop.evolve(evo1, n=50)

with plt.style.context("bmh"):
    plt.figure(figsize=(16,5))
    plt.plot(fitness_history)

First 100 steps:

image

@koaning
Copy link
Contributor

koaning commented Jan 11, 2021

Your example isn't complete. You didn't add the code that handles the actual logging of stats you see in your chart or the generate_chromosome function to generate the candidates. I cannot reproduce your code so I cannot confirm/deny if there's a bug in your code.

@ELC
Copy link
Author

ELC commented Jan 11, 2021

You are right, I am sorry.

I edited the message and use the README code to build a minimal self-contained example.

I am reporting the population.current_best.fitness for the generation fitness, and this is the metric I would like to keep monotonically increasing

@koaning
Copy link
Contributor

koaning commented Jan 11, 2021

I have a hint of what is happening. You're logging the best candidate per generation but you're still applying a mutate. This mutate is what is messing with your expectation. Here's the same charts when I increase/decrease the noise added.

Same Run with sigma=10

image

Same Run with sigma=0.01

image

You might want to adapt the mutate function such that it doesn't change if the parent is the best candidate.

@ELC
Copy link
Author

ELC commented Jan 11, 2021

I know but the idea is behind elistism is that we save the best chromosome from the population before selection and mutation and then we add it to the population after mutation.

The behaviour I want is to be able to make some chromosome jump the breed and mutate steps even if they occur 100% of the time. This is to ensure that at the end of the run, the best chromosome is the historical best.

And this should happen independently of how the breed and mutation was configured.

Here is an illustration of what I am looking for:

image

@koaning
Copy link
Contributor

koaning commented Jan 11, 2021

Mhmm ... now I understand.

I'll need to ponder this one but I think you're right that this isn't easily implemented with the current API. Maybe a "concat" verb makes sense but I'll need to give it a think.

It should also be said; I no longer work for GoDataDriven and I haven't touched this library in ... years? Just to check @rogiervandergeer you still watching this repo?

@koaning
Copy link
Contributor

koaning commented Jan 11, 2021

Would it be sensible to change the mutate function to not change the chromosomes of the best performing candidates.

@ELC
Copy link
Author

ELC commented Jan 12, 2021

Modifying the breed and mutate methods of population should work for most cases (cases where callbacks modify the population are exceptions).

breed should also be modified because when luck=True it is not guaranteed that the best performing candidate will be kept.

@ELC
Copy link
Author

ELC commented Jan 12, 2021

After trying several things, I think I found a solution:

I was using

population.individuals[index] = best_chromosome,

and I should have used

population.individuals[index] = deepcopy(best_chromosome)

I will do some tests and update the issue tomorrow

@rogiervandergeer
Copy link
Collaborator

It should also be said; I no longer work for GoDataDriven and I haven't touched this library in ... years? Just to check @rogiervandergeer you still watching this repo?

Of course I am! And don't let a mutation in the value for current_employer stop you from working on Evol!

Modifying the breed and mutate methods of population should work for most cases (cases where callbacks modify the population are exceptions).

breed should also be modified because when luck=True it is not guaranteed that the best performing candidate will be kept.

In #159 I've added a flag to let mutate behave elitist. For the survive function, I think setting luck=True is so contradictory to elitism that I'm happy with the current situation. @ELC would this work for you?

@koaning
Copy link
Contributor

koaning commented Jan 12, 2021

I just pushed version 0.5.2 which should have this elitism feature. @ELC let us know if it works.

@GiliardGodoi
Copy link

Hi guys!

I don't know if this is the right place for my thoughts, but here we go!

I do know that 'evol' has a fancy way of defining an evolutionary algorithm's pipeline, process, whatever...
Like the example above, we don't code directly with 'Evolution Step' classes.
Instead, 'Evolution Step' objects are appended onto the 'Evolution' chain, and the Population's method 'evolve' applies the 'Evolution Step' objects in the population.
On the other hand, it calls some method on the 'Population' class like breed, mutate or do some operation directly in the population methods like filter and map.

And I'm aware of the great flexibility and simplicity of implementing our custom mate function, chromosome evaluation function, picker and combiner function, etc., and passes it to the evolutionary pipeline.

However, I notice that those methods in the 'Population' class are getting more and more complicated.
We might be adding some behaviors in these methods, but they are not necessary all the time.
Additionally, we add some 'if ... else' clauses to achieve these different behaviors.
One side effect is that parameters may be necessary in some cases, but they may not be in others. When do we need to set them? It is not clear.

So I wondered -- and this would be a significant change in the framework's architecture -- if the 'Evolution Step' classes implemented these different kinds of strategies. For instance, we're going to have different mutation strategies: simple mutation, elitism mutation, and so on. We can compose these different strategies in uncountable ways.

In this circumstance, this new object would implement the evolution step and not merely call a method from a class like 'Population.'

Again, this is just a thought.

@rogiervandergeer
Copy link
Collaborator

Hi @GiliardGodoi,

Thank you for your thoughts!

When first setting up Evol we spent a lot of time overthinking the API we wanted to expose. Our most important goal was to have a rather simple API that is intuitive to use and still makes it easy to do (somewhat) complicated things. Hence we came up with the "verbs" of survive, mutate, combine etc. The nice thing about having these is that you can play around by manually calling these methods on any population, as well as have Evol do all the magic for you in an Evolution object.

One downside of this approach is indeed that those methods of the Population object get quite complicated. The issue with which arguments are needed in which cases could be solved by better documentation (I guess we could spend some more time there), but at some point it will just be impossible to incorporate more logic.

You propose to create alternative EvolutionStep object that implement a different versions of our verbs. I don't like that idea so much because that would mean we can no longer play around and call the methods on the Population itself (and I find that is by far the best way to debug).

As an alternative we could offer multiple different Population-type objects. We already have two kinds -- why not more? We could have an ElitistPopulation, for example. If you want to mix-and-match different versions of the "verbs", we could do something with mixin classes, where you can define your own Population class by choosing your favourite way of surviving, mutating and combining -- although I guess that gets very complicated rather quickly. The upside of this solution is that the simple and intuitive way of using Evol is maintained: you only need to bother about alternative Population types when you want to do complicated things.

Would you think that is a good solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants