Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'plus' replacement in replacers.py might have an issue #27

Closed
albertotonda opened this issue Jul 21, 2023 · 5 comments
Closed

'plus' replacement in replacers.py might have an issue #27

albertotonda opened this issue Jul 21, 2023 · 5 comments

Comments

@albertotonda
Copy link
Contributor

  • inspyred version: 1.0.1
  • Python version: 3.10.9
  • Operating System: Windows

Description

By performing debug printouts, I noticed something weird in inspyred.ec.replacers.plus_replacement ; in theory, "plus" should take the current population + the offspring, sort them, and return the best len(population) individuals as the survivors. However, unless I made a mistake with my debug printouts, it looks like that the 'parent' list in the argument contains only individuals that were actually selected for reproduction. So, the code for plus_replacement that goes:

    pool = list(offspring)
    pool.extend(parents)
    pool.sort(reverse=True)
    survivors = pool[:len(population)]
    return survivors

is not correct, as it creates a list with all individuals which reproduced + the offspring. In this way, it actually creates a 'survivors' list with potentially multiple copies of the same individual (if, for example, one individual was selected for reproduction multiple times). Correcting this should be easy enough:

    pool = list(offspring)
    **pool.extend(population)**
    pool.sort(reverse=True)
    survivors = pool[:len(population)]
    return survivors

However, I would double-check all other replacers, because they might have similar issues if they assume that 'parents' contains the current population and not just a list (with potential duplicates!) of the individuals that were selected for reproduction.

Here below is a printout of the content of 'parents', 'population', and 'offspring' for a problem where I have a custom individual.candidate genome (akin to Genetic Programming). I also have unique ids for the individuals. As you can see, while 'population' contains one copy of each individual, 'parents' has several copies of individuals 1 and 4, which were selected for reproduction multiple times.

[DEBUG 2023-07-21 10:09:01] Initial state of the parents:
[DEBUG 2023-07-21 10:09:01] 1:'u' : '(cos(P)/(P-P))'
[DEBUG 2023-07-21 10:09:01] 4:'u' : 'sin(cos(((P-P)+(L/P))))'
[DEBUG 2023-07-21 10:09:01] 1:'u' : '(cos(P)/(P-P))'
[DEBUG 2023-07-21 10:09:01] 1:'u' : '(cos(P)/(P-P))'
[DEBUG 2023-07-21 10:09:01] 4:'u' : 'sin(cos(((P-P)+(L/P))))'
[DEBUG 2023-07-21 10:09:01] 5:'u' : 'max(cos(min(L,(0.6575))),min(min(L,L),((0.6309)+(0.5804))))'
[DEBUG 2023-07-21 10:09:01] 9:'u' : '(((min(P,P)-max(L,P))*(min(L,(-0.9861))+min(L,P)))/(-0.6634))'
[DEBUG 2023-07-21 10:09:01] 4:'u' : 'sin(cos(((P-P)+(L/P))))'
[DEBUG 2023-07-21 10:09:01] 2:'u' : '(max(L,L)+max((-0.0879),(0.9665)))'
[DEBUG 2023-07-21 10:09:01] 8:'u' : 'sin(min(max(L,(0.7915)),(P/(0.2088))))'

[DEBUG 2023-07-21 10:09:01] Initial state of the population:
[DEBUG 2023-07-21 10:09:01] 0:'u' : 'cos(sin((-0.8001)))'
[DEBUG 2023-07-21 10:09:01] 1:'u' : '(cos(P)/(P-P))'
[DEBUG 2023-07-21 10:09:01] 2:'u' : '(max(L,L)+max((-0.0879),(0.9665)))'
[DEBUG 2023-07-21 10:09:01] 3:'u' : 'sqrt(sin(min((L-P),max(P,P))))'
[DEBUG 2023-07-21 10:09:01] 4:'u' : 'sin(cos(((P-P)+(L/P))))'
[DEBUG 2023-07-21 10:09:01] 5:'u' : 'max(cos(min(L,(0.6575))),min(min(L,L),((0.6309)+(0.5804))))'
[DEBUG 2023-07-21 10:09:01] 6:'u' : 'max(max((0.7001),P),sin(P))'
[DEBUG 2023-07-21 10:09:01] 7:'u' : '(cos((0.1225))*sqrt(L))'
[DEBUG 2023-07-21 10:09:01] 8:'u' : 'sin(min(max(L,(0.7915)),(P/(0.2088))))'
[DEBUG 2023-07-21 10:09:01] 9:'u' : '(((min(P,P)-max(L,P))*(min(L,(-0.9861))+min(L,P)))/(-0.6634))'

[DEBUG 2023-07-21 10:09:01] Initial state of the offspring:
[DEBUG 2023-07-21 10:09:01] 10:'u' : '(((P-P)+(L/P))/(P-P))'
[DEBUG 2023-07-21 10:09:01] 11:'u' : '(sin(P)/min(P,L))'
[DEBUG 2023-07-21 10:09:01] 12:'u' : 'sin(cos(min((P-P),max(P,P))))'
[DEBUG 2023-07-21 10:09:01] 13:'u' : 'L'
[DEBUG 2023-07-21 10:09:01] 14:'u' : 'sin(min(max(L,(0.7915)),(P/(0.2088))))'
@aarongarrett
Copy link
Owner

aarongarrett commented Jul 21, 2023 via email

@albertotonda
Copy link
Contributor Author

I see! However, I still think the implementation should be modified, for two reasons:

  1. When manuscripts in EAs talk about "parent population", I think they mean "whole population at the current generation, before creating offspring", rather than "only the individuals selected for reproduction". I tried to find some pseudo-code to support my claim, and the first thing I stumbled upon is from Hansen's introduction to Evolution Strategies, http://www.cmap.polytechnique.fr/~nikolaus.hansen/es-overview-2015.pdf :

image

Here, it is clear that the population P at line 7, just before selection, is the union of the old population and the new offspring. Even from another point of view, since mu refers to the size of the whole population, if only the parent individuals selected for reproduction were included in the global archive before selection, the size of the population at the end of a generation, just before selection, would not be of size mu+lambda, but rather (size of an archive with just individuals selected for reproduction)+lambda. The size of such an archive would vary with the type of variators applied (e.g. 2 individuals for each activation of crossover(s), 1 individual for each activation of a mutation).

  1. Even if the current interpretation were correct, the 'parent' list can contain multiple copies of the same individual. I don't think this is the intended behavior, as it can rapidly degrade population diversity.

@markcoletti
Copy link

markcoletti commented Jul 21, 2023

Alberto is correct. For an ES(µ+λ) the pool of prospective parents and created offspring are poured into the same bucket and then truncation selection yields a new pool of prospective parents of size |µ|. Combining just the selected parents and offspring introduces a new source of genetic drift -- that is, parents not selected drop out of sight even if they happen to be some of the best.

This is an ES example from LEAP:

    while generation_counter.generation() < max_generation:
        offspring = pipe(parents,
                         ops.random_selection,
                         ops.clone,
                         mutate_gaussian(std=context['leap']['std'], expected_num_mutations=1),
                         ops.evaluate,
                         # create the brood
                         ops.pool(size=len(parents) * BROOD_SIZE),
                         # mu + lambda
                         ops.truncation_selection(size=len(parents),
                                                  parents=parents))
        parents = offspring

pipe() takes a data source as its first argument, in this case the parents, from which offspring are accumulated via the ops.pool() operator. Then those offspring are passed to the truncation_selection() operator where the pool of prospective parents, including those that were not selected by happenstance for creating offspring, are combined and truncated by fitness; this way we do not lose the best parents (unless they happen to all be worse than the offspring). Converting this snippet to an ES(µ,λ) would simply entail dropping the parents=parents argument to the truncation selection operator.

@aarongarrett
Copy link
Owner

aarongarrett commented Jul 21, 2023 via email

@sanjayankur31
Copy link
Collaborator

Fixed in #28

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants