Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Genome mutate - implements new logic for mutations #180

Conversation

HenrikMettler
Copy link
Contributor

Implements the new logic for mutation discussed in #172. Closes #172 , #173 and makes #124 obsolete. Does not pass test_parallel_population in test/test_hl_api.py, since fitnes is not the with 1 or 2 parallel processes. I created the PR anyway since I couldn't figure out why and whether this is a problem with the new logic in mutate

Copy link
Member

@mschmidt87 mschmidt87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rebase your branch on current master? I think you branched off a while ago and now there are some changes in the PR which don't belong here.

cgp/genome.py Outdated Show resolved Hide resolved
cgp/genome.py Outdated Show resolved Hide resolved
cgp/genome.py Outdated Show resolved Hide resolved
@HenrikMettler HenrikMettler force-pushed the issue/assert_real_mutations branch 3 times, most recently from 47f99b8 to a8034c1 Compare July 14, 2020 11:58
@jakobj jakobj added this to the 0.2.0 milestone Jul 14, 2020
Copy link
Member

@jakobj jakobj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome, great work @HenrikMettler! 🚀

this is quite a substantial change, so accordingly I had a few remarks. ;) they should be addressed before merging. please let me know if anything is unclear.

cgp/genome.py Show resolved Hide resolved
cgp/genome.py Outdated Show resolved Hide resolved
cgp/genome.py Outdated Show resolved Hide resolved
cgp/genome.py Outdated Show resolved Hide resolved
cgp/genome.py Outdated Show resolved Hide resolved
test/test_genome.py Outdated Show resolved Hide resolved
test/test_genome.py Outdated Show resolved Hide resolved
test/test_genome.py Outdated Show resolved Hide resolved
test/test_genome.py Outdated Show resolved Hide resolved
test/test_genome.py Outdated Show resolved Hide resolved
@jakobj
Copy link
Member

jakobj commented Jul 14, 2020

[snip] Does not pass test_parallel_population in test/test_hl_api.py, since fitnes is not the with 1 or 2 parallel processes. I created the PR anyway since I couldn't figure out why and whether this is a problem with the new logic in mutate

the fitness should be identical, independent of the number of processes as far as I remember. this is obviously desirable from a reproducibility viewpoint. is it possible that any of the stochastic operations are not using the internal rng of the population? that would be my first guess.

@HenrikMettler
Copy link
Contributor Author

Just noticed that test_create_new_offspring_and_parent_generation in test_ea_mu_plus_lambda.py occasionally fails. I assume this is since non of the genes that have more than one permissible value is selected for mutation. Should I adapt the test logic or adapt mutate such that always at least one gene in mutated?

@HenrikMettler
Copy link
Contributor Author

the fitness should be identical, independent of the number of processes as far as I remember. this is obviously desirable from a reproducibility viewpoint. is it possible that any of the stochastic operations are not using the internal rng of the population? that would be my first guess.

As far as I can see the problem just arises when setting n_process = 1 and not for any other (valid) number of processes. Could the error have something to do with https://github.com/Happy-Algorithms-League/hal-cgp/blob/master/cgp/ea/mu_plus_lambda.py#L157 ? I don't know how my changes influence this though

@jakobj
Copy link
Member

jakobj commented Jul 16, 2020

Just noticed that test_create_new_offspring_and_parent_generation in test_ea_mu_plus_lambda.py occasionally fails. I assume this is since non of the genes that have more than one permissible value is selected for mutation. Should I adapt the test logic or adapt mutate such that always at least one gene in mutated?

i would adapt the test logic as we are happy with the logic in mutate. maybe you can increase the mutation rate to 1.0?

@jakobj
Copy link
Member

jakobj commented Jul 16, 2020

the fitness should be identical, independent of the number of processes as far as I remember. this is obviously desirable from a reproducibility viewpoint. is it possible that any of the stochastic operations are not using the internal rng of the population? that would be my first guess.

As far as I can see the problem just arises when setting n_process = 1 and not for any other (valid) number of processes. Could the error have something to do with https://github.com/Happy-Algorithms-League/hal-cgp/blob/master/cgp/ea/mu_plus_lambda.py#L157 ? I don't know how my changes influence this though

what do you mean by "the problem"? is the fitness always the same except for running a single process?

Copy link
Member

@jakobj jakobj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is improving nicely! just added two more comments

cgp/genome.py Outdated Show resolved Hide resolved
test/test_genome.py Outdated Show resolved Hide resolved
@HenrikMettler
Copy link
Contributor Author

Just noticed that test_create_new_offspring_and_parent_generation in test_ea_mu_plus_lambda.py occasionally fails. I assume this is since non of the genes that have more than one permissible value is selected for mutation. Should I adapt the test logic or adapt mutate such that always at least one gene in mutated?

i would adapt the test logic as we are happy with the logic in mutate. maybe you can increase the mutation rate to 1.0?

I can't set it to '1.0' because of https://github.com/Happy-Algorithms-League/hal-cgp/blob/master/cgp/population.py#L37 (if not (0.0 < mutation_rate and mutation_rate < 1.0): raise ValueError("mutation rate needs to be in (0, 1)") But 0.99 should be fine, I guess? (The chance of not having 0 mutations with that mutation_rate can be neglectable)

@HenrikMettler
Copy link
Contributor Author

the fitness should be identical, independent of the number of processes as far as I remember. this is obviously desirable from a reproducibility viewpoint. is it possible that any of the stochastic operations are not using the internal rng of the population? that would be my first guess.

As far as I can see the problem just arises when setting n_process = 1 and not for any other (valid) number of processes. Could the error have something to do with https://github.com/Happy-Algorithms-League/hal-cgp/blob/master/cgp/ea/mu_plus_lambda.py#L157 ? I don't know how my changes influence this though

what do you mean by "the problem"? is the fitness always the same except for running a single process?

Yes. (It also doesn't matter at which position in the vector n_process = 1 is, eg if I set for n_process in [2, 1, 4] still the one where n_process = 1 is having a different fitness)

@jakobj
Copy link
Member

jakobj commented Jul 17, 2020

i would adapt the test logic as we are happy with the logic in mutate. maybe you can increase the mutation rate to 1.0?

I can't set it to '1.0' because of https://github.com/Happy-Algorithms-League/hal-cgp/blob/master/cgp/population.py#L37 (if not (0.0 < mutation_rate and mutation_rate < 1.0): raise ValueError("mutation rate needs to be in (0, 1)") But 0.99 should be fine, I guess? (The chance of not having 0 mutations with that mutation_rate can be neglectable)

ah, interesting. well, why don't we allow a mutation rate of 1.0? excluding the lower bound makes sense since no evolution takes place if no mutation can happen, but the upper bound could be included, right?

@jakobj
Copy link
Member

jakobj commented Jul 17, 2020

the fitness should be identical, independent of the number of processes as far as I remember. this is obviously desirable from a reproducibility viewpoint. is it possible that any of the stochastic operations are not using the internal rng of the population? that would be my first guess.

As far as I can see the problem just arises when setting n_process = 1 and not for any other (valid) number of processes. Could the error have something to do with https://github.com/Happy-Algorithms-League/hal-cgp/blob/master/cgp/ea/mu_plus_lambda.py#L157 ? I don't know how my changes influence this though

what do you mean by "the problem"? is the fitness always the same except for running a single process?

Yes. (It also doesn't matter at which position in the vector n_process = 1 is, eg if I set for n_process in [2, 1, 4] still the one where n_process = 1 is having a different fitness)

this is not good and we should fix it before merging. i'll see whether i find time today to look into this.

Copy link
Member

@jakobj jakobj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks great, nice solution to the only_silent_mutations testing 🚀

added a few more comments

cgp/genome.py Outdated Show resolved Hide resolved
cgp/genome.py Outdated Show resolved Hide resolved
cgp/genome.py Outdated Show resolved Hide resolved
cgp/genome.py Outdated Show resolved Hide resolved
cgp/genome.py Outdated Show resolved Hide resolved
test/test_genome.py Outdated Show resolved Hide resolved
test/test_genome.py Outdated Show resolved Hide resolved
test/test_genome.py Outdated Show resolved Hide resolved
test/test_genome.py Outdated Show resolved Hide resolved
test/test_genome.py Outdated Show resolved Hide resolved
Copy link
Member

@jakobj jakobj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a few inline comments regarding the generation of random numbers

cgp/genome.py Outdated Show resolved Hide resolved
cgp/genome.py Outdated Show resolved Hide resolved
cgp/genome.py Outdated Show resolved Hide resolved
Copy link
Member

@jakobj jakobj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good, just some more nitpicking :)

cgp/genome.py Outdated Show resolved Hide resolved
cgp/genome.py Outdated Show resolved Hide resolved
cgp/genome.py Outdated Show resolved Hide resolved
cgp/genome.py Outdated Show resolved Hide resolved
cgp/genome.py Outdated Show resolved Hide resolved
test/test_genome.py Outdated Show resolved Hide resolved
test/test_genome.py Outdated Show resolved Hide resolved
test/test_genome.py Outdated Show resolved Hide resolved
test/test_genome.py Outdated Show resolved Hide resolved
test/test_hl_api.py Outdated Show resolved Hide resolved
test/test_genome.py Outdated Show resolved Hide resolved
Copy link
Member

@jakobj jakobj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright, four more small comments to fix and this is good to go! :)

you also need to rebase on the current master to fix the conflicts before we can merge

test/test_genome.py Outdated Show resolved Hide resolved
test/test_genome.py Outdated Show resolved Hide resolved
test/test_genome.py Show resolved Hide resolved
cgp/genome.py Outdated Show resolved Hide resolved
@HenrikMettler
Copy link
Contributor Author

I am not sure how to fix to travis CI build issue, when I introduce a line break, black "refixes" it

cgp/genome.py Outdated Show resolved Hide resolved
test/test_genome.py Outdated Show resolved Hide resolved
test/test_genome.py Outdated Show resolved Hide resolved
@jakobj
Copy link
Member

jakobj commented Jul 20, 2020

I am not sure how to fix to travis CI build issue, when I introduce a line break, black "refixes" it

hmm, yeah, black and flake seem to bite each other here. i suggested a shorter variable name, that should do the trick :)

… random prob is below the mutation rate, evaluate permissible values for every gene, make a mutation always change the gene value, if more than one value is possible for this gene.

Add test for number of mutations in a large population (must be close to expected value), test only silent mutation by monkey patching the indices selection function, testing permissible values for hidden and output region.
Allow mutation_rates = 1.0.
Copy link
Member

@jakobj jakobj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great work, this look good 🎉 ✨

approving 👍 conditioned on travis passing

@jakobj jakobj merged commit 607d028 into Happy-Algorithms-League:master Jul 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants