Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arbiter : Genetic Search Algorithm #7081

Merged
merged 17 commits into from Feb 20, 2019

Conversation

@aboulang2002
Copy link
Contributor

commented Jan 26, 2019

I have implemented a genetic search algorithm and I think others would find it useful.

For those unfamiliar with genetic algorithms (GA), the gist of it is to mimic biologic evolution. That is, a population of candidate solutions is bred and/or mutated to generate better solutions.
Implementation details:

There are four building blocks to this algorithm: the crossover operators, the mutation operators, the cull operators and the population model. But the heart of the candidate generation is the SelectionOperator.

SelectionOperator:
The default operator, GeneticSelectionOperator, has two distinct behaviors. The first of these behaviors (buildRandomGenes()) is used when the population has not yet reached a certain size (buildNextGenes(), more on that later). In such case, the operator simply generate a random candidate. When the required size has been reached, the selection operator will start to breed candidate from the population (buildOffspring). It first call a crossover operator, which may or may not change an existing candidate from the population (more on that later). Then this new candidate is passed to the mutation operator which may also change or not change the candidate. This crossover/mutation step is repeated until a change has been made (from the crossover and/or the mutation). A step is then taken that is not strictly necessary in a GA. Since evaluating a candidate is very costly, we don't want to waste time with a candidate that we already evaluated in the past. So, the generated candidate is searched among a certain number of past candidates and will be discarded if found.

Crossover Operators:
The crossover operator is the one doing the breeding. The operator flow is this: First, parents are selected randomly. Then the operator randomly decides whether or not the crossover will proceed (with crossover ratio probability). If it does not proceed, the first selected parent is returned. If the crossover does proceed, the selected parents are bred depending on the selected crossover operator.

These are the currently implemented crossover operators:

  • ArithmeticCrossover: Each genes are interpolated at a random point (i.e. for each gene: t*a + (1-t)*b, where t is between 0 and 1)

  • SinglePointCrossover: A special case of the following KPointCrossover. The genes from the first parent are selected up to a random point (the crossover point), after which genes from the second parent are selected.

  • KPointCrossover: A bit like the SinglePointCrossover, but we can have multiple crossover points which will make the gene selection switch back and forth from the first and second parents.

  • UniformCrossover: For each gene, select randomly if it comes from the first or the second parent. It support a parent bias parameter so, one parent can be favored.

    Parent Selection: Presently, all crossover operators only support two parents. However, it has been shown that multiple parents crossovers can converge more rapidly. I will probably implement this at a later time.

Mutation Operators:
Presently only a random mutation operator is implemented. Each gene has a probability of being mutated depending on the mutation rate.

Population Model:
There are two main population models in GA. The more popular one is the generational model, which the candidate are generated from the current generation and are placed in the 'next generation'. When the next generation has reached the desired size, the current generation is discarded completely and the next becomes the current. The two drawbacks of this model is that, firstly, newly generated chromosomes only becomes available for breeding when the generation changes. And secondly, possible good chromosomes are discarded when the generation changes.

The other popular model is the steady state model, in which a newly generated candidate takes the place of an old one in the population. This model solves one of the problem that the generational one has; the new chromosomes are immediately available for crossover. But it has a drawback of its own; if the culling strategy is not chosen carefully, this model has a tendency to reduce the genetic diversity and thus, will tend to favor local optima instead of the global optimum.

The implementation presented here is a compromise of the two. The population is allowed to grow to a certain size (population size) and then, is culled to a certain smaller size (the culled size). This way, new chromosomes are immediately available and, no matter what is our culling strategy, they will have an opportunity to be selected as a parent, thus helping to keep a diverse gene pool.

Cull Operators:
This operator will discard least desirable candidates from the population. The only strategy currently implemented is the least-fit operator, an elitist operator which will discard the candidates with the worst fitness score. I may implement other strategies such as random replacement, kill tournament and kill oldest.

I'd like to implement a way to save and load the GA's state so the optimization can be stopped and restarted at will. I'd also like to have a better support for on-line optimization, that is to handle the scenario which new training/eval data arrives constantly and good but old candidates may not remain as good as time goes by.

My immediate next steps, however, will be to write examples and documentation.

Quick checklist

  • Reviewed the Contributing Guidelines and followed the steps within.
  • Created tests for any significant new code additions.
  • Relevant tests for your changes are passing.
  • Ran mvn formatter:format (see formatter instructions for targeting your specific files).
Merge pull request #1 from deeplearning4j/master
Get latest changes to my fork
Merge pull request #2 from deeplearning4j/master
merged down from master fork
@AlexDBlack

This comment has been minimized.

Copy link
Contributor

commented Jan 30, 2019

Hi, sorry for not getting to this sooner. At first glance it looks good, thanks for submitting it!
I'll do a formal review soon, but one thing I did notice initially is the lack of javadoc.... Javadoc on classes and methods would make this easier to use and maintain. We need copyright headers too (you can copy/paste from existing files).
You've got a bunch of unit tests (which is great) but I didn't see any using MultiLayerNetwork/ComputationGraph when I scanned the code (i.e., "full functionality" tests that are more like integration tests). Some tests like this would be good to make sure all of the components work together correctly.

@aboulang2002

This comment has been minimized.

Copy link
Contributor Author

commented Jan 31, 2019

Thanks Alex. I added the javadoc and the copyright headers. I will also add some integration tests soon.

@AlexDBlack
Copy link
Contributor

left a comment

Finally got to formally reviewing this.
Overall it's great - I have noted a few stylistic things and a few minor things that could be polished/improved.
Final integration tests aside, this looks close to being mergable.


public final PopulationModel populationModel;
public final ChromosomeFactory chromosomeFactory;
public final SelectionOperator selectionOperator;

This comment has been minimized.

Copy link
@AlexDBlack

AlexDBlack Feb 5, 2019

Contributor

Let's make these 3 protected final if we can.


@Override
public boolean hasMoreCandidates() {
return true;

This comment has been minimized.

Copy link
@AlexDBlack

AlexDBlack Feb 5, 2019

Contributor

There's probably some edge cases where this isn't true.
Consider the case where there's 1 discrete parameter to optimize, with possible values {A,B,C}.
In this case, there's only 3 candidates - and the candidate generator should terminate once all 3 have been generated.

/**
* The fitness score of the genes.
*/
public final double fitness;

This comment has been minimized.

Copy link
@AlexDBlack

AlexDBlack Feb 5, 2019

Contributor

Let's use private or protected here... that goes for all other public fields too (I won't point out every instance in this review).
And add a lombok @Data annotation to the class, to add getters, toString, equals etc
In case you aren't familiar with lombok: https://projectlombok.org/ - you'll also need IDE plugin.

public Builder crossoverRate(double rate) {
if(rate < 0 || rate > 1.0) {
throw new IllegalArgumentException("Rate must be between 0.0 and 1.0");
}

This comment has been minimized.

Copy link
@AlexDBlack

AlexDBlack Feb 5, 2019

Contributor

It's more a stylistic thing (I'll leave it up to you to change this sort of input validation if you want), but I tend to use the following (which is shorter and more informative for the user):

Preconditions.checkState(rate >= 0.0 && rate <= 1.0, "Rate must be between 0.0 and 1.0, got %s", rate);

That's the ND4J Preconditions class:
https://github.com/deeplearning4j/deeplearning4j/blob/master/nd4j/nd4j-common/src/test/java/org/nd4j/base/TestPreconditions.java
https://github.com/deeplearning4j/deeplearning4j/blob/master/nd4j/nd4j-common/src/main/java/org/nd4j/base/Preconditions.java

The ND4J preconditions class also formats arrays properly in the error string, and usually results in no object creation unless an exception is thrown.

/**
* A comparator used when higher fitness value is better
*/
public class MaximizeScoreComparator implements Comparator<Chromosome> {

This comment has been minimized.

Copy link
@AlexDBlack

AlexDBlack Feb 5, 2019

Contributor

This (and the other comparator) should be a static class?

return 1;
else if (rhs.fitness < lhs.fitness)
return -1;
return 0;

This comment has been minimized.

Copy link
@AlexDBlack

AlexDBlack Feb 5, 2019

Contributor

Let's replace these 5 lines with: return Double.compareDouble(lhs.getFitness(), rhs.getFitness()) (possibly with order switched, if that's the intention here).
Note I'm assuming the fitness field is protected/private and getter via @Data was added, as per earlier review comment.
Same thing for other comparator.

runner.addListeners(new LoggingStatusListener());
runner.execute();

System.out.println("----- Complete -----");

This comment has been minimized.

Copy link
@AlexDBlack

AlexDBlack Feb 5, 2019

Contributor

Maybe add a check that there were 0 failed, 50 successful, 50 total results?

aboulang2002 and others added 2 commits Feb 6, 2019
Alexandre Boulanger Alexandre Boulanger
@AlexDBlack

This comment has been minimized.

Copy link
Contributor

commented Feb 7, 2019

New changes look good. 👍
Let me know when you're ready for another review (hopefully final + merge). And let me know if you need help with the integration tests (DL4J net optimization).

@aboulang2002

This comment has been minimized.

Copy link
Contributor Author

commented Feb 7, 2019

Hi Alex, I should have had taken a moment to thank you for your review.

I've been scratching my head a bit about how to fix the return true; in hasMoreCandidates(). The use case you mention looks easy but quickly get too complex to my taste as other use cases are taken into consideration. I suppose that the norm is a space having at least one non-discrete hyperparameter. In these cases, we can safely always return true. But in the case of an all-discrete space, we would have to keep track of what has been tried and what has not, and this can get gigantic rapidly.

Another approach would be to monitor the diversity in the population. When near a optima, the algorithm will have a very similar gene pool and the fitness of all individuals will be very close. When that happens, hasMoreCandidates() could return false. But, that would be a lie; hasMoreCandidates() has more candidates, we just have decided on the user's behalf that the population he got is good enough. Besides, that approach should be implemented in a TerminationCondition if one really want to go that way.

Instead, I decided to add a attempt counter in GeneticSelectionOperator.buildNextGenes() and GeneticSelectionOperator.buildOffspring() that would throw an exception if any of these methods fails to generate a new set of genes. GeneticSearchCandidateGenerator.getCandidate() catch this exception and hasMoreCandidates() will return false after that. That also fixes a possible infinite loop. But I think I found another infinite loop in BaseOptimizationRunner.execute(). There is an outer while(true) loop and an inner while(config.getCandidateGenerator().hasMoreCandidates() ...) loop. In my test, I haven't set a termination condition and hasMoreCandidates() immediately returns false. When the test runs, it never terminates, trapped in the outer while loop. I haven't had time yet to understand what that outer loop does but the code in the rest of the class lead me to believe this is some multi-threading related loop. If this is the case, I need to to change a few things in the genetic algorithm classes because thread safety has not been a concern.

What do you think would be a good way to fix hasMoreCandidates()?
Should I change my code to make everything thread safe?

aboulang2002 added 2 commits Feb 9, 2019

@aboulang2002 aboulang2002 changed the title [WIP] (Request for comments) Arbiter : Genetic Search Algorithm Arbiter : Genetic Search Algorithm Feb 9, 2019

@aboulang2002

This comment has been minimized.

Copy link
Contributor Author

commented Feb 9, 2019

I think this PR is ready for another review.
Thanks

aboulang2002 added 3 commits Feb 9, 2019
@AlexDBlack

This comment has been minimized.

Copy link
Contributor

commented Feb 14, 2019

What do you think would be a good way to fix hasMoreCandidates()?

As for the "hasMoreCanditates()" issue - yeah, I gave that some thought, and came to basically the same conclusion that you did.
If there's real-valued parameter spaces, there's always more possible candidates.
As for discrete tracking - that would work in principle but isn't easy in practice. And, furthermore a user could implement their own parameter space which we can't actually check to see if we've "exhausted" it like for discrete parameter spaces.

I did realize until that we actually do the same thing (return true) in RandomSearchCandidateGenerator, so in light of the fact that we don't even solve it there, let's ignore it (but maybe we should make a brief note in the javadoc). That's not exactly ideal, but in the absence of a better idea, we might have to just go with that.

Anyway, I scanned the code again, it looks good to me. The main thing is the final integration test.
For that, you can probably adapt one of these: https://github.com/deeplearning4j/deeplearning4j/blob/master/arbiter/arbiter-deeplearning4j/src/test/java/org/deeplearning4j/arbiter/computationgraph/TestGraphLocalExecution.java
Basically take one of the (non-ignored) tests and plug in the new candidate generator instead of RandomSearchGenerator. Given your other tests are pretty thorough, that should be sufficient I think.

In my test, I haven't set a termination condition and hasMoreCandidates() immediately returns false.

Termination conditions aren't optional... you should be hitting this validation error instead:
https://github.com/deeplearning4j/deeplearning4j/blob/master/arbiter/arbiter-core/src/main/java/org/deeplearning4j/arbiter/optimize/runner/BaseOptimizationRunner.java#L70-L73

Let's make sure that check (or something equivalent) is actually being triggered for that case.

@aboulang2002

This comment has been minimized.

Copy link
Contributor Author

commented Feb 16, 2019

I added an integration test that I copied from TestGraphLocalExecution.testLocalExecutionDataSources(). I liked the fact this test is pretty thorough.

@AlexDBlack
Copy link
Contributor

left a comment

LGTM - thanks again for the contribution!
I'll just build/test it locally, but assuming that passes this can be merged 👍

@AlexDBlack AlexDBlack merged commit 34c1059 into eclipse:master Feb 20, 2019

1 of 3 checks passed

codeclimate Code Climate encountered an error attempting to analyze this pull request.
Details
continuous-integration/jenkins/pr-head This commit cannot be built
Details
Codacy/PR Quality Review Up to standards. A positive pull request.
Details

@aboulang2002 aboulang2002 deleted the aboulang2002:ab2002_genetic_search branch Feb 20, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.