Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: Skip duplicate candidates with IdentityHashMap and make AbstractEvolutionEngine thread safe #31

Open
bernard01 opened this issue Jan 17, 2018 · 2 comments

Comments

@bernard01
Copy link

bernard01 commented Jan 17, 2018

We can improve evolution performance of the AbstractEvolutionEngine by exploiting the specified SelectionStrategy behavior.

interface SelectionStrategy#select comments:

... the same individual may potentially be selected more than once

This has a consequence in AbstractEvolutionEngine#evaluatePopulation()

which will then evaluate identical candidates more than once.

The result is reduced evolution performance which is a general problem not specific to any application.

One gets partial relief by maintaining the fitness value in the candidate itself (not something that is always possible) and evaluating conditionally. That requires synchronizing on the candidate in FitnessEvaluator#getFitness(). However, this scheme runs into thread contention problems with a high number of identical candidates, again resulting in reduced evolution performance.

The permanent solution is to only evaluate a reduced size set of distinct candidates. Synchronizing as described above is then no longer required.

So in AbstractEvolutionEngine#evaluatePopulation()

// Each entry contains the count of skipped duplicates (in addition to the 1st).
Map<T, Integer> duplicatesCountMap = new IdentityHashMap();
for (T candidate: population){
	if(duplicatesCountMap.containsKey(candidate)){
		duplicatesCountMap.put(candidate, duplicatesCountMap.get(candidate) + 1);
	}else{
		duplicatesCountMap.put(candidate, 0);
	}
}

and then in both single threaded and multi-threaded if branches

replace
for (T candidate : population)
with
for (T candidate : duplicatesCountMap.keySet())

and then later some post-processing to add the skipped duplicates back to the results so that the population size remains the same (that is why we count the duplicates):

final List<EvaluatedCandidate<T>> skippedPopulation = new ArrayList<EvaluatedCandidate<T>>(population.size());
for(EvaluatedCandidate<T> evaluatedCandidate : evaluatedPopulation){
	final Integer skippedCount = duplicatesCountMap.get(evaluatedCandidate.getCandidate());
	for(int index = 0; index < skippedCount; index++){
		skippedPopulation.add(evaluatedCandidate);
	}
}
evaluatedPopulation.addAll(skippedPopulation);

Please refer to the attached source file
AbstractEvolutionEngine.zip

.

@bernard01 bernard01 changed the title Performance: Skip duplicate candidates with IdentityHashMap Performance: Skip duplicate candidates with IdentityHashMap and make AbstractEvolutionEngine thread safe Feb 21, 2018
@bernard01
Copy link
Author

AbstractEvolutionEngine is not thread safe. That is because it is likely that multiple threads evaluate the same candidate at the same time.

We fix that and improve evolution performance of the AbstractEvolutionEngine at the same time by exploiting the specified SelectionStrategy behavior.

interface SelectionStrategy#select comments:

... the same individual may potentially be selected more than once

This has a consequence in AbstractEvolutionEngine#evaluatePopulation()

which will then evaluate identical candidates more than once.

The result is contention and reduced evolution performance which is a general problem not specific to any application.

One gets partial relief by maintaining the fitness value in the candidate itself (not something that is always possible) and evaluating conditionally. That requires synchronizing on the candidate in FitnessEvaluator#getFitness() because the engine is not thread safe. However, this scheme runs into thread contention problems with a high number of identical candidates, again resulting in reduced evolution performance.

The permanent solution is to only evaluate a reduced size set of distinct candidates. Synchronizing as described above is then no longer required.

So in AbstractEvolutionEngine#evaluatePopulation()

	// Each entry contains the count of skipped duplicates (in addition to the 1st).
	Map<T, Integer> duplicatesCountMap = new IdentityHashMap();
	for (T candidate: population){
		if(duplicatesCountMap.containsKey(candidate)){
			duplicatesCountMap.put(candidate, duplicatesCountMap.get(candidate) + 1);
		}else{
			duplicatesCountMap.put(candidate, 0);
		}
	}

and then in both single threaded and multi-threaded if branches

replace
for (T candidate : population)
with
for (T candidate : duplicatesCountMap.keySet())

and then later some post-processing to add the skipped duplicates back to the results so that the population size remains the same (that is why we count the duplicates):


final List<EvaluatedCandidate<T>> skippedPopulation = new ArrayList<EvaluatedCandidate<T>>(population.size());
for(EvaluatedCandidate<T> evaluatedCandidate : evaluatedPopulation){
    final Integer skippedCount = duplicatesCountMap.get(evaluatedCandidate.getCandidate());
        for(int index = 0; index < skippedCount; index++){
            skippedPopulation.add(evaluatedCandidate);
        }
}
evaluatedPopulation.addAll(skippedPopulation);

Please refer to the attached source code.
AbstractEvolutionEngine.zip

@bernard01 bernard01 reopened this Feb 21, 2018
@JoonasVali
Copy link

JoonasVali commented Nov 15, 2020

Shouldn't really be a problem with CachingFitnessEvaluator, which caches the fitnesses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants