Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ga_instance.best_solution() performs all solutions in instance #79

Closed
Stoops-ML opened this issue Dec 30, 2021 · 5 comments
Closed

ga_instance.best_solution() performs all solutions in instance #79

Stoops-ML opened this issue Dec 30, 2021 · 5 comments
Assignees
Labels
question Further information is requested

Comments

@Stoops-ML
Copy link

The best_solution() method of a PyGAD.GA() instance runs all the solutions within the instance. This seems unnecessary as the instance could save the best solution on the fly and therefore not need to run all the solutions in the instance.

@Stoops-ML
Copy link
Author

I didn't notice the input save_best_solutions=False to PyGAD.GA(). Apolagies

@ahmedfgad ahmedfgad added the question Further information is requested label Jan 1, 2022
@whubaichuan
Copy link

@Stoops-ML hi, what do you mean here? run all the solutions mean all the generation's solutions? or just the last generated whole solution?

@ahmedfgad
Copy link
Owner

Hi @whubaichuan,

@Stoops-ML means that we do not have to re-calculate the fitness of the best solutions in the best_solution() method. As these solutions got their fitness values calculated previously, we can re-use them instead of calling the fitness function again.

As @Stoops-ML mentioned, we can use the save_best_solutions parameter. When it is set to True, then the best solutions in addition to their fitness values are saved into these 2 instance attributes:

  1. best_solutions
  2. best_solutions_fitness

@whubaichuan
Copy link

whubaichuan commented Feb 22, 2023

@ahmedfgad hi, thanks for your reply.

In my testing,

  1. it seems that best_solution() does not calculate all the solutions in the last generation when keep_elitsm is higher than 0 (for example =1), which is conflict with this document
  2. It seems that the best_solutions_fitness can be valid even save_best_solution is False

def on_start(ga_instance):

    print("Initial Generation")
    print(ga_instance.population)


def fitness_func(solution,solution_idx):
    return sum(solution)

def on_generation(ga):
    print("Generation", ga.generations_completed)
    print(ga.population)

if __name__ == '__main__':


    num_generations = 3
    num_parents_mating = 2
    fitness_function = fitness_func
    sol_per_pop = 4
    num_genes = 3
    parent_selection_type = "rank"
    keep_elitism = 1

    gene_space = [[1,2,3,4],[1,2,3,4],[1,2,3,4]]

    ga_instance = pygad.GA(num_generations=num_generations,
                        num_parents_mating=num_parents_mating,
                        fitness_func=fitness_function,
                        sol_per_pop=sol_per_pop,
                        num_genes=num_genes,
                            #callback_generation=callback_gen,
                            gene_space = gene_space,
                        parent_selection_type=parent_selection_type,
                            keep_elitism=keep_elitism,
                            random_seed=2,
                        on_start=on_start,
                        on_generation=on_generation,
                            gene_type=int,
                            save_best_solutions=False,
                        parallel_processing=["process", 4])
    
    ga_instance.run()
    print(ga_instance.population)
    print(ga_instance.last_generation_fitness)
    print(ga_instance.best_solutions_fitness)

    solution, solution_fitness, solution_idx = ga_instance.best_solution()
    print("Parameters of the best solution : {solution}".format(solution=solution))
    print("Fitness value of the best solution = {solution_fitness}".format(solution_fitness=solution_fitness))
    print("Index of the best solution : {solution_idx}".format(solution_idx=solution_idx))

@ahmedfgad
Copy link
Owner

@whubaichuan,

Thanks for your notes.

For the first point, the documentation will be updated to include these details about the cal_pop_fitness() method:

The cal_pop_fitness() method calculates and returns the fitness values of the solutions in the current population.
This function is optimized to save time by making fewer calls the fitness function. It follows this process:

  1. If the save_solutions parameter is set to True, then it checks if the solution is already explored and saved in the solutions instance attribute. If so, then it just retrieves its fitness from the solutions_fitness instance attribute without calling the fitness function.
  2. If save_solutions is set to False or if it is True but the solution was not explored yet, then the cal_pop_fitness() method checks if the keep_elitism parameter is set to a positive integer. If so, then it checks if the solution is saved into the last_generation_elitism instance attribute. If so, then it retrieves its fitness from the previous_generation_fitness instance attribute.
  3. If neither of the above 3 conditions apply (1. save_solutions is set to False or 2. if it is True but the solution was not explored yet or 3. keep_elitism is set to zero), then the cal_pop_fitness() method checks if the keep_parents parameter is set to -1 or a positive integer. If so, then it checks if the solution is saved into the last_generation_parents instance attribute. If so, then it retrieves its fitness from the previous_generation_fitness instance attribute.
  4. If neither of the above 4 conditions apply, then we have to call the fitness function to calculate the fitness for the solution. This is by calling the function assigned to the fitness_func parameter.

This function takes into consideration:

  1. The parallel_processing parameter to check whether parallel processing is in effect.
  2. The fitness_batch_size parameter to check if the fitness should be calculated in batches of solutions.
    It returns a vector of the solutions' fitness values.

For the second point, you are right. Even if save_best_solutions=False, the fitness values of the best solutions are saved into the best_solutions_fitness attribute. Note that this happens with minor increase in the computational time.

ahmedfgad added a commit that referenced this issue Feb 22, 2023
PyGAD 2.19.0 Release Notes
1. A new `summary()` method is supported to return a Keras-like summary of the PyGAD lifecycle.
2. A new optional parameter called `fitness_batch_size` is supported to calculate the fitness function in batches. If it is assigned the value `1` or `None` (default), then the normal flow is used where the fitness function is called for each individual solution. If the `fitness_batch_size` parameter is assigned a value satisfying this condition `1 < fitness_batch_size <= sol_per_pop`, then the solutions are grouped into batches of size `fitness_batch_size` and the fitness function is called once for each batch. In this case, the fitness function must return a list/tuple/numpy.ndarray with a length equal to the number of solutions passed. #136.
3. The `cloudpickle` library (https://github.com/cloudpipe/cloudpickle) is used instead of the `pickle` library to pickle the `pygad.GA` objects. This solves the issue of having to redefine the functions (e.g. fitness function). The `cloudpickle` library is added as a dependancy in the `requirements.txt` file. #159
4. Support of assigning methods to these parameters: `fitness_func`, `crossover_type`, `mutation_type`, `parent_selection_type`, `on_start`, `on_fitness`, `on_parents`, `on_crossover`, `on_mutation`, `on_generation`, and `on_stop`. #92 #138
5. Validating the output of the parent selection, crossover, and mutation functions.
6. The built-in parent selection operators return the parent's indices as a NumPy array.
7. The outputs of the parent selection, crossover, and mutation operators must be NumPy arrays.
8. Fix an issue when `allow_duplicate_genes=True`. #39
9. Fix an issue creating scatter plots of the solutions' fitness.
10. Sampling from a `set()` is no longer supported in Python 3.11. Instead, sampling happens from a `list()`. Thanks `Marco Brenna` for pointing to this issue.
11. The lifecycle is updated to reflect that the new population's fitness is calculated at the end of the lifecycle not at the beginning. #154 (comment)
12. There was an issue when `save_solutions=True` that causes the fitness function to be called for solutions already explored and have their fitness pre-calculated. #160
13. A new instance attribute named `last_generation_elitism_indices` added to hold the indices of the selected elitism. This attribute helps to re-use the fitness of the elitism instead of calling the fitness function.
14. Fewer calls to the `best_solution()` method which in turns saves some calls to the fitness function.
15. Some updates in the documentation to give more details about the `cal_pop_fitness()` method. #79 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants