Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can not control the number of max_evals or max_iters #376

Closed
hanamthang opened this issue Feb 13, 2022 · 11 comments
Closed

Can not control the number of max_evals or max_iters #376

hanamthang opened this issue Feb 13, 2022 · 11 comments

Comments

@hanamthang
Copy link

hanamthang commented Feb 13, 2022

Hi,

Thank you very much for your hard working to create the NiaPy which is potential to improve much my research :-))))

Could you please explain the differences between the parameter max_evals and max_iters in the task? And how can I control them?
I tried to minimize the mean square error during the feature selection using Firefly Algorithm and CatBoost Regressor. However, when I set the max_evals = 10 as follows:
task = Task(problem, max_evals=10, optimization_type=OptimizationType.MINIMIZATION, enable_logging=True)
it only ran 5 evaluations:

INFO:niapy.task.Task:evals:1 => 446.13928695170216
INFO:niapy.task.Task:evals:5 => 433.54774275563943
Number of selected features: 21

or set the max_iters = 10 as follows:
task = Task(problem, max_iters=10, optimization_type=OptimizationType.MINIMIZATION, enable_logging=True)
it ran 2423 evaluations:

INFO:niapy.task.Task:evals:1 => 446.13928695170216
INFO:niapy.task.Task:evals:5 => 433.54774275563943
INFO:niapy.task.Task:evals:11 => 428.94143771504224
INFO:niapy.task.Task:evals:20 => 422.539381286218
INFO:niapy.task.Task:evals:28 => 412.32678534520574
INFO:niapy.task.Task:evals:30 => 412.07734133808253
INFO:niapy.task.Task:evals:109 => 411.98004342657293
INFO:niapy.task.Task:evals:139 => 400.99684114079884
INFO:niapy.task.Task:evals:442 => 393.40534326526745
INFO:niapy.task.Task:evals:1900 => 393.07398868489685
INFO:niapy.task.Task:evals:2423 => 378.8922834335721
Number of selected features: 22

It seems a stopping criteria was set and the algorithm only stopped when it reached to this criteria. Is it correct? And should I use the parameter max_evals or max_iters in case of my problem?

Many thanks,
Thang

@zStupan
Copy link
Contributor

zStupan commented Feb 13, 2022

Hello,

you are correct, max_iters and max_evals are stopping criteria. max_iters controls the maximum number of iterations (generations) of the algorithm, while max_evals is the maximum number of fitness function evaluations.

The reason your first example only logs 5 evaluations is because it only logs a fitness value if there has been an improvement to the current best global solution. In this case it ran 10 evaluations, and there was no improvement after the 5th one.

I'd say in your case, you're better off going with max_iters as your stopping condition.

@hanamthang
Copy link
Author

Oh, I got it. Could you help me explain why the number of evaluations were 2423 when I set the max_iters to 10? And how can I force the algorithm run 10 evaluations with max_iters = 10?

Many thanks,
Thang

@zStupan
Copy link
Contributor

zStupan commented Feb 14, 2022

You could do task = Task(problem, max_evals=10, max_iters=10, optimization_type=OptimizationType.MINIMIZATION, enable_logging=True), but if you set max_evals to 10 it will just calculate the fitness of the first 10 individuals in the initial population and return the best one of them. It won't actually do 10 iterations.

Could you help me explain why the number of evaluations were 2423 when I set the max_iters to 10?

The output of the logger is the number of evaluations where the fitness is better than the current global best fitness. It actually performed more evaluations but there was no improvement after the 2423rd evaluation. For example if we ran DE with population size 100 with max_iters = 10 it would do 100 + 10 * 100 = 1100 evaluations, but there might be no improvements past the 900th one, so the last output logged would be the fitness at 900 evaluation.

@hanamthang
Copy link
Author

Thank you very much for quick and detail explanation. I would like to forge the algorithm run with max_iters so I can compare with other metaheuristic algorithms in other packages (whale optimization algorithm in another library, for instance) with the same axes of iteration and MSE. It seems we can not have the desired number of iteration in this case.

Just wonder in my example using the Firefly Algorithm with population size 30, max_iters 10, so the number of evaluations should be (30 + 10 * 30) = 330. But the algorithm returned ~ 2500 evaluations. So I missed something here, right?

@zStupan
Copy link
Contributor

zStupan commented Feb 14, 2022

The firefly algorithm is a bit different, because in each iteration there's nested for loops which calculate the distances between each pair of fireflies, update them and evaluate them again, so it performs a lot more function evaluations.

@hanamthang
Copy link
Author

hanamthang commented Feb 14, 2022

So, if I want to compare the optimization performance among various algorithms (say, Firefly, PSO...) using the convergence speed (for MSE) in a number of iterations, could you advise me the approach in this case?

@zStupan
Copy link
Contributor

zStupan commented Mar 1, 2022

Hello,

my sincerest apologies for the late response. Unfortunately we currently only support convergence by number of evaluations. task.plot() plots a convergence graph with the number of evaluations on the x axis and the fitness function values on the y axis (we should probably add support for convergence according to iterations @firefly-cpp).

Unfortunately the best you can do right now is to use max_evals and set it to say 100 000 for each algorithm and compare the results that way. If you just want the data from the graph task.return_conv() returns a tuple of (evaluations, fitness values).

I'll try to add convergence by iterations by the end of this week, shouldn't be complicated. And then we can hopefully solve this issue.

@hanamthang
Copy link
Author

I understood and will wait for your next update.

Many thanks,
Thang

@zStupan
Copy link
Contributor

zStupan commented Mar 4, 2022

@hanamthang I've submitted a PR that will hopefully provide the features you are looking for. You'll be able to plot a convergence graph by running task.plot_convergence(x_axis='iters'), or get the raw data by running task.convergence_data(x_axis='iters'), which will return 2 numpy arrays, the x axis and the y axis. the first one is just an array of ints from 0 to the number of iterations, and the second one is the best global fitness at each iteration.

@zStupan
Copy link
Contributor

zStupan commented Mar 6, 2022

So, if I want to compare the optimization performance among various algorithms (say, Firefly, PSO...) using the convergence speed (for MSE) in a number of iterations, could you advise me the approach in this case?

Now in version 2.0.1 you could do something like:

algorithms = [FireflyAlgorithm(), ParticleSwarmOptimization(), DifferentialEvolution()]
convergence = []
for algorithm in algorithms:
    task = Task(problem, max_iters=50)
    best_x, best_mse = algorithm.run(task)
    _, fitness = task.convergence_data()  # fitness = array of best fitness at each iteration 
    convergence.append(fitness)

ff_fit, pso_fit, de_fit = convergence  # unpack 
# plot ....

@hanamthang
Copy link
Author

Thank you so much for your work and the example codes. I will test the codes with the new version of NiaPy.

You can close the issue now.

Many thanks,
Thang

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants