Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Options.npopulations = nothing, does not detect number of cores #38

Closed
cobac opened this issue Jul 13, 2021 · 2 comments
Closed

Options.npopulations = nothing, does not detect number of cores #38

cobac opened this issue Jul 13, 2021 · 2 comments

Comments

@cobac
Copy link
Contributor

cobac commented Jul 13, 2021

Hello,

Reading the API documentation I get that the program should run as many populations as cores if Options.npopulations is set to nothing. However, that is not the case. Am I understanding properly what this option is supposed to do?

Running Julia with -t 5:

using SymbolicRegression

x = rand(2, 30)
f(x) = x[1] + x[2]
y = f.(eachcol(x))

opt1 = Options()
opt2 = Options(npopulations = 5)

EquationSearch(x, y, niterations = 2, numprocs = 5, runtests = false, options = opt1)
# With opt1: 2 iterations in total

EquationSearch(x, y, niterations = 2, numprocs = 5, runtests = false, options = opt2)
# With opt2: 10 iterations in total as expected
@MilesCranmer
Copy link
Owner

MilesCranmer commented Jul 13, 2021

So, npopulations=nothing will trigger npopulations=nworkers() as expected. Note that -t 5 creates 5 threads, whereas -p 5 creates 5 processes. The latter is what will produce the correct number of populations.

However, if you are manually creating processes like that, you will need to declare everything with Distributed.@everywhere, since SymbolicRegression.jl won't set it up automatically then.

So this may be a mistake on my part, because when getting SR to set up the processes with numprocs, it is not correcting the number of populations. This is also problematic since you can pass multithreading=True to EquationSearch, in which case the number of populations should match the threads rather than the workers... will see what I can do. I note that Options is immutable, so it would probably need to be re-created after calling EquationSearch, which might introduce other issues...

Maybe the easiest solution is to just fix the number of populations to 20 or 40, as done in PySR, since that makes pipelines more reproducible!

@cobac
Copy link
Contributor Author

cobac commented Jul 19, 2021

Oh right, thank you for the answer.

Yeah I found a bit confusing the distributed-options logic, but that could definitely be just me since I'm clearly not very familiar yet with how parallel things can be handled on Julia. Glad to at least have nudged you to think about that loophole.

Thanks again!

@cobac cobac closed this as completed Jul 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants