Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option threads? #54

Open
HDembinski opened this issue Jul 16, 2020 · 4 comments
Open

Add option threads? #54

HDembinski opened this issue Jul 16, 2020 · 4 comments

Comments

@HDembinski
Copy link
Member

One of the first questions I got after the presentation on resample at PyHEP was about parallelization.

In principle, resampling methods are perfectly parallelizable, assuming that fn is pure (has no side-effects). That is generally a reasonable assumption. In Python, there are many ways to parallelize, you may want to parallelize on your own cores, or on some cluster of computers, or on the cloud. Therefore, offering direct access to resample is good, because it allows the user to user to chose their parallelization scheme.

For the simple common cases, however, we may want to offer a threads option to our methods, which compute fn on the replicas using threads number of threads on the current computer, to better utilize common multi-core processors. This would an option for the functions bootstrap and jackknife and those that build on them, e.g. bias and variance etc. @dsaxton What do you think?

@dsaxton
Copy link
Collaborator

dsaxton commented Jul 17, 2020

I think it makes sense, although I wouldn't know how to implement it. Are there options built into numpy and scipy that can be used?

@HDembinski
Copy link
Member Author

HDembinski commented Jul 18, 2020

Parallel execution is easy to implement with concurrent.futures.ThreadPoolExecutor, I can do that. It is mainly a question of whether we want to add this. I think it would be useful and convenient, but you were worried a while ago about adding too many keywords, that's why I bring it up before coding something.

@dsaxton
Copy link
Collaborator

dsaxton commented Jul 18, 2020

Parallel execution is easy to implement with concurrent.futures.ThreadPoolExecutor, I can do that. It is mainly a question of whether we want to add this. I think it would be useful and convenient, but you were worried a while ago about adding too many keywords, that's why I bring it up before coding something.

I'd be in favor of adding it. I think it could go after 1.0.0 since it should be fully backwards compatible?

@HDembinski
Copy link
Member Author

True!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants