-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
View and Set N Jobs #1029
View and Set N Jobs #1029
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @DhruvSrikanth - the changes loog good, but I might not have understood the motivation for them. As far as I see, all functions that use multiprocessing have an n_jobs
setting that would allow you changing the number of jobs used. A quick search through the repo only gave me results where the default is used as a function parameter default setting - so you should be able to overwrite it in any case.
Am I missing something?
|
||
|
||
class MultiprocessingDistributorTestCase(TestCase): | ||
def test_n_jobs(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you make sure that the default setting (as it was before running this test) is reset after the test? (also if the test failed).
Just to make sure that in case any other test assumes the default value is still set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, I can do that. Should I make the changes and open as a new PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just add the changes to this PR as they belong together :)
I was not able to pass the n_jobs parameter to some of the feature extractors such as mean, median etc. I wanted to profile specific feature extractors, hence, I thought more control over the default number of processes would be useful to have. Happy to provide a more concrete example as well. |
Yes, I might need some more concrete example. Because each specific feature extractor like mean etc. does not do multiprocessing in any case - multiprocessing is just used to process multiple timeseries in parallel. If there is only a single time series, changing the number of jobs will not help :) |
I see. So each feature extractor uses a single process to perform the computation. What about the case where multiple features need to be extracted on the same timeseries. Would a single process handle all of the feature extractors or would Irrespective of that, I believe it would be helpful to have a function to globally set the number of jobs to be used in any computation. I don't see it being more or less helpful than having P.S - Apologies for the late response. |
The parallelization is only on the timeseries. So a single timeseries will never do anything in parallel - so multiple features for a single time series are calculated on the same process. Sure, we can still have the changes in - I just wanted to make sure you actually get what you expected :) After my comment is in, we can merge! |
8b2fdfb
to
79ccfd1
Compare
I've incorporated the following changes:
Are there any others I should incorporate? @nils-braun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Looks good!
Codecov ReportPatch and project coverage have no change.
Additional details and impacted files@@ Coverage Diff @@
## main #1029 +/- ##
=======================================
Coverage 93.43% 93.43%
=======================================
Files 20 20
Lines 1888 1888
Branches 372 372
=======================================
Hits 1764 1764
Misses 85 85
Partials 39 39 ☔ View full report in Codecov by Sentry. |
Thanks! How can I check if the PR has been merged into |
I have just done so. You see it mentioned on this PR page. |
Awesome! Thanks!! |
Reason for PR:
When profiling specific functions, it is often helpful to have more fine-grained control over the
N_PROCESSES
constant used in multiprocessing.PR Description:
Getter and setter functions for the
N_PROCESSES
constant found with the other default constants.