New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DOC] Speed up fit computation with parallel processing #4108
Conversation
ymzayek
commented
Nov 13, 2023
- Closes Examples with long build time #3980
👋 @ymzayek Thanks for creating a PR! Until this PR is ready for review, you can include the [WIP] tag in its title, or leave it as a github draft. Please make sure it is compliant with our contributing guidelines. In particular, be sure it checks the boxes listed below.
For new features:
For bug fixes:
We will review it as quick as possible, feel free to ping us with questions if needed. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #4108 +/- ##
==========================================
+ Coverage 91.63% 91.69% +0.06%
==========================================
Files 143 144 +1
Lines 16115 16150 +35
Branches 3353 3359 +6
==========================================
+ Hits 14767 14809 +42
+ Misses 800 796 -4
+ Partials 548 545 -3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Can you assess the impact on build time ? |
In the CI, plot_haxby_frem.html now builds in a bit over 2 minutes and for plot_compare_decomposition.html it's about 4min. This time will depend on whether data is downloaded or cache is used. For this reference build cache is not used. Locally this change shaves off about 2-3 minutes for each example. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good with me.
Out of curiosity: how many cpus can we use in github CI?
Are there other examples that could benefit from this to further reduce doc generation time ? If yes, we should do it in this PR. |
Good point. I'll check and update |
I set the examples to use max cpu available with -1. It seems each CPU is dual-core so we end up with 4 processes |
that's luxurious! 💲 |
Please do not leave |
@bthirion yes I thought about that but wanted to check the limit. Probably safest to leave n_jobs=2 so as to not have people run into memory issues when running the examples as they are |
The build with n_jobs=2 (https://github.com/nilearn/nilearn/actions/runs/6892690674/job/18750572159) now takes ~1h10m which is quite a speed up. This is with using data cache. |
Should we apply that to other examples too ? |
@bthirion it's possible I've missed a few. I'll double check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM !
Thanks merging |