Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] Speed up fit computation with parallel processing #4108

Merged
merged 7 commits into from Nov 20, 2023

Conversation

ymzayek
Copy link
Member

@ymzayek ymzayek commented Nov 13, 2023

Copy link
Contributor

👋 @ymzayek Thanks for creating a PR!

Until this PR is ready for review, you can include the [WIP] tag in its title, or leave it as a github draft.

Please make sure it is compliant with our contributing guidelines. In particular, be sure it checks the boxes listed below.

  • PR has an interpretable title.
  • PR links to Github issue with mention Closes #XXXX (see our documentation on PR structure)
  • Code is PEP8-compliant (see our documentation on coding style)
  • Changelog or what's new entry in doc/changes/latest.rst (see our documentation on PR structure)

For new features:

  • There is at least one unit test per new function / class (see our documentation on testing)
  • The new feature is demoed in at least one relevant example.

For bug fixes:

  • There is at least one test that would fail under the original bug conditions.

We will review it as quick as possible, feel free to ping us with questions if needed.

Copy link

codecov bot commented Nov 13, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (aa9c914) 91.63% compared to head (1bd04b5) 91.69%.
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4108      +/-   ##
==========================================
+ Coverage   91.63%   91.69%   +0.06%     
==========================================
  Files         143      144       +1     
  Lines       16115    16150      +35     
  Branches     3353     3359       +6     
==========================================
+ Hits        14767    14809      +42     
+ Misses        800      796       -4     
+ Partials      548      545       -3     
Flag Coverage Δ
macos-latest_3.10 91.60% <ø> (+0.06%) ⬆️
macos-latest_3.11 ?
macos-latest_3.12 ?
macos-latest_3.9 ?
ubuntu-latest_3.10 91.60% <ø> (+0.06%) ⬆️
ubuntu-latest_3.11 91.60% <ø> (+0.06%) ⬆️
ubuntu-latest_3.12 91.60% <ø> (+0.06%) ⬆️
ubuntu-latest_3.9 91.57% <ø> (+0.06%) ⬆️
windows-latest_3.10 91.56% <ø> (+0.06%) ⬆️
windows-latest_3.11 91.56% <ø> (+0.06%) ⬆️
windows-latest_3.12 91.56% <ø> (+0.06%) ⬆️
windows-latest_3.9 91.53% <ø> (+0.06%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bthirion
Copy link
Member

Can you assess the impact on build time ?

@ymzayek
Copy link
Member Author

ymzayek commented Nov 14, 2023

In the CI, plot_haxby_frem.html now builds in a bit over 2 minutes and for plot_compare_decomposition.html it's about 4min. This time will depend on whether data is downloaded or cache is used. For this reference build cache is not used. Locally this change shaves off about 2-3 minutes for each example.

Copy link
Collaborator

@Remi-Gau Remi-Gau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good with me.
Out of curiosity: how many cpus can we use in github CI?

@bthirion
Copy link
Member

Are there other examples that could benefit from this to further reduce doc generation time ? If yes, we should do it in this PR.

@ymzayek
Copy link
Member Author

ymzayek commented Nov 16, 2023

Good point. I'll check and update

@ymzayek
Copy link
Member Author

ymzayek commented Nov 16, 2023

I set the examples to use max cpu available with -1. It seems each CPU is dual-core so we end up with 4 processes

@Remi-Gau
Copy link
Collaborator

that's luxurious! 💲

@bthirion
Copy link
Member

Please do not leave n_jobs=-1 in the examples.

@ymzayek
Copy link
Member Author

ymzayek commented Nov 16, 2023

@bthirion yes I thought about that but wanted to check the limit. Probably safest to leave n_jobs=2 so as to not have people run into memory issues when running the examples as they are

@ymzayek
Copy link
Member Author

ymzayek commented Nov 17, 2023

The build with n_jobs=2 (https://github.com/nilearn/nilearn/actions/runs/6892690674/job/18750572159) now takes ~1h10m which is quite a speed up. This is with using data cache.

@bthirion
Copy link
Member

Should we apply that to other examples too ?

@ymzayek
Copy link
Member Author

ymzayek commented Nov 17, 2023

@bthirion it's possible I've missed a few. I'll double check

Copy link
Member

@bthirion bthirion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !

@ymzayek
Copy link
Member Author

ymzayek commented Nov 20, 2023

Thanks merging

@ymzayek ymzayek merged commit f4b03d4 into nilearn:main Nov 20, 2023
29 checks passed
@ymzayek ymzayek deleted the add-njobs branch November 20, 2023 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Examples with long build time
3 participants