Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for running shell commands asynchronously with run_shell_cmd #4444

Merged
merged 8 commits into from Feb 21, 2024

Conversation

boegel
Copy link
Member

@boegel boegel commented Jan 19, 2024

TODO:

  • add support in run_shell_cmd for using asynchronous=True
  • use run_shell_cmd with asynchronous=True in EasyBlock.skip_extensions_parallel
  • use run_shell_cmd with asynchronous=True in Extension.async_cmd_start
  • test with R easyconfig and --parallel-extensions-install (and --skip)

@boegel boegel added this to the 5.0 milestone Jan 19, 2024
@boegel
Copy link
Member Author

boegel commented Jan 19, 2024

Quick test:

  • eb R-4.2.2-foss-2022b.eb --skip --force (serial) took ~28m30s
  • eb R-4.2.2-foss-2022b.eb --skip --parallel-ext --experimental -f with 24 cores (RHEL8, AMD Rome) took ~16m55s (with EasyBuild 4.9.x it takes ~18m20s)

And that's only the skipping part being done in parallel...

@boegel boegel changed the title implement initial support for running shell commands asynchronously using run_shell_cmd (WIP) add support for running shell commands asynchronously with run_shell_cmd (WIP) Jan 22, 2024
@boegel
Copy link
Member Author

boegel commented Feb 7, 2024

I've tested this with R-4.2.1-foss-2022a.eb, and saw a significant speedup when using --parallel-extensions-install --experimental using 24 cores (RHEL8, AMD Rome):

  • with sequential installation of extensions (default):
    • 6h25min total time
    • 5h50min for extensions step
  • with --parallel-extensions-install --experimental:
    • 1h54min total time => 3.37x speedup
    • 1h21min for extensions step => 4.32x speedup for this specific step

The scaling is not spectacular, but I think that's mainly determine by the order in which extensions are being considered, not so much the implementation of running the extension installations in parallel in the background.

So, I think this is ready for review/merge.

Do note that these changes require a corresponding change in easyblocks, see:

I prefer keeping the support for --parallel-extensions-install experimental for now, because I would like to revisit install_extensions_parallel later and improve things a bit.

There's a small additional change needed to make parallel installation of extensions work for easyconfigs like R-bundle-Bioconductor or R-bundle-CRAN, but I'll make that change in a separate PR, I don't want to bury that in this one...

@boegel boegel marked this pull request as ready for review February 7, 2024 09:53
@boegel boegel changed the title add support for running shell commands asynchronously with run_shell_cmd (WIP) add support for running shell commands asynchronously with run_shell_cmd Feb 7, 2024
easybuild/framework/easyblock.py Outdated Show resolved Hide resolved
easybuild/framework/easyblock.py Outdated Show resolved Hide resolved
@easybuilders easybuilders deleted a comment from boegelbot Feb 21, 2024
@boegel boegel requested a review from lexming February 21, 2024 18:46
Copy link
Contributor

@lexming lexming left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lexming lexming merged commit c14a130 into easybuilders:5.0.x Feb 21, 2024
32 checks passed
@boegel boegel deleted the run_shell_cmd_async branch February 22, 2024 06:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

2 participants