significantly speed up robot_find_subtoolchain_for_dep by specifying whether dep is resolved via an external module #2697
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I noticed that the easyconfigs test suite became significantly slower after #2690 got merged, so I did a profile run to figure out where most time was being spent...
It turns out that a whopping 75% of time was spent in the
mod_exist_via_show
inner function ofModulesTool.exist
, which is only used as a fallback check in case a (visible) module is not found via the list produced bymodule avail
.This is required because a module name can be partial, in the case of external modules (cfr. https://easybuild.readthedocs.io/en/latest/Using_external_modules.html).
mod_exist_via_show
is expensive since it involves actually calling out to the modules tool (e.g. Lmod) every time. We already have a "show cache", which helps quite a bit (it cuts down actualmodule show
calls to with about 75% in the case of the easyconfigs test suite), but we still callmodule show
once for every module name that is not found in the output ofmodule avail
.Since
robot_find_subtoolchain_for_dep
knows when a dependency is resolved via an external module or not, it can pass down this information toModulesTool.exist
to avoid all these needlessmodule show
calls when we are certain that the module name is not partial (since that can only occur with dependencies marked as external modules).It's worth mentioning that
ModulesTool.exist
already handles hidden module names separately, and so a fallback toshow
is already in place for module names for which the actual module file starts with a.
.For modules that are hidden via Lmod's feature to hide modules via
.modulerc
were also save, since we runmodule --show-hidden avail
with Lmod, and so hidden modules will be listed.This change has a dramatic effect on the time needed for easyconfig test suite: it was cut down from ~2550s to ~550s (on my laptop).