-
-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
{mpich,openmpi}: optionally link against slurm's libpmi2 #283071
base: master
Are you sure you want to change the base?
Conversation
pmixSupport ? false, | ||
withSlurm ? true, | ||
slurm, | ||
}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ended up just reformatting with nixfmt
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a general comment - the reformatting and reordering of code and makes the PR quite difficult to read. If you reformat a package, please put it at least in a different commit. @markuskowa
Note: now in a separate commit
Closes NixOS#280406 (cherry picked from commit 3e98731)
9031f39
to
e767576
Compare
# PMIX support is likely incompatible with process managers (`--with-pm`) | ||
# https://github.com/NixOS/nixpkgs/pull/274804#discussion_r1432601476 | ||
pmixSupport ? false, | ||
withPmilib ? "slurm", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this now overlaps with pmixSupport
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, there's also a --with-pmi
, which could also take "slurm"
. On one hand, mirroring the upstream configuration options is a sane default. OTOH, upstream seems to maintain a variety of options that do the same or related things and potentially conflict with each other. We should expose the bare minimum that we can truly guarantee to work, plus a way to disable all of our heuristics and do the ad hoc job in an override
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notably, when we pass both no there aren'tpmix
and slurm.lib
to buildInputs
there are two paths that provide libpmi2.so
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Force-pushed again. Now explicitly mirroring upstream and erroring out on any surprising inputs by means of broken = true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rebased on master after the split-pmix PR
4a13c64
to
736e228
Compare
# https://www.open-mpi.org/faq/?category=buildcuda | ||
(lib.withFeature true "ucc") | ||
(lib.withFeature true "ucx") | ||
|
||
(lib.enableFeature (!cudaSupport) "mca-dso") | ||
(lib.withFeatureAs cudaSupport "cuda" cudaPackages.cuda_cudart) | ||
(lib.enableFeature cudaSupport "dlopen") | ||
|
||
(lib.withFeatureAs stdenv.isLinux "pmix" (lib.getDev pmix)) | ||
(lib.withFeatureAs stdenv.isLinux "pmix-libdir" "${lib.getLib pmix}/lib") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Motivation: prevents autoconfigure from guessing stuff.
Didn't migrate all of the flags because out of scope. The ones migrated I had already touched for other reasons
# PMIX support is likely incompatible with process managers (`--with-pm`) | ||
# https://github.com/NixOS/nixpkgs/pull/274804#discussion_r1432601476 | ||
enablePmix ? false, | ||
enablePmi1 ? false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Damn... those are internal variables, not options I guess:
configure: WARNING: unrecognized options: --disable-pmix, --disable-pmi1, --enable-pmi2
6959731
to
b5108e5
Compare
] ++ lib.optionals (withPmi == "pmi1") [ | ||
"gforker" | ||
], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a failure saying something like "gforker needs pmi1 but you chose pmi2". Forgetting now, I should check again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking at pmodels/mpich#6662 and trying to think what could be the callPackage
interface that we could fix now and reuse once we update to 4.2.x
? The --with-pmi{1,2}
options are claimed to expect paths, except I think in case of pmix
their values are optional and just dropping stuff in buildInputs
still "works". There's no explicit option to mirror for slurm's pmi support, because --with-pmilib=slurm
was a temporary hack? And I don't think we want packages or strings with store paths as arguments because that might be bad for splicing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
b5108e5
to
3dd880b
Compare
(cherry picked from commit c920deb49a83cf504b251588e9e2210a674ff407)
3dd880b
to
206d757
Compare
Description of changes
Closes #280406. Partially addresses #274584 (memory leaks are still there, but segfaults seem to be gone, at least on aalto's triton)
Things done
nix.conf
? (See Nix manual)sandbox = relaxed
sandbox = true
nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)Add a 馃憤 reaction to pull requests you find important.