Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add missing int def header, fix install_name on mac, restore tests #102

Merged
merged 10 commits into from
Dec 24, 2023

Conversation

minrk
Copy link
Member

@minrk minrk commented Dec 19, 2023

  • fix int size to default of 32 on Windows
  • actually run mpi tests, which would have caught this

closes #100

closes #103 because the install_name was wrong (revealed by the re-enabled tests)

- fix default int size to 32 on Windows
- actually run mpi tests
@conda-forge-webservices
Copy link
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@traversaro
Copy link
Contributor

Thanks @minrk ! Should we mark as broken the previous Windows build with 64-bit index? Fortunately conda-forge/conda-forge-pinning-feedstock#5274 was not merged, so I think we do not have any package that built with 5.6.2 mumps packages.

@minrk
Copy link
Member Author

minrk commented Dec 19, 2023

@traversaro yeah, that would probably be prudent

@minrk
Copy link
Member Author

minrk commented Dec 19, 2023

linux mpich cross-compile builds are failing because of conda-forge/mpich-feedstock#86

linux openmpi builds are failing with:

+ /tmp/tmpmppakv2p/info/recipe/parent/mpiexec.sh -n 2 ./ssimpletest
--------------------------------------------------------------------------
There was a problem while initializing support for the CUDA reduction operations.
hostname:   9e20deb675ab
priority:   78
collective: scan
--------------------------------------------------------------------------

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

which I don't understand. I don't get why CUDA anything is being loaded, which obviously won't work.

@jakirkham
Copy link
Member

cc @leofang (in case you have any ideas on the question above)

@@ -7,6 +7,7 @@ if [[ "$mpi" == "mpich" ]]; then
elif [[ "$mpi" == "openmpi" ]]; then
export OMPI_MCA_plm=isolated
export OMPI_MCA_rmaps_base_oversubscribe=yes
export OMPI_MCA_btl=tcp,self
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clueless here too. Just wondering: what happens if you set sm,self instead?

@dalcinl
Copy link
Contributor

dalcinl commented Dec 21, 2023

@minrk @leofang The sm btl was available in v3, then they renamed it to vader in v4, and in v5 it is again named sm with vader as an alias scheduled for removal. Disclaimer: I'm not an Open MPI user, so I may have gotten any of these facts wrong.

Regarding the weird CUDA failure, maybe the way to go is to add --mca opal_cuda_support 0 in the mpiexec invokation within mpiexec.sh?

@leofang
Copy link
Member

leofang commented Dec 21, 2023

Regarding the weird CUDA failure, maybe the way to go is to add --mca opal_cuda_support 0 in the mpiexec invokation within mpiexec.sh?

Yeah let's try that, though if it works then I don't know why the default we set (CUDA off) did not kick in.

recipe/mpiexec.sh Outdated Show resolved Hide resolved
Co-authored-by: Joseph Capriotti <josephrcapriotti@gmail.com>
@minrk minrk changed the title add missing int def header add missing int def header, fix install_name on mac, restore tests Dec 23, 2023
@minrk
Copy link
Member Author

minrk commented Dec 23, 2023

disabling cuda works, but realized vader was never actually tested on its own. So running that test, then I think this is ready to go. If vader fails, I don't understand the implications of why cuda's being loaded here but apparently not in other openmpi runs.

@minrk
Copy link
Member Author

minrk commented Dec 23, 2023

switch to vader seems to have fixed it. No idea why using tcp, which we've used for ages, triggers failure to load cuda. That's bizarre.

@minrk minrk merged commit d7796cf into conda-forge:main Dec 24, 2023
18 checks passed
@minrk minrk deleted the fix-header-indices branch December 24, 2023 06:35
@minrk minrk mentioned this pull request Jan 18, 2024
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

incorrect install name in osx mumps libraries Missing header file
6 participants