Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SuperLU_DIST] New Build #4672

Merged
merged 17 commits into from May 23, 2022
Merged

[SuperLU_DIST] New Build #4672

merged 17 commits into from May 23, 2022

Conversation

rayegun
Copy link
Contributor

@rayegun rayegun commented Mar 23, 2022

This isn't ready yet, I need to build the 64 bit XSDK_INDEX_SIZE, and split them. I'm going to attempt to do what I did with PETSc by renaming filename and SONAME in subdirectories. Hopefully the PETSc build system is resilient to those renamings (I believe it is, you can do --with-lib=...).

We do, also, have a Windows issue with PARMETIS that I need to investigate.

@rayegun
Copy link
Contributor Author

rayegun commented Mar 23, 2022

That's a new error. Hmm. I assume this is from the warning about libmetis

@rayegun
Copy link
Contributor Author

rayegun commented Mar 23, 2022

That's better, now I just need to fix the Windows build (msmpi should work) (I don't think they recommend using Parmetis there, which is good), and then build the 64-bit version in a different directory

@giordano
Copy link
Member

For the PowerPC error

[19:35:26] /workspace/destdir/lib/libopenblas.so: undefined reference to `GOMP_parallel@GOMP_4.0'
[19:35:26] /workspace/destdir/lib/libopenblas.so: undefined reference to `omp_get_num_places@OMP_4.5'

you need GCC 5 or 6, I don't remember exactly which one (I think the minimum that'd work is 5)

rayegun and others added 2 commits April 6, 2022 10:41
Co-authored-by: Boris Kaus <61824822+boriskaus@users.noreply.github.com>
@rayegun
Copy link
Contributor Author

rayegun commented Apr 6, 2022

What in the heck. Oh it's a libgfortran3 thing.

@vchuravy
Copy link
Member

vchuravy commented Apr 6, 2022

Want to turn off parametis for now?

@rayegun
Copy link
Contributor Author

rayegun commented Apr 6, 2022

Well I think we can just turn it off for Windows, the SuperLU_Dist docs show how. I won't be able to do that until this evening. Strangely ParMETIS is not listed in the optional deps section of SuperLU_Dist, so I'm not sure what Windows will lose in the process.

Then the last thing for SuperLU_Dist is to build the 64-bit version and rename it.

@ViralBShah
Copy link
Contributor

ViralBShah commented Apr 19, 2022

Maybe @xiaoyeli can help here. Sherry - I believe it should be possible to build SuperLU_Dist without partitioning support. I think eventually we want parmetis, but it may not necessarily be useful for all problems to do the nested dissection ordering.

@xiaoyeli
Copy link

You can disable (par)metis during installation with CMake:
-DHAVE_PARMEETIS=FALSE

The default ordering is serial Metis. Since you are not linking with Metis, in your code, you need to set ColPerm option as something else. A good alternative is:
options->ColPerm = MMD_AT_PLUS_A
(the default is METIS_AT_PLUS_A)

@rayegun
Copy link
Contributor Author

rayegun commented Apr 19, 2022

Thanks a bunch!

@ViralBShah
Copy link
Contributor

IMO, it should be ok to exclude Windows from MPI libraries.

@xiaoyeli
Copy link

One of the users did a Window port of superlu_dist, with MPI support. If you are savvy with CMake, you can find the Window related setup in top level CMakeLists.txt and SRC/CMakeLists.txt.

@rayegun
Copy link
Contributor Author

rayegun commented Apr 29, 2022

IMO, it should be ok to exclude Windows from MPI libraries.

Unfortunately this project is specifically at the request of someone looking for Windows support for a class (in the Fall I think?). For the remaining xSDK builders we need I don't really plan to build for Windows for this reason.

One of the users did a Window port of superlu_dist, with MPI support. If you are savvy with CMake, you can find the Window related setup in top level CMakeLists.txt and SRC/CMakeLists.txt.

The current problem seems to be CMake just not picking up MSMPI using the find_package command. I may need to get a CMake guru to help with that in the end. I've also tried manually specifying the required variables without much luck.

I'm still working on this problem, but I switched to work on some SuiteSparse (and SparseArrays) stuff this week that I'm almost done with.

@ViralBShah
Copy link
Contributor

I am pretty sure most MPI packages are never actually tested on Windows. @xiaoyeli Do you know if SuperLU-Dist works on windows with MSMPI?

I would love to learn more about this class and why do they want MPI on windows on the student's laptops.

@boriskaus
Copy link
Contributor

I am pretty sure most MPI packages are never actually tested on Windows. @xiaoyeli Do you know if SuperLU-Dist works on windows with MSMPI?

I would love to learn more about this class and why do they want MPI on windows on the student's laptops.

that would be me; I'm teaching a modelling class in the geosciences where students use our MPI-parallel/PETSc-based code LaMEM to perform simulations. Approximately 90% of the students have windows machines and there is nearly no-one with linux experience. PETSc_jll already works on windows in a basic version as does a BinaryBuilder version of our code LaMEM_jll that employs the PETSc_jll . Unfortunately for most simulations we need a PETSc_jll version that includes parallel direct solvers (SuperLU_Dist and MUMPS). That is something that has been pending since some time.

MSMPI itself works ok under windows. See, e.g., MAGEMin_jll, which we tested on various machines.

@ViralBShah
Copy link
Contributor

Thanks @boriskaus - that is very valuable. Does MPI parallel Petsc really make a huge difference on most laptops? Wouldn't you get sufficient parallelism from BLAS threads? Is using sequential Petsc for teaching, and a parallel one on linux clusters for scaling a good solution?

We'll naturally try to see if we can find a solution here - but just wondering if you still have a path if it is tough.

@ViralBShah
Copy link
Contributor

@Wimmerer One suggestion would be to get this merged without windows, so that upstream packages can be built, while windows support is worked on in a separate PR.

@boriskaus
Copy link
Contributor

boriskaus commented Apr 29, 2022

Most students try their setup locally before sending this off to a cluster, so if possible it would be nice to have this on windows. Yet, if that is too difficult, it is not a disaster (SuiteSparse is an option that could work as well). In fact, using the Windows Subsystem for Linux (WSL) would allow running this in parallel on modern windows laptops.

What would be important, though, is having PETSc_jll versions that are compiled with MUMPS & SuperLU_Dist (& perhaps SuiteSparse) for linux and mac. This PR does this, but hasn't been merged (most likely because the Yggdrasil team prefers to have separate and independent builds of SuperLU_Dist & MUMPS rather than building all through PETSc).

@rayegun
Copy link
Contributor Author

rayegun commented Apr 29, 2022

I'm working on the SuiteSparse PR next. We can drop the SuperLU_dist dependency on Windows maybe? And work on Windows further at a later date. I'll rebuild the PRs this evening in light of that plan.


# These are the platforms we will build for by default, unless further
# platforms are passed in on the command line
platforms = expand_gfortran_versions(supported_platforms())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why expanding the libgfortran version?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Wimmerer bump

@giordano
Copy link
Member

I don't think SuperLU_DIST can be possibly compiled for Windows with MinGW: https://github.com/xiaoyeli/superlu_dist/blob/3810da3a85baf516bfeafc316f1833acfee5b625/SRC/sec_structs.c#L134 getline is not a standard function and is not provided by MinGW. I don't see how this can work (of course short of manually patching up an implementation of getline, which I'm not going to do).

giordano and others added 2 commits May 1, 2022 00:19
Note: this still doesn't work because the source code uses the non-standard
`getline`, not provided by MinGW.
@xiaoyeli
Copy link

xiaoyeli commented May 2, 2022

What compiler are you using?

I consulted a successful user, the following is his cmake script. He uses LLVM / clang compiler.
He said they are using MSMPI, without problem.
(His script was using an older version 5.4.0, but should work with the latest version.

'/winsame/user/dev2/contrib-llvm_vs2017/cmake-3.20.3-ser/bin/cmake.exe'
-DCMAKE_INSTALL_PREFIX:PATH=D:/winsame/user/dev2/contrib-llvm_vs2017/superlu_dist-5.4.0-par
-DCMAKE_BUILD_TYPE:STRING=Release
-DCMAKE_COLOR_MAKEFILE:BOOL=FALSE
-DCMAKE_VERBOSE_MAKEFILE:BOOL=TRUE
-DBUILD_SHARED_LIBS:BOOL=OFF
-DCMAKE_C_COMPILER:FILEPATH='C:/Program Files/LLVM/bin/clang-cl.exe'
-DCMAKE_C_FLAGS:STRING='/W3 -std=c99'
-DXSDK_ENABLE_Fortran:BOOL='OFF'
-Denable_blaslib:BOOL='OFF'
-Denable_parmetislib:BOOL='OFF'
-DCMAKE_C_FLAGS_RELEASE:STRING='/O2 /Ob2 /DNDEBUG'
-DCMAKE_CXX_FLAGS_RELEASE:STRING='/O2 /Ob2 /DNDEBUG'
-Denable_openmp:BOOL='FALSE'
-G 'NMake Makefiles JOM'
D:/winsame/user/vsimall-dev2/builds/superlu_dist-5.4.0

@giordano
Copy link
Member

giordano commented May 2, 2022

The problem is with GCC + MinGW. See for example https://stackoverflow.com/q/27381903/2442087

@xiaoyeli
Copy link

xiaoyeli commented May 3, 2022

@piyush314
'getline' is used only in getFreq() routine, which in turn is not so useful. Can we remove that? It causes trouble in gcc + MinGW (Windows)

@piyush314
Copy link

piyush314 commented May 3, 2022 via email

@xiaoyeli
Copy link

xiaoyeli commented May 3, 2022

@piyush314
Can you update the master branch to remove that part?

@xiaoyeli
Copy link

xiaoyeli commented May 4, 2022

@piyush314 has resolved this issue in master branch.

@giordano
Copy link
Member

giordano commented May 4, 2022

Applying this upstream patch we can now build for Windows, thanks!

This now looks good to me, except for the fact I don't know why this is expanding the libgfortran version, I believe we're building multiple duplicates. Input from @Wimmerer about this choice would be useful.

@ViralBShah
Copy link
Contributor

We should just remove the gfortran expansion in here. I think most people are not sure when to do it or copy a recipe from another package.

@ViralBShah
Copy link
Contributor

@xiaoyeli Do you have plans to tag a new release of superlu_dist with this fix included? If so, we can just use that instead of carrying the patch here.

@rayegun
Copy link
Contributor Author

rayegun commented May 5, 2022

I believe there was a reason, I did not have it on initial commit I don't think. But locally it no longer seems necessary, so I've removed.

I added 64-bit index builds so that's the only thing left to be checked I think. I think PETSc should happily consume these libraries despite the different names, though.

@xiaoyeli
Copy link

xiaoyeli commented May 5, 2022

@ViralBShah The patch is already in github master. I will not do a BugFix tag for the moment, because next week, we will have a new release with new functionality.

@xiaoyeli Do you have plans to tag a new release of superlu_dist with this fix included? If so, we can just use that instead of carrying the patch here.

@ViralBShah
Copy link
Contributor

Yes - we can definitely wait till the next release since it is quite soon.

@rayegun rayegun marked this pull request as ready for review May 23, 2022 16:18
@ViralBShah
Copy link
Contributor

I think this should be good to merge. We will only find out issues once we can try out these binaries which will happen post merging.

@giordano giordano enabled auto-merge (squash) May 23, 2022 20:15
@giordano giordano merged commit 9c17ee7 into JuliaPackaging:master May 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants