Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using Homebrew MUMPS #2

Open
dpo opened this issue Apr 1, 2014 · 28 comments
Open

Using Homebrew MUMPS #2

dpo opened this issue Apr 1, 2014 · 28 comments

Comments

@dpo
Copy link

dpo commented Apr 1, 2014

I tried building MUMPS.jl against the shared MUMPS library built via Homebrew. An advantage is that we can call brew to discover the location of the library, so there's no need to package the MUMPS source code. Here's a sample Makefile:

prefix = $(shell brew --prefix mumps)
libdir = $(prefix)/lib
incdir = $(prefix)/include

SRC = mumps_cmplx_p.f90  mumps_p.f90 call_mumps_cmplx_p.f90 call_mumps_p.f90
LIB = -L$(libdir) -ldmumps -lzmumps -lmumps_common -lpord

%.o: %.f90
    gfortran -I$(incdir) -fPIC -fcray-pointer -c $<

all: ${SRC:.f90=.o}
    libtool -dynamic -undefined dynamic_lookup ${SRC:.f90=.o} -o ../lib/MUMPS $(LIB)

clean:
    rm ../lib/MUMPS

This Makefile assumes that the parallel MUMPS library was built. If the sequential library was built (with brew install mumps --without-mpi), incdir becomes $(prefix)/libexec/include and we must add -lmpiseq to LIB.

This compiles but it seems something needs to be updated in the interface:

$ julia ../tests/testTwoSystem.jl
Factorize complex matrix
WARNING: convert{T}(p::Type{Ptr{T}},a::Array) is deprecated, use convert(p,pointer(a)) instead.

(sorry I'm too new to Julia to know what that message means). I'm also getting a bunch of messages of the form:

 ** On entry to ZGEMM  parameter number  4 had an illegal value
@tkelman
Copy link
Contributor

tkelman commented Apr 10, 2014

Parameter number 4 to ZGEMM should be N, so this is probably the same issue as jump-dev/Ipopt.jl#1 (comment) - Mumps typically uses an LP64 blas, but Julia uses ILP64 OpenBlas by default. It looks like Mumps is being set up in this package to link to a shared-library blas, so zgemm_ is shadowed.

@lruthotto
Copy link
Contributor

Thanks @dpo for pointing into this very interesting direction. I agree that installing MUMPS using homebrew would be a great improvement! Not only more elegant but also giving access to MUMPS' parallel version.

The warning you mentioned seems to be independent from the MUMPS installation. At least I got the same warning, when running the tests on my machine using the current version of Julia. Unfortunately, I do not see either where it comes from.

Did the comment @tkelman made help you? I cannot try it on my machine yet as I have trouble installing MUMPS using homebrew.

@dpo
Copy link
Author

dpo commented Apr 23, 2014

I haven't had a chance to play with this. You might want to post the error message in the Julia user's forum to find out what it means. Please open an issue on homebrew/science if you have trouble installing Mumps.

@lruthotto
Copy link
Contributor

Hi Dominique,

sorry that it took me that long, but I now managed to install mumps using brew (seems like missing Xcode command line tools have caused my problems). I tried to use your makefile, but it failed with many errors. Here is the start:

gfortran -I/usr/local/opt/mumps/include -fPIC -fcray-pointer -c mumps_cmplx_p.f90
mumps_cmplx_p.f90:24.23:

TYPE (ZMUMPSbrew_STRUC),intent(inout):: mumps_par
                       1
Error: Derived type 'zmumpsbrew_struc' at (1) is being used before it is defined
mumps_cmplx_p.f90:28.13:

   mumps_par%SYM = sym  ! =0 unsymmetric, =1 symm. pos def, =2 general symm.
             1
Error: Symbol 'mumps_par' at (1) has no IMPLICIT type
mumps_cmplx_p.f90:29.13:

   mumps_par%PAR = 1
             1
Error: Symbol 'mumps_par' at (1) has no IMPLICIT type
mumps_cmplx_p.f90:31.13:

   mumps_par%JOB = -1  ! initialize

I'm leaving out some more errors of that type... here is how it ends:


Error: Symbol 'mumps_par' at (1) has no IMPLICIT type
mumps_cmplx_p.f90:144.13:

   mumps_par%JOB = 2  !  factorization
             1
Error: Symbol 'mumps_par' at (1) has no IMPLICIT type
Fatal Error: Error count reached limit of 25.
make: *** [mumps_cmplx_p.o] Error 1

@dpo : Do you have a clue what's going on? Is there anything I need to modify in the Fortran code?

Cheers,
Lars

@dpo
Copy link
Author

dpo commented Jul 17, 2014

Hi Lars. Sorry for the slow reply. I just pulled the git repo and it builds just fine against my homebrew Mumps with this Makefile (which is similar to the one I posted above). Does it not build for you?

The tests don't run for me though because it seems MPI isn't initialized. I quickly hacked mumps_p.f90 (here) but there's an error related to the LAPACK function DSYR:

 ** On entry to DSYR   parameter number  7 had an illegal value

You can probably tell more quickly than I what the problem is.

@lruthotto
Copy link
Contributor

Hey Dominique.

I did some extensive testing and asked around, but could not come to a solution. First of all, your makefile now works for me and I could test mumps with and without MPI as well as with openblas and with my Mac's blas.

Unfortunately, I could not reproduce your error message. Instead I am getting Segmentation faults!

This one when I link against brew's MUMPS without MPI support (for both BLAS versions):

include("testMUMPS.jl")
test MUMPS
Test for real SPD matrix: one rhs

signal (11): Segmentation fault: 11
??? at ???:110110743
Segmentation fault: 11

And this one when I link against brew's MUMPS with MPI (for both BLAS versions):

include("testDivGrad.jl")
Test for real SPD matrix: one rhs

signal (11): Segmentation fault: 11
??? at ???:110110743
[dhcp-128-189-89-68:81342] *** Process received signal ***
[dhcp-128-189-89-68:81342] Signal: Segmentation fault: 11 (11)
[dhcp-128-189-89-68:81342] Signal code:  (0)
[dhcp-128-189-89-68:81342] Failing at address: 0x0
[dhcp-128-189-89-68:81342] [ 0] 0   libsystem_c.dylib                   0x00007fff8723b90a _sigtramp + 26
[dhcp-128-189-89-68:81342] [ 1] 0   libdmumps.dylib                     0x000000011ad5a9b7 __dmumps_load_MOD_dmumps_467 + 82
[dhcp-128-189-89-68:81342] [ 2] 0   libopenblas.dylib                   0x000000010580e9ce dsyr_ + 302
[dhcp-128-189-89-68:81342] [ 3] 0   libdmumps.dylib                     0x000000011ad13cb0 dmumps_230_ + 142
[dhcp-128-189-89-68:81342] *** End of error message ***

signal (11): Segmentation fault: 11
??? at ???:110110743
Segmentation fault: 11

It also points to the DSYR function. People here have the feeling it might be some issue with linking BLAS dynamically (Julia may link to a different BLAS as well). But this is just a guess.

At this point we have no idea how to fix it. Sorry!

@dpo
Copy link
Author

dpo commented Aug 11, 2014

Hi Lars. I forked your repository yesterday and worked on it for a while. I found a way to refer to the library in a generic way, provided it's in the user's LD_LIBRARY_PATH. But I'm getting the same segmentation fault as you. Mumps and Julia use different BLAS libraries, but I'm not sure why that would be a problem. I'll contact some Julia folks to see if they have any idea.

@tkelman
Copy link
Contributor

tkelman commented Aug 12, 2014

Is Homebrew OpenBLAS built with 32 bit or 64 bit integers?

@dpo
Copy link
Author

dpo commented Aug 12, 2014

Mine is build with 64bit integers. But how does that matter? Mumps uses its own BLAS.

@tkelman
Copy link
Contributor

tkelman commented Aug 12, 2014

Right, and what does it expect as the integer size there? Unless it's linking to a static-library BLAS, the integer sizes will all have to be the same. Shared libraries will shadow each other. We have this problem all the time with Ipopt.

@dpo
Copy link
Author

dpo commented Aug 12, 2014

Aha. The BLAS used by MUMPS is the default brew OpenBLAS, so I presume it uses 32bit integers. In my fork, I carefully changed all integer data to 32bit because MUMPS uses default FORTRAN INTEGERs, which are 32bit. Are you saying MUMPS should link the Julia's OpenBLAS?

@tkelman
Copy link
Contributor

tkelman commented Aug 12, 2014

Either Mumps will need to link a static-library Blas that uses 32-bit integers, or you'll need to configure Mumps to use 64-bit integers in all Blas calls (this might be doable with -fdefault-integer=8 or some such, but I've never tried it)

Or, you can tell Julia and Blas and Mumps to use 32-bit integers everywhere (USE_BLAS64=0 I think it is), but the default binaries are not built this way AFAIK.

@dpo
Copy link
Author

dpo commented Aug 12, 2014

I see. In the second scenario, what if a user built Julia with a 32bit OpenBLAS?

@tkelman
Copy link
Contributor

tkelman commented Aug 12, 2014

Sorry yeah edited in a 3rd option - you'll need Julia and Mumps to be consistent in the integer size they expect from Blas, unless you link statically.

@dpo
Copy link
Author

dpo commented Aug 12, 2014

Ok, will give that a try. Unless there's a plan to include MUMPS in Julia at some point, I think we'll have to go with the static linking. I'm still not sure I totally understand the issue because MUMPS's BLAS lib is libopenblas while Julia's lib is libopenblas64. How can there be confusion?

@tkelman
Copy link
Contributor

tkelman commented Aug 12, 2014

Oh do you have openblas building under a different library name when it uses 64 bit integers? That's probably a good idea. I think the confusion and cross-talk is from having multiple libraries dynamically loaded that have the same function names, though I'm not an expert in how dynamic linking works I just have seen the consequences and ways that I've been able to fix it.

We did static linking (to netlib blas, but a better static blas should also work) for the purposes of Mumps within the Ipopt.jl package, see my original link from April.

I think the eventual best thing to do here is prefix all the function names for Julia's 64-bit-integer openblas to avoid this problem. That's the plan in the cross-referenced Julia issue 4923, it just hasn't been done yet.

@dpo
Copy link
Author

dpo commented Aug 12, 2014

That's tricky. Thanks a lot for the tip! That sounds like the likely culprit.

@dpo
Copy link
Author

dpo commented Aug 12, 2014

Well I just rebuilt MUMPS against a static 32bit OpenBLAS but I'm still getting the same segfault. Now MUMPS depends on Scalapack, which also links to its own BLAS (dylib for now). It's not realistic to require that everybody be linked statically.

@tkelman
Copy link
Contributor

tkelman commented Aug 12, 2014

Perhaps not. Until someone figures out the proper way to use objcopy as part of Julia's build system and writes the requisite macros around all blas ccalls to only add the prefixes when necessary, the package will only be usable when Julia is configured for 32-bit-integer blas.

@dpo
Copy link
Author

dpo commented Aug 12, 2014

Confirm: The MUMPS interface works with Julia32. Thanks @tkelman!

@dpo
Copy link
Author

dpo commented Aug 12, 2014

There's still a segfault when I try to solve two systems in a row. Somehow memory isn't cleared up properly and/or MUMPS isn't reinitialized properly.

@tkelman
Copy link
Contributor

tkelman commented Aug 12, 2014

@dpo if it works the first time but not the second, I suspect that's a bug in memory handling as you said, and probably has more to do with the package binding code here in MUMPS.jl than anything about the Julia build configuration.

@dpo
Copy link
Author

dpo commented Aug 16, 2014

Well I was not able to make this work, so I rewrote my own MUMPS interface (in C this time!) See https://github.com/dpo/MUMPS.jl. Deallocating / reallocating works fine (at least for me). It's real-only at the moment, but adding complex arithmetic is simple. The C interface also seems far simpler and easier to maintain.

@lruthotto
Copy link
Contributor

thanks! this looks really interesting. looking forward to testing it soon!

@dpo
Copy link
Author

dpo commented Aug 17, 2014

ps: the new interface also runs in parallel!

@lruthotto
Copy link
Contributor

hi dominique. I like your interface a lot better than our fortran interface (especially the access to all control parameters). I also agree it seems simpler to maintain.
However, did I understand it correctly, that is still is limited to Julia's 32 bit version? For our applications that would not be feasible. Are there new ideas how to combine Julia 64 bit and MUMPS?

@dpo
Copy link
Author

dpo commented Aug 18, 2014

Hi Lars. The 32 bit thing is only because my BLAS is 32 bit. If you can manage to build Julia against a 64 bit BLAS and not have a 32 bit BLAS shadow the functions of the first, you should be good to go. However, MUMPS uses default Fortran INTEGERs, which are 32 bit unless you compile with -fdefault-integer-8. In the latter case, the MUMPS/Julia interface will require minor tweaks. I'll put that on the list.

@tkelman
Copy link
Contributor

tkelman commented Aug 18, 2014

If you can manage to build Julia against a 64 bit BLAS and not have a 32 bit BLAS shadow the functions of the first

@lruthotto, @dpo, this is being tracked at JuliaLang/julia#4923. If you could speak up there that this is causing problems in packages like this one, and more importantly if you have any suggestions that could work on Linux and Mac (I already have a solution that should work for Windows), please let us know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants