Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scalapack: query lwork from ppocon #5915

Merged
merged 1 commit into from
Feb 16, 2018
Merged

Conversation

davydden
Copy link
Contributor

@davydden davydden commented Feb 16, 2018

i was trying to fix scalapack_05 on some testers https://cdash.kyomu.43-1.org/testSummary.php?project=1&name=scalapack%2Fscalapack_05.mpirun%3D4.debug&date=2018-02-15
and was digging into the code.

Looks like lwork is wrong, IBM states this as:

 lwork ≥ 2np0 + 2nq0 + max(2, max(nb(max(1, iceil(nprow-1, npcol))), nq0 + nb(max(1, iceil(npcol-1, nprow)))))

mb = MB_A
nb = NB_A
iroff = mod(ia-1, mb)
icoff = mod(ja-1, nb)
iarow = mod(RSRC_A + (ia-1)/mb, nprow)
iacol = mod(CSRC_A + (ja-1)/nb, npcol)
np0 = NUMROC(n+iroff, mb, myrow, iarow, nprow)
nq0 = NUMROC(n+icoff, nb, mycol, iacol, npcol)

and netlib-scalapack as

       LWORK >= 2*LOCr(N+MOD(IA-1,MB_A)) + 2*LOCc(N+MOD(JA-1,NB_A))+
             MAX( 2, MAX(NB_A*CEIL(NPROW-1,NPCOL),LOCc(N+MOD(JA-1,NB_A)) +
             NB_A*CEIL(NPCOL-1,NPROW)) ).

in our case IA=JA=1 so things are simpler, but still there were no max and alike in

int lwork = 2 * n_local_rows + 3 * n_local_columns + column_block_size;

To be on the safe side we can query the right number from ppocon.

p.s. this does not fix the float test for me, it still hangs on MPI=4 run with a 32x32 matrix with block-size 32 (process grid will consist of one core only here). double works just fine, so does float with 1 MPI core...

@davydden
Copy link
Contributor Author

/run-tests

@bangerth
Copy link
Member

OK to merge once the tester is happy.

@davydden davydden merged commit 06b959b into dealii:master Feb 16, 2018
@davydden
Copy link
Contributor Author

davydden commented Feb 17, 2018

For the record, i know what's going on with that scalapack_05 test. It's another Fortran symbol conflict due to faulty pArpack. Should be pslamch (I submitted a patch, will find out if it helps soon). As soon as I disable pArpack in deal.ii, the test runs fine. @BenBrands , who extended ScaLAPACK wrappers to run with float and modified that test, did not see this issue as he is not building dealii with pArpack.

@bangerth
Copy link
Member

Oh, are you saying that both SCALAPCK and PArpack have functions called pslamch and that the linker chooses one or the other randomly? If so, Ouch!

Can you post a link to your patch, if it's in some public repo, for posterity?

@bangerth
Copy link
Member

Ah, found it: opencollab/arpack-ng#50

@davydden
Copy link
Contributor Author

davydden commented Feb 19, 2018

Oh, are you saying that both SCALAPCK and PArpack have functions called pslamch and that the linker chooses one or the other randomly? If so, Ouch!

yeap, Arpack used to have pdlamch but i fixed it in opencollab/arpack-ng#21

the fix for float is opencollab/arpack-ng#85

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants