Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crippled Lapack Declaration on Windows #207

Closed
kthohr opened this issue Mar 22, 2018 · 43 comments
Closed

Crippled Lapack Declaration on Windows #207

kthohr opened this issue Mar 22, 2018 · 43 comments

Comments

@kthohr
Copy link
Contributor

kthohr commented Mar 22, 2018

Greetings,

A lot of Lapack routines were added with R version 3.3.0 (zgges, etc.), and the package configure script for Unix-alike systems correctly identifies this. However, for Windows systems, the assumption in RcppArmadilloConfig.h is that these routines are still missing. With R v3.5 being released in the next month or so, is now a good time to (safely) change this default behavior and assume most users have updated?

Best,
Keith

@eddelbuettel
Copy link
Member

No, there isn't, sadly, as one @baptiste 's package still failed over a complex decomposition. As long as the (partial) LAPACK copy that ships with R and is used by default on Windows, we are bound to have these issues. I fought this twice already.

@eddelbuettel
Copy link
Member

What is your underlying question? Are you missing a particular routine? Over the years it was generally just the somewhat more esoteric complex functions that were seen missing, and R Core helped twice and added them.

In general, until the R build changes and always uses an external (complete) LAPACK I fear that is little we can do better. Whenever we have left the guard down and said "we're good" we ended up being bitten by something we actually did not have.

@kthohr
Copy link
Contributor Author

kthohr commented Mar 22, 2018

For sure. I guess what I should have asked is: can we update the list of unavailable Lapack symbols on the Armadillo side? (What gets ifdef'd out by ARMA_CRIPPLED_LAPACK.) For example, I've tested qz() in the complex case on Windows (vanilla R v4.3) and there are no issues.

Is @baptiste 's package calling routines that weren't covered by the updates?

@eddelbuettel
Copy link
Member

It think when @baptiste was bitten last it was due to a change on the Armadillo side when @conradsnicta invoked a different (sparse?) solver. I am a little fuzzy on the details of that conversation -- may have been mostly on the r-package-devel or rcpp-devel mailing lists.

As for

there are no issues

that tends to be my sentiment too. As of today, we do have 458 packages on CRAN that use RcppArmadillo. In general, things just work.

This README in the R sources has a good list of what is included. Again, for RcppArmadillo, ex ante we do not know which library we are facing hence the need to dial down. So we set ARMA_CRIPPLED_LAPACK if we're on Windows, or if we see Rlapack.* being used. That #define then affects Conrad's code.

@kthohr
Copy link
Contributor Author

kthohr commented Mar 22, 2018

Sorry I wasn't clear with that example. Right now qz() for complex-valued matrices is ifdef'd out on the Armadillo side, as zgges was missing until R version 3.3.0. However, I've tested RcppArmadillo on Windows with that guard removed and it works fine. So my feeling is that it should be okay to start removing some of the guards based on the 3.3 update.

@eddelbuettel
Copy link
Member

eddelbuettel commented Mar 22, 2018

And again, to be crystal clear, that guard is in the Armadillo code, correct?

We could test this in a local patch we could try to noodle through CRAN. If it all works we can discuss with Conrad if/where/how the #ifdef is still needed. A quick recursive grep reveals that it is only used in
inst/include/armadillo_bits/auxlib_meat.hpp but I fear we have to test each and every use case against different R versions and their Rlapack.*. (Or at least 3.3.0 and later...)

That is quite a bit of work. Are you volunteering?

@kthohr
Copy link
Contributor Author

kthohr commented Mar 22, 2018

Yep, that guard is on the Armadillo side.

I'm happy to do the heavy lifting on this---I'm familiar with Rlapack and the internal Armadillo code. Would you prefer to start with the single qz case and then branch out to other functionality?

@eddelbuettel
Copy link
Member

eddelbuettel commented Mar 22, 2018

I really appreciate this. And the four-eyes principle is good. So how about something like this:

  • we look at the different functions affected by the #ifdef guard
  • we try to make sure each one is tickled by at least one unit test function
  • we use win-builder and/or r-hub to test 'both sides' ie external LAPACK as well as cases where Rlapack is used

How does that sound?

@kthohr
Copy link
Contributor Author

kthohr commented Mar 22, 2018

Sounds good to me.

@baptiste
Copy link

baptiste commented Mar 22, 2018

For the record since my cda package was mentioned: the package actually broke for Linux on CRAN, which apparently also rely on the internal R Lapack (I thought it did only for Windows). The package itself does not use any unusual routine, it just calls solve() on a standard complex linear system, but Armadillo recently introduced a new special-case for banded matrices that would require, if it was ever called, a Lapack routine that R doesn't provide. Linux CRAN checks find this routine in the headers missing, and break (in contrast Windows is always flagged as crippled). Someone kindly provided a workaround (to basically flag Lapack as crippled) on the mailing list so that it would pass CRAN checks like on Windows, but I never found the time to test this (when I tried on my Linux box an external Lapack was used, and when I tried to switch that didn't seem to affect anything; maybe I'd need to compile R with some flags for the purpose of this test).

Bottom line: cda is now archived on CRAN; I hope to return to it someday.

It sounds like we're facing opposing problems: you want to tell CRAN not to flag 'crippled' now that a routine is available, while I'd need to enforce the flag manually because some routine is missing.

@eddelbuettel
Copy link
Member

eddelbuettel commented Mar 22, 2018

I thought it did only for Windows

It's a compile time option. On the non-Windows systems, if you do configure --help you get (inter alia)

  --with-blas           use system BLAS library (if available), or specify it [no]
  --with-lapack         use system LAPACK library (if available), or specify it [no]

so the default may actually be to use the 'crippled' lapack (I think my local r-devel builds do that as I don't override it). But most distribution builds will probably use the external BLAS.

I supposed one can alter it on Windows too which must be how Microsoft injects the MKL.

Bottom line: "it's complicated" and when we build RcppArmadillo we have to track it.

@baptiste
Copy link

baptiste commented Mar 25, 2018

@conradsnicta I think a suitably edited version of your question would be a useful discussion to have on R-devel. I don't know the reasons myself, but my guess would be that R has always been very conservative with respect to numerical accuracy, and R-core may have found it necessary to manually patch/review some routines on a case-by-case basis. Or it could be that the effort of writing all the appropriate wrappers and ensuring their correctness has always been the limiting factor, leading to this cherry-picked, add-only-when-necessary situation.

Here's a link to the last time R added a bunch of Lapack routines; I'm trying to figure out from the (large) diff what was involved in the process.

@baptiste
Copy link

@conradsnicta as I said I don't know the reasons (and therefore should not have replied); I made a wild guess based on my recollection of discussions around flavours of Blas and Lapack (cf openBLAS and other alternatives), with R favouring a conservative (inefficient but more reliably accurate) choice by default. It may very well be completely irrelevant here; R-devel would have much more pertinent things to say.

@eddelbuettel
Copy link
Member

This is a 30 year old discussion. Statistician prefer eg the QR solver with pivoting which is why a modified LINPACK gets used in the standard lm(). But these days with double precision you have to work hard for finding actual differences. The example fastLm() in RcppEigen has a worked example thanks to @dmbates , the one in for fastLm() in RcppArmadillo does not but at least I put a note in the documentation.

And for 99.95% of all users the shipped LAPACK subset is actually sufficient. In all those years the only user I encountered who had issues was @baptiste . Now, his uses were all legit so we tried (and managed) to get this improved -- which can be done. But running around, screaming murder and calling other people names rarely does.

@eddelbuettel
Copy link
Member

The part I have control over -- how R is built and shipped for Debian and hence Debian derivatives -- does that: full external Lapack. Other parts and builds I cannot control, but I invite you to engage with R Core and argue for change there.

@baptiste
Copy link

I've emailed R-devel.

@eddelbuettel
Copy link
Member

Thanks. In the meantime we can try refining the tests.

@kthohr
Copy link
Contributor Author

kthohr commented Mar 26, 2018

Thanks for starting the discussion, @baptiste . This is a bane of mine, too.

On the CRIPPLED_LAPACK side, I've identified which guards can be removed; I'm just figuring out an intelligent way to test this to unit.tests. Dirk, would you prefer I add a new set of tests (say runit.Rlapack)?

@eddelbuettel
Copy link
Member

Awesome news @kthohr !

Maybe we can discuss a bit more. What we currently have is blunt: one test, one variable, being used in a good half dozen instances inside that 'meat' file. Do you think we should have variables, or just do away with the ones you identified as being available?

As for new tests, if it is a new topic, a new top-level R file inst/unitTests/runit.$name.R with a new matching cpp/$name.cpp seems the best. Rlapack works for me as $name.

@eddelbuettel
Copy link
Member

Oh, and should we remove the guard or refine the test to use when we know it works? Where do you think is the cutoff? R 3.3.0* ? That would be old enough not to worry.

@kthohr
Copy link
Contributor Author

kthohr commented Mar 26, 2018

I think we should remove the guards entirely, where possible. Here's a list of affected functions (with cutoffs):

eig_pair() for complex matrices
Lapack: zggev.  Added: R 3.3.0

qz() complex matrices
Lapack: zgges.  Added: R 3.3.0

svd_dc() and svd_dc_econ() for complex matrices
Lapack: zgesdd. Added: R 3.1.0

solve_square_refine() for complex matrices
Lapack: zgesvx. Added: R 3.3.0

solve_approx_svd() for complex matrices
Lapack: zgelsd. Added: R 3.1.0

schur() for complex matrices
Lapack: zgees.  Added: R 3.3.0

So some are related to changes made in R version 3.1.0, but most are from version 3.3.0. The general CRIPPLED_LAPACK guard will remain for some decompositions of band matrices (see chol_band and solve_band_fast).

@coatless
Copy link
Contributor

@kthohr this is great stuff! Thank you for compiling this list.

@kthohr
Copy link
Contributor Author

kthohr commented Mar 26, 2018

So my quick and dirty changes can be found here:

devtools:::install_github("kthohr/RcppArmadillo",ref="lapack-checks")

(Tests are in RcppArmadillo.cpp instead of a separate unit test file, while I will of course change for the PR.) I have checked this on a Windows machine with our beloved Rlapack. ARMA_EXTRA_DEBUG is enabled, so calling

library(RcppArmadillo)

set.seed(123)
n <- 5

norm(cx_eig_pair_test(n),"2")
norm(cx_qz_test(n),"2")
cx_rank_test(n) - n
norm(cx_solve_test(n),"2")
norm(cx_pinv_test(n),"2") - 1
norm(cx_schur_test(n),"2")

will show the internal pings of the Lapack functions. Any volunteers with broken Lapack distributions: please check this if you can.

Would the following make most sense?

  • Create a pull request with @conradsnicta to remove the relevant guards on the Armadillo side.
  • Then a pull request to RcppArmadillo with the runit.Rlapack tests.

@eddelbuettel
Copy link
Member

Good plan. I agree that we should just remove the relevant test on the side of @conradsnicta / Arma.

I think I will play with your snippet above and submit this to the various winbuilders and friends. Maybe also add a simple

// [Rcpp::export]]
int whatAreWe() {
#if defined(ARMA_CRIPPLED_LAPACK)
   return 1;
#else 
   return 0;
#endif;
}

to know where we are at.

@kthohr
Copy link
Contributor Author

kthohr commented Mar 26, 2018

Great! I just added a armadillo_Rlapack() function to cover your suggestion.

On my end, I tested the code with fresh installations of R---versions 3.3.0 and 3.4.3---on Windows. I also checked against R-devel on my main system (macOS), but that uses a system supplied Lapack (the Accelerate library).

@eddelbuettel
Copy link
Member

Are you set up for R-hub? That should give you a wide variety of different systems? (And I am tied up with something else...)

@kthohr
Copy link
Contributor Author

kthohr commented Mar 27, 2018

Nope, never worked with R-hub. And no rush on my side.

@eddelbuettel
Copy link
Member

It is the awesomest thing ever. Brew a fresh cup and look into it:

image

That is the webview at https://builder.r-hub.io/; I tend to use rhub package...

@eddelbuettel
Copy link
Member

It's a little different. Release candidates do not go to CRAN. I only upload to CRAN once every two months -- it is too involved otherwise. Details in the ChangeLog and NEWS at the package page.

I do roll up interim releases, and do eg the reverse-depends checking I did today. But the systems used for those have full LAPACK anyway.

So I would suggest the following. We really only need two blocks commented out, so I suggest that for the upcoming release we keep things mostly as they are but change

#if defined(ARMA_CRIPPLED_LAPACK)    

to

#if defined(ARMA_CRIPPLED_LAPACK) || defined(ARMA_CRIPPLED_LAPACK_BANDED)   

and I define that new variable, but not the old one. That gives us the 2 out of 7 we need. If that passes, we clean up for the next release on remove the old define.

Thoughts?

@kthohr
Copy link
Contributor Author

kthohr commented Mar 27, 2018

@eddelbuettel - I don't think I follow the logic of that... where are we placing ARMA_CRIPPLED_LAPACK_BANDED?

For reference, this is what I was going to submit on the Armadillo side:
kthohr/armadillo-code@1ac3e5d

Are we still keeping the guards around those blocks? Sorry for my confusion...

@eddelbuettel
Copy link
Member

What I had in mind would simply be less invasive as a start. We simply add a new "or" with the new define in the two spots where banded is used.

@coatless
Copy link
Contributor

In essence, this means that RcppArmadillo will need to depend on an R version greater than or equal to R 3.3.0 (per @kthohr's analysis). If the petition to r-devel is successful for R 3.5.*, then two years from now the dependency could increase to R 3.5.0 (unlikely due to the release schedule).

@eddelbuettel
Copy link
Member

eddelbuettel commented Mar 27, 2018

  1. R 3.5.0 will come out next month, nothing of this sort (ie changing Rlapack) will go into it now.
  2. @kthohr did a fine analysis, we can easily depend on R (>= 3.3.0) as we have similar / worse depends on a new enough compiler with (Rcpp)Armadillo already.
  3. It is not just Windows. It is every possible R build where external lapack is not switched on.
  4. It is in fact '(i) proper lapack, (iii) less borked lapack' as we are down to two instances of banded matrices.
  5. So it is clearly a simplification of the existing setup.

@eddelbuettel
Copy link
Member

So I took the version of auxlib_meat.hpp from @kthor and created a new minor test release in a branch.

That passes on the four default R Hub instances invoked from rhub::check_for_cran(). In that version, we are down to four uses of the #define. I think this could go into Armadillo 8.500 as is.

@kthohr
Copy link
Contributor Author

kthohr commented Mar 27, 2018

Thanks for taking the time to do that, @eddelbuettel. I'll pass a clean unit test to RcppArmadillo later today.

@eddelbuettel
Copy link
Member

And if anyone wants to play along, this is now in this RcppArmadillo branch. Might even fold that into master.

(@kthohr I used your school email for the ChangeLog as you list it on you GH profile. We can adjust if you'd prefer a different one.)

@kthohr
Copy link
Contributor Author

kthohr commented Mar 28, 2018

Thanks, Dirk! I created a PR to push a unit test into that branch.

(I've also changed my e-mail on my GH profile.)

@eddelbuettel
Copy link
Member

This is now merged in the master branch (as well as upstream for the actual change), so we could close this ticket -- or keep it open as a reminder of the four remaining #if defined tests.

Thoughts?

@kthohr
Copy link
Contributor Author

kthohr commented Mar 28, 2018

Great! I think we can close the ticket now.

@eddelbuettel
Copy link
Member

I think you are right -- this was after all about reflecting what R 3.3.0 brought us -- so thanks again for all your work on this.

@coatless
Copy link
Contributor

Need to add an R depends on R 3.3.0

MIA from:

#211
#212

@eddelbuettel
Copy link
Member

eddelbuettel commented Mar 28, 2018

Sure, but we haven't released to CRAN (or even a drat repo...) either. On the TODO list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants