-
Notifications
You must be signed in to change notification settings - Fork 41
Update call to RcppArmadillo_fastLm_impl #56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update call to RcppArmadillo_fastLm_impl #56
Conversation
- RcppArmadillo's underlying implementation of fastLm was renamed to fastLm_impl in RcppCore/RcppArmadillo#164 (mid-August 2017) - Call the RcppAttributes generated C++ function directly
inst/examples/lmBenchmark.R
Outdated
@@ -37,7 +37,7 @@ exprs$QR <- expression(.Call("RcppEigen_fastLm_Impl", mm, y, 1L, PACKAGE="RcppEi | |||
exprs$LLt <- expression(.Call("RcppEigen_fastLm_Impl", mm, y, 3L, PACKAGE="RcppEigen")) | |||
|
|||
if (suppressMessages(require("RcppArmadillo", character=TRUE, quietly=TRUE))) { | |||
exprs$arma <- expression(.Call("RcppArmadillo_fastLm", mm, y, PACKAGE="RcppArmadillo")) | |||
exprs$arma <- expression(.Call("_RcppArmadillo_fastLm_impl", mm, y, PACKAGE="RcppArmadillo")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bigger question is why we even call this way by .Call()
anymore ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minimizing non-C++ overhead?
Happy to rewrite everything to call Rcpp{Eigen,Armadillo,GSL}::fastLmPure if you'd rather that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was mostly just thinking out aloud but if you wouldn't mind running a quick check?
Now that we have registration on, there should be little overhead. And yes, I would take an enlarged PR ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a quick look at this. The results from calling fastLmPure
are basically the same, and might even be a bit faster than the current approach.
From playing around with it, it feels like the fastLmPure
approach appears to be a bit more random (since it does a bit more), but the additional noise is on the same order of magnitude as the noise in the benchmarks themselves.
Unless it's important to remove all additional sources of noise from the benchmarks, I'd go with fastLmPure
. If getting optimal benchmarks is important, using the higher-precision timers from microbenchmark
or bench
might be a good idea.
Here's three runs (100 replicates per run) of just the two Cholesky approaches:
> source("lmBenchmark-short.R")
lm benchmark for n = 100000 and p = 40: nrep = 100
test relative elapsed user.self sys.self
2 LDLt2 1.000 1.683 1.683 0.000
4 LLt2 1.011 1.701 1.701 0.000
1 LDLt 1.014 1.706 1.706 0.000
3 LLt 1.017 1.712 1.708 0.004
> source("lmBenchmark-short.R")
lm benchmark for n = 100000 and p = 40: nrep = 100
test relative elapsed user.self sys.self
2 LDLt2 1.000 1.580 1.544 0.036
4 LLt2 1.001 1.582 1.546 0.036
3 LLt 1.005 1.588 1.553 0.035
1 LDLt 1.007 1.591 1.560 0.032
> source("lmBenchmark-short.R")
lm benchmark for n = 100000 and p = 40: nrep = 100
test relative elapsed user.self sys.self
3 LLt 1.000 1.710 1.643 0.066
1 LDLt 1.019 1.742 1.738 0.004
2 LDLt2 1.037 1.774 1.773 0.000
4 LLt2 1.039 1.777 1.778 0.000
The "2" versions are the ones calling the appropriate fastLmPure
instead of .Call
directly.
Here's the result of three runs on all methods :
> source("lmBenchmark-mod.R")
lm benchmark for n = 100000 and p = 40: nrep = 20
test relative elapsed user.self sys.self
4 LDLt 1.000 0.325 0.324 0.000
15 LLt2 1.006 0.327 0.327 0.000
14 LLt 1.012 0.329 0.329 0.000
5 LDLt2 1.015 0.330 0.331 0.000
11 SymmEig2 2.222 0.722 0.723 0.000
10 SymmEig 2.255 0.733 0.723 0.009
13 QR2 7.163 2.328 2.328 0.000
12 QR 7.172 2.331 2.331 0.000
2 PivQR 7.455 2.423 2.413 0.011
3 PivQR2 7.458 2.424 2.419 0.005
1 lm.fit 14.597 4.744 4.641 0.102
17 arma2 17.588 5.716 5.716 0.000
16 arma 17.649 5.736 5.735 0.001
6 GESDD 31.440 10.218 9.974 0.243
7 GESDD2 31.658 10.289 10.034 0.254
8 SVD 54.985 17.870 17.356 0.511
9 SVD2 55.018 17.881 17.366 0.513
19 GSL2 108.926 35.401 35.273 0.124
18 GSL 111.711 36.306 36.186 0.117
> source("lmBenchmark-mod.R")
lm benchmark for n = 100000 and p = 40: nrep = 20
test relative elapsed user.self sys.self
15 LLt2 1.000 0.312 0.311 0.001
14 LLt 1.006 0.314 0.313 0.000
5 LDLt2 1.026 0.320 0.320 0.000
4 LDLt 1.048 0.327 0.327 0.000
11 SymmEig2 2.237 0.698 0.696 0.002
10 SymmEig 2.263 0.706 0.697 0.009
13 QR2 7.288 2.274 2.272 0.001
12 QR 7.304 2.279 2.276 0.002
2 PivQR 7.667 2.392 2.392 0.000
3 PivQR2 7.679 2.396 2.396 0.000
1 lm.fit 14.019 4.374 4.368 0.006
17 arma2 16.888 5.269 5.267 0.002
16 arma 17.119 5.341 5.338 0.003
7 GESDD2 30.394 9.483 9.448 0.034
6 GESDD 30.532 9.526 9.491 0.034
8 SVD 56.388 17.593 17.309 0.283
9 SVD2 59.721 18.633 18.320 0.312
19 GSL2 110.465 34.465 34.346 0.117
18 GSL 111.256 34.712 34.588 0.120
> source("lmBenchmark-mod.R")
lm benchmark for n = 100000 and p = 40: nrep = 20
test relative elapsed user.self sys.self
4 LDLt 1.000 0.308 0.306 0.002
5 LDLt2 1.010 0.311 0.311 0.000
14 LLt 1.019 0.314 0.314 0.000
15 LLt2 1.036 0.319 0.319 0.000
10 SymmEig 2.221 0.684 0.683 0.001
11 SymmEig2 2.237 0.689 0.688 0.000
12 QR 7.292 2.246 2.246 0.000
13 QR2 7.315 2.253 2.253 0.000
2 PivQR 7.500 2.310 2.307 0.002
3 PivQR2 7.506 2.312 2.311 0.001
1 lm.fit 14.221 4.380 4.352 0.028
16 arma 17.237 5.309 5.308 0.001
17 arma2 17.354 5.345 5.344 0.000
6 GESDD 30.114 9.275 9.268 0.007
7 GESDD2 30.253 9.318 9.315 0.002
8 SVD 55.571 17.116 16.872 0.242
9 SVD2 57.045 17.570 17.320 0.249
18 GSL 111.227 34.258 34.139 0.115
19 GSL2 114.195 35.172 35.049 0.120
Pushed a new version which i) uses the Here's example output:
|
- Use `microbenchmark` instead of `rbenchmark` Report relative and absolute timings - Use `fastLmPure` functions instead of calling C++ directly
inst/examples/lmBenchmark.R
Outdated
@@ -5,7 +5,7 @@ | |||
## This file is part of RcppEigen. | |||
|
|||
require("stats", character=TRUE, quietly=TRUE) | |||
require("rbenchmark", character=TRUE, quietly=TRUE) | |||
require("microbenchmark", character=TRUE, quietly=TRUE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to add it to Suggests, and we cannot "just" require it. Please make the whole block below conditional of microbenchmark being present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure? rbenchmark
wasn't require()
'd previously and the below scripts weren't conditional. Even though this is in inst/examples
, R CMD check
doesn't register it as an "example" (in the sense of getting checked automatically) so this wasn't a problem. I'm happy to add it, just want to make sure you're ok with adding another soft-dependency first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not; I just noticed it was not in DESCRIPTION
but then again inst/examples/
may not get parsed.
But I also used to use this wrong code idiom of just require()
which does NOT reliably fail / skip in the absence of the tested package. I have changed this since in some places.
So with that, I would still like you to condition on microbenchmark
being present as it is a conditional dependency via Suggests
. See what we do here in tests/doRUnit.R
:
if(require("RUnit", quietly = TRUE)) {
# ...
That is the only correct way for Suggests
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, done.
I left RcppArmadillo
and RcppGSL
out of "Suggests" unless you want them included also.
Thank you! Almost there -- the |
Now you must add |
- Move benchmark script into a `if(require("microbenchmark"))` block so script works even if microbenchmark is not installed - Add microbenchmark to `Suggests` - Add r-cran-microbenchmark to .travis.yml
Sorry about that - I didn't get a build-failed email from Travis. Done now -- Travis passes with a |
Yes, they are constants. In the setup I prefer at Travis, vignettes are skipped, and you get idea about directory sizes. |
The RcppEigen benchmark script calls a C++ function which is no longer provided by RcppArmadillo (since mid-August 2017). This updates the benchmark script to call the RcppAttributes-generated entry point
_RcppArmadillo_fastLm_impl