-
Notifications
You must be signed in to change notification settings - Fork 122
Description
Short version
step_ICA using options = list(method = 'C') is always faster than the default method = 'R, with no downsides.
I don't know what the policy is on changing external package defaults in recipes, but it's a 'free' speed gain.
Longer version
step_ica uses fastICA::fastICA which has an argument method. Quoting FastICA's documentation:
if method == "R" then computations are done exclusively in R (default). The code allows the interested R user to see exactly what the algorithm does. if method == "C" then C code is used to perform most of the computations, which makes the algorithm run faster. During compilation the C code is linked to an optimized BLAS library if present, otherwise stand-alone BLAS routines are compiled.
So in almost all cases, you want method = 'C'. The only use for 'R' seem to be the very specific case where you want to see the exact calculations - but then you would have to look in the documentation for step_ica anyway (and see the changed default).
So I propose to change the default option of step_ica to use method = 'C'. I can make a pull request (at some point) if you want.
Below is a small example of the speed gain. Note that it varies but is consistently positive.
library('fastICA')
library('tictoc')
set.seed(1)
X <- matrix(runif(5e6), 5e6 / 20, 20)
tic()
a <- fastICA(X, 20, method = "R")
toc()
#> 4.177 sec elapsed
tic()
b <- fastICA(X, 20, method = "C")
toc()
#> 2.629 sec elapsedCreated on 2020-06-02 by the reprex package (v0.3.0)