-
-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sugar functions - rowSums, colSums, rowMeans, colMeans #549
Comments
I originally added that code as it was supposed to be faster that the macro version, so I'm pretty surprised that's not the case here! FWIW I think that these optimizations only kick in when compiling in C++11 mode (since we condition on
When compiling in C++98 / the regular mode, we just delegate to |
FWIW, this code: Rcpp/inst/include/Rcpp/traits/is_na.h Lines 40 to 43 in 3c09586
could likely be improved to only check for Firstly, R's Secondly, it's often not clear what a particular R API does for missing values; I usually have to jump back to the sources to remind myself whether a particular routine checks:
(note that the C version of the API is not the same as the R version of the API; I outlined a bit of that here: http://stackoverflow.com/questions/26241085/rcpp-function-check-if-missing-value/26262984#26262984) |
Hmm well now I'm starting to wonder if the performance difference is platform-related. In a separate file I compiled
which is strangely even slower than the C++98 traits version. Running this test on Windows 7 I see the same pattern; macro version < C++98 traits version < C++11 traits version. At any rate, I can see why the C++98 version of Circling back to my original question, I just want to make sure that it is okay to use |
|
Now that I think about it, we should probably just delegate to |
What about falling back on #include <Rcpp.h>
// [[Rcpp::export]]
bool is_nan(double x) {
return R_isnancpp(x);
}
/*** R
is_nan(NA_real_)
# [1] TRUE
is_nan(NaN)
# [1] TRUE
is_nan(pi)
# [1] FALSE
*/ It seems like someone already did the hard work of making this portable. |
Sounds good to me; I'd be open to accepting a PR if that change was made. Honestly, the code I added for 'optimizing' NA checks in hindsight just looks dangerous / weird (ie, the code is too clever for its own good); I'd be open to simplifying things and calling pre-established APIs. |
@nathan-russell Do you consider the |
@eddelbuettel No it's not urgent, especially since the code hasn't been vetted yet. Please feel free to proceed with releasing 0.12.7 without this addition. |
@kevinushey Re: the |
row/col-Sums/Means and unit tests - for #549
This should now be taken care of. |
I put together sugar functions for
rowSums
,colSums
,rowMeans
, andcolMeans
(for use with matrices, notdata.frame
s) which all seem to be working as expected. However, early on in the process I switched from usingRcpp::traits::is_na<>
to the R macroISNAN
to check forNA
s /NaN
s in numeric matrices because I noticed a large difference in performance. As an example,I believe the difference is due to the
memcmp
call here. @kevinushey It looks like you authored this, so can you comment on whether or not it is safe to be usingISNAN
in place oftraits::is_na
? I read the explanatory comment,and it made sense to me, but despite the fact that I am on a 64-bit machine with
unsigned long long
support, I was not able to reproduce the situation described. If I'm only usingISNAN
to check values on input objects, is there any risk in it giving erroneous results? If so, is there some way to test this directly?The text was updated successfully, but these errors were encountered: