-
-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sort case-insensitive to make RcppExports more deterministic #878
Conversation
A number of packages that have required locale-independent sorting have implemented a We could potentially do something similar in Rcpp -- explicitly set LC_COLLATE, sort the list of files, and then restore the old LC_COLLATE. |
But if setting |
Codecov Report
@@ Coverage Diff @@
## master #878 +/- ##
=======================================
Coverage 90.09% 90.09%
=======================================
Files 71 71
Lines 3271 3271
=======================================
Hits 2947 2947
Misses 324 324
Continue to review full report at Codecov.
|
I'll leave it to your judgement: I agree this is a quick win which, at least for me, covers my fairly vanilla Ubuntu and Mac setups, which were producing different RcppExports code. I'm sure those case-insensitive windows users would also have bad interactions when also developing also on case-sensitive operating systems. |
This is an easy win. We could always look into setting locale to C, sorting, then resetting as @kevinushey suggested. But until then this works for me. |
Could we please change the locale here? That will also ensure consistency from system to system. |
Quoting @jjallaire in the same thread :
|
That link is to this issue? The problem with only ignoring the case is that you will still get a different default ordering in locales that don't order letters the same way as English. This means that collaborators with different locales will still generate different RcppExports. |
@jackwasey or @kevinushey Do you see a way to change this so that we can completely stable ordering across all systems and locales? |
We could set the locale to C temporarily and then sort the discovered files, e.g. cppFiles <- list.files(srcDir, pattern = "\\.((c(c|pp)?)|(h(pp)?))$", ignore.case = TRUE)
locale <- Sys.getlocale(category = "LC_COLLATE")
Sys.setlocale(category = "LC_COLLATE", locale = "C")
cppFiles <- sort(cppFiles)
Sys.setlocale(category = "LC_COLLATE", locale = locale) |
@kevinushey this looks good to me. |
compileAttributes
useslist.files
which gives a locale-dependent ordering of the matched files. This differs between C and en-us.UTF-8, with C ordering putting upper-case file names first. This makes the code output toRcppExports.*
different depending on the locale.It was not obvious to me how to make this locale independent using base R, but at least ignoring case when calling
list.files
eliminates some of the discrepancies produced with these two widely used collations. People callingcompileAttributes()
from locales which are not ordered A-Z will see differences in the output order (and thus order of functions in the generated code).