Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test.data.table() fails on non-English locale #3039

Closed
MichaelChirico opened this issue Sep 7, 2018 · 11 comments · Fixed by #3553
Closed

test.data.table() fails on non-English locale #3039

MichaelChirico opened this issue Sep 7, 2018 · 11 comments · Fixed by #3553

Comments

@MichaelChirico
Copy link
Member

@MichaelChirico MichaelChirico commented Sep 7, 2018

We might be able to fix this with a gettext approach if I can ever figure out how that works.

Closely related to (not quite sure if duplicate) of #630.

This terminal's sessionInfo():

> R
R version 3.4.3 (2017-11-30) -- "Kite-Eating Tree"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin15.6.0 (64-bit)

R は、自由なソフトウェアであり、「完全に無保証」です。 
一定の条件に従えば、自由にこれを再配布することができます。 
配布条件の詳細に関しては、'license()' あるいは 'licence()' と入力してください。 

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

'demo()' と入力すればデモをみることができます。 
'help()' とすればオンラインヘルプが出ます。 
'help.start()' で HTML ブラウザによるヘルプがみられます。 
'q()' と入力すれば R を終了します。 

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] C/UTF-8/C/C/C/C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_3.4.3

(given the locale I'm not sure why everything comes up in Japanese on my Mac terminal...)

Output of test.data.table() on current master

Running /Library/Frameworks/R.framework/Versions/3.4/Resources/library/data.table/tests/tests.Rraw 
 要求されたパッケージ methods をロード中です 
Running test id 147      Test 147 didn't produce the correct error:
Expected:  unused argument 
Observed:   使われていない引数 (MySum = sum(v))  
Running test id 330      Test 330 didn't produce the correct warning:
Expected:  object 'd' not found 
Observed:   オブジェクト 'd' がありません  
Running test id 380      Test 380 didn't produce the correct error:
Expected:  locked binding 
Observed:   '.SD' に対するロックされたバインディングの値は変更できません  
Running test id 381      Test 381 didn't produce the correct error:
Expected:  locked binding 
Observed:   '.SD' に対するロックされたバインディングの値は変更できません  
Running test id 573      Test 573 didn't produce the correct error:
Expected:  object ' a' not found 
Observed:   オブジェクト ' a' がありません  
Running test id 715      Test 715 didn't produce the correct error:
Expected:  could not find function.*J 
Observed:   関数 "J" を見つけることができませんでした  
Running test id 1079.1      Test 1079.1 didn't produce the correct warning:
Expected:  row names.*discarded 
Observed:   列名は短い変数から見つけられ、捨て去られました  
Running test id 1137.7      Test 1137.7 didn't produce the correct error:
Expected:  invalid argument to unary 
Observed:   単項演算子に対する引数が不正です  
Running test id 1251.9      Test 1251.9 didn't produce the correct error:
Expected:  argument lengths differ 
Observed:   引数の長さが異なります  
Running test id 1291.1      Test 1291.1 didn't produce the correct warning:
Expected:  no non-missing arguments to max 
Observed:   max の引数に有限な値がありません: -Inf を返します  
Running test id 1294.4      Test 1294.4 didn't produce the correct warning:
Expected:  NAs introduced by coercion 
Observed:   強制変換により NA が生成されました  
Running test id 1294.11      Test 1294.11 didn't produce the correct warning:
Expected:  NAs introduced by coercion 
Observed:   強制変換により NA が生成されました  
Running test id 1439      Test 1439 didn't produce the correct error:
Expected:  nchar(dec) == 1L is not TRUE 
Observed:   nchar(dec) == 1L は TRUE ではありません  
Running test id 1536      Test 1536 didn't produce the correct error:
Expected:  argument 'incomparables != FALSE' 
Observed:   引数 'incomparables != FALSE' は (まだ) 使用されていません  
Running test id 1658.23      Test 1658.23 didn't produce the correct error:
Expected:  is.character\(file\).*not TRUE 
Observed:   is.character(file) && length(file) == 1L && !is.na(file) は TRUE ではありません  
Running test id 1671      Test 1671 didn't produce the correct error:
Expected:  invalid type/length (closure/1) 
Observed:   ベクトル割り当てにおいて、型か長さ (closure か 1) が不正です  
Running test id 1676.1      Test 1676.1 didn't produce the correct error:
Expected:  is not TRUE 
Observed:   length(na) == 1L は TRUE ではありません  
Running test id 1724.1      Test 1724.1 didn't produce the correct warning:
Expected:  NAs introduced by coercion 
Observed:   強制変換により NA が生成されました  
Running test id 1724.2      Test 1724.2 didn't produce the correct warning:
Expected:  NAs introduced by coercion 
Observed:   強制変換により NA が生成されました  
Running test id 1733.1      Test 1733.1 didn't produce the correct error:
Expected:  dec != sep is not TRUE 
Observed:   dec != sep は TRUE ではありません  
Running test id 1750.11      Test 1750.11 didn't produce the correct error:
Expected:  invalid 'type' (character) of argument 
Observed:   引数 'type' (character) が不正です  
Running test id 1750.12      Test 1750.12 didn't produce the correct error:
Expected:  not meaningful for factors 
Observed:   ‘sum’ は因子に対しては無意味です  
Running test id 1750.2      Test 1750.2 didn't produce the correct error:
Expected:  object 'stat' not found 
Observed:   オブジェクト 'stat' がありません  
Running test id 1750.22      Test 1750.22 didn't produce the correct error:
Expected:  object 'a' not found 
Observed:   オブジェクト 'a' がありません  
Running test id 1840.3      Test 1840.3 didn't produce the correct error:
Expected:  !is.na(sep) is not TRUE 
Observed:   !is.na(sep) は TRUE ではありません  
Running test id 1840.4      Test 1840.4 didn't produce the correct error:
Expected:  !is.na(sep) is not TRUE 
Observed:   !is.na(sep) は TRUE ではありません  
Running test id 1872.7      Test 1872.7 didn't produce the correct error:
Expected:  should be one of 
Observed:   'arg' は “weeks”, “months”, “quarters”, “years” の一つでなければなりません:  
Running test id 1918.1      Test 1918.1 didn't produce the correct error:
Expected:  not meaningful for factors 
Observed:   ‘min’ は因子に対しては無意味です  
Running test id 1918.2      Test 1918.2 didn't produce the correct error:
Expected:  not meaningful for factors 
Observed:   ‘max’ は因子に対しては無意味です  
Running test id 1924.1      Test 1924.1 didn't produce the correct error:
Expected:  not found\. Perhaps you intended.*varname 
Observed:   オブジェクト 'var_name' がありません  
Running test id 1924.2      Test 1924.2 didn't produce the correct error:
Expected:  Object.*not found among 
Observed:   オブジェクト 'variable' がありません  
Running test id 1924.3      Test 1924.3 didn't produce the correct error:
Expected:  non-numeric argument 
Observed:   二項演算子の引数が数値ではありません  
Running test id 1924.4      Test 1924.4 didn't produce the correct error:
Expected:  Perhaps you intended.*varname, VAR_NAME 
Observed:   オブジェクト 'var_name' がありません  
Running test id 1924.5      Test 1924.5 didn't produce the correct error:
Expected:  Perhaps you intended.*V1.*V5 or 45 more 
Observed:   オブジェクト 'V' がありません  
Running test id 1940         
10 longest running tests took 19s (45% of 42s)
      ID  time nTest
 1: 1874 3.684     5
 2: 1438 2.686   738
 3: 1648 2.218    91
 4: 1650 2.192    91
 5: 1652 2.180    91
 6: 1642 1.346    91
 7: 1646 1.292    91
 8: 1835 1.276     1
 9: 1644 1.245    91
10: 1739 1.216     5
 eval(exprs[i], envir) でエラー: 
  34 errors out of 7841 in 42.2sec on Fri Sep  7 12:10:04 2018. [endian==little, sizeof(long double)==16, sizeof(pointer)==8, TZ=Asia/Singapore, locale='C/UTF-8/C/C/C/C', l10n_info()='MBCS=TRUE; UTF-8=TRUE; Latin-1=FALSE']. Search inst/tests/tests.Rraw for test numbers: 147, 330, 380, 381, 573, 715, 1079.1, 1137.7, 1251.9, 1291.1, 1294.4, 1294.11, 1439, 1536, 1658.23, 1671, 1676.1, 1724.1, 1724.2, 1733.1, 1750.11, 1750.12, 1750.2, 1750.22, 1840.3, 1840.4, 1872.7, 1918.1, 1918.2, 1924.1, 1924.2, 1924.3, 1924.4, 1924.5.
@mattdowle
Copy link
Member

@mattdowle mattdowle commented Sep 26, 2018

Is there a way to stop translation of error and warnings in R, so they appear as English? Just for during the test suite.

@MichaelChirico
Copy link
Member Author

@MichaelChirico MichaelChirico commented Sep 26, 2018

@mattdowle there's supposed to be a way to do it with gettext -- all of the errors cited above i think are coming from base R errors which have translations, e.g.

‘max’ は因子に対しては無意味です

Literally is the translation of

'max' is not meaningful for factors

Which R has stored in its internals somewhere.

So in principle for these errors we should be able to replace the English-specified not meaningful for factors with something from gettext, but I've never been able to figure out how to do it.

@mattdowle
Copy link
Member

@mattdowle mattdowle commented Sep 26, 2018

But gettext translates English input to Japanese. You're proposing, iiuc, to add gettext to our test() internals to convert error= and warning= strings that appear in tests.Rraw to Japanese and for test() to compare the two Japanese strings? I'm suggesting the other way around : temporarily stopping R from translating its messages to Japanese, so the R session outputs base R errors and warnings in English, just for the duration of test.data.table.

@MichaelChirico
Copy link
Member Author

@MichaelChirico MichaelChirico commented Sep 26, 2018

Oh, I see. That should just (?) be a matter of setting/resetting locale variables? Though I'm not sure if a restart is required for these env settings to take effect...

@MichaelChirico
Copy link
Member Author

@MichaelChirico MichaelChirico commented Sep 26, 2018

And yes that's what I was suggesting -- using gettext to match the R-produced translated errors to whatever the user's local settings are. Would be more non-English friendly this way but also more of a pain I think.

@mattdowle
Copy link
Member

@mattdowle mattdowle commented Sep 26, 2018

How would that be more non-Engligh friendly?

@MichaelChirico
Copy link
Member Author

@MichaelChirico MichaelChirico commented Sep 26, 2018

Non-English user runs test.data.table() and sees one of the above errors (obviously they shouldn't see any error from test.data.table(), but supposing), e.g.

Running test id 1918.2      Test 1918.2 didn't produce the correct error:
Expected:  'max' not meaningful for factors 
Observed:   ‘min’ は因子に対しては無意味です  

Probably they can infer what's going wrong & most users of data.table probably have some basic competency in English to help them along, but (IINM) what I'm proposing would instead have resulted in this output for them:

Running test id 1918.2      Test 1918.2 didn't produce the correct error:
Expected:  'max' は因子に対しては無意味です
Observed:   ‘min’ は因子に対しては無意味です  

i.e. the whole error is in their native script/vocab & so friendlier for them to parse.

The corollary though is what they should do with this information since it may make it harder for them to communicate such errors to us meaningfully... it's murky territory

@mattdowle
Copy link
Member

@mattdowle mattdowle commented Sep 26, 2018

But what I'm proposing would result in :

Running test id 1918.2      Test 1918.2 didn't produce the correct error:
Expected: `max` not meaningful for factors 
Observed: `min` not meaningful for factors

i.e. English compared to English, even in Japanese locale. I see what you mean though : they might not be able to understand the difference because it is in English.
It's just base R errors and warnings that get translated, right? When users see data.table warnings and errors, they are in English aren't they? There's a mechanism for translations that can be added to a package I believe but I never looked into it.
Since test.data.table() is currently failing when users run locally, we should just get something working -- either way -- to solve the main problem.

@mattdowle
Copy link
Member

@mattdowle mattdowle commented Sep 26, 2018

Sys.setlocale() is used in test 1590. A restart isn't needed. Changing the locale applies to the whole currently running R session so it has to be carefully set back afterwards. I'd hope there is a more direct way to turn off message translations though. I looked for a domain option but couldn't see one. domain seems to be the parameter gettext() uses. Somehow do we need to turn the default from NULL to NA?

@MichaelChirico
Copy link
Member Author

@MichaelChirico MichaelChirico commented May 9, 2019

When users see data.table warnings and errors, they are in English aren't they?

This is correct. There's no internal translation mechanism; translations are hard-coded, e.g. I see the error above in src/library/base/po/R-ja.po

msgid "%s not meaningful for factors"
msgstr " %s は因子に対しては無意味です "

@MichaelChirico
Copy link
Member Author

@MichaelChirico MichaelChirico commented May 9, 2019

Searching around a bit, seems like testthat didn't really find a gettext solution either:

r-lib/testthat#565 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants