Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbind unmatched columns crashes R session (Windows, dev version) #4402

Closed
shrektan opened this issue Apr 27, 2020 · 26 comments · Fixed by #4531
Closed

rbind unmatched columns crashes R session (Windows, dev version) #4402

shrektan opened this issue Apr 27, 2020 · 26 comments · Fixed by #4531
Assignees
Labels
dev platform-specific translation issues/PRs related to message translation projects
Milestone

Comments

@shrektan
Copy link
Member

shrektan commented Apr 27, 2020

Test 450 will crash R session with dev version of data.table (but not with CRAN version) on Windows.

R version: 3.6.2
Platform: Windows 10

library(data.table)
DT = data.table(a=1:3,b=4:6)
rbind(DT,list(c=4L,a=7L))
@shrektan shrektan changed the title rbind float with integer crashes R session (Windows, dev version) rbind unmatched columns crashes R session (Windows, dev version) Apr 27, 2020
@jangorecki jangorecki added the dev label Apr 27, 2020
@jangorecki
Copy link
Member

jangorecki commented Apr 27, 2020

Thanks for reporting.
On linux it is

Error in rbindlist(l, use.names, fill, idcol) : 
  Column 1 ['c'] of item 2 is missing in item 1. Use fill=TRUE to fill with NA (NULL for list columns), or use.names=FALSE to ignore column names.

@jangorecki jangorecki added this to the 1.12.9 milestone Apr 27, 2020
@shrektan
Copy link
Member Author

This line leads to the crash.

data.table/src/rbindlist.c

Lines 210 to 211 in dd7609e

snprintf(buff, 1000, _("Column %d ['%s'] of item %d is missing in item %d. Use fill=TRUE to fill with NA (NULL for list columns), or use.names=FALSE to ignore column names.%s"),
w2+1, str, i+1, missi+1, extra );

@shrektan shrektan self-assigned this Apr 29, 2020
@shrektan
Copy link
Member Author

shrektan commented Apr 29, 2020

I can confirm it's due to the Chinese translation because when I set the language to en, it works well.

@jangorecki jangorecki added the translation issues/PRs related to message translation projects label Apr 29, 2020
@jangorecki
Copy link
Member

fy @MichaelChirico

@shrektan
Copy link
Member Author

Looks like we can't use _() with snprintf() on Windows, because when I change the code from

data.table/src/rbindlist.c

Lines 210 to 212 in dd7609e

snprintf(buff, 1000, _("Column %d ['%s'] of item %d is missing in item %d. Use fill=TRUE to fill with NA (NULL for list columns), or use.names=FALSE to ignore column names.%s"),
w2+1, str, i+1, missi+1, extra );
if (usenames==TRUE) error(buff);

to

if (usenames==TRUE) error( _("Column %d ['%s'] of item %d is missing in item %d. Use fill=TRUE to fill with NA (NULL for list columns), or use.names=FALSE to ignore column names.%s"),
              w2+1, str, i+1, missi+1, extra);

The crash will no longer happen.

If that's the case, we may have to check all the combination of snprintf() and _().

@shrektan
Copy link
Member Author

I'm almost sure that snprintf() can't be used with _() on Windows but I don't know the reason.

As you can see that the following code (another place uses snprintf() with _()), on Windows (compared to Mac), although it doesn't crash the R session, it doesn't interpolate the values like %3$d.

library(data.table)
DT = data.table(a=1:3,b=2)
rbind(DT,list(b=5,a=3), use.names = F)

Mac

image

Windows

image

@MichaelChirico
Copy link
Member

MichaelChirico commented Apr 30, 2020

Thanks for diving on this @shrektan. Have asked about this on r-devel, it's a bit above my pay grade.

https://stat.ethz.ch/pipermail/r-devel/2020-April/079397.html

We would drop translation for such messages if there are no suggestions there... I also didn't see any thing googling around for "snprintf" gettext windows...

In the meantime, I tried to replicate your error on Appveyor on the snprintf-windows-test branch, a few different configs of trying to get Appveyor into the Chinese locale didn't work. If you have an idea how to do that could give it a try?

@jangorecki
Copy link
Member

jangorecki commented Apr 30, 2020

What if the size of chinesse characters is so big that it exceeds the size allocated for buff?
Actually that should not be an issue, because buff keeps our english message, unless gettext convert text in buff in place, which is rather not happenning.

@MichaelChirico
Copy link
Member

I believe gettext is intercepting the text between when it's produced in C and when it's surfaced to the user, so the buffer would receive the char array as usual, then the .mo file is used to do a lookup & convert before showing to user.

I am no expert here at all but I don't see how it could work otherwise b/c of the buffer overflow issue. (FWIW Chinese messages will usually be "narrower" than the English translation but maybe not in edge cases)

@shrektan
Copy link
Member Author

What if the size of chinesse characters is so big that it exceeds the size allocated for buff?

@jangorecki Yes, as you said, the buff size is not the cause. I actually tried to increase the buff size and it doesn't fix this issue as expected.

In the meantime, I tried to replicate your error on Appveyor on the snprintf-windows-test branch, a few different configs of trying to get Appveyor into the Chinese locale didn't work. If you have an idea how to do that could give it a try?

@MichaelChirico If the virual machine can't be set to the Chinese locale, you may try to add a line language=cn(I only use "en" so I'm not sure if it's "cn") to ~/.Renviron.

@jangorecki jangorecki modified the milestones: 1.12.11, 1.12.9 May 26, 2020
@mattdowle
Copy link
Member

mattdowle commented Jun 2, 2020

I tried to reproduce this locally, but no luck so far. I built R-devel with ASAN which usually catches crashes like this. @shrektan when you say 'crash' is there any message at all or does R.exe just simply stop running and exit. Can you try R.exe at the prompt outside RStudio and see if that makes any difference or gets us any sort of error? Sounds like you're able to compile and try changes out ... if you change the snprintf call to sprintf (no n) and remove the 1000 argument, does it then work?

it doesn't interpolate the values like %3$d

That comment seems significant. The order of the arguments in that string (number, string, number, number, string) is different in the translation, which seems to be what the n$ prefix is for. If that's not working then a mixup in arguments to specifiers (a number where a string is expected) is just the sort of thing that can cause these hard to reproduce crashes.

Anyway, here's what I get, see CRAN_Release.cmd for how I configured R-devel.

$ Rdevel-strict-gcc CMD INSTALL data.table_1.12.9.tar.gz
* installing to library ‘/home/mdowle/build/R-devel-strict-gcc/library’
* installing *source* package ‘data.table’ ...
** using staged installation
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer 9.3.0
zlib 1.2.11 is available ok
OpenMP supported
** libs
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c assign.c -o assign.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c between.c -o between.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c bmerge.c -o bmerge.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c chmatch.c -o chmatch.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c cj.c -o cj.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c coalesce.c -o coalesce.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c dogroups.c -o dogroups.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c fastmean.c -o fastmean.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c fcast.c -o fcast.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c fifelse.c -o fifelse.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c fmelt.c -o fmelt.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c forder.c -o forder.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c frank.c -o frank.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c fread.c -o fread.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c freadR.c -o freadR.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c froll.c -o froll.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c frollR.c -o frollR.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c frolladaptive.c -o frolladaptive.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c fsort.c -o fsort.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c fwrite.c -o fwrite.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c fwriteR.c -o fwriteR.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c gsumm.c -o gsumm.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c ijoin.c -o ijoin.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c init.c -o init.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c inrange.c -o inrange.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c nafill.c -o nafill.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c nqrecreateindices.c -o nqrecreateindices.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c openmp-utils.c -o openmp-utils.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c quickselect.c -o quickselect.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c rbindlist.c -o rbindlist.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c reorder.c -o reorder.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c shift.c -o shift.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c subset.c -o subset.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c transpose.c -o transpose.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c types.c -o types.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c uniqlist.c -o uniqlist.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c utils.c -o utils.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c vecseq.c -o vecseq.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel-strict-gcc/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c wrappers.c -o wrappers.o
gcc -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -shared -L/usr/local/lib -o data.table.so assign.o between.o bmerge.o chmatch.o cj.o coalesce.o dogroups.o fastmean.o fcast.o fifelse.o fmelt.o forder.o frank.o fread.o freadR.o froll.o frollR.o frolladaptive.o fsort.o fwrite.o fwriteR.o gsumm.o ijoin.o init.o inrange.o nafill.o nqrecreateindices.o openmp-utils.o quickselect.o rbindlist.o reorder.o shift.o subset.o transpose.o types.o uniqlist.o utils.o vecseq.o wrappers.o -lz
if [ "data.table.so" != "datatable.so" ]; then mv data.table.so datatable.so; fi
if [ "" != "Windows_NT" ] && [ `uname -s` = 'Darwin' ]; then install_name_tool -id datatable.so datatable.so; fi
installing to /home/mdowle/build/R-devel-strict-gcc/library/00LOCK-data.table/00new/data.table/libs
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (data.table)
$
$ LANGUAGE=zh_CN Rdevel-strict-gcc

R Under development (unstable) (2020-06-01 r78624) -- "Unsuffered Consequences"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R是自由软件,不带任何担保。
在某些条件下你可以将其自由散布。
用'license()'或'licence()'来看散布的详细条件。

R是个合作计划,有许多人为之做出了贡献.
用'contributors()'来看合作者的详细情况
用'citation()'会告诉你如何在出版物中正确地引用R或R程序包。

用'demo()'来看一些示范程序,用'help()'来阅读在线帮助文件,或
用'help.start()'通过HTML浏览器来看帮助文件。
用'q()'退出R.

> library(data.table)
data.table 1.12.9 IN DEVELOPMENT built 2020-06-02 20:26:08.262 UTC; mdowle using 1 threads (see ?getDTthreads).  Latest news: r-datatable.com
**********
用中文运行data.table。软件包只提供英语支持。当在在线搜索帮助时,也要确保检查英语错误信息。这个可以通过查看软件包源文件中的po/R-zh_CN.po和po/zh_CN.po文件获得,这个文件可以并排找到母语和英语错误信息。
**********
**********
This installation of data.table has not detected OpenMP support. It should still work but in single-threaded mode. If this is a Mac, please ensure you are using R>=3.4.0 and have followed our Mac instructions here: https://github.com/Rdatatable/data.table/wiki/Installation. This warning message should not occur on Windows or Linux. If it does, please file a GitHub issue.
**********
> DT = data.table(a=1:3, b=4:6)
> rbind(DT, list(c=4L,a=7L))
Error in rbindlist(l, use.names, fill, idcol) : 
  第 2 项的第 1 列 ['c'] 在第 1 项中并不存在。请使用 fill=TRUE 以用NA (或 NULL 若该列为列表(list))填充,或使用 use.names=FALSE 以忽略列名。
> for (i in 1:100) try(rbind(DT,list(c=4L,a=7L)))
Error in rbindlist(l, use.names, fill, idcol) : 
  第 2 项的第 1 列 ['c'] 在第 1 项中并不存在。请使用 fill=TRUE 以用NA (或 NULL 若该列为列表(list))填充,或使用 use.names=FALSE 以忽略列名。
Error in rbindlist(l, use.names, fill, idcol) : 
  第 2 项的第 1 列 ['c'] 在第 1 项中并不存在。请使用 fill=TRUE 以用NA (或 NULL 若该列为列表(list))填充,或使用 use.names=FALSE 以忽略列名。
... snip
>

I was hoping to see some sort of memory exception printed out by ASAN with a line number to the .c source. That's what happens in other cases.

Or maybe it truly is a Windows-only problem.

@mattdowle
Copy link
Member

@shrektan are you using 32bit or 64bit R on Windows?

@shrektan
Copy link
Member Author

shrektan commented Jun 3, 2020

@mattdowle

Yes, it's definitely a Windows-platform specific issue, in my opinion. Just reproduced it on another Windows computer:

  • I can confirm it's the same with or without RStudio.
  • Using the R.exe, the R window hangs for 1 or 2 seconds and then disappears silently without any extra message.
  • Both 32bit and 64bit R share the same issue.
  • This time I'm using R 4.0.0 Patched r78465 but previously this was also reproduced on R 4.0.0 released and R 3.6.2.

The session info:

image

My doubt

In addition, I doubt that it's a problem relates to Encoding because R has to convert the translated message to the proper native encoding in order to display them correctly. snprintf() is a C function and not an R internal function, which is not able to handle the string Encoding translation properly.

(I actually have no idea how the function _() is implemented and I can't find the documentation so it's just a pure random guess)

@jangorecki
Copy link
Member

Don't if that is related but I see Rterm on your screenshot.
According to https://developer.r-project.org/Blog/public/2020/05/02/utf-8-support-on-windows/index.html

RTerm is a Windows application not using Unicode, like most of R it is implemented using the standard C library assuming that the encoding-specific operations will work according to the C locale. In R 4.0 and earlier, RTerm cannot handle non-representable characters.

@mattdowle
Copy link
Member

mattdowle commented Jun 3, 2020

Looks like numbered argument specifiers (%n$), as used in the translation of this particular error, are a POSIX extension.

https://stackoverflow.com/questions/19327441/gcc-dollar-sign-in-printf-format-string
https://stackoverflow.com/questions/44543540/index-specification-in-printf-format-specifier

That would explain why it works on Linux and Mac, but we see the crash on Windows. This is consistent with @shrektan's output and what he wrote above already.

On Windows, looks like we can use _sprintf_p instead. Seems like a drop-in replacement for snprintf. So one way which springs to mind is to simply #define snprintf _sprintf_p on Windows.

https://docs.microsoft.com/en-us/cpp/c-runtime-library/printf-p-positional-parameters?view=vs-2019

@MichaelChirico
Copy link
Member

Nice finds! Feels like it should be documented in base R, I'll try & file a patch...

@mattdowle
Copy link
Member

mattdowle commented Jun 4, 2020

All I really did in in the end in that merged PR (#4523) was define _XOPEN_SOURCE 1 for MinGW to include POSIX extensions, I think. Because, despite several attempts, MinGW compiled to _sprintf_p ok but failed to link to it. The fact the PR passes doesn't mean anything yet. @shrektan and @MichaelChirico could you test please? I merged it to master to make it easier for @shrektan to grab latest dev.

https://www.gnu.org/software/libc/manual/html_node/Feature-Test-Macros.html

R itself seem to set _XOPEN_SOURCE via _GNU_SOURCE.
See https://raw.githubusercontent.com/wch/r-source/trunk/configure. That refers to gettext which makes me think we're barking up the right tree:

 ## We call AC_GNU_SOURCE early (it is a prerequisite for the gettext
 ## macros), so all the C compiling makes use of that.

R does seem to have code to interpret %n$ positional too, but that code seems to be for the R level sprintf function to support positional specifiers. When it comes to C level, at the moment, my guess is that R is using POSIX extensions of MinGW.

@MichaelChirico
Copy link
Member

Another nice find on _GNU_SOURCE, I had tried to find _XOPEN_SOURCE to no avail.

My own reading per #4523 (comment) was that the trioremap.h header was the workhorse but certainly the _GNU_SOURCE would trump that, I wonder if/why both are necessary...

Sorry for the unforeseen headaches of introducing translations!

@shrektan
Copy link
Member Author

shrektan commented Jun 4, 2020

@mattdowle Thanks for all these hard investigations. Unfortunately, the dev version of data.table doesn't fix this issue on my computer. The issue is the same: R hangs for 1 or 2 seconds before the window gets closed silently.

I've tried it on x64 / i386 with Rgui.exe / R.exe. All behave the same...

@MichaelChirico
Copy link
Member

Hmm at this point, we might consider just removing translation for snprintf messages with >1 formatter. I count 17 of them in current src. Let me know and I will file PR.

We can return to getting those translated later...

@mattdowle
Copy link
Member

mattdowle commented Jun 5, 2020

@shrektan Please re-test master with #4524 now merged. Thanks to you getting the Chinese to work on AppVeyor, and that test is passing now, fingers crossed! Will leave this issue open until you confirm.

To do before closing:

  • shrektan confirm
  • Michael spotted %1$ could match to %%1$, fix and add test
  • add error in unlikely event that strlen(fmt) > n (fmt is written to dest as tmp)
  • check that strlen(buff) + strlen(fmt) < n. If not, error or truncate? error easier than exact truncate to n (and easier for us is ok in this instance)
  • clang compile warning Michael shows below

@shrektan
Copy link
Member Author

shrektan commented Jun 5, 2020

@mattdowle I confirm the issue gets fixed - no crash with Chinese error message as expected. Thanks for all these!

Can't believe it turns out to be such a complicated issue... and it's really amazing that you find out the root cause of this issue and fix it in such a short time.

Thanks again :D!

@MichaelChirico
Copy link
Member

I'm also getting this compile warning on current master:

#define sprintf USE_SNPRINTF_NOT_SPRINTF  // prevent use of sprintf in data.table source; force us to use n always
        ^
/usr/include/secure/_stdio.h:46:9: note: previous definition is here
#define sprintf(str, ...) \
        ^

@jangorecki
Copy link
Member

I am not getting warnings like this... gcc 7.5.0 here.
We could eventually undefine it before making own macro.

@MichaelChirico
Copy link
Member

clang for me. I guess we just need to add:

#ifdef sprintf
#undef sprintf
#endif

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dev platform-specific translation issues/PRs related to message translation projects
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants