Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fread("") segfault on empty input in R-devel #2499

Closed
mattdowle opened this issue Nov 23, 2017 · 1 comment
Closed

fread("") segfault on empty input in R-devel #2499

mattdowle opened this issue Nov 23, 2017 · 1 comment
Milestone

Comments

@mattdowle
Copy link
Member

@mattdowle mattdowle commented Nov 23, 2017

Test 885.1 fails with segfault rather than the graceful error about being empty :
test(885.1, fread(""), error="empty")
Feels to me like a temporary glitch in R-devel related to CHAR() on R_BlankString and/or R_BlankScalarString but could be wrong. Have asked Luke Tierney.

$ Rdevel CMD INSTALL data.table_1.10.5.tar.gz
$ Rdevel
> require(data.table)
> fread("")

Correct behavior :
Error in fread("") : 
  Input is either empty or fully whitespace after the skip or autostart. Run again with verbose=TRUE.

Observed :
$ Rdevel CMD INSTALL data.table_1.10.5.tar.gz 
* installing to library ‘/home/mdowle/build/R-devel/library’
* installing *source* package ‘data.table’ ...
** libs
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c assign.c -o assign.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c between.c -o between.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c bmerge.c -o bmerge.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c chmatch.c -o chmatch.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c dogroups.c -o dogroups.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c fastmean.c -o fastmean.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c fcast.c -o fcast.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c fmelt.c -o fmelt.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c forder.c -o forder.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c frank.c -o frank.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c fread.c -o fread.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c freadR.c -o freadR.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c fsort.c -o fsort.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c fwrite.c -o fwrite.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c fwriteR.c -o fwriteR.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c gsumm.c -o gsumm.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c ijoin.c -o ijoin.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c init.c -o init.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c inrange.c -o inrange.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c nqrecreateindices.c -o nqrecreateindices.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c openmp-utils.c -o openmp-utils.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c quickselect.c -o quickselect.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c rbindlist.c -o rbindlist.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c reorder.c -o reorder.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c shift.c -o shift.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c subset.c -o subset.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c transpose.c -o transpose.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c uniqlist.c -o uniqlist.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c vecseq.c -o vecseq.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -I"/home/mdowle/build/R-devel/include" -DNDEBUG   -I/usr/local/include   -fpic  -Og -g -c wrappers.c -o wrappers.o
clang-5.0 -fsanitize=undefined,address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer -shared -L/usr/local/lib -o data.table.so assign.o between.o bmerge.o chmatch.o dogroups.o fastmean.o fcast.o fmelt.o forder.o frank.o fread.o freadR.o fsort.o fwrite.o fwriteR.o gsumm.o ijoin.o init.o inrange.o nqrecreateindices.o openmp-utils.o quickselect.o rbindlist.o reorder.o shift.o subset.o transpose.o uniqlist.o vecseq.o wrappers.o
mv data.table.so datatable.so
if [ "" != "Windows_NT" ] && [ `uname -s` = 'Darwin' ]; then install_name_tool -id datatable.so datatable.so; fi
installing to /home/mdowle/build/R-devel/library/data.table/libs
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
* DONE (data.table)
mdowle@MattsMac:~/GitHub/data.table$ Rdevel

R Under development (unstable) (2017-11-21 r73768) -- "Unsuffered Consequences"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> require(data.table)
Loading required package: data.table
data.table 1.10.5 IN DEVELOPMENT built 2017-11-22 20:46:01.155 UTC; mdowle
**********
This installation of data.table has not detected OpenMP support. It should still work but in single-threaded mode. If this is a Mac, please ensure you are using R>=3.4.0 and have installed the MacOS binary package from CRAN: see ?install.packages, the 'type=' argument and the 'Binary packages' section. If you compiled from source, please reinstall and precisely follow the installation instructions on the data.table homepage. This warning message should not occur on Windows or Linux. If it does and you've followed the installation instructions on the data.table homepage, please file a GitHub issue.
**********
  The fastest way to learn (by data.table authors): https://www.datacamp.com/courses/data-analysis-the-data-table-way
  Documentation: ?data.table, example(data.table) and browseVignettes("data.table")
  Release notes, videos and slides: http://r-datatable.com
> fread("")
=================================================================
==4838==ERROR: AddressSanitizer: global-buffer-overflow on address 0x000002259bbf at pc 0x7f23a20d945e bp 0x7ffc2bdbd0d0 sp 0x7ffc2bdbd0c8
READ of size 1 at 0x000002259bbf thread T0
    #0 0x7f23a20d945d  (/home/mdowle/build/R-devel/library/data.table/libs/datatable.so+0xf445d)
    #1 0x7f23a20f4790  (/home/mdowle/build/R-devel/library/data.table/libs/datatable.so+0x10f790)
    #2 0x70513d  (/home/mdowle/build/R-devel/bin/exec/R+0x70513d)
    #3 0x76cd17  (/home/mdowle/build/R-devel/bin/exec/R+0x76cd17)
    #4 0x8c10e5  (/home/mdowle/build/R-devel/bin/exec/R+0x8c10e5)
    #5 0x884bf3  (/home/mdowle/build/R-devel/bin/exec/R+0x884bf3)
    #6 0x8e3c57  (/home/mdowle/build/R-devel/bin/exec/R+0x8e3c57)
    #7 0x88592d  (/home/mdowle/build/R-devel/bin/exec/R+0x88592d)
    #8 0x97aa9b  (/home/mdowle/build/R-devel/bin/exec/R+0x97aa9b)
    #9 0x97f510  (/home/mdowle/build/R-devel/bin/exec/R+0x97f510)
    #10 0x97f311  (/home/mdowle/build/R-devel/bin/exec/R+0x97f311)
    #11 0x5209ba  (/home/mdowle/build/R-devel/bin/exec/R+0x5209ba)
    #12 0x7f23a906c3f0  (/lib/x86_64-linux-gnu/libc.so.6+0x203f0)
    #13 0x42c859  (/home/mdowle/build/R-devel/bin/exec/R+0x42c859)

0x000002259bbf is located 1 bytes to the left of global variable 'newFileName' defined in 'sys-unix.c:108:13' (0x2259bc0) of size 4096
0x000002259bbf is located 59 bytes to the right of global variable 'num_initialized' defined in 'system.c:153:12' (0x2259b80) of size 4
SUMMARY: AddressSanitizer: global-buffer-overflow (/home/mdowle/build/R-devel/library/data.table/libs/datatable.so+0xf445d) 
Shadow bytes around the buggy address:
  0x000080443320: 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9
  0x000080443330: 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9
  0x000080443340: 00 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9
  0x000080443350: 04 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9
  0x000080443360: 00 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9
=>0x000080443370: 04 f9 f9 f9 f9 f9 f9[f9]00 00 00 00 00 00 00 00
  0x000080443380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x000080443390: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000804433a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000804433b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000804433c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==4838==ABORTING

It's this line :
https://github.com/Rdatatable/data.table/blob/master/src/freadR.c#L87
when inputArg is the blank string "".

>
> 86        ch = ch2 = (const char *)CHAR(STRING_ELT(inputArg,0));
> 87        while (*ch2!='\n' && *ch2!='\r' && *ch2!='\0') ch2++;
> (gdb) p ch
> $1 = 0x28 <error: Cannot access memory at address 0x28>

It seems like CHAR() is misbehaving on R_BlankString; i.e. very odd.
Everything else looks correct :

> (gdb) p Rf_PrintValue(inputArg)
> [1] ""
> $2 = void
> (gdb) p Rf_PrintValue(STRING_ELT(inputArg,0))
> <CHARSXP: "">
> $3 = void
> (gdb) p inputArg
> $4 = (SEXP) 0x625008b21be0
> (gdb) p R_BlankScalarString
> $5 = (SEXP) 0x6250000046c8
> (gdb) p STRING_ELT(inputArg,0)
> $6 = (struct SEXPREC *) 0x625000004738
> (gdb) p STRING_ELT(R_BlankScalarString,0)
> $7 = (struct SEXPREC *) 0x625000004738
> (gdb) p LENGTH(R_BlankScalarString)
> $8 = 1
> (gdb) p R_BlankString
> $9 = (SEXP) 0x625000004738
> (gdb) p LENGTH(R_BlankString)
> $10 = 0
> (gdb) p Rf_PrintValue(R_BlankString)
> <CHARSXP: "">
> $11 = void
@mattdowle mattdowle added this to the v1.10.6 milestone Nov 23, 2017
@mattdowle
Copy link
Member Author

@mattdowle mattdowle commented Dec 7, 2017

In the end, it wasn't R-devel at all.
The error from gdb cannot access memory was a red herring.
Turning on verbose=TRUE showed it was getting further and it was the eof[-1] access. (I had compiled both the package and R itself with -g and -Og so I don't see why ASAN didn't tell me the line number.)

Thanks to Luke Tierney for solving it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant