Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting Segfault writing feather #68

Closed
shapenaji opened this issue Mar 30, 2016 · 9 comments
Closed

Getting Segfault writing feather #68

shapenaji opened this issue Mar 30, 2016 · 9 comments

Comments

@shapenaji
Copy link

OS: OS X El Capitan
R 3.2.4

Started with a csv: FileName.csv - 64 columns 213k rows

library(data.table)
library(feather)

z <- fread('FileName.csv')

Read 213246 rows and 64 (of 64) columns from 0.680 GB file in 00:00:08
Warning message:
In fread("FileName.csv") :
C function strtod() returned ERANGE for one or more fields. The first was string input '4.57829471736681e-314'. It was read using (double)strtold() as numeric value 4.5782947173668142E-314 (displayed here using %.16E); loss of accuracy likely occurred. This message is designed to tell you exactly what has been done by fread's C code, so you can search yourself online for many references about double precision accuracy and these specific C functions. You may wish to use colClasses to read the column as character instead and then coerce that column using the Rmpfr package for greater accuracy.

# Switch it into data.frame just to be sure
df <- as.data.frame(z)

> write_feather(df,'testfeath')

*** caught segfault ***
address 0x10, cause 'memory not mapped'

Traceback:
1: .Call("feather_writeFeather", PACKAGE = "feather", df, path)
2: writeFeather(x, path)
3: write_feather(df, "testfeath")

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection:

Any idea what this could be?

@hadley
Copy link
Collaborator

hadley commented Mar 31, 2016

Can you please provide a reproducible example, preferably without any data.table? (to ensure that isn't the root cause)

@shapenaji
Copy link
Author

I'll see what I can do, there are a lot of random characters in this df, (and for proprietary reasons, I unfortunately can't include it, I'll see if I can find a subset where it's the case), maybe I can randomly sample the characters in my fields...

@shapenaji
Copy link
Author

Reproduced it, it's coming from fread, read.csv does not trigger it, but I can't seem to generate it with random data. Trying to figure out which one it is

@hadley
Copy link
Collaborator

hadley commented Apr 1, 2016

Try narrowing down to a specific column - the most likely culprit is likely to be a character column, given that read.csv() doesn't illustrate the problem.

@shapenaji
Copy link
Author

Sorry for the delay: Got it, seems to have to do with empty characters.

here we go:

library(data.table)
library(feather)
write.csv(data.table(x = rep('',5)),'test.csv')
z <- fread('test.csv', data.table = FALSE)
write_feather(z,  'testfeath')

@hadley
Copy link
Collaborator

hadley commented Apr 1, 2016

Doesn't seem to be anything to do with data.table:

write_feather(data.frame(x = rep('',5)), "test.feather")

@wesm Am I doing something wrong here? Here's the backtrace:

* thread #1: tid = 0x177ea17, 0x000000010a8f9e07 feather.so`chrToPrimitiveArray(SEXPREC*) [inlined] feather::Buffer::data(this=<unavailable>) const + 7 at buffer.h:51, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x10)
  * frame #0: 0x000000010a8f9e07 feather.so`chrToPrimitiveArray(SEXPREC*) [inlined] feather::Buffer::data(this=<unavailable>) const + 7 at buffer.h:51
    frame #1: 0x000000010a8f9e00 feather.so`chrToPrimitiveArray(x=0x00007fff5fbfcdc0) + 704 at feather-write.cpp:221
    frame #2: 0x000000010a8fa5b7 feather.so`addCategoryColumn(table=0x00007fff5fbfcef8, name="x", x=0x000000010588dd28) + 119 at feather-write.cpp:265
    frame #3: 0x000000010a8fabfb feather.so`addColumn(table=0x00007fff5fbfcef8, name="x", x=0x000000010588dd28) + 59 at feather-write.cpp:314
    frame #4: 0x000000010a8faeda feather.so`writeFeather(df=Rcpp::DataFrame @ 0x00007fff5fbfcfa0, path=<unavailable>) + 538 at feather-write.cpp:340

I think the problem is that because there are only empty strings, the size_ of the BufferBuilder is 0 and data_ is a nullptr.

@wesm
Copy link
Owner

wesm commented Apr 1, 2016

I'm able to get a core dump in Python, too. Patch incoming

@wesm
Copy link
Owner

wesm commented Apr 1, 2016

Can you confirm #86 fixes the bug in R, too?

@shapenaji
Copy link
Author

Works for me! Thank you!

@wesm wesm closed this as completed in 890a1e3 Apr 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants