Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fread segfault #4644

Closed
renkun-ken opened this issue Jul 26, 2020 · 6 comments · Fixed by #4724
Closed

fread segfault #4644

renkun-ken opened this issue Jul 26, 2020 · 6 comments · Fixed by #4724
Labels
Milestone

Comments

@renkun-ken
Copy link
Member

renkun-ken commented Jul 26, 2020

I'm using the latest data.table release and the following code causes segfault:

library(data.table)
md <- fread("~/data/20200724.sample.log.gz",
  col.names = c("localtime", "event", "exchange_id", "instrument_id", "exchange_time",
    "pre_close", "last_price", "volume", "turnover",
    "bid_price", "ask_price", "bid_volume", "ask_volume"
  ),
  colClasses = c(
    "character", "factor", "factor", "character", "character",
    "double", "double", "double", "double",
    "double", "double", "integer", "integer"
  )
)
 *** caught segfault ***
address 0x5c, cause 'memory not mapped'

The gzipped data is attached.

20200724.sample.log.gz

Both original file or gz causes segfault. If I only read some top rows (e.g. nrows = 100) it won't crash.

If I don't use colClasses, it won't cause segfault either.

@mattdowle
Copy link
Member

mattdowle commented Jul 26, 2020

Thanks. Confirmed.

@mattdowle mattdowle added this to the 1.13.1 milestone Jul 26, 2020
@mattdowle mattdowle added the bug label Jul 26, 2020
@hongyuanjia
Copy link

hongyuanjia commented Aug 2, 2020

Hi @mattdowle. May I ask is there any update on this thread? Most of my CI failed due to this bug. I would like to know any workaround before I submit a new version to CRAN. Many thanks!

@mattdowle
Copy link
Member

mattdowle commented Aug 5, 2020

@hongyuanjia Sorry for the impact. Working on the valgrind issue now, then this segfault, then the [[ by group slow down, then will release. Hopefully end of week but no promises. I'm not aware of a workaround in the meantime other than going back to v1.12.8.

@st-pasha
Copy link
Contributor

st-pasha commented Aug 5, 2020

MRE is

data.table:::fread("~/Downloads/20200724.sample.log.gz", colClasses=c(V13="integer"), verbose=TRUE)

@st-pasha
Copy link
Contributor

st-pasha commented Aug 5, 2020

or even this:

data.table:::fread("0.0\n", colClasses="integer")

@hongyuanjia
Copy link

hongyuanjia commented Aug 6, 2020

@mattdowle Thanks for letting me know the timeline. Just take your time. I will keep track on this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants