Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal error: length(irows)!=length(o__) #3062

Closed
renkun-ken opened this issue Sep 22, 2018 · 1 comment
Closed

Internal error: length(irows)!=length(o__) #3062

renkun-ken opened this issue Sep 22, 2018 · 1 comment
Labels
Milestone

Comments

@renkun-ken
Copy link
Member

@renkun-ken renkun-ken commented Sep 22, 2018

When I upgrade data.table to the latest release (1.11.6) from 1.11.4, the following code produces errors:

library(data.table)

set.seed(123)
n <- 100
dt <- data.table(group = rbinom(n, 5, 0.5), x = rnorm(n), flag = rbinom(n, 1, 0.9))

f1 <- function(data, group, x) { NULL }

dt[flag == 1 & group == 1, f1(.SD, group, x)] # (1)
dt[flag == 1, f1(.SD, group, x), keyby = group] # (2)

(2) results in the following error:

Error in `[.data.table`(dt, flag == 1, f1(.SD, group, x), keyby = group) : 
  Internal error: length(irows)!=length(o__)

But if (1) is not run, (2) will work.

The actual code in production ends up in segfault:

 *** caught segfault ***
address (nil), cause 'unknown'

Traceback:
 1: uniqlist(byval, order = o__)
 2: ...

The code in production is reduced to the form like above. I cannot reproduce the segfault with simple code yet so I'm not sure if they are caused by the same bug. But in my production, when code in the form of (1) is skipped before (2) is executed, (2) will work just like the example above.

My session info:

R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.11.6

loaded via a namespace (and not attached):
[1] compiler_3.5.1 tools_3.5.1   

The error can be reproduced on both R 3.4 and 3.5. I guess it is not caused by ALTREP.

@renkun-ken
Copy link
Member Author

@renkun-ken renkun-ken commented Sep 23, 2018

A minimal example is

set.seed(123)
n <- 10
dt <- data.table(group = rbinom(n, 5, 0.5), x = rnorm(n), flag = rbinom(n, 1, 0.9))
dt[flag == 1 & group == 1, 1] # (1)
dt[flag == 1, 1, keyby = group] # (2)

If the indices are removed using attr(dt, "index") <- NULL before calling (2), there's no error.

It looks like the problem lies at using existing index in (2).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants