Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NA as a factor level malformed by rbindlist #3915

Closed
sindribaldur opened this issue Sep 26, 2019 · 1 comment · Fixed by #3909
Closed

NA as a factor level malformed by rbindlist #3915

sindribaldur opened this issue Sep 26, 2019 · 1 comment · Fixed by #3909
Assignees
Milestone

Comments

@sindribaldur
Copy link

@sindribaldur sindribaldur commented Sep 26, 2019

data.table(V1 = factor(as.character(c(NA, 1:100, NA)), exclude = NULL))
# Error in as.character.factor(x) : malformed factor
data.table(V1 = factor(as.character(c(NA, 1:3, NA)), exclude = NULL))
#      V1
# 1: <NA>
# 2:    1
# 3:    2
# 4:    3
# 5: <NA>

Is this the expected behaviour?

Another example here: https://stackoverflow.com/q/58103098/4552295

I'm getting this result with data.table_1.12.2 and R version 3.6.1.

@jangorecki
Copy link
Member

@jangorecki jangorecki commented Sep 26, 2019

Same in devel. Thanks for reporting. Happens when printing DT having > 100 rows because of head&tail print format.

root cause of the issue is rbindlist in this line

toprint = rbindlist(list(head(x, topn), tail(x, topn)), use.names=FALSE) # no need to match names because head and tail of same x, and #3306
producing malformed factor, so the issue is actually in rbindlist. fyi @mattdowle
minimal example

x = data.table(V1 = factor(as.character(c(NA, 1:3, NA)), exclude = NULL))
rbindlist(list(x), use.names=FALSE)$V1
#Error in as.character.factor(x) : malformed factor

@jangorecki jangorecki self-assigned this Sep 26, 2019
@jangorecki jangorecki added this to the 1.12.4 milestone Sep 26, 2019
@jangorecki jangorecki assigned mattdowle and unassigned jangorecki Sep 26, 2019
@jangorecki jangorecki changed the title NA as a factor level in a data.table column NA as a factor level malformed by rbindlist Sep 26, 2019
@mattdowle mattdowle mentioned this issue Sep 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants