Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'memory not mapped' in setdiff when NA present in a factor column (dev version)' #1526

Closed
shntnu opened this issue Nov 11, 2015 · 5 comments
Closed
Assignees
Labels
Milestone

Comments

@shntnu
Copy link

@shntnu shntnu commented Nov 11, 2015

> sessionInfo()
R version 3.2.1 (2015-06-18)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.5 (Yosemite)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
> packageVersion("dplyr")
[1] ‘0.4.3.9000> 
> dplyr::setdiff(data.frame(var = c(NA, "a"), stringsAsFactors = F), data.frame(var = c("a"), stringsAsFactors = F))
   var
1 <NA>
> 
> dplyr::setdiff(data.frame(var = c(NA, "a"), stringsAsFactors = T), data.frame(var = c("a"), stringsAsFactors = T))

 *** caught segfault ***
address 0x7f9f9ef31028, cause 'memory not mapped'

Traceback:
 1: .Call("dplyr_setdiff_data_frame", PACKAGE = "dplyr", x, y)
 2: setdiff_data_frame(x, y)
 3: setdiff.data.frame(data.frame(var = c(NA, "a"), stringsAsFactors = T),     data.frame(var = c("a"), stringsAsFactors = T))
 4: dplyr::setdiff(data.frame(var = c(NA, "a"), stringsAsFactors = T),     data.frame(var = c("a"), stringsAsFactors = T))
@shntnu shntnu changed the title memory not mapped on select/setdiff (dev version) 'memory not mapped' in setdiff when NA present in a factor column (dev version)' Nov 11, 2015
@hadley
Copy link
Member

@hadley hadley commented Mar 1, 2016

Can you please provide a reproducible example? i.e. something I can copy and paste into R directly.

@hadley hadley added the reprex label Mar 1, 2016
@shntnu
Copy link
Author

@shntnu shntnu commented Mar 1, 2016

dplyr::setdiff(data.frame(var = c(NA, "a"), stringsAsFactors = F), data.frame(var = c("a"), stringsAsFactors = F))
# var
# 1 <NA>

dplyr::setdiff(data.frame(var = c(NA, "a"), stringsAsFactors = T), data.frame(var = c("a"), stringsAsFactors = T))
# *** caught segfault ***
#   address 0x7fbd8b3ff058, cause 'memory not mapped'
# 
# Traceback:
# 1: .Call("dplyr_setdiff_data_frame", PACKAGE = "dplyr", x, y)
# 2: setdiff_data_frame(x, y)
# 3: setdiff.data.frame(data.frame(var = c(NA, "a"), stringsAsFactors = T),     data.frame(var = c("a"), stringsAsFactors = T))
# 4: dplyr::setdiff(data.frame(var = c(NA, "a"), stringsAsFactors = T),     data.frame(var = c("a"), stringsAsFactors = T))

@hadley hadley removed the reprex label Mar 1, 2016
@hadley
Copy link
Member

@hadley hadley commented Mar 1, 2016

Minimal reprex:

library(dplyr)

df1 <- data_frame(x = factor(c(NA, "a")))
df2 <- data_frame(x = factor("a"))

setdiff(df1, df2)

But it works for me in the dev version.

@hadley
Copy link
Member

@hadley hadley commented Apr 19, 2016

That sha was spurious, but the problem does seem to have come back. @romainfrancois can you please take a look?

@romainfrancois
Copy link
Member

@romainfrancois romainfrancois commented Apr 30, 2016

Went a little beyond that as setdiff was coercing factors to characters even when the levels were identical. Not anymore:

> df1 <- data_frame(x = factor(c(NA, "a")))
> df2 <- data_frame(x = factor("a"))
>
> setdiff(df1, df2)
Source: local data frame [1 x 1]

       x
  <fctr>
1     NA

sicarul added a commit to sicarul/dplyr that referenced this issue May 4, 2016
@lock lock bot locked as resolved and limited conversation to collaborators Jun 9, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants