Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mutate with multiple list columns cause crashes, a selection of errors #1231

Closed
richierocks opened this issue Jun 23, 2015 · 12 comments
Closed

mutate with multiple list columns cause crashes, a selection of errors #1231

richierocks opened this issue Jun 23, 2015 · 12 comments

Comments

@richierocks
Copy link

@richierocks richierocks commented Jun 23, 2015

Calling mutate_ with multiple list columns fails using dplyr 0.4.2.

To reproduce:

library(dplyr)

get_random_letters <- function(n)
{
  replicate(n, sample(letters, 100, replace = TRUE), simplify = FALSE)
}

n <- 1e3
d <- mutate_(
  data.frame(id = seq_len(n)),
  x = ~ get_random_letters(n),
  y = ~ get_random_letters(n),
  z = ~ get_random_letters(n) 
)

# OK to here; next line fails

d <- mutate_(
  d,
  x = ~ lapply(x, sort),
  y = ~ lapply(x, sort),
  z = ~ lapply(x, sort)
)

I get a variety of responses, seemingly randomly. In decreasing order of frequency:

  • R crashes
  • Error: data_frames can only contain 1d atomic vectors and lists
  • Error in as.list.data.frame(X) :
    unimplemented type (29) in 'lazy_duplicate'
  • As above, but with 11, 12 or 28 as the unimplemented type number.
  • Error: Columns are not all same length
  • Error in structure(.Call(C_objectSize, x), class = "object_size") : unimplemented type (29) in 'object.size'

Tested under Windows and Linux. It worked fine when only one or two columns were mutated.

Possibly related to #1228

@hadley
Copy link
Member

@hadley hadley commented Jun 23, 2015

@romainfrancois can you take a look at this too please?

Loading

@john-sandall
Copy link

@john-sandall john-sandall commented Jun 23, 2015

I'm seeing a similar problem, I posted on SO (http://stackoverflow.com/questions/31010713/combining-dplyrmutate-with-lubridateymd-hms-in-r-randomly-causes-segfault) but if it's definitely this same bug I'll close the SO post and reference this issue.

I'm seeing the same variety of crashes as @richierocks mentioned, in that same order of frequency, although my issue is occurring with the standard evaluating mutate() function and only when I'm calling ymd_hms from lubridate within the mutation. Sample code to replicate on SO or in this gist: https://gist.github.com/john-sandall/05c3abb24fc738ddc2ad

Update: Spurred by this thread, I uninstalled dplyr 0.4.2 & installed 0.4.1, issue is resolved. Definitely seems to be some kind of bug only present in 0.4.2

Loading

@HarlanH
Copy link

@HarlanH HarlanH commented Jun 24, 2015

I'm also seeing hard crashes and weird behavior in 0.4.2, including factors getting converted to numeric. No simple test cases yet...

Loading

@amin04
Copy link

@amin04 amin04 commented Jun 24, 2015

I second that. Hard crashes & weird behaviour with columns. Had to switch back to 0.4.1.

Loading

@daltonhance
Copy link

@daltonhance daltonhance commented Jul 3, 2015

Chiming in here. I'm also getting hard crashes and having Rmarkdown fail to compile when using mutate with a call to lubridate.

%>% mutate(Mark_Time_Value = ymd_hms(Mark_Time_Value),
     Release_Time_Value =  ymd_hms(Release_Time_Value),
     Obs_Time_Value =  ymd_hms(Obs_Time_Value)) 
%>% mutate(TravelTime = difftime(Obs_Time_Value, Release_Time_Value, units = "days" ))

The error also occurred when I performed all these steps in a single mutate call (I changed to see if that would fix it). I thought it was lubridate before finding this thread, because otherwise this pipe and other pipes with mutate calls work just fine.

I'll try switching back to 0.4.1 to see if that fixes.

EDIT: Yep, switching back fixed it.

If it helps I did get this error message on one attempted run of Rmarkdown: unimplemented type (31) in 'duplicate'

Loading

@hadley
Copy link
Member

@hadley hadley commented Jul 4, 2015

No need to add additional examples, we're aware of the problem.

Loading

@romainfrancois
Copy link
Member

@romainfrancois romainfrancois commented Jul 6, 2015

This should be fixed now. @richierocks can you try it please. This was a sneaky protection problem.

Loading

@romainfrancois
Copy link
Member

@romainfrancois romainfrancois commented Jul 6, 2015

@john-sandall, @daltonhance @amin04 this should also fix the things you mention. Please test.

Loading

@daltonhance
Copy link

@daltonhance daltonhance commented Jul 6, 2015

@romainfrancois The development version seems to work for me. But I first checked the version available on CRAN as of this morning and ran into some of the same problems.

Loading

@richierocks
Copy link
Author

@richierocks richierocks commented Jul 7, 2015

@romainfrancois The problem seems to be solved. My code that was broken is now working again, and I haven't been able to reproduce those errors at all.

Loading

@romainfrancois
Copy link
Member

@romainfrancois romainfrancois commented Jul 7, 2015

Nice. Thanks for the heads up. That's a close then. 😎

Loading

@lgautier
Copy link

@lgautier lgautier commented Aug 4, 2015

That's a pretty serious issue. Is there an ETA for a bugfix release (for example dplyr 0.4.3) ?

Loading

@lock lock bot locked as resolved and limited conversation to collaborators Jun 9, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
8 participants