Character encoding problem leading to errors in distinct(), write_tsv(), unique() ... ? #2971
I've been trying to reproduce this error, but I'm having difficulties. Please bare with me. Code to reproduce appears below!
I have a file with a few columns, which gets red in via
The error I get is:
In an effort for reproducibility, I created a gist from the file, hoping this would help reproducibility. But sometimes the error 'magically' disappears, sometimes I can reproduce it.
Here's the code that should reproduce it:
The reason I figured it might have something to do with the encoding is that
However, as mentioned above
I've tried the same code on another machine (OSX instead of linux), and can reproduce the error if it's from a fresh R session. I've (strangely only sometimes) managed to resolve the error, by splitting up
If there's anything I can do or try on my end please let me know.
Relevant session info:
FWIW, after downgrading to
The text was updated successfully, but these errors were encountered:
Have you looked into nested tibbles? Try this:
geneBed <- bed %>% group_by(interval.id) %>% mutate(min.start = min(start), max.end = max(end), dist.to.start = start - min.start, exon.len = end - start, cds.start = min.start, cds.end = max.end) %>% nest(dist.to.start, exon.len)
Your original problem seems to be caused by the grouped mutate that assigns a string. Looks like a protection error to me. Simpler reprex:
library(dplyr) set.seed(20170715L) df <- data_frame(x = 1:10000) %>% group_by(x) %>% mutate(y = as.character(runif(1L)), z = as.character(runif(1L))) df %>% distinct(x, .keep_all = TRUE) #> Error in distinct_impl(dist$data, dist$vars, dist$keep): Value of SET_STRING_ELT() must be a 'CHARSXP' not a 'list'