-
Notifications
You must be signed in to change notification settings - Fork 420
Description
I caused an issue with separate yesterday. While tidying, I gathered a series of columns into a single column that looks like this,
df <- data.frame(yrqtr = rep("X1996.04", times = 1000000))
My desire is to separate this column by "." but the following was causing RStudio to hang and then crash,
df <- separate(df, col = yrqtr, into = c("year", "quarter"), sep = ".")
After some investigation I realized my mistake by remembering that "dot" means match everything in regular expressions. So I'm breaking separate by supplying it the worst possible regex. Changing that line to sep = "\\." eliminates the problem.
I imagine that other people will run into this issue also by following their intuitions for the separator argument. Had the column been "X1996@04", sep = "@" would have worked, and in the manner I initially expected would work with "."
Is sep = "." a pain point worth throwing a warning about?
Interestingly, sep = "." + extra = "merge" also "solves" the crashing, but it's doing so in a way that obscures the error of my improper regex string.