New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add nest_string()/unnest_string() functions #69
Conversation
Hmmmm, that's only one type of nesting - the other is when you have something like: dplyr::data_frame(x = c(1, 2), y = list(1:3, 9:10)) So maybe a more specific name like |
Ah, good call. But which type of nesting do you think is more common? If it's string nesting then maybe |
Thinking about it for the past 15 seconds, Are you envisioning |
To me, |
#' stringsAsFactors = FALSE | ||
#' ) | ||
#' unnest_string(df, y) | ||
unnest_string <- function(data, col = NULL, sep = "[^[:alnum:]]+", ...) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this always need a col?
Could you PTAL at the build failure? |
Thanks for the comments. I see your point about Any interest in extending |
@@ -17,6 +17,8 @@ Imports: | |||
dplyr (>= 0.2), | |||
stringi, | |||
lazyeval | |||
Suggets: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this deliberate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant to add data.table to the suggested packages to address this build failure, but listing it under a redundant/misspelled key was just a silly mistake (insert embarrassed emoji here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I just discovered that too and fixed it (in a slightly different way). Could you please merge/rebase?
Would you mind including a couple of unit tests too please? |
Sure, I'll rebase and add a few tests. |
3c3e044
to
5938ed1
Compare
I wonder if this should be |
Yeah, I think that makes sense. Should I update the PR? |
Yes, that would be great! I think maybe it should have It would also be useful give this function, |
From a readability perspective I definitely prefer However, I think an argument can be made that this function is conceptually more similar to |
It's half-way between both, but I think people are more likely to look for it after discovering that |
* Preserves grouping * Avoid modifying grouped variable * Convert works as expected
#' @export | ||
separate_rows_.grouped_df <- function(data, cols, sep = "[^[:alnum:].]+", | ||
convert = FALSE) { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should separate_rows()
prevent modification of grouped variables, @hadley?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't - you can copy the approach from separate_
:
#' @export
separate_.grouped_df <- function(data, col, into, sep = "[^[:alnum:]]+",
remove = TRUE, convert = FALSE,
extra = "warn", fill = "warn", ...) {
regroup(NextMethod(), data, if (remove) col)
}
Cool, this should be ready for review. |
Thanks! |
This PR slightly modifies
unnest()
so the transform step from the example isn't necessary and can just be written as:I'm often doing these kind of unnesting operations and just wanted to save myself some typing by letting
unnest()
handle the string splitting.This also adds a
nest()
function so it's possible to round trip: