-
Notifications
You must be signed in to change notification settings - Fork 415
Sketch implementation of new separate functions #1304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Since the set of allowed characters is different to R
#Conflicts: # DESCRIPTION
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mainly lots of small things. Overall looks really nice
I had a tough time knowing how to review the "progressively relax the matches" part of str_separate_wider_regex()
because I'm not super familiar with regex
Co-authored-by: Davis Vaughan <davis@rstudio.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the PR looks great. I like the naming of the functions and the arguments. I've added some comments throughout but they're mostly minor.
One thing that I think would be beneficial is a table like https://github.com/tidyverse/tidyups/blob/9fbe5d48ec9c8dbf3bcd5db5995d905fba9532a7/002-tidyr-stringr.Rmd#L108-L114. I wonder if that's more of a vignette-thing than a function docs thing, but I think it would be good to expose a version of that with the new functions somewhere in the tidyverse docs.
A minor / potentially nitpicky comment is about styling of function arguments in documentation. For example, we should use "code text" (backticks) for the names_repair options in the pack.R documentation (
Lines 91 to 94 in f6b9509
#' * "minimal": no name repair or checks, beyond basic existence, | |
#' * "unique": make sure names are unique and not empty, | |
#' * "check_unique": (the default), no name repair, but check they are unique, | |
#' * "universal": make the names unique and syntactic |
"minimal"
. I think it's also worthwhile to make a similar change to the extra
and fill
options for separate()
as well. (Note: I checked other similar arguments and .groups
for summarize()
doesn't use code text but method
for geom_smooth()
does. I think it's good to consistently distinguish character strings that are function arguments with styling.)
#' @return A data frame based on `data`. It has the same columns, but different | ||
#' rows. | ||
#' @examples | ||
#' df <- tibble(id = 1:4, x = c("x", "x y", "x y z", NA)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all of these examples would be easier to follow if the df
s are defined with tribble()
instead of tibble()
, but I realize that takes space...
#' `separate_rows()` has been superseded in favour of [separate_longer_delim()] | ||
#' because it has a more consistent API with other separate functions. | ||
#' Superseded functions will not go away, but will only receive critical bug | ||
#' fixes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a general comment about this sort of statement, I think it makes more sense above the description, as the first thing in the documentation for the function. I often find myself regretting having spent mental energy deciphering the description of the function just to find out it's been superseded/deprecated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting; I put them below in the belief that most people would want to know what a function does before knowing what replaces it. I can move them.
Co-authored-by: Mine Cetinkaya-Rundel <cetinkaya.mine@gmail.com>
#Conflicts: # NEWS.md
TODO:
extra
andfill
need prefixeslist2df()
be baked intounnest_wider()
or provided in stringr? — Doesn't really matter; leave in tidyr for this PR.separate_wider_fixed()
&separate_wider_regex()
names_repair
in*_wider
map_unpack()
+map_unchop()
?separate_longer()
that extracts group from multiple matchesseparate_wider_regex()
uses parens (i.e. check number of columns in match)str_separate_wider_fixed
should be able to create arbitrary number of columns tooseparate_wider_regex()
Think about how to incorporate intonames_pattern
etc inpivot_longer()
pivot_longer(names_pattern)
sep
instead ofdelim
separate_regex_wider
diagnose = TRUE
NA
values in problems