Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: fct_match with validation #127

Merged
merged 11 commits into from Feb 16, 2019

Conversation

Projects
None yet
2 participants
@jonocarroll
Copy link
Contributor

jonocarroll commented Apr 20, 2018

Addresses #126.

Redesign of tidyverse/dplyr#3514, with improvements:

  • distinguishes between levels(f) and unique(f) using allow_missing.
  • full test coverage.
  • cleaner API.
@hadley

This comment has been minimized.

Copy link
Member

hadley commented Jan 4, 2019

Are you still interested in this? I think the basic approach is ok, but there are quite a few style issues..

@jonocarroll

This comment has been minimized.

Copy link
Contributor Author

jonocarroll commented Jan 5, 2019

Absolutely. I'm happy to put in some effort to bring it into line with acceptable style.

Show resolved Hide resolved DESCRIPTION Outdated
R/match.R Outdated
#' @md
#' @rdname fct_match
#' @param f A factor (or character vector).
#' @param inverse (logical) if `TRUE`, perform a "not match" operation.

This comment has been minimized.

@hadley

hadley Jan 7, 2019

Member

This argument and fct_exclude() do not seem compelling to me as they're neither more expression or more succinct than !fct_match()

This comment has been minimized.

@jonocarroll

jonocarroll Jan 13, 2019

Author Contributor

Removed.

This comment has been minimized.

@jonocarroll

jonocarroll Jan 13, 2019

Author Contributor

Counterpoint: using this within a pipe, e.g.

gss_cat$marital %>% 
  fct_exclude("No answer") %>% 
  table()

vs

gss_cat$marital %>% 
  fct_match("No answer") %>% 
  `!` %>% 
  table()
R/match.R Outdated
#' @param inverse (logical) if `TRUE`, perform a "not match" operation.
#' @param allow_missing (logical) if `TRUE`, allow matching to empty levels in
#' f
#' @param ... A series of characters for which the test "are these levels in

This comment has been minimized.

@hadley

hadley Jan 7, 2019

Member

Are you sure this should be ... and not a single vector? Using ... saves typing but it makes it feel a bit weird to me.

This comment has been minimized.

@jonocarroll

jonocarroll Jan 13, 2019

Author Contributor

Not having to create a vector felt more natural, but this is fine too.

Show resolved Hide resolved R/match.R Outdated
Show resolved Hide resolved R/match.R Outdated
Show resolved Hide resolved R/match.R Outdated
Show resolved Hide resolved R/match.R Outdated
R/match.R Outdated
#' @noRd
validate_level_inputs <- function(factor, check) {
levels_present <- check %in% factor
if (!all(levels_present)) {

This comment has been minimized.

@hadley

hadley Jan 7, 2019

Member

An early return would make this more readable:

if (all(levels_present))  {
  return(TRUE)
}

This comment has been minimized.

@jonocarroll

jonocarroll Jan 13, 2019

Author Contributor

Done. I was previously returning NULL (as the output is not used) but TRUE makes more sense.

R/match.R Outdated
stop(
paste0(
"Level(s) not present in factor: \"",
toString(check[which(!levels_present)]),

This comment has been minimized.

@hadley

hadley Jan 7, 2019

Member

Can you please use encodeString() here? (Since I actually know what it does because we use it elsewhere in the tidyverse)

This comment has been minimized.

@jonocarroll

jonocarroll Jan 13, 2019

Author Contributor

Done. I was not aware of that one, but I use toString to convert vectors to comma-separated strings. I've seen it in base code, I'm sure. encodeString seems to additionally do some escaping.

f <- factor(c("a", "b", "c d"), levels = c("a", "b", "c d", "e"))

expect_equal(fct_match(f, "a", "b"), c(TRUE, TRUE, FALSE))
expect_error(fct_match(f, "a", "x"), regexp = 'not present.*"x"')

This comment has been minimized.

@hadley

hadley Jan 7, 2019

Member

Please don't name the second argument here.

This comment has been minimized.

@jonocarroll

jonocarroll Jan 13, 2019

Author Contributor

Done. I assumed that would be helpful to distinguish it from the tested expression, but no worries if not.

jonocarroll added some commits Jan 13, 2019

rebase onto master
Merge remote-tracking branch 'upstream/master'

# Conflicts:
#	NEWS.md
@jonocarroll

This comment has been minimized.

Copy link
Contributor Author

jonocarroll commented Jan 13, 2019

Tests and checks all passing. I'm happy to make more changes if you have suggestions.

jonocarroll and others added some commits Jan 13, 2019

@hadley

This comment has been minimized.

Copy link
Member

hadley commented Feb 16, 2019

Thanks! I simplified the code and docs a bit, and also handled NAs to match %in%.

@hadley hadley merged commit f66a9ef into tidyverse:master Feb 16, 2019

1 check was pending

continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.