Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add weighting to fct_lump #70

Merged
merged 1 commit into from Feb 10, 2018
Merged

Add weighting to fct_lump #70

merged 1 commit into from Feb 10, 2018

Conversation

@wilkox
Copy link
Contributor

@wilkox wilkox commented Dec 29, 2016

Adds a 'weights' argument to fct_lump that allows frequencies to be weighted. Issue #68.

Copy link
Member

@hadley hadley left a comment

Thanks for this!

Overall the strategy looks good, but there are some code style fixes needed.

R/lump.R Outdated
ties.method = c("min", "average", "first", "last", "random", "max")) {
f <- check_factor(f)
ties.method <- match.arg(ties.method)

if (! missing(weights)) {
Copy link
Member

@hadley hadley Dec 30, 2016

Please remove the spaces after !, and use != not ! x == y

R/lump.R Outdated
if (! is.numeric(weights)) {
stop("`weights` must be a numeric vector")
} else if (! length(f) == length(weights)) {
stop("different lengths of `f` and `weights`")
Copy link
Member

@hadley hadley Dec 30, 2016

Error messages should start with a capital letter and always use call. = FALSE

R/lump.R Outdated
if (missing(weights)) {
count <- table(f)
} else {
count <- tapply(weights, f, FUN=sum)
Copy link
Member

@hadley hadley Dec 30, 2016

Need spaces around =

@@ -2,44 +2,76 @@ context("fct_lump")

test_that("positive values keeps most commmon", {
f <- c("a", "a", "a", "b", "b", "c", "d", "e", "f", "g")
w <- c( 1, 1, 1, 1, 1, 4, 2, 1, 1, 1) # Sum = 14
Copy link
Member

@hadley hadley Dec 30, 2016

I think the tests would be clearer if you tested weighting in its own block. I think you only need a couple of tests to verify that weighted and the equivalent unweighted return the same values. The code is quite simple so it doesn't need extensive tests.

You might want to add one other test that checks that bad weights generate error messages.

@hadley
Copy link
Member

@hadley hadley commented Dec 30, 2016

Also in one of the commit message please include Fixes #68 that way the issue will be automatically closed when the pull is merged.

@wilkox
Copy link
Contributor Author

@wilkox wilkox commented Dec 30, 2016

Tidied up the style issues, simplified the tests and added tests for errors, and rebased.

@wilkox
Copy link
Contributor Author

@wilkox wilkox commented Jul 12, 2017

Thanks @instantkaffee – fixed the error and rebased.

@andwilson
Copy link

@andwilson andwilson commented Nov 20, 2017

Is this going to merge into the master? This is extremely useful functionality.

@tiernanmartin
Copy link

@tiernanmartin tiernanmartin commented Jan 6, 2018

I'd also like to see this feature merged into the master - thanks!

R/lump.R Outdated
#' # Use ties.method to control how tied factors are collapsed
#' fct_lump(x, n = 6)
#' fct_lump(x, n = 6, ties.method = "max")
#'
fct_lump <- function(f, n, prop, other_level = "Other",
fct_lump <- function(f, n, prop, weights, other_level = "Other",
Copy link
Member

@hadley hadley Feb 10, 2018

I realise now that weights should have a default value since it's option. @wilkox are you still interested in this PR? If not, I'm happy to make the change, but since you've put so much time into it, I thought you might like to carry it home.

Copy link
Contributor Author

@wilkox wilkox Feb 10, 2018

Still interested in seeing this merged 😃 Added a default weights = NULL, which seems less confusing than weights = 1.

f2 <- c("a", rep("b", 4), rep("c", 6), rep("d", 4), rep("e", 2),
rep("f", 2), "g")

expect_equal(levels(fct_lump(f, weights = w)),
Copy link
Member

@hadley hadley Feb 10, 2018

Can you please use the style described at http://style.tidyverse.org/syntax.html#long-lines ?

@hadley
Copy link
Member

@hadley hadley commented Feb 10, 2018

One last change before I’ll merge - can you please add a bullet to news following http://style.tidyverse.org/news.html#during-development

* Add tests for weighted fct_lump
* Update fct_lump documentation
@wilkox
Copy link
Contributor Author

@wilkox wilkox commented Feb 10, 2018

Added the bullet to NEWS.md, rebased.

@hadley hadley merged commit 81f01c0 into tidyverse:master Feb 10, 2018
3 checks passed
@hadley
Copy link
Member

@hadley hadley commented Feb 10, 2018

Thanks for bearing with me through this long process!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants