Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cume_dist() and percent_rank() react strangely to NA value inside mutate() #4427

Closed
clanker opened this issue Jun 17, 2019 · 1 comment
Closed
Labels
Milestone

Comments

@clanker
Copy link

@clanker clanker commented Jun 17, 2019

cume_dist() and percent_rank() computed values within mutate do not ignore NA values

I'm expecting values as in the column correct when computing these two functions on mutate-generated gain, just as it is correct when computed directly on b-a. I don't think the results of the ecdf and correct columns should differ. I include pct_rank to show that function is also affected.

library(tidyverse)
set.seed(0)
df <- tibble(a = runif(1000, -1, 1), b = runif(1000, -1, 1))
df[df < 0] <- NA
df <- df %>%
  mutate(gain = b - a)
df <- df %>%
  mutate(ecdf = cume_dist(gain), 
         correct = cume_dist(b - a),
         pct_rank = percent_rank(gain)) 
@lock
Copy link

@lock lock bot commented Dec 15, 2019

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Dec 15, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants