Skip to content

cume_dist() and percent_rank() react strangely to NA value inside mutate() #4427

@clanker

Description

@clanker

cume_dist() and percent_rank() computed values within mutate do not ignore NA values

I'm expecting values as in the column correct when computing these two functions on mutate-generated gain, just as it is correct when computed directly on b-a. I don't think the results of the ecdf and correct columns should differ. I include pct_rank to show that function is also affected.

library(tidyverse)
set.seed(0)
df <- tibble(a = runif(1000, -1, 1), b = runif(1000, -1, 1))
df[df < 0] <- NA
df <- df %>%
  mutate(gain = b - a)
df <- df %>%
  mutate(ecdf = cume_dist(gain), 
         correct = cume_dist(b - a),
         pct_rank = percent_rank(gain)) 

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugan unexpected problem or unintended behavior

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions