{{ message }}

# ntile behaviour#2564

Closed
opened this issue Mar 23, 2017 · 1 comment
Closed

# ntile behaviour#2564

opened this issue Mar 23, 2017 · 1 comment

### MikeBadescu commented Mar 23, 2017

 The result of function `ntile` depends on the the proportion of `NA`s in argument `x`. This is because we divide by the length of `x`, regardless of how many missing values are in `row_number(x)`. Ideally, `NA`s should be ignored all together. ```# this is the dplyr function # ntile <- function (x, n) { # floor((n * (row_number(x) - 1)/length(x)) + 1) # } # proposed changes (illustrative) ntile2 <- function (x, n) { len <- sum(!is.na(x)) if (len == 0) return(x) floor((n * (dplyr::row_number(x) - 1)/len) + 1) } x00 <- numeric() x04 <- rep(NA_real_, 4) x40 <- c(1,2,3,4) x44 <- c(1,2,3,4, rep(NA_real_, 4)) x48 <- c(1,2,3,4, rep(NA_real_, 8)) dplyr::ntile(x00, 0) #> numeric(0) dplyr::ntile(x04, 4) #> [1] NA NA NA NA dplyr::ntile(x40, 4) #> [1] 1 2 3 4 dplyr::ntile(x44, 4) #> [1] 1 1 2 2 NA NA NA NA dplyr::ntile(x48, 4) #> [1] 1 1 1 2 NA NA NA NA NA NA NA NA ntile2(x00, 4) #> numeric(0) ntile2(x04, 4) #> [1] NA NA NA NA ntile2(x40, 4) #> [1] 1 2 3 4 ntile2(x44, 4) #> [1] 1 2 3 4 NA NA NA NA ntile2(x48, 4) #> [1] 1 2 3 4 NA NA NA NA NA NA NA NA``` The text was updated successfully, but these errors were encountered:
closed this in ``` 611ae3e ``` Mar 23, 2017

### MikeBadescu commented Mar 23, 2017

 Thank you!