New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider adding a drop_na verb #194

Closed
jankatins opened this Issue May 27, 2016 · 1 comment

Comments

Projects
None yet
2 participants
@jankatins

jankatins commented May 27, 2016

[forwarded from https://github.com/tidyverse/dplyr/issues/1797 as per @hadley: "This feels more like a tidyr verb to me."]

pandas has the pandas.DataFrame.dropna() which lets you drop rows which have NAs in it but also let you specify subset of columns (-> subset) to be looked at for NAs.

A similar dplyr verb would be

drop_na <- function(data, ...){
    if (missing(...)){
        f = complete.cases(data)
    } else {
        f <- complete.cases(select_(data, .dots = lazyeval::lazy_dots(...)))
    }
    filter(data, f)
}

Examples:

> df <- data.frame(a=c(1,2,3,4,NA), b=c(NA,1,2,3,4), ac=c(1,2,NA,3,4))
> df %>% drop_na(a,b)
  a b ac
1 2 1  2
2 3 2 NA
3 4 3  3
> df %>% drop_na(starts_with("a"))
  a  b ac
1 1 NA  1
2 2  1  2
3 4  3  3
> df %>% drop_na()
  a b ac
1 2 1  2
2 4 3  3

I can clean the above up and submit a PR if such a verb would be considered.

@hadley

This comment has been minimized.

Member

hadley commented Jun 8, 2016

Yeah, I'd be happy to review a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment