Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider adding a drop_na verb #194

Closed
jankatins opened this issue May 27, 2016 · 1 comment
Closed

Consider adding a drop_na verb #194

jankatins opened this issue May 27, 2016 · 1 comment

Comments

@jankatins
Copy link

@jankatins jankatins commented May 27, 2016

[forwarded from https://github.com/tidyverse/dplyr/issues/1797 as per @hadley: "This feels more like a tidyr verb to me."]

pandas has the pandas.DataFrame.dropna() which lets you drop rows which have NAs in it but also let you specify subset of columns (-> subset) to be looked at for NAs.

A similar dplyr verb would be

drop_na <- function(data, ...){
    if (missing(...)){
        f = complete.cases(data)
    } else {
        f <- complete.cases(select_(data, .dots = lazyeval::lazy_dots(...)))
    }
    filter(data, f)
}

Examples:

> df <- data.frame(a=c(1,2,3,4,NA), b=c(NA,1,2,3,4), ac=c(1,2,NA,3,4))
> df %>% drop_na(a,b)
  a b ac
1 2 1  2
2 3 2 NA
3 4 3  3
> df %>% drop_na(starts_with("a"))
  a  b ac
1 1 NA  1
2 2  1  2
3 4  3  3
> df %>% drop_na()
  a b ac
1 2 1  2
2 4 3  3

I can clean the above up and submit a PR if such a verb would be considered.

@hadley
Copy link
Member

@hadley hadley commented Jun 8, 2016

Yeah, I'd be happy to review a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants