Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

select(): negation of non-match gives no vars, instead of all vars #1176

Closed
gunapemmaraju opened this issue May 28, 2015 · 2 comments
Closed
Labels
feature a feature request or enhancement
Milestone

Comments

@gunapemmaraju
Copy link

In dplyr, I want to exclude columns which contain the word "junk" but, there may not be any column that contain the word "junk". In that case, dplyr should return all columns. But it returns none. See unit test case below.

df<-data.frame(name=paste("name",1:5), age=1:5)
str(df)
 'data.frame': 5 obs. of  2 variables:
 $ name: Factor w/ 5 levels "name 1","name 2",..: 1 2 3 4 5
 $ age : int  1 2 3 4 5
df1<-df%>%select(-contains("junk"))
str(df1)
 'data.frame': 5 obs. of  0 variables

More at

http://stackoverflow.com/questions/30496930/in-dplyr-select-with-a-drop-does-not-work

@blasern
Copy link

blasern commented May 28, 2015

This is because contains returns integer(0) if it does not find any match. -integer(0) is still integer(0). An easy fix would be to return -seq_along(vars) instead.

contains <- function (vars, match, ignore.case = TRUE) 
  {
    stopifnot(is.string(match), nchar(match) > 0)
    if (ignore.case) {
      vars <- tolower(vars)
      match <- tolower(match)
    }
    res <- grep(match, vars, fixed = TRUE)
    if (length(res) == 0) res <- -seq_along(vars)
    res
  }

The same problem occurs also in starts_with, ends_with, contains and matches.

@drewgendreau
Copy link

Just wanted to add that I tried doing this earlier today and was surprised to find that

df %>% select(-contains("this_string_is_definitely_not_in_any_of_the_column_names") )

returns an empty data.frame. The use case is that I'm running a program where I don't know all the column names ahead of time, but if there is one with a particular string I'd like to drop it.

@hadley hadley added the feature a feature request or enhancement label Aug 24, 2015
@hadley hadley added this to the 0.5 milestone Aug 24, 2015
@hadley hadley changed the title in dplyr select with a drop does not work, when no matches select(): negation of non-match gives no vars, instead of all vars Oct 22, 2015
@hadley hadley closed this as completed in 4ab2f0e Mar 8, 2016
@lock lock bot locked as resolved and limited conversation to collaborators Jun 9, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature a feature request or enhancement
Projects
None yet
Development

No branches or pull requests

4 participants