Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

select(): negation of non-match gives no vars, instead of all vars #1176

Closed
gunapemmaraju opened this issue May 28, 2015 · 2 comments
Closed

select(): negation of non-match gives no vars, instead of all vars #1176

gunapemmaraju opened this issue May 28, 2015 · 2 comments
Labels
Milestone

Comments

@gunapemmaraju
Copy link

@gunapemmaraju gunapemmaraju commented May 28, 2015

In dplyr, I want to exclude columns which contain the word "junk" but, there may not be any column that contain the word "junk". In that case, dplyr should return all columns. But it returns none. See unit test case below.

df<-data.frame(name=paste("name",1:5), age=1:5)
str(df)
 'data.frame': 5 obs. of  2 variables:
 $ name: Factor w/ 5 levels "name 1","name 2",..: 1 2 3 4 5
 $ age : int  1 2 3 4 5
df1<-df%>%select(-contains("junk"))
str(df1)
 'data.frame': 5 obs. of  0 variables

More at

http://stackoverflow.com/questions/30496930/in-dplyr-select-with-a-drop-does-not-work

@blasern
Copy link
Contributor

@blasern blasern commented May 28, 2015

This is because contains returns integer(0) if it does not find any match. -integer(0) is still integer(0). An easy fix would be to return -seq_along(vars) instead.

contains <- function (vars, match, ignore.case = TRUE) 
  {
    stopifnot(is.string(match), nchar(match) > 0)
    if (ignore.case) {
      vars <- tolower(vars)
      match <- tolower(match)
    }
    res <- grep(match, vars, fixed = TRUE)
    if (length(res) == 0) res <- -seq_along(vars)
    res
  }

The same problem occurs also in starts_with, ends_with, contains and matches.

@drewgendreau
Copy link

@drewgendreau drewgendreau commented Jun 17, 2015

Just wanted to add that I tried doing this earlier today and was surprised to find that

df %>% select(-contains("this_string_is_definitely_not_in_any_of_the_column_names") )

returns an empty data.frame. The use case is that I'm running a program where I don't know all the column names ahead of time, but if there is one with a particular string I'd like to drop it.

@hadley hadley added the feature label Aug 24, 2015
@hadley hadley added this to the 0.5 milestone Aug 24, 2015
@hadley hadley changed the title in dplyr select with a drop does not work, when no matches select(): negation of non-match gives no vars, instead of all vars Oct 22, 2015
@hadley hadley closed this in 4ab2f0e Mar 8, 2016
@lock lock bot locked as resolved and limited conversation to collaborators Jun 9, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants