-
Notifications
You must be signed in to change notification settings - Fork 20
-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataTable.select warns if no columns match #322
Comments
@dsherry Thinking of this generally and ignoring what pandas does for the time being, why would you not want this warning as a user? I was originally thinking that it might be nice to explicitly tell the user that they tried to select something that wasn't in the data, rather than just silently giving them nothing back, but maybe it isn't useful and goes against the behavior most users would expect to see. |
@thehomebrewnerd that's a good question, and yes I agree we shouldn't simply mirror pandas behavior for the sake of mirroring pandas behavior, lol. Consider the use-case in evalml which prompted me to file this: we want to select all natural_language features and pass those down the stack to the pipeline creation code etc. If there are no text features, that's fine for us, we just won't add a text featurizer component to the pipeline. So the warning doesn't add value in our current use-case; in fact it adds confusion when users run automl search, because they don't have the right context to understand the warning. That use-case aside, my opinion is that our APIs should allow people to get into "bad" situations like having an empty set of columns selected. I'm sure there are some situations where a warning here would be helpful, but we don't know which type of situation the user is in, and I'd rather trust the user to handle this case. What do you think? |
@dsherry Thanks for your thoughts. As I think about this more, I'm convincing myself that your approach of allowing people to get into "bad" situations is a reasonable approach. We can't anticipate all the "bad" situations anyway, and trying to handle them makes the code base bigger and harder to maintain. It's probably best to not implement these checks unless we really are certain they are needed. With that, I'm in favor of removing this particular warning and returning no columns in this case. |
The following code
currently emits this warning on the first run:
I would expect this to emit no warning and to return an DataTable containing 0 columns. That's pandas' behavior:
emits a dataframe with the same index as the original (569 rows), but with 0 columns.
Encountered in https://github.com/alteryx/evalml/pull/1062/files#r512896715
The text was updated successfully, but these errors were encountered: