Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG + 1] Fix ValueError in LabelEncoder when using inverse_transform on unseen labels #9816

Merged
merged 2 commits into from Sep 21, 2017
Merged

Conversation

@ghost
Copy link

@ghost ghost commented Sep 21, 2017

Reference Issue

Fix #9812

Copy link
Member

@amueller amueller left a comment

looks good to me.

@@ -148,8 +149,9 @@ def inverse_transform(self, y):
check_is_fitted(self, 'classes_')

diff = np.setdiff1d(y, np.arange(len(self.classes_)))
if diff:
raise ValueError("y contains new labels: %s" % str(diff))
if len(diff):

This comment has been minimized.

@amueller

amueller Sep 21, 2017
Member

This len is the fix, right?

This comment has been minimized.

@ghost

ghost Sep 21, 2017
Author

Correct - that's all that was needed to produce a sensible error 😄

assert_raises(ValueError, le.inverse_transform, [-1])
le.fit([1, 2, 3, -1, 1])
msg = "contains previously unseen labels"
assert_raise_message(ValueError, msg, le.inverse_transform, [-2])

This comment has been minimized.

@amueller

amueller Sep 21, 2017
Member

I find the organization of the tests a bit weird but not your fault. The test that it actually works if they are present is way at the top of the file.

This comment has been minimized.

@ghost

ghost Sep 21, 2017
Author

Happy to reorganise tomorrow if you are able to give me some pointers - I'm not very familiar with the testing structure of sklearn as this is my first issue.

This comment has been minimized.

@amueller

amueller Sep 21, 2017
Member

it's fine, I think.

@amueller amueller changed the title Fix ValueError in LabelEncoder when using inverse_transform on unseen labels [MRG + 1] Fix ValueError in LabelEncoder when using inverse_transform on unseen labels Sep 21, 2017
@lesteve
Copy link
Member

@lesteve lesteve commented Sep 21, 2017

LGTM, merging, thanks a lot @newey01c!

@lesteve lesteve merged commit c554aad into scikit-learn:master Sep 21, 2017
6 checks passed
6 checks passed
ci/circleci Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 96.17%)
Details
codecov/project 96.17% (+<.01%) compared to a78b66f
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
lgtm analysis: Python No alert changes
Details
@jnothman
Copy link
Member

@jnothman jnothman commented Sep 24, 2017

This is missing a whats_new entry. I'll pull it into my 0.19.1 branch and write an entry there

@vdaita
Copy link

@vdaita vdaita commented Feb 3, 2018

The issue appears to be persistent - I am using LabelEncoder. Here is my stack trace:

 File "ann.py", line 71, in <module>
    X_train, X_test, y_train, y_test = get_dataset("Churn_Modelling.csv", 3, 13, 13)
  File "ann.py", line 28, in get_dataset
    encoder.fit(labels)
  File "/home/yolopc/.local/lib/python3.5/site-packages/sklearn/preprocessing/label.py", line 96, in fit
    self.classes_ = np.unique(y)
  File "/home/yolopc/.local/lib/python3.5/site-packages/numpy/lib/arraysetops.py", line 210, in unique
    return _unique1d(ar, return_index, return_inverse, return_counts)
  File "/home/yolopc/.local/lib/python3.5/site-packages/numpy/lib/arraysetops.py", line 277, in _unique1d
    ar.sort()
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Do you have any suggestions?

@jnothman
Copy link
Member

@jnothman jnothman commented Feb 3, 2018

As noted in #10552, this was accidentally not included in the 0.19.1 release. Using the development version of scikit-learn will make it work.

@vdaita
Copy link

@vdaita vdaita commented Feb 4, 2018

@jnothman
Copy link
Member

@jnothman jnothman commented Feb 4, 2018

@vdaita
Copy link

@vdaita vdaita commented Feb 5, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants