New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG + 1] Fix ValueError in LabelEncoder when using inverse_transform on unseen labels #9816

Merged
merged 2 commits into from Sep 21, 2017

Conversation

Projects
None yet
6 participants
@newey01c
Contributor

newey01c commented Sep 21, 2017

Reference Issue

Fix #9812

@amueller

looks good to me.

@@ -148,8 +149,9 @@ def inverse_transform(self, y):
check_is_fitted(self, 'classes_')
diff = np.setdiff1d(y, np.arange(len(self.classes_)))
if diff:
raise ValueError("y contains new labels: %s" % str(diff))
if len(diff):

This comment has been minimized.

@amueller

amueller Sep 21, 2017

Member

This len is the fix, right?

@amueller

amueller Sep 21, 2017

Member

This len is the fix, right?

This comment has been minimized.

@newey01c

newey01c Sep 21, 2017

Contributor

Correct - that's all that was needed to produce a sensible error 😄

@newey01c

newey01c Sep 21, 2017

Contributor

Correct - that's all that was needed to produce a sensible error 😄

assert_raises(ValueError, le.inverse_transform, [-1])
le.fit([1, 2, 3, -1, 1])
msg = "contains previously unseen labels"
assert_raise_message(ValueError, msg, le.inverse_transform, [-2])

This comment has been minimized.

@amueller

amueller Sep 21, 2017

Member

I find the organization of the tests a bit weird but not your fault. The test that it actually works if they are present is way at the top of the file.

@amueller

amueller Sep 21, 2017

Member

I find the organization of the tests a bit weird but not your fault. The test that it actually works if they are present is way at the top of the file.

This comment has been minimized.

@newey01c

newey01c Sep 21, 2017

Contributor

Happy to reorganise tomorrow if you are able to give me some pointers - I'm not very familiar with the testing structure of sklearn as this is my first issue.

@newey01c

newey01c Sep 21, 2017

Contributor

Happy to reorganise tomorrow if you are able to give me some pointers - I'm not very familiar with the testing structure of sklearn as this is my first issue.

This comment has been minimized.

@amueller

amueller Sep 21, 2017

Member

it's fine, I think.

@amueller

amueller Sep 21, 2017

Member

it's fine, I think.

@amueller amueller changed the title from Fix ValueError in LabelEncoder when using inverse_transform on unseen labels to [MRG + 1] Fix ValueError in LabelEncoder when using inverse_transform on unseen labels Sep 21, 2017

@lesteve

This comment has been minimized.

Show comment
Hide comment
@lesteve

lesteve Sep 21, 2017

Member

LGTM, merging, thanks a lot @newey01c!

Member

lesteve commented Sep 21, 2017

LGTM, merging, thanks a lot @newey01c!

@lesteve lesteve merged commit c554aad into scikit-learn:master Sep 21, 2017

6 checks passed

ci/circleci Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 96.17%)
Details
codecov/project 96.17% (+<.01%) compared to a78b66f
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
lgtm analysis: Python No alert changes
Details
@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Sep 24, 2017

Member

This is missing a whats_new entry. I'll pull it into my 0.19.1 branch and write an entry there

Member

jnothman commented Sep 24, 2017

This is missing a whats_new entry. I'll pull it into my 0.19.1 branch and write an entry there

@vdaita

This comment has been minimized.

Show comment
Hide comment
@vdaita

vdaita Feb 3, 2018

The issue appears to be persistent - I am using LabelEncoder. Here is my stack trace:

 File "ann.py", line 71, in <module>
    X_train, X_test, y_train, y_test = get_dataset("Churn_Modelling.csv", 3, 13, 13)
  File "ann.py", line 28, in get_dataset
    encoder.fit(labels)
  File "/home/yolopc/.local/lib/python3.5/site-packages/sklearn/preprocessing/label.py", line 96, in fit
    self.classes_ = np.unique(y)
  File "/home/yolopc/.local/lib/python3.5/site-packages/numpy/lib/arraysetops.py", line 210, in unique
    return _unique1d(ar, return_index, return_inverse, return_counts)
  File "/home/yolopc/.local/lib/python3.5/site-packages/numpy/lib/arraysetops.py", line 277, in _unique1d
    ar.sort()
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Do you have any suggestions?

vdaita commented Feb 3, 2018

The issue appears to be persistent - I am using LabelEncoder. Here is my stack trace:

 File "ann.py", line 71, in <module>
    X_train, X_test, y_train, y_test = get_dataset("Churn_Modelling.csv", 3, 13, 13)
  File "ann.py", line 28, in get_dataset
    encoder.fit(labels)
  File "/home/yolopc/.local/lib/python3.5/site-packages/sklearn/preprocessing/label.py", line 96, in fit
    self.classes_ = np.unique(y)
  File "/home/yolopc/.local/lib/python3.5/site-packages/numpy/lib/arraysetops.py", line 210, in unique
    return _unique1d(ar, return_index, return_inverse, return_counts)
  File "/home/yolopc/.local/lib/python3.5/site-packages/numpy/lib/arraysetops.py", line 277, in _unique1d
    ar.sort()
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Do you have any suggestions?

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Feb 3, 2018

Member

As noted in #10552, this was accidentally not included in the 0.19.1 release. Using the development version of scikit-learn will make it work.

Member

jnothman commented Feb 3, 2018

As noted in #10552, this was accidentally not included in the 0.19.1 release. Using the development version of scikit-learn will make it work.

@vdaita

This comment has been minimized.

Show comment
Hide comment
@vdaita

vdaita Feb 4, 2018

vdaita commented Feb 4, 2018

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Feb 4, 2018

Member
Member

jnothman commented Feb 4, 2018

@vdaita

This comment has been minimized.

Show comment
Hide comment
@vdaita

vdaita Feb 5, 2018

vdaita commented Feb 5, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment