Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC added example to permutation importance #16460

Merged

Conversation

magda-zielinska
Copy link
Contributor

Introduced suggestions made by @glemaitre to the example of permutation importance

#16223

hope it works out this time. looking foreword to feedback. all the best!

together with @fraboeni

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failure that you have is because of the need to synchronize your branch with the upstream/master branch. You can merge upstream/master into your branch and it should be fixed.

LogisticRegression(...)

>>> result = permutation_importance(clf, X, y, n_repeats=10,
... random_state=0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you align this line such that random_state would be under clf of the previous line

--------
>>> from sklearn.linear_model import LogisticRegression
>>> from sklearn.inspection import permutation_importance

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can remove all blank lines between each >>>. They will create some blocks on the html page (cf. https://93833-843222-gh.circle-artifacts.com/0/doc/modules/generated/sklearn.inspection.permutation_importance.html)

>>> from sklearn.linear_model import LogisticRegression
>>> from sklearn.inspection import permutation_importance

>>> X = [[1,9,9],[1,9,9],[1,9,9],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add spaces after each comma

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apart of the last one :)


>>> X = [[1,9,9],[1,9,9],[1,9,9],
... [0,9,9],[0,9,9],[0,9,9]]
>>> y = [1,1,1,0,0,0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
>>> y = [1,1,1,0,0,0]
>>> y = [1, 1, 1, 0, 0, 0]

>>> from sklearn.linear_model import LogisticRegression
>>> from sklearn.inspection import permutation_importance

>>> X = [[1,9,9],[1,9,9],[1,9,9],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
>>> X = [[1,9,9],[1,9,9],[1,9,9],
>>> X = [[1, 9, 9], [1, 9, 9], [1, 9, 9],

>>> from sklearn.inspection import permutation_importance

>>> X = [[1,9,9],[1,9,9],[1,9,9],
... [0,9,9],[0,9,9],[0,9,9]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
... [0,9,9],[0,9,9],[0,9,9]]
... [0, 9, 9], [0, 9, 9], [0, 9, 9]]

array([0.5, 0. , 0. ])

>>> result.importances_std
array([0.2236068, 0. , 0. ])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't it weird that the number here is different from the one on the dictionary couple of line above?

'importances_std': array([0.16666667, 0. , 0. ]),
'importances': array([[0.33333333, 0.66666667],
[0. , 0. ],
[0. , 0. ]])}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The full dictionary output should not be displayed as the results are stored in the result variable. I think showing the values of result.importances_mean and result.importances_std is enough.

@magda-zielinska
Copy link
Contributor Author

@glemaitre dear Guillaume, i did the changes that you and @ogrisel Olivier suggested. Thanks a lot for your comments!

Managed to synchronize my branch with the help of @adrinjalali, big thanks for that!

@adrinjalali
Copy link
Member

adrinjalali commented Feb 26, 2020

You have some linter issues (line too long). Related to linting, if you haven't, you may find pep8 useful.

@glemaitre
Copy link
Member

glemaitre commented Feb 26, 2020 via email

@magda-zielinska
Copy link
Contributor Author

@glemaitre Good morning, hmmm, now the checks from before are passing, but there new ones failing.
The logs don't tell me much. Could you help? All the best!

@adrinjalali
Copy link
Member

I think this is the part of the error message relevant to this PR:

____________ ERROR collecting inspection/_permutation_importance.py ____________
/usr/share/miniconda/envs/testvenv/lib/python3.8/doctest.py:932: in find
    self._find(tests, obj, name, module, source_lines, globs, {})
/usr/share/miniconda/envs/testvenv/lib/python3.8/site-packages/_pytest/doctest.py:459: in _find
    doctest.DocTestFinder._find(  # type: ignore
/usr/share/miniconda/envs/testvenv/lib/python3.8/doctest.py:994: in _find
    self._find(tests, val, valname, module, source_lines,
/usr/share/miniconda/envs/testvenv/lib/python3.8/site-packages/_pytest/doctest.py:459: in _find
    doctest.DocTestFinder._find(  # type: ignore
/usr/share/miniconda/envs/testvenv/lib/python3.8/doctest.py:982: in _find
    test = self._get_test(obj, name, module, globs, source_lines)
/usr/share/miniconda/envs/testvenv/lib/python3.8/doctest.py:1066: in _get_test
    return self._parser.get_doctest(docstring, globs, name,
/usr/share/miniconda/envs/testvenv/lib/python3.8/doctest.py:668: in get_doctest
    return DocTest(self.get_examples(string, name), globs,
/usr/share/miniconda/envs/testvenv/lib/python3.8/doctest.py:682: in get_examples
    return [x for x in self.parse(string, name)
/usr/share/miniconda/envs/testvenv/lib/python3.8/doctest.py:644: in parse
    self._parse_example(m, name, lineno)
/usr/share/miniconda/envs/testvenv/lib/python3.8/doctest.py:703: in _parse_example
    self._check_prefix(source_lines[1:], ' '*indent + '.', name, lineno)
/usr/share/miniconda/envs/testvenv/lib/python3.8/doctest.py:798: in _check_prefix
    raise ValueError('line %r of the docstring for %s has '
E   ValueError: line 72 of the docstring for sklearn.inspection._permutation_importance.permutation_importance has inconsistent leading whitespace: '                                    ... random_state=0)'
____________ ERROR collecting inspection/_permutation_importance.py ____________
/usr/share/miniconda/envs/testvenv/lib/python3.8/doctest.py:932: in find
    self._find(tests, obj, name, module, source_lines, globs, {})
/usr/share/miniconda/envs/testvenv/lib/python3.8/site-packages/_pytest/doctest.py:459: in _find
    doctest.DocTestFinder._find(  # type: ignore
/usr/share/miniconda/envs/testvenv/lib/python3.8/doctest.py:994: in _find
    self._find(tests, val, valname, module, source_lines,
/usr/share/miniconda/envs/testvenv/lib/python3.8/site-packages/_pytest/doctest.py:459: in _find
    doctest.DocTestFinder._find(  # type: ignore
/usr/share/miniconda/envs/testvenv/lib/python3.8/doctest.py:982: in _find
    test = self._get_test(obj, name, module, globs, source_lines)
/usr/share/miniconda/envs/testvenv/lib/python3.8/doctest.py:1066: in _get_test
    return self._parser.get_doctest(docstring, globs, name,
/usr/share/miniconda/envs/testvenv/lib/python3.8/doctest.py:668: in get_doctest
    return DocTest(self.get_examples(string, name), globs,
/usr/share/miniconda/envs/testvenv/lib/python3.8/doctest.py:682: in get_examples
    return [x for x in self.parse(string, name)
/usr/share/miniconda/envs/testvenv/lib/python3.8/doctest.py:644: in parse
    self._parse_example(m, name, lineno)
/usr/share/miniconda/envs/testvenv/lib/python3.8/doctest.py:703: in _parse_example
    self._check_prefix(source_lines[1:], ' '*indent + '.', name, lineno)
/usr/share/miniconda/envs/testvenv/lib/python3.8/doctest.py:798: in _check_prefix
    raise ValueError('line %r of the docstring for %s has '
E   ValueError: line 72 of the docstring for sklearn.inspection._permutation_importance.permutation_importance has inconsistent leading whitespace: '                                    ... random_state=0)'
--------- generated xml file: /home/vsts/work/tmp_folder/test-data.xml ---------

which means there's an issue with the whitespaces at the begining of the line you're changing/adding.

... [0, 9, 9],[0, 9, 9],[0, 9, 9]]
>>> y = [1, 1, 1, 0, 0, 0]
>>> clf = LogisticRegression().fit(X, y)
LogisticRegression(...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LogisticRegression(...)

Tentative solution for the failed check... hope it helps...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cmarmo Hi Chiara, thanks a lot! All the best!

@magda-zielinska
Copy link
Contributor Author

@glemaitre Hi Guillaume, @adrinjalali Hi Adrin, seems like the tests are passing! wow. cool.
Thanks a lot for your lots of suggestions and help. xxx Magda

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM, thanks @magda-zielinska

>>> result = permutation_importance(clf, X, y, n_repeats=10,
... random_state=0)
>>> result.importances_mean
array([0.46666667, 0. , 0. ])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use ellipsis to avoid precision issues on different platforms:

Suggested change
array([0.46666667, 0. , 0. ])
array([0.4666..., 0. , 0. ])

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adrinjalali all right, I changed it! thanks. the local tests passed. pushing it here

>>> result.importances_mean
array([0.46666667, 0. , 0. ])
>>> result.importances_std
array([0.22110832, 0. , 0. ])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here as well

Suggested change
array([0.22110832, 0. , 0. ])
array([0.2211..., 0. , 0. ])

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thomasjpfan thomasjpfan changed the title [MRG] DOC added example to permutation importance DOC added example to permutation importance Apr 10, 2020
@thomasjpfan thomasjpfan merged commit 0447dd9 into scikit-learn:master Apr 10, 2020
gio8tisu pushed a commit to gio8tisu/scikit-learn that referenced this pull request May 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants