Python C C++ Shell PowerShell Batchfile
Clone or download
qinhanmin2014 and amueller [MRG+2] ENH&BUG Add pos_label parameter and fix a bug in average_prec…
…ision_score (#9980)

Thanks for contributing a pull request! Please ensure you have taken a look at
the contribution guidelines:

#### Reference Issues/PRs
Example: Fixes #1234. See also #3456.
Please use keywords (e.g., Fixes) to create link to the issues or pull requests
you resolved, so that they will automatically be closed when your pull request
is merged. See
part of #9829

#### What does this implement/fix? Explain your changes.
(1)add pos_label parameter to average_precision_score (Although we finally decide not to introduce pos_label in roc_auc_score, I think we might need pos_label here. Because there are no relationship between the results if we reverse the true labels, also, precision/recall all support pos_label)
(2)fix a bug where average_precision_score will sometimes return nan when sample_weight contains 0
y_true = np.array([0, 0, 0, 1, 1, 1])
y_score = np.array([0.1, 0.4, 0.85, 0.35, 0.8, 0.9])
average_precision_score(y_true, y_score, sample_weight=[1, 1, 0, 1, 1, 0])
# output:nan
I do it here because of (3)
(3)move average_precision scores out of METRIC_UNDEFINED_BINARY (this should contain the regression test for (1) and (2))

Some comments:
(1)For the underlying method(precision_recall_curve), the default value of pos_label is None, but I choose to set the default value of pos_label to 1 because this is what P/R/F is doing. What's more, the meaning of pos_label=None is not clear even in scikit-learn itself (see #10010)
(2)I slightly modified the common test. Currently, the part I modified is only designed for brier_score_loss(I'm doing the same thing in #9562) . I think it is right because as a common test, it seems not good to force metrics to accept str y_true without pos_label.

#### Any other comments?
cc @jnothman Could you please take some time to review or at least judge whether this is the right way to go? Thanks a lot :) 

Please be aware that we are a loose team of volunteers so patience is
necessary; assistance handling other issues is very welcome. We value
all user contributions, no matter how minor they are. If we are slow to
review, either the pull request needs some benchmarking, tinkering,
convincing, etc. or more likely the reviewers are simply busy. In either
case, we ask for your understanding during the review process.
For more information, see our FAQ on this topic:

Thanks for contributing!
Latest commit dd69361 Jul 16, 2018
Failed to load latest commit information.
.circleci MAINT: Deploy doc even if broken on Python2 (#11529) Jul 15, 2018
benchmarks MAINT Complete 0.20 deprecations (#9570) Jun 24, 2018
build_tools MNT CI show full traceback on sphinx-build exception (#11386) Jul 9, 2018
doc [MRG+2] ENH&BUG Add pos_label parameter and fix a bug in average_prec… Jul 16, 2018
examples [MRG+1] MissingIndicator transformer (#8075) Jul 16, 2018
sklearn [MRG+2] ENH&BUG Add pos_label parameter and fix a bug in average_prec… Jul 16, 2018
.codecov.yml Turn off codecov comments (#10146) Nov 15, 2017
.coveragerc coverall added Oct 8, 2013
.gitattributes MAINT remove .c files from .gitattributes Nov 21, 2016
.gitignore MAINT Complete 0.20 deprecations (#9570) Jun 24, 2018
.landscape.yml make much more useful Mar 10, 2015
.mailmap Fix mailmap format (#9620) Aug 24, 2017
.travis.yml TRAVIS fix condition for testing scipy-dev build (#11402) Jul 2, 2018
AUTHORS.rst DOC mention new maintainers in AUTHORS.rst (#11491) Jul 13, 2018 DOC Link to dev doc in Mar 29, 2018
COPYING MAINT Update copyright year 2018 (#10456) Jan 11, 2018 add python parameter to issue template for better code rendering Apr 14, 2017 MAINT Include binary_tree.pxi in source distribution Jul 4, 2014
Makefile [MRG] fix command for make test-coverage (#10188) Nov 23, 2017 DOC Encourage contributors to use keywords to close issue automatical… Oct 20, 2017
README.rst DOC Add and changelog links (#11298) Jun 18, 2018
appveyor.yml [MRG] Appveyor version upgrade (#11425) Jul 4, 2018 TST: only run doctests on numpy 1.14. (#10835) Mar 27, 2018
setup.cfg MAINT cleanup setup.cfg Jul 15, 2018 MAINT Use install_requires for numpy and scipy (#10402) Jan 14, 2018
site.cfg Remove obsolete info. Feb 8, 2011


Travis AppVeyor Codecov CircleCI Python27 Python35 PyPi DOI


scikit-learn is a Python module for machine learning built on top of SciPy and distributed under the 3-Clause BSD license.

The project was started in 2007 by David Cournapeau as a Google Summer of Code project, and since then many volunteers have contributed. See the AUTHORS.rst file for a complete list of contributors.

It is currently maintained by a team of volunteers.




scikit-learn requires:

  • Python (>= 2.7 or >= 3.4)
  • NumPy (>= 1.8.2)
  • SciPy (>= 0.13.3)

For running the examples Matplotlib >= 1.3.1 is required. A few examples require scikit-image >= 0.9.3 and a few examples require pandas >= 0.13.1.

scikit-learn also uses CBLAS, the C interface to the Basic Linear Algebra Subprograms library. scikit-learn comes with a reference implementation, but the system CBLAS will be detected by the build system and used if present. CBLAS exists in many implementations; see Linear algebra libraries for known issues.

User installation

If you already have a working installation of numpy and scipy, the easiest way to install scikit-learn is using pip

pip install -U scikit-learn

or conda:

conda install scikit-learn

The documentation includes more detailed installation instructions.


See the changelog for a history of notable changes to scikit-learn.


We welcome new contributors of all experience levels. The scikit-learn community goals are to be helpful, welcoming, and effective. The Development Guide has detailed information about contributing code, documentation, tests, and more. We've included some basic information in this README.

Important links

Source code

You can check the latest sources with the command:

git clone

Setting up a development environment

Quick tutorial on how to go about setting up your environment to contribute to scikit-learn:


After installation, you can launch the test suite from outside the source directory (you will need to have the pytest package installed):

pytest sklearn

See the web page for more information.

Random number generation can be controlled during testing by setting the SKLEARN_SEED environment variable.

Submitting a Pull Request

Before opening a Pull Request, have a look at the full Contributing page to make sure your code complies with our guidelines:

Project History

The project was started in 2007 by David Cournapeau as a Google Summer of Code project, and since then many volunteers have contributed. See the AUTHORS.rst file for a complete list of contributors.

The project is currently maintained by a team of volunteers.

Note: scikit-learn was previously referred to as scikits.learn.

Help and Support




If you use scikit-learn in a scientific publication, we would appreciate citations: