Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Thresholder and RejectOptionClassifier #997

Open
wants to merge 58 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 49 commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
6883b75
Add _thresholder.py and change __init__
bramreinders97 Nov 25, 2021
6747563
Add first version of unit test Thresholder
bramreinders97 Nov 25, 2021
9d75c48
include test with regression problem (predict_method=predict)
bramreinders97 Nov 26, 2021
2144645
test prefit=False
bramreinders97 Nov 28, 2021
901e89b
fix flake8 issues
bramreinders97 Nov 29, 2021
abb5ea4
implement other comments from roman
bramreinders97 Nov 29, 2021
e939596
Include way to deal with multiple sensitive features
bramreinders97 Dec 2, 2021
ed070d9
include docstring
bramreinders97 Dec 4, 2021
8aee535
fix flake8
bramreinders97 Dec 4, 2021
bd8b44f
improve warning about unsees sensitive features
bramreinders97 Dec 4, 2021
5b8a6bc
fix flake8
bramreinders97 Dec 4, 2021
369aed8
include test multiple sf
bramreinders97 Dec 6, 2021
0c7b856
Also allow for threshold operations the other way around
bramreinders97 Dec 15, 2021
d276880
add simple user example for docs
bramreinders97 Dec 15, 2021
a07f8ed
improved docs
bramreinders97 Dec 15, 2021
cd0ea0d
add default_threshold to constructor, sensitive_features to fit
bramreinders97 Dec 16, 2021
dabee57
create _make_predictions in order to reduce amount of duplicate code
bramreinders97 Dec 17, 2021
0241e14
fix flake8
bramreinders97 Dec 17, 2021
6079850
Merge branch 'fairlearn:main' into create_reject
bramreinders97 Jan 18, 2022
0a4acbd
create user guide Thresholder
bramreinders97 Jan 18, 2022
c901231
remove conflicts?
bramreinders97 Jan 18, 2022
a596905
remove conflicts!
bramreinders97 Jan 18, 2022
93015c7
move to mitigation and minor fixes
bramreinders97 Jan 19, 2022
8d4ab22
add plot functions to ._plotting.py
bramreinders97 Jan 20, 2022
aa6dba6
fix flake8
bramreinders97 Jan 20, 2022
0df53f8
move dataset description to dataset user guide
bramreinders97 Feb 7, 2022
fb1393f
minor change to example in API docs
bramreinders97 Feb 7, 2022
8a0e416
Add Thresholder and plot functions to version guide
bramreinders97 Feb 10, 2022
2ad0694
Add Thresholder and plot functions to version guide, but now with cor…
bramreinders97 Feb 10, 2022
2730360
minor fixes to user guide and Thresholder
bramreinders97 Feb 10, 2022
3dd8af3
change A to sf in user guide
bramreinders97 Feb 10, 2022
bb701ab
add RejectOptionClassifier
bramreinders97 Feb 24, 2022
b7838ed
fix comments miro
bramreinders97 Apr 8, 2022
d4d0f1b
resolve conflicts?
bramreinders97 Apr 8, 2022
62594f5
resolve conflicts
bramreinders97 Apr 8, 2022
3abb15a
create more rigorous tests
bramreinders97 Apr 10, 2022
e21f783
resolve flake8 and no_matplotlib error
bramreinders97 Apr 11, 2022
39be13b
resolve majority of doctest failures
bramreinders97 Apr 11, 2022
852119d
resolve .fit() doctest failures
bramreinders97 Apr 11, 2022
5819513
fix .fit() errors in build doctest 2.0
bramreinders97 Apr 11, 2022
300b467
output fixes doctest
bramreinders97 Apr 11, 2022
c5bca6d
before merge
bramreinders97 May 22, 2022
abd8846
test
bramreinders97 May 22, 2022
e9bda30
resolve comments hilde
bramreinders97 May 22, 2022
2c62401
temporary commit
bramreinders97 May 22, 2022
0df0809
resolve comments hilde commit try 2
bramreinders97 May 22, 2022
abd5271
Merge branch 'create_reject' of https://github.com/bramreinders97/fai…
bramreinders97 May 22, 2022
0e7af8d
remove HEAD from init
bramreinders97 May 22, 2022
2450d0a
fix doctest fail
bramreinders97 May 22, 2022
22a2084
Update docs/user_guide/mitigation.rst
hildeweerts Jul 12, 2022
f3f8c19
Update docs/user_guide/mitigation.rst
hildeweerts Jul 12, 2022
26020d3
Update docs/user_guide/mitigation.rst
hildeweerts Jul 12, 2022
6ec25e8
Update docs/user_guide/mitigation.rst
hildeweerts Jul 12, 2022
ed47e4f
Update docs/user_guide/mitigation.rst
hildeweerts Jul 12, 2022
c9353d9
in order to switch branches
bramreinders97 Jul 15, 2022
2d509e9
Merge branch 'main' into copy_reject_2
bramreinders97 Jul 15, 2022
3101ced
resolve comments hile and roman
bramreinders97 Jul 15, 2022
0960d0b
resolve comments hile and roman
bramreinders97 Jul 15, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions docs/refs.bib
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,16 @@ @inproceedings{agarwal2019fair
url = {http://proceedings.mlr.press/v97/agarwal19d.html}
}

@inproceedings{kamiran2012rejectoptionclassifier,
author={Kamiran, Faisal and Karim, Asim and Zhang, Xiangliang},
booktitle={2012 IEEE 12th International Conference on Data Mining},
title={Decision Theory for Discrimination-Aware Classification},
year={2012},
volume={},
number={},
pages={924-929},
doi={10.1109/ICDM.2012.45}}

@inproceedings{hardt2016equality,
author = {Moritz Hardt and
Eric Price and
Expand Down
14 changes: 13 additions & 1 deletion docs/user_guide/datasets/boston_housing_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ Revisiting the Boston Housing Dataset

Introduction
^^^^^^^^^^^^^^^^^

The Boston Housing dataset is one of the datasets currently callable in :mod:`fairlearn.datasets` module.
In the past, it has commonly been used for benchmarking in popular machine learning libraries,
including `scikit-learn <https://scikit-learn.org/>`_ and `OpenML <https://www.openml.org/>`_.
Expand Down Expand Up @@ -447,3 +446,16 @@ you pause about using it in the future.

.. [#11] Kinmberlé Crenshaw, Mapping the margins: Intersectionality, identity politics, and violence against women of color,
Stanford Law Review, 1993, 43(6), 1241-1299.

.. _hospital_readmissions_dataset:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be in this file and it probably should also not be part of this PR (see #1066). Instead, I would introduce the dataset briefly when it is used in the user guide.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I briefly introduced the dataset in the user guide, but moved it as I did because @romanlutz said

I'm not opposed to either one of these, but I'm not a fan of introducing the dataset in between the mitigation subsections.

in this comment.

I agree that it does not make a lot of sense to have this dataset described in the boston housing dataset file. Shall I create a new file hospital_readmissions_dataset.rst , and create a separate PR with the purpose of providing information about this dataset similar to the way this is done in dataset_x.rst, for someone else to finish (as also suggested by roman in the comment just mentioned)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems reasonable. Seems related to #1086, or is that yet another dataset?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the same dataset, so I would encourage you not to write it again :).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh great! I'm glad we figured that part out. Thanks @rensoostenbach for confirming!


Hospital readmissions dataset
------------------------------
This is a clincial dataset of hospital readmissions over a ten-year period (1998-2008)
for diabetic patients across 130 different hospitals in the US. Each record
represents the hospital admission records for a patient diagnosed with
diabetes whose stay lasted one to fourteen days. We would like to develop a
classification model, which decides whether the patients should be suggested
to their primary care physicians for an enrollment into the high-risk care
management program. The positive prediction will mean recommendation into the
care program.
2 changes: 1 addition & 1 deletion docs/user_guide/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,4 @@ User Guide
mitigation
datasets/index
installation_and_version_guide/index
further_resources
further_resources
bramreinders97 marked this conversation as resolved.
Show resolved Hide resolved
25 changes: 25 additions & 0 deletions docs/user_guide/installation_and_version_guide/v0.7.1.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
v0.7.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be part of this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this here because of this comment. Considering the recent changes about including/not including this PR in v.8 milestones, maybe it should be changed to v0.8.0.rst or removed completely? Wdyt @hildeweerts @romanlutz @adrinjalali

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

======

.. note::

v0.7.1 is not yet released. This page reflects changes on the current
:code:`main` branch that will eventually be a part of v0.7.1.

* Relaxed checks made on :code:`X` in :code:`_validate_and_reformat_input()`
since that is the concern of the underlying estimator and not Fairlearn.
* Add support for Python 3.9
* Added error handling in :code:`MetricFrame`. Methods :code:`group_max`, :code:`group_min`,
:code:`difference` and :code:`ratio` now accept :code:`errors` as a parameter,
which could either be :code:`raise` or :code:`coerce`.
* Fixed a bug whereby passing a custom :code:`grid` object to a :code:`GridSearch`
reduction would result in a :code:`KeyError` if the column names were not ordered
integers.
* :class:`~fairlearn.preprocessing.CorrelationRemover` now exposes
:code:`n_features_in_` and :code:`feature_names_in_`.
* Added the ACSIncome dataset and corresponding documentation.
* Added :class:`~fairlearn.postprocessing.Thresholder`, including the corresponding
documentation and a user guide.
* Added :func:`~fairlearn.postprocessing.plot_histograms_per_group`,
:func:`~fairlearn.postprocessing.plot_positive_predictions`,
:func:`~fairlearn.postprocessing.plot_proba_distribution` to :code:`fairlearn.postprocessing`.
Loading