Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scikit-learn-compatible API #134

Merged
merged 61 commits into from Feb 19, 2020
Merged

scikit-learn-compatible API #134

merged 61 commits into from Feb 19, 2020

Conversation

hoffmansc
Copy link
Collaborator

Progress towards #58

Old API:

  • Minor changes to old API mostly to aid reproducibility.
  • Fixed bug in CalibratedEqOddsPostprocessing -- GFNR is weighted by base_rate not 1-base_rate
  • New Sphinx docs layout

    new docs pages

New API:

  • 4 datasets (Adult, German, Bank, Compas) in DataFrame format with protected attributes in the index
  • 6 group fairness metrics as functions (statistical_parity_difference, disparate_impact_ratio, equal_opportunity_difference, average_odds_difference, average_odds_error, between_group_generalized_entropy_error)
  • 2 individual fairness metrics as functions (generalized_entropy_index and its variants, consistency_score)
  • 5 additional metrics as functions (specificity_score, base_rate, selection_rate, generalized_fpr, generalized_fnr)
  • 3 algorithms (Reweighing, AdversarialDebiasing, CalibratedEqualizedOdds)

Samuel Hoffman and others added 30 commits December 19, 2019 12:49
* dataset loading is more similar to sklearn.datasets
* label binarization is now done outside standardize_dataset
* metrics use 'groups' and 'priv_group' to signify priv/unpriv split
removed Reweighing.sample_weight_ attribute
* fixed German 'age' from being dropped
* renamed two_year_recid labels to 'Survived' and 'Recidivated' to match ProPublica article
* reordered COMPAS categories to 'Male' < 'Female'
* added 'foreign_worker' protected attribute for German
@animeshsingh
Copy link
Collaborator

Thanks @hoffmansc. Maybe you can schedule 30 min review with @Tomcli myself ?

@hoffmansc
Copy link
Collaborator Author

Sure. Sent an invite.

aif360/datasets/adult_dataset.py Show resolved Hide resolved
aif360/sklearn/datasets/openml_datasets.py Show resolved Hide resolved
aif360/sklearn/utils.py Show resolved Hide resolved
aif360/sklearn/metrics/metrics.py Outdated Show resolved Hide resolved
aif360/sklearn/metrics/metrics.py Outdated Show resolved Hide resolved
aif360/sklearn/postprocessing/calibrated_equalized_odds.py Outdated Show resolved Hide resolved
aif360/sklearn/postprocessing/__init__.py Show resolved Hide resolved
docs/Makefile Show resolved Hide resolved
examples/sklearn/demo_new_features.ipynb Show resolved Hide resolved
tests/sklearn/test_adversarial_debiasing.py Show resolved Hide resolved
@animeshsingh
Copy link
Collaborator

cc @adrinjalali

Adrian this is a PR originating from the original request
#58

Would be great to get your feedback and review on this, as well as how can we target this toward SKLearn community?

@animeshsingh
Copy link
Collaborator

@hoffmansc @nrkarthikeyan best to get this merged if the issues are non blockers, and then we can come back with refinements.

* added one-hot encoding example and random_states to demo notebook
* added 'prefit' option to PostProcessingMeta
* multiple fixes to docstring wordings
* added additional links/disclaimers in docstrings
* renamed CalibratedEqualizedOdds args to X and y
@hoffmansc hoffmansc merged commit 1002610 into master Feb 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use OpenML and sklearn.datasets.fetch_openml for datasets.
3 participants