New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added algorithms and test file for Celis et al. paper. #21
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind adding the MetaFairClassifier to inprocessing/__init__.py
similarly to the other examples there? This allows it to be imported like from aif360.algorithms.inprocessing import MetaFairClassifier
consistent with the other classes.
Additionally, I see you have a demo notebook, which is great! Could you add a couple assert statements testing that the relevant metrics are debiased? You can see an example here: https://github.com/IBM/AIF360/blob/master/examples/demo_optim_data_preproc.ipynb (the cells marked ### Testing
). Then we can add this to the unit test framework with a corresponding file like https://github.com/IBM/AIF360/blob/master/tests/test_demo_optim_data_preproc.py (all you have to do is change the name and path).
Thanks for your contribution!
|
||
|
||
x_train = dataset.features | ||
y_train = np.array([1 if y == [1] else -1 for y in dataset.labels]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be ... y == [dataset.favorable_label] ...
since we assume the general case where the favorable label may not be 1
return pred_dataset | ||
|
||
def _runTrain(self, x_train, y_train, x_control_train): | ||
obj = FalseDiscovery() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you want to make this a parameter in __init__
to handle the StatisticalRate case as well?
|
||
return pred_dataset | ||
|
||
def _runTrain(self, x_train, y_train, x_control_train): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these helper functions aren't strictly necessary maybe we can move the code inside fit
or predict
? Or rename them to something more descriptive? I don't want users to accidentally do _runTrain
when they mean to do fit
etc.
It looks like the demo notebook is taking a too long to run. We could up the time limit but I don't think we should have such a lengthy unit test anyway. I propose we do a few things:
Also, a few other things:
Thanks for staying patient |
…nged README to include link to Celis et al paper.
Thanks a lot for the review. I made the suggested changes and created a new test file instead of the earlier one based on the notebook. Its passed the tests now. Please let me know if there's any other change required before it can be merged. |
* Added algorithms and test file for Celis et al. paper. * Expanded example and changed stats function * Made small changes to main file and example and added unit test * Fixed bugs in notebook and added unbiased classifier for comparison * Modified test file and made small clean-up changes to main files. Changed README to include link to Celis et al paper. * Small change to check build
* Added algorithms and test file for Celis et al. paper. * Expanded example and changed stats function * Made small changes to main file and example and added unit test * Fixed bugs in notebook and added unbiased classifier for comparison * Modified test file and made small clean-up changes to main files. Changed README to include link to Celis et al paper. * Small change to check build
As per discussion with Sameep Mehta, I have added support for the Fair Classification meta-algorithm that we developed. The implementation is based on the following work:
Classification with Fairness Constraints: A Meta-Algorithm with Provable Guarantees - https://arxiv.org/abs/1806.06055.
The algorithm we suggest in the paper takes the fairness metric as input and corresponding to the metric specified, the optimization problem is different. In this implementation, we provide code for optimizing two fairness metrics: Statistical Rate and False Discovery Rate.
The main implementation file is meta_fair_classifier.py. The algorithm is implemented in the folder celisMeta. I have added a Jupyter test notebook for this algorithm as well - demo_meta_classifier.ipynb.
I have tried to loosely follow the coding style used for other implemented algorithms. Any comments on the code/algorithm will be very helpful and please let me know any changes I should make before this can be merged.