Executing the Credit Risk notebook does not generate a de-biased dataset #51

biosopher · 2018-11-14T05:39:56Z

Executing the Credit Risk Notebook does not generate a de-biased dataset. The results below are from a brand now GIT pull from the AIF360 repo. As shown at the end, the new "debiased" model now over twice as biased as the original model:

adrinjalali · 2018-11-16T17:20:44Z

I [almost] concur. I get 0.11 vs 0.21. Is the input data changing? What's the source of these different results?

It may be a good idea to have some of these as tests or doctests in the docstrings (as examples).

biosopher · 2018-11-26T04:59:34Z

@nrkarthikeyan Can you advise on this? Just sent an email cc'ing the AIOS team. Need this fixed so we can show integration.

biosopher · 2018-11-26T07:09:08Z

I know the German credit risk dataset is small, but that doesn't explain the odd behavior of AIF360's notebook. E.g. I looped through a hundred different splits of that dataset looking for one that would de-bias properly.

In the process I found that for EVERY split of test/train, AIF360 actually generates a more biased dataset than the initial one. Something else is wrong here.

Here's my notebook showing this bug: tutorial_credit_scoring_merged.ipynb.zip

adrinjalali · 2018-11-26T09:39:05Z

The warning shown in the notebook might also be significant. The preprocessing sets the privileged and unprivileged groups, (gender and race I guess), and then in the notebook when the user tries to set the privileged/unprivileged group, it's ignored as a result. That would quite change the results.

nrkarthikeyan · 2018-11-26T13:50:27Z

Hi all, the optimized pre-processing (used in the original credit scoring tutorial) has a lot of randomness built into it, which will create issues with small datasets. I suggest that we use re-weighing pre-processing to circumvent this issue. Please see the attached notebook:
tutorial_credit_scoring_reweighing.ipynb.zip

The key thing to keep in mind is that reweighing pre-processing works by changing the instance level weights (this is available in dataset.instance_weights). So, the classifier trained on the debiased data should be capable of handling instance level weights. Let me know what you folks think.

hoffmansc · 2018-12-05T16:30:47Z

Alternatively, we could run it on the Adult dataset which seems to be much more stable and effective.
tutorial_credit_scoring_adult.ipynb.zip

scottdangelo · 2018-12-11T18:16:24Z

Any update on fixing this? We've a code pattern on developer.ibm.com that uses this notebook, and it shows that using AIF360 makes fairness worse.

nrkarthikeyan · 2019-01-20T21:47:40Z

We have modified the tutorial and this issue is fixed.

Port changes from Trusted-AI/AIF360#51 Switch to Reweighing pre-processing alogrithm. Closes: #25

michaelhind changed the title ~~Executing the Credit Risk notebook does generate a de-biased dataset~~ Executing the Credit Risk notebook does not generate a de-biased dataset Nov 16, 2018

scottdangelo mentioned this issue Dec 11, 2018

Difference between the privileged and the unprivileged groups increased after running the mitigation IBM/ensure-loan-fairness-aif360#25

Closed

hoffmansc mentioned this issue Jan 15, 2019

Credit tutorial reweighing #62

Merged

nrkarthikeyan closed this as completed Jan 20, 2019

scottdangelo added a commit to IBM/ensure-loan-fairness-aif360 that referenced this issue Feb 26, 2019

Updates to notebook.

06a675c

Port changes from Trusted-AI/AIF360#51 Switch to Reweighing pre-processing alogrithm. Closes: #25

scottdangelo added a commit to IBM/ensure-loan-fairness-aif360 that referenced this issue Feb 26, 2019

Updates to notebook.

6a23e8b

Port changes from Trusted-AI/AIF360#51 Switch to Reweighing pre-processing alogrithm. Closes: #25

scottdangelo mentioned this issue Feb 26, 2019

Updates to notebook. IBM/ensure-loan-fairness-aif360#32

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Executing the Credit Risk notebook does not generate a de-biased dataset #51

Executing the Credit Risk notebook does not generate a de-biased dataset #51

biosopher commented Nov 14, 2018

adrinjalali commented Nov 16, 2018

biosopher commented Nov 26, 2018

biosopher commented Nov 26, 2018 •

edited

Loading

adrinjalali commented Nov 26, 2018

nrkarthikeyan commented Nov 26, 2018

hoffmansc commented Dec 5, 2018

scottdangelo commented Dec 11, 2018

nrkarthikeyan commented Jan 20, 2019

Executing the Credit Risk notebook does not generate a de-biased dataset #51

Executing the Credit Risk notebook does not generate a de-biased dataset #51

Comments

biosopher commented Nov 14, 2018

adrinjalali commented Nov 16, 2018

biosopher commented Nov 26, 2018

biosopher commented Nov 26, 2018 • edited Loading

adrinjalali commented Nov 26, 2018

nrkarthikeyan commented Nov 26, 2018

hoffmansc commented Dec 5, 2018

scottdangelo commented Dec 11, 2018

nrkarthikeyan commented Jan 20, 2019

biosopher commented Nov 26, 2018 •

edited

Loading