Fairness Aware Counterfactuals for Subgroups #457

phantom-duck · 2023-06-29T13:51:08Z

Hello! We would like to contribute the implementation for the paper "Fairness Aware Counterfactuals for Subgroups".

Fairness Aware Counterfactuals for Subgroups (FACTS), a framework for auditing subgroup fairness through counterfactual explanations. We aim to (a) formulate different aspects of the difficulty of individuals in certain subgroups to achieve recourse, i.e. receive the desired outcome, either at the micro level, considering members of the subgroup individually, or at the macro level, considering the subgroup as a whole, and (b) introduce notions of subgroup fairness that are robust, if not totally oblivious, to
the cost of achieving recourse.

For the implementation, we expose to the user a set of functions which perform the main steps of our algorithm and which we have grouped in the wrapper class FairnessAwareSubgroupCounterfactuals in the facts submodule's __init__.

Closes #456

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

…etrics_aggr Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

…ates Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

phantom-duck · 2023-07-04T10:01:01Z

Amended last commit - had forgotten to sign off.

We are looking forward to your feedback!

phantom-duck · 2023-08-29T11:01:55Z

@hoffmansc (code review request)

Hello again! I would just like to remind you, when you get a chance, to take a look at the pull request and inform us of any corrections or improvements that may be needed; or merge the pull request if you believe it is adequate. In any case, we are looking forward to your feedback!

hoffmansc · 2023-08-29T16:32:13Z

Hi. First of all, thanks for contributing, this is great stuff! I have a couple comments before we get into the specifics of the code.

I see you’ve placed the code under algorithms.postprocessing. This subpackage is for bias mitigation algorithms which transform predictions and return fairer predictions. To me, this seems different than what FACTS is doing. I think there should be three places where the contributions from this work get exposed: sklearn.detectors, sklearn.metrics, and sklearn.explainers (as a side note, we can focus on the aif360.sklearn subpackage, which will be much more straightforward for this work).

explainers is the clearest match in my mind. Currently, there is one other PR under review that will also go here (#450). Basically, your whole reporting function can go here. The idea here is to offer textual or visual explanations for bias measurements/metrics.

Correspondingly, I think there should be a function under metrics which just outputs the unfairness score as a number, i.e., contribution (b) you mention. This could either be the highest score or possibly given a specific subgroup. This way, we can easily compare classifiers directly.

This brings me to the final location, detectors. We currently have one detector, MDSS, and the idea behind this class is to discover subgroups which are maximally advantaged/disadvantaged. From my understanding of FACTS, you similarly generate subgroups and further divide them according to a protected attribute. The outputs here would simply be the subgroup and score, not the entire explanation.

Essentially, I’m trying to think about the various ways a user might want to utilize this work as well as how they would discover it according to the taxonomy we have in AIF360.

Feel free to disagree if I’m misunderstanding anything. Once we settle these high-level questions, I’ll take a closer look at the code and make specific comments.

phantom-duck · 2023-09-04T12:30:00Z

Note: we have edited the original comment; we thought that some phrases were a poor choice of words for conveying our thoughts.

Hello again, and thank you very much for the reply!

Firstly, we definitely agree that FACTS is not a good fit for algorithms.postprocessing.

That being said, on a higher level, we think of FACTS primarily as a detector. Although tightly connected to the metrics and the respective explanations, the primary goal of our work is to discover subgroups of the population where the model exhibits bias. Our framework implements 6 detectors:

Equal Effectiveness
Equal Choice for Recourse
Equal Effectiveness within Budget
Equal Cost of Effectiveness
Fair Effectiveness-Cost Trade-Off
Equal (Conditional) Mean Recourse

When someone chooses a detector, the metric of the chosen detector is used to score the unfairness of all subgroups available and ranks them in decreasing order of their unfairness (more unfair subgroups appear higher in the ranking). On the explanation side of things, the reporting that appears explains why the subgroups have the specific unfairness score based on the metric of the detector. It is there for interpretability and transparency of our algorithm's results. Thus, each detector has its own metric that scores subgroups and its own explainer that explains the unfairness based on the scores of the metric.

Looking forward to your view on the matter! In the meantime, we will make sure to rework all of this before the more detailed review you mentioned.

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

@dsachar

Credits to @dsachar. Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

rahulnair23

HI @phantom-duck - thanks! This is looking good. Please see some general comments.

aif360/sklearn/detectors/facts/formatting.py

examples/demo_FACTS.ipynb

aif360/sklearn/detectors/facts/__init__.py

rahulnair23 · 2023-11-10T16:06:06Z

aif360/sklearn/detectors/facts/__init__.py

+    def bias_scan(
+        self,
+        metric: str = "equal-effectiveness",
+        viewpoint: str = "macro",


What does viewpoint mean?

It refers to the notions of "macro viewpoint" and "micro viewpoint" defined in section 2.2 of the paper. As a short explanation, consider a set of actions A and a subgroup (cohort / set of individuals) G. Metrics with the macro viewpoint interpretation are constrained to always apply one action from A to the entire G, while metrics with the micro interpretation are allowed to give each individual in G the min-cost action from A which changes the individual's class.

I should mention, though, that not all combinations of metric and viewpoint are valid, e.g. "Equal Choice for Recourse" only has a macro interpretation. Do you think this should be reflected in the code somehow?

could you just add a sentence to this effect in the docstring?

hoffmansc · 2023-11-11T00:42:19Z

Thanks for reviewing, @rahulnair23.

One other thing, @phantom-duck: would it be possible to combine the fit(), bias_scan(), and print_recourse_report() methods into a single function without the need for a class? This will better match the detectors API. Right now, I don't really see the point in making three separate calls when only one has outputs. Is it likely that users would run multiple metrics after fitting? The function can have a verbose flag to show the progress bars and a print_recourse_report flag to show the report otherwise just output the highest scoring subgroup and score like MDSS. If it's easier, you could also just keep the class and implement a wrapper function which does this.

phantom-duck · 2023-11-17T10:29:22Z

Hello to both! Thank you for the comments, and sorry for the delayed response.

@rahulnair23 Mainly, thank you very much for the detailed review!! For the comments I have replied individually, with the exception of the demo notebook, where the points you raise are interesting and seem trickier to me (especially 2) so I will get back on that soon.

@hoffmansc Ok, thanks for this! Because it is not perfectly clear to me what exactly is the structure of the detectors API. For example, I do not see how the signature of the MDSS function sklearn.detectors.detectors.bias_scan() could be applied in our case. The way I see it, the function you describe would need to take as arguments a dataset X and the trained model clf mainly, plus some other more specific parameters. Would something along these lines be okay for what you are thinking?

More importantly, yes, exactly, in our mind, it is entirely possible that the user would run multiple metrics after fitting! Firstly, because fitting takes much longer to run, while the metrics run almost instantaneously. Secondly, because there are several metrics, and with a few extra parameters; so, we would expect (at least some) users to choose a few different overall parameterizations, based on the metrics' descriptions we give in the demo (or in the paper) and domain intuition, and inspect their results. These could very well yield different unfair subgroups.

Due to this, I think it best to keep the class, and of course implement an additional wrapper function as you suggest.

Thank you both again very much!

I will get back to you soon with the modifications and / or further comments. Meanwhile, it goes without saying that we are always available for further discussion.

hoffmansc · 2023-11-17T20:24:27Z

@hoffmansc Ok, thanks for this! Because it is not perfectly clear to me what exactly is the structure of the detectors API. For example, I do not see how the signature of the MDSS function sklearn.detectors.detectors.bias_scan() could be applied in our case. The way I see it, the function you describe would need to take as arguments a dataset X and the trained model clf mainly, plus some other more specific parameters. Would something along these lines be okay for what you are thinking?

Whatever you need as inputs is fine. I was more referring to the output and function vs. class method(s) paradigm matching MDSS.

As a separate note, is there a more specific/descriptive name than bias_scan? Just to differentiate it from the existing detector. Or we could rename that one to MDSS_bias_scan and this to FACTS_bias_scan or similar, I suppose.

More importantly, yes, exactly, in our mind, it is entirely possible that the user would run multiple metrics after fitting! Firstly, because fitting takes much longer to run, while the metrics run almost instantaneously. Secondly, because there are several metrics, and with a few extra parameters; so, we would expect (at least some) users to choose a few different overall parameterizations, based on the metrics' descriptions we give in the demo (or in the paper) and domain intuition, and inspect their results. These could very well yield different unfair subgroups.

Due to this, I think it best to keep the class, and of course implement an additional wrapper function as you suggest.

This makes sense, thanks.

phantom-duck · 2023-11-22T13:04:38Z

@hoffmansc Hello again!

Ok, that is great, then I will add the wrapper function and get back to you!

As for the name of bias_scan, I do not have something much better in mind. The original intention was to match the name of the MDSS function, but in a separate class. Perhaps find_most_biased_subgroups or something along those lines would be more informative? Maybe I will think it a bit more. Nonetheless, FACTS_bias_scan also seems perfectly ok to me. I leave it up to you.

They define at one point a global warning filter, which then influences all Deprecation warnings. Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

easy2hard_name_map was a dictionary with aliases to metric names, used in the wrapper class of the method, FACTS. Being unnecessary and slightly confusing, it was removed, and the names changed accordingly. Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

Added the function FACTS_bias_scan, as an alternative API, closer to the API of the other detectors. Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

phantom-duck · 2023-11-24T12:21:27Z

@hoffmansc The wrapper function is now implemented! I have also added a small snippet of example invocation in the FACTS_demo notebook. Check it out if you please, and inform me of any other changes that need to be made.

PS: for now I have named it FACTS_bias_scan, but as I said other suggestions are more than welcome.

Several points of the final output of the method were not clear, e.g. why the unfairness score equals a certain value Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

phantom-duck · 2024-02-08T14:53:48Z

@hoffmansc @rahulnair23 Hello again to both! Just a quick reminder, whenever you have a chance, to review the latest updates in the pull request and let me know of any necessary changes. We are always at your disposal for anything further.

PS: albeit a little late, have a healthy, happy, and fruitful 2024! :)

rahulnair23 · 2024-02-13T12:59:05Z

Looks good to me.

rahulnair23 · 2024-02-23T14:28:17Z

Hi @hoffmansc - is this good to merge?

hoffmansc

other than these tiny things, it looks good to me too.

aif360/sklearn/detectors/__init__.py

aif360/sklearn/detectors/facts/__init__.py

Specifically, re-exported the function from sklearn.detectors.__init__ Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

phantom-duck · 2024-03-08T14:11:41Z

@hoffmansc

Hello! That is great, thank you!

I have addressed your small comments. I believe everything should be okay, but as I mentioned above for the 3rd one, perhaps you should take a glimpse to make sure it is sufficient.

Other than that, I have one last remark which I would like to ask you about. The new wrapper method FACTS_bias_scan currently lacks any documentation. Until now I assumed this was ok because it simply wraps the FACTS class and its methods. Is this true, or should we add a little something?

hoffmansc · 2024-03-08T15:58:32Z

Other than that, I have one last remark which I would like to ask you about. The new wrapper method FACTS_bias_scan currently lacks any documentation. Until now I assumed this was ok because it simply wraps the FACTS class and its methods. Is this true, or should we add a little something?

Yes, that's a good point. It would be great if you can add that. It should basically be copy-paste. Thanks!

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

phantom-duck · 2024-03-10T22:59:37Z

@hoffmansc Hello again!

This is also now done, so I believe everything should be in order. We are looking forward to the addition of our contribution to the aif360 code base! Of course, we are still at your disposal for anything further in the meantime.

phantom-duck · 2024-03-15T13:51:17Z

@hoffmansc @rahulnair23

Hello again. Firstly, thank you both very much for your guidance and cooperation in bringing our contribution to the toolkit!

We also have one question. We would like to be able to provide links to the FACTS section from the official documentation of aif360. It will be added there eventually right? Would it be possible for you to give an estimation as to when will this happen?

Thank you again very much for everything! :)

hoffmansc · 2024-04-08T19:33:10Z

This should be fixed by #524. See https://aif360.readthedocs.io/en/latest/modules/sklearn.html#module-aif360.sklearn.detectors

phantom-duck and others added 18 commits June 29, 2023 15:55

initial commit of FACTS library code

39eda3e

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

FACTS: delete old segments of code, unused in current FACTS

9692701

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

Add misc.py

fcb8067

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

Add metrics.py

c76ac53

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

Add formatting.py

4ad09ab

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

FACTS: delete forgotten code from misc

db6fc65

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

some refactoring of feature_change_builder

4592d2c

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

FACTS: remove unused pieces of code

104ff10

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

FACTS: tests initial commit

6e97455

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

FACTS: removed unused commands and functions from misc and fairness_m…

6da349b

…etrics_aggr Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

FACTS: tests for misc and fairness_metrics_aggr

a0263ac

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

FACTS: first attempt at integration of our docs with the AIF360 templ…

6064c22

…ates Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

FACTS: add demo with CSC generation

c741ad1

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

FACTS: add more commentary to the demo notebook

5ccf41b

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

FACTS: Added doc to top-level wrapper class

74eb059

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

FACTS: small comment on demo notebook

0b8e6e1

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

Merge branch 'master' of https://github.com/Trusted-AI/AIF360

ccfdbb1

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

FACTS: Added short description to demo notebook

0383078

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

phantom-duck force-pushed the master branch from 9f119ad to 0383078 Compare July 4, 2023 09:57

phantom-duck added 6 commits September 6, 2023 23:46

removed no longer pieces of code from predicate

3e7c34c

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

FACTS: models.py removed, should not be part of FACTS

1893b07

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

FACTS: updated requirements.txt and setup.py

ce489b6

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

FACTS: updated codebase with latest changes

265b1c9

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

FACTS: moved source files to sklearn.detectors

afb9fa3

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

Merge branch 'master' of https://github.com/Trusted-AI/AIF360

496cc09

phantom-duck force-pushed the master branch from bc819c6 to 4beda12 Compare September 19, 2023 11:37

FACTS demo: improved metric suggestions.

0a88051

Credits to @dsachar. Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

rahulnair23 reviewed Nov 10, 2023

View reviewed changes

phantom-duck added 5 commits November 23, 2023 02:57

FACTS: "dirty" fix for nasty mlxtend-caused bug

e452aae

They define at one point a global warning filter, which then influences all Deprecation warnings. Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

FACTS: updated dependencies

d4e8933

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

FACTS: minor comment update

b160df5

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

FACTS: add wrapper function

9081591

Added the function FACTS_bias_scan, as an alternative API, closer to the API of the other detectors. Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

FACTS demo: added explanation for output elements

bcc2033

Several points of the final output of the method were not clear, e.g. why the unfairness score equals a certain value Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

rahulnair23 approved these changes Feb 23, 2024

View reviewed changes

hoffmansc approved these changes Feb 26, 2024

View reviewed changes

aif360/sklearn/detectors/__init__.py Outdated Show resolved Hide resolved

aif360/sklearn/detectors/__init__.py Outdated Show resolved Hide resolved

hoffmansc reviewed Feb 26, 2024

View reviewed changes

aif360/sklearn/detectors/facts/__init__.py Outdated Show resolved Hide resolved

phantom-duck added 2 commits March 8, 2024 14:27

FACTS: exposed wrapper function to calling code

77701a6

Specifically, re-exported the function from sklearn.detectors.__init__ Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

FACTS: short description for viewpoint parameter

6eba81e

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

FACTS: documentation for FACTS_bias_scan

ee539fc

Signed-off-by: phantom-duck <37215464+phantom-duck@users.noreply.github.com>

hoffmansc merged commit a4d18f2 into Trusted-AI:main Mar 11, 2024
9 checks passed

phantom-duck deleted the master branch April 1, 2024 11:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fairness Aware Counterfactuals for Subgroups #457

Fairness Aware Counterfactuals for Subgroups #457

phantom-duck commented Jun 29, 2023 •

edited

Loading

phantom-duck commented Jul 4, 2023

phantom-duck commented Aug 29, 2023 •

edited

Loading

hoffmansc commented Aug 29, 2023

phantom-duck commented Sep 4, 2023 •

edited

Loading

rahulnair23 left a comment

rahulnair23 Nov 10, 2023

phantom-duck Nov 16, 2023

hoffmansc Feb 26, 2024

hoffmansc commented Nov 11, 2023

phantom-duck commented Nov 17, 2023 •

edited

Loading

hoffmansc commented Nov 17, 2023

phantom-duck commented Nov 22, 2023 •

edited

Loading

phantom-duck commented Nov 24, 2023

phantom-duck commented Feb 8, 2024 •

edited

Loading

rahulnair23 commented Feb 13, 2024

rahulnair23 commented Feb 23, 2024

hoffmansc left a comment

phantom-duck commented Mar 8, 2024

hoffmansc commented Mar 8, 2024

phantom-duck commented Mar 10, 2024 •

edited

Loading

phantom-duck commented Mar 15, 2024 •

edited

Loading

hoffmansc commented Apr 8, 2024

Fairness Aware Counterfactuals for Subgroups #457

Fairness Aware Counterfactuals for Subgroups #457

Conversation

phantom-duck commented Jun 29, 2023 • edited Loading

phantom-duck commented Jul 4, 2023

phantom-duck commented Aug 29, 2023 • edited Loading

hoffmansc commented Aug 29, 2023

phantom-duck commented Sep 4, 2023 • edited Loading

rahulnair23 left a comment

Choose a reason for hiding this comment

rahulnair23 Nov 10, 2023

Choose a reason for hiding this comment

phantom-duck Nov 16, 2023

Choose a reason for hiding this comment

hoffmansc Feb 26, 2024

Choose a reason for hiding this comment

hoffmansc commented Nov 11, 2023

phantom-duck commented Nov 17, 2023 • edited Loading

hoffmansc commented Nov 17, 2023

phantom-duck commented Nov 22, 2023 • edited Loading

phantom-duck commented Nov 24, 2023

phantom-duck commented Feb 8, 2024 • edited Loading

rahulnair23 commented Feb 13, 2024

rahulnair23 commented Feb 23, 2024

hoffmansc left a comment

Choose a reason for hiding this comment

phantom-duck commented Mar 8, 2024

hoffmansc commented Mar 8, 2024

phantom-duck commented Mar 10, 2024 • edited Loading

phantom-duck commented Mar 15, 2024 • edited Loading

hoffmansc commented Apr 8, 2024

phantom-duck commented Jun 29, 2023 •

edited

Loading

phantom-duck commented Aug 29, 2023 •

edited

Loading

phantom-duck commented Sep 4, 2023 •

edited

Loading

phantom-duck commented Nov 17, 2023 •

edited

Loading

phantom-duck commented Nov 22, 2023 •

edited

Loading

phantom-duck commented Feb 8, 2024 •

edited

Loading

phantom-duck commented Mar 10, 2024 •

edited

Loading

phantom-duck commented Mar 15, 2024 •

edited

Loading