Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH added agg argument to equalized odds difference and ratio to support "average odds" #960

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
40 changes: 40 additions & 0 deletions fairlearn/metrics/_disparities.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,46 @@
from ._metric_frame import MetricFrame


def average_odds_difference(
romanlutz marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

@MiroDudik MiroDudik Oct 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine including the two metrics--the goal and even behavior is very similar to our existing equalized_odds_difference and equalized_odds_ratio, in the sense that average_odds_difference==0 or average_odds_ratio==1 indicate that there is no disparity according to the equalized odds criterion.

But I think that the name is confusing for two reasons:

  • the word "odds" is in statistics defined as a ratio of "probability of event / 1-probability of event" and we are definitely not averaging such odds...
  • it turns out that AIF360's average_odds_difference is calculating something different than this function! and it's not just the usual distinction due to their use of privileged/unprivileged, it's also the fact that AIF360's definition allows the signs of the two differences to "cancel out" whereas we take the absolute value of the differences before averaging

So my suggestion would be not to create a new function, but instead add a new optional argument to our existing equalized_odds_difference and equalized_odds_ratio. For example, we could add an optional parameter called aggregation_method with its default value being worst_case, and a new possible value being average. The aggregation_method would just specify what's happening with the FPR_difference and FNR_difference -- whether we are taking the worst case or average.

Thoughts?

[tagging @romanlutz @hildeweerts]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not too bothered by the name wrt normal usage of the word "odds" - I think we can blame the person who came up with equalized odds for that. I am very surprised that AIF360 would allow for "cancelling out" the differences... isn't that the whole point of considering the FPR and TPR separately instead of looking at overall accuracy?

I like the proposed solution @MiroDudik - in particular the usage of worst_case would result in less mistakes compared to min and max, I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if folks are in favor of just adding an optional argument to equalized_odds_{difference,ratio}, I'd be open to having something less verbose, e.g., we could follow pandas conventions and use agg={worst_case,mean}.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the number of thumbs up I think we can assume that we can move forward with @MiroDudik 's suggestion?

@IanEisenberg would you like to implement the suggestion? If you don't have the time/interest that is of course perfectly fine (in that case we will just open a new issue).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to implement Miro's suggestion. I'll get to it ASAP!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, @IanEisenberg! Feel free to ask questions if anything is unclear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since @IanEisenberg couldn't continue on this I've picked it up today and continued along the lines discussed in this thread. Let me know if you have any concerns! @fairlearn/fairlearn-maintainers

y_true,
y_pred,
*,
sensitive_features,
method='between_groups',
sample_weight=None) -> float:
"""Calculate the average odds difference.

The average of two metrics: `true_positive_rate_difference` and
`false_positive_rate_difference`. The former is the difference between the
largest and smallest of :math:`P[h(X)=1 | A=a, Y=1]`, across all values :math:`a`
of the sensitive feature(s). The latter is defined similarly, but for
:math:`P[h(X)=1 | A=a, Y=0]`.
The equalized odds difference of 0 means that all groups have the same
IanEisenberg marked this conversation as resolved.
Show resolved Hide resolved
true positive, true negative, false positive, and false negative rates.

Parameters
----------
y_true : array-like
Ground truth (correct) labels.
y_pred : array-like
Predicted labels :math:`h(X)` returned by the classifier.
sensitive_features :
The sensitive features over which demographic parity should be assessed
method : str
How to compute the differences. See :func:`fairlearn.metrics.MetricFrame.ratio`
for details.
sample_weight : array-like
The sample weights
Returns
-------
float
The average odds difference
"""
eo = _get_eo_frame(y_true, y_pred, sensitive_features, sample_weight)

return sum(eo.difference(method=method))/2


def demographic_parity_difference(
y_true,
y_pred,
Expand Down