Skip to content

Latest commit

 

History

History
572 lines (371 loc) · 16.8 KB

model_fairness.md

File metadata and controls

572 lines (371 loc) · 16.8 KB
author title semester footer license
Christian Kaestner and Claire Le Goues
MLiP: Measuring Fairness
Spring 2023
Machine Learning in Production/AI Engineering • Christian Kaestner, Carnegie Mellon University • Spring 2023
Creative Commons Attribution 4.0 International (CC BY 4.0)

Machine Learning in Production

Measuring Fairness


Diving into Fairness...

Overview of course content


Reading

Required:

Recommended:


Learning Goals

  • Understand different definitions of fairness
  • Discuss methods for measuring fairness
  • Outline interventions to improve fairness at the model level

Fairness: Definitions

How do we measure fairness of an ML model?


Fairness is still an actively studied & disputed concept!

Source: Mortiz Hardt, https://fairmlclass.github.io/


Fairness: Definitions

  • Anti-classification (fairness through blindness)
  • Group fairness (independence)
  • Equalized odds (separation)
  • ...and numerous others and variations!

Running Example: Mortgage Applications

  • Large loans repaid over long periods
  • Home ownership is key path to build generational wealth
  • Past decisions often discriminatory (redlining)
  • Replace biased human decisions by accurate ML model
    • income, other debt, home value
    • past debt and payment behavior (credit score)

Recall: What is fair?

Fairness discourse asks questions about how to treat people and whether treating different groups of people differently is ethical. If two groups of people are systematically treated differently, this is often considered unfair.


Recall: What is fair?

Contrasting equality, equity, and justice


What is fair in mortgage applications?

  1. Distribute loans equally across all groups of protected attribute(s) (e.g., ethnicity)
  2. Prioritize those who are more likely to pay back (e.g., higher income, good credit history)

Note: slack vote


Caveat on Intersectionality

Individuals can and do fall into multiple groups!

Subgroup fairness gets extremely technically complicated quickly.

We therefore focus on the simple cases for the purposes of the material in this class.


Redlining

Redlining

Withold services (e.g., mortgage, education, retail) from people in neighborhoods deemed "risky"

Map of Philadelphia, 1936, Home Owners' Loan Corps. (HOLC)

  • Classification based on estimated "riskiness" of loans

Past bias, different starting positions

Severe median income and worth disparities between white and black households

Source: Federal Reserve’s Survey of Consumer Finances

Notes: much of fairness discourse here is trying to account for unequal starting positions


Fairness: Definitions

  • Anti-classification (fairness through blindness)
  • Group fairness (independence)
  • Equalized odds (separation)
  • ...and numerous others and variations!

Anti-Classification

  • Also called fairness through blindness or fairness through unawareness
  • Ignore certain sensitive attributes when making a decision
  • Example: Remove gender and race from mortgage model

Anti-Classification: Example

appraisal

"After Ms. Horton removed all signs of Blackness, a second appraisal valued a Jacksonville home owned by her and her husband, Alex Horton, at 40 percent higher."

https://www.nytimes.com/2022/03/21/realestate/remote-home-appraisals-racial-bias.html


Anti-Classification

Easy to implement, but any limitations?


Recall: Proxies

Features correlate with protected attributes


Also, recall: Not all discrimination is harmful

  • Loan lending: Gender discrimination is illegal.
  • Medical diagnosis: Gender-specific diagnosis may be desirable.
  • ML models discriminate based on input data by construction.
  • The problem is unjustified differentiation; i.e., discriminating on factors that should not matter
  • Discrimination is a domain-specific concept

Anti-Classification

  • Ignore certain sensitive attributes when making a decision
  • Advantage: Easy to implement and test
  • Limitations
    • Sensitive attributes may be correlated with other features
    • Some ML tasks need sensitive attributes (e.g., medical diagnosis)

Ensuring Anti-Classification

How to train models that are fair w.r.t. anti-classification?


Ensuring Anti-Classification

How to train models that are fair w.r.t. anti-classification?

--> Simply remove features for protected attributes from training and inference data

--> Null/randomize protected attribute during inference

(does not account for correlated attributes, is not required to)


Testing Anti-Classification

How do we test that a classifier achieves anti-classification?


Testing Anti-Classification

Straightforward invariant for classifier $f$ and protected attribute $p$:

$\forall x. f(x[p\leftarrow 0]) = f(x[p\leftarrow 1])$

(does not account for correlated attributes, is not required to)

Test with any test data, e.g., purely random data or existing test data

Any single inconsistency shows that the protected attribute was used. Can also report percentage of inconsistencies.

See for example: Galhotra, Sainyam, Yuriy Brun, and Alexandra Meliou. "Fairness testing: testing software for discrimination." In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, pp. 498-510. 2017.


Anti-Classification Discussion

Testing of anti-classification barely needed, because easy to ensure by constructing during training or inference!

Anti-classification is a good starting point to think about protected attributes

Useful baseline for comparison

Easy to implement, but only effective if (1) no proxies among features and (2) protected attributes add no predictive power


Fairness: Definitions

  • Anti-classification (fairness through blindness)
  • Group fairness (independence)
  • Equalized odds (separation)
  • ...and numerous others and variations!

Group fairness

Key idea: Outcomes matter, not accuracy!

Compare outcomes across two groups

  • Similar rates of accepted loans across racial/gender groups?
  • Similar chance of being hired/promoted between gender groups?
  • Similar rates of (predicted) recidivism across racial groups?

Disparate impact vs. disparate treatment

Disparate treatment: Practices or rules that treat a certain protected group(s) differently from others

  • e.g., Apply different mortgage rules for people from different backgrounds

Disparate impact: Neutral rules, but outcome is worse for one or more protected groups

  • Same rules are applied, but certain groups have a harder time obtaining mortgage in a particular neighborhood

Group fairness in discrimination law

Relates to disparate impact and the four-fifth rule

Can sue organizations for discrimination if they

  • mostly reject job applications from one minority group (identified by protected classes) and hire mostly from another
  • reject most loans from one minority group and more frequently accept applicants from another

Notation

  • $X$: Feature set (e.g., age, race, education, region, income, etc.,)
  • $A \in X$: Sensitive attribute (e.g., gender)
  • $R$: Regression score (e.g., predicted likelihood of on-time loan payment)
  • $Y'$: Classifier output
    • $Y' = 1$ if and only if $R > T$ for some threshold $T$
    • e.g., Grant the loan ($Y' = 1$) if the likelihood of paying back > 80%
  • $Y$: Target variable being predicted ($Y = 1$ if the person actually pays back on time)

Setting classification thresholds: Loan lending example


Group Fairness

$P[Y' = 1 | A = a] = P[Y' = 1 | A = b]$

  • Also called independence or demographic parity
  • Mathematically, $Y' \perp A$
    • Prediction ($Y'$) must be independent of the sensitive attribute ($A$)
  • Examples:
    • The predicted rate of recidivism is the same across all races
    • Both women and men have the equal probability of being promoted
      • i.e., P[promote = 1 | gender = M] = P[promote = 1 | gender = F]

Notes: probability is the same across all groups


Group Fairness Limitations

What are limitations of group fairness?


Group Fairness Limitations

  • Ignores possible correlation between $Y$ and $A$
    • Rules out perfect predictor $Y' = Y$ when $Y$ & $A$ are correlated!
  • Permits abuse and laziness: Can be satisfied by randomly assigning a positive outcome ($Y' = 1$) to protected groups
    • e.g., Randomly promote people (regardless of their job performance) to match the rate across all groups

Notes: firing practices


Adjusting Thresholds for Group Fairness

Select different classification thresholds ($t_0$, $t_1$) for different groups (A = 0, A = 1) to achieve group fairness, such that $P[R > t_0 | A = 0] = P[R > t_1 | A = 1]$

Example: Mortgage application

  • R: Likelihood of paying back the loan on time
  • Suppose: With a uniform threshold used (i.e., R = 80%), group fairness is not achieved
    • P[R > 0.8 | A = 0] = 0.4, P[R > 0.8 | A = 1] = 0.7
  • Adjust thresholds to achieve group fairness
    • P[R > 0.6 | A = 0] = P[R > 0.8 | A = 1]
  • Wouldn't group A = 1 argue it's unfair? When does this type of adjustment make sense?

Testing Group Fairness

How would you test whether a classifier achieves group fairness?


Testing Group Fairness

Collect realistic, representative data (not randomly generated!)

  • Use existing validation/test data
  • Monitor production data
  • (Somehow) generate realistic test data, e.g. from probability distribution of population

Separately measure the rate of positive predictions

  • e.g., P[promoted = 1 | gender = M], P[promoted = 1 | gender = F] = ?

Report issue if the rates differ beyond some threshold $\epsilon$ across groups


Equalized odds

  • Anti-classification (fairness through blindness)
  • Group fairness (independence)
  • Equalized odds (separation)
  • ...and numerous others and variations!

Equalized odds

Key idea: Focus on accuracy (not outcomes) across two groups

  • Similar default rates on accepted loans across racial/gender groups?
  • Similar rate of "bad hires" and "missed stars" between gender groups?
  • Similar accuracy of predicted recidivism vs actual recidivism across racial groups?

Accuracy matters, not outcomes!


Equalized odds in discrimination law

Relates to disparate treatment

Typically, lawsuits claim that protected attributes (e.g., race, gender) were used in decisions even though they were irrelevant

  • e.g., fired over complaint because of being Latino, whereas other White employees were not fired with similar complaints

Must prove that the defendant had intention to discriminate

  • Often difficult: Relying on shifting justifications, inconsistent application of rules, or explicit remarks overheard or documented

Equalized odds

$P[Y'=1∣Y=0,A=a] = P[Y'=1∣Y=0,A=b]$ $P[Y'=0∣Y=1,A=a] = P[Y'=0∣Y=1,A=b]$

Statistical property of separation: $Y' \perp A | Y$

  • Prediction must be independent of the sensitive attribute conditional on the target variable

Review: Confusion Matrix

Can we explain separation in terms of model errors?

  • $P[Y'=1∣Y=0,A=a] = P[Y'=1∣Y=0,A=b]$
  • $P[Y'=0∣Y=1,A=a] = P[Y'=0∣Y=1,A=b]$

Separation

$P[Y'=1∣Y=0,A=a] = P[Y'=1∣Y=0,A=b]$ (FPR parity)

$P[Y'=0∣Y=1,A=a] = P[Y'=0∣Y=1,A=b]$ (FNR parity)

  • $Y' \perp A | Y$: Prediction must be independent of the sensitive attribute conditional on the target variable
  • i.e., All groups are susceptible to the same false positive/negative rates
  • Example: Y': Promotion decision, A: Gender of applicant: Y: Actual job performance

Testing Separation

Requires realistic representative test data (telemetry or representative test data, not random)

Separately measure false positive and false negative rates

  • e..g, for FNR, compare P[promoted = 0 | female, good employee] vs P[promoted = 0 | male, good employee]

How is this different from testing group fairness?

Notes: need labels hard in some applications


Breakout: Cancer Prognosis

In groups, post to #lecture tagging members:

  • Does the model meet anti-classification fairness w.r.t. gender?
  • Does the model meet group fairness?
  • Does the model meet equalized odds?
  • Is the model fair enough to use?

Notes: prob cancer male vs female


Other fairness measures

  • Anti-classification (fairness through blindness)
  • Group fairness (independence)
  • Equalized odds (separation)**
  • ...and numerous others and variations!


Many measures

Many measures proposed

Some specialized for tasks (e.g., ranking, NLP)

Some consider downstream utility of various outcomes

Most are similar to the three discussed

  • Comparing different measures in the error matrix (e.g., false positive rate, lift)

Outlook: Building Fair ML-Based Products

Next lecture: Fairness is a system-wide concern

  • Identifying and negotiating fairness requirements
  • Fairness beyond model predictions (product design, mitigations, data collection)
  • Fairness in process and teamwork, barriers and responsibilities
  • Documenting fairness at the interface
  • Monitoring
  • Promoting best practices

Summary

  • Three definitions of fairness: Anti-classification, group fairness, equalized odds
  • Tradeoffs between fairness criteria
    • What is the goal?
    • Key: how to deal with unequal starting positions
  • Improving fairness of a model
    • In all pipeline stages: data collection, data cleaning, training, inference, evaluation

Further Readings