### Disparate Impact  
[towardsdatascience blog](https://towardsdatascience.com/ai-fairness-explanation-of-disparate-impact-remover-ce0da59451f1)  
Disparate Impact is a metric to evaluate fairness. It compares the proportion of individuals that receive a positive output for two groups: an unprivileged group and a privileged group.  
$$\frac{Pr(Y=1|D=\text{unprivileged})}{Pr(Y=1|D=\text{privileged})}$$  

__Disparate Impact Remover__  
Disparate Impact Remover is a pre-processing technique that edits values, which will be used as features, to increase fairness between the groups.  
Disparate Impact Remover aims to remove this ability to distinguish between group membership.  

[Certifying and removing disparate impact (pdf)](https://arxiv.org/pdf/1412.3756.pdf)  

Assume binary class in the protected feature.  
In multiple classes case: assume that a multivalued class attribute has one value designated as the "default" or majority class, and will compare each of the other values pairwise to this default class.  

###  Adversarial Machine Learning  
[floydhub blog](https://blog.floydhub.com/introduction-to-adversarial-machine-learning/)  
Adversarial Machine Learning is a collection of techniques to train neural networks on how to spot intentionally misleading data or behaviors  

Expect attacks on the model, and try to defend them  

Attack:  
* __Black Box Attack__  
  The attacker has no information about the model, or has no access to the gradients/parameters of the model  
* __White Box Attack__  
  The attacker has complete access to the parameters and the gradients of the model  


* __Targeted Attack__  
  The attacker perturbs the input image in a way such that the model predicts a specific target class  
* __Untargeted Attack__  
  The attacker perturbs the input image such as to make the model predict any class other than the true class  

Defense:  
* __Adversial Training__  
  Simply speaking, while the training is going on we also generate adversarial images with the attack which we want to defend and we train the model on the adversarial images along with regular images  
* __Random Resizing and Padding__  

## 1. Existing Measures of Bias and Related Techniques

### Paper 1  
[A Survey on Bias and Fairness in Machine Learning (pdf)](https://arxiv.org/pdf/1908.09635.pdf)  
introduce 23 types of bias  
related bias:
* __Measurement Bias__  
  Measurement bias happens from the way we choose, utilize, and measure a particular feature  
* __Evaluation Bias__  
  Evaluation bias happens during model evaluation  
* __Aggregation Bias__  
  Aggregation bias happens when false conclusions are drawn for a subgroup based on observing other different subgroups or generally when false assumptions about a population affect the model’s outcome and definition  
* __Sampling Bias__  
  Sampling bias arises due to non-random sampling of subgroups  
* __Algorithmic Bias__  
  Algorithmic bias is when the bias is not present in the input data and is added purely by the algorithm  
* __Omitted Variable Bias__  
  Omitted variable bias occurs when one or more important variables are left out of the model  

introduce 6 types of discrimination  
1. __Direct Discrimination__  
   Direct discrimination happens when protected attributes of individuals explicitly result in non-favorable outcomes toward them  
2. __Indirect Discrimination__  
   In Indirect discrimination, individuals appear to be treated based on seemingly neutral and non-protected attributes; however, protected groups or individuals still get to be treated unjustly as a result of implicit effects from their protected attributes  
3. __Systemic Discrimination__  
   Systemic discrimination refers to policies, customs, or behaviors that are a part of the culture or structure of an organization that may perpetuate discrimination against certain subgroups of the population  
4. __Statistical Discrimination__  
   Statistical discrimination is a phenomenon where decision-makers use average group statistics to judge an individual belonging to that group  
5. __Explainable Discrimination__  
   Differences in treatment and outcomes amongst different groups can be justified and explained via some attributes in some cases. In situations where these differences are justified and explained, it is not considered to be illegal discrimination and hence called explainable  
6. __Unexplainable Discrimination__  
   In contrast to explainable discrimination, there is unexplainable discrimination in which the discrimination toward a group is unjustified and therefore considered illegal  


introduce 10 definitions of fairness  
1. __Equalized Odds__  
   A predictor $\hat{Y}$ satisfies equalized odds with respect to protected attribute $A$ and outcome $Y$, if $\hat{Y}$ and $A$ are independent conditional on $Y$. P($\hat{Y}$=1|$A$=0,$Y$=y) = P($\hat{Y}$=1|$A$=1,$Y$=y) , y$\in${0,1}  
   The equalized odds definition states that the protected and unprotected groups should have equal rates for true positives and false positives  
2. __Equal Opportunity__  
   A binary predictor $\hat{Y}$ satisfies equal opportunity with respect to $A$ and $Y$ if P($\hat{Y}$=1|$A$=0,$Y$=1) = P($\hat{Y}$=1|$A$=1,$Y$=1)  
   The equal opportunity definition states that the protected and unprotected groups should have equal true positive rates  
3. __Demographic Parity (Statistical Parity)__  
   A predictor $\hat{Y}$ satisfies demographic parity if P($\hat{Y}$|$A$=0) = P($\hat{Y}$|$A$=1)  
   The likelihood of a positive outcome should be the same regardless of whether the person is in the protected (e.g., female) group  
4. __Fairness Through Awareness__  
   An algorithm is fair if it gives similar predictions to similar individuals  
5. __Fairness Through Unawareness__  
   An algorithm is fair as long as any protected attributes $A$ are not explicitly used in the decision-making process  
6. __Treatment Equality__  
   Treatment equality is achieved when the ratio of false negatives and false positives is the same for both protected group categories  
7. __Test Fairness__  
   A score S = S($x$) is testfair (well-calibrated) if it reflects the same likelihood of recidivism irrespective of the individual’s group membership, $R$. That is, if for all values of $s$, P($Y$=1|$S$=$s$,$R$=$b$) = P($Y$=1|$S$=$s$,$R$=$w$)  
   The test fairness definition states that for any predicted probability score $S$, people in both protected and unprotected (female and male) groups must have equal probability of correctly belonging to the positive class  
8. __Counterfactual Fairness__  
   Predictor $\hat{Y}$ is counterfactually fair if under any context $X$=$x$ and $A$=$a$, P($\hat{Y}_{A\leftarrow{a}}$($U$)=$y$|$X$ =$x$,$A$=$a$)=P($\hat{Y}_{A\leftarrow{a'}}$($U$)=$y$|$X$=$x$,$A$=$a$), (or all $y$ and for any value $a'$ attainable by $A$  
   Based on intuition that a decision is fair towards an individual if it is the same in both the actual world and a counterfactual world where the individual belonged to a different demographic group  
9. __Fairness in Relational Domains__  
   A notion of fairness that is able to capture the relational structure in a domain -- not only by taking attributes of individuals into consideration but by taking into account the social, organizational, and other connections between individuals  
10. __Conditional Statistical Parity__  
    For a set of legitimate factors $L$, predictor $\hat{Y}$ satisfies conditional statistical parity if P($\hat{Y}$|$L$=1,$A$=0) = P($\hat{Y}$|$L$=1,$A$=1)  
    Conditional statistical parity states that people in both protected and unprotected (female and male) groups should have equal probability of being assigned to a positive outcome given a set of legitimate factors $L$  

Fairness definitions fall under 3 types: **Individual Fairness**, **Group Fairness**, and **Subgroup Fairness**  

Classification task related papers:  
* [Non-discriminatory machine learning through convex fairness criteria](https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16476/16633)  
* [Fairness-aware classifier with prejudice remover regularizer](https://link.springer.com/content/pdf/10.1007%2F978-3-642-33486-3_3.pdf)  
* [Adaptive Sensitive Reweighting to Mitigate Bias in Fairness-aware Classification](https://dl.acm.org/doi/pdf/10.1145/3178876.3186133)  
* [The cost of fairness in binary classification](http://proceedings.mlr.press/v81/menon18a/menon18a.pdf)  
* [Fairness without Harm: Decoupled Classifiers with Preference Guarantees](http://proceedings.mlr.press/v97/ustun19a/ustun19a.pdf)  
* [Equality of opportunity in supervised learning](http://papers.nips.cc/paper/6374-equality-of-opportunity-in-supervised-learning.pdf)  
* [Fairness constraints: Mechanisms for fair classification](http://proceedings.mlr.press/v54/zafar17a/zafar17a.pdf)  
* [Learning non-discriminatory predictors](https://arxiv.org/pdf/1702.06081.pdf)  
* [Stable and Fair Classification](https://arxiv.org/pdf/1902.07823.pdf)  
* [Three naive Bayes approaches for discrimination-free classification](https://link.springer.com/content/pdf/10.1007/s10618-010-0190-x.pdf)  
* [Fairness-aware Classification: Criterion, Convexity, and Bounds](https://arxiv.org/pdf/1809.04737.pdf)  

### Paper 2  
[Non-Discriminatory Machine Learning through Convex Fairness Criteria (pdf)](https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16476/16633)  

__World Bias Matrix__  
The world bias matrix is a $2\times{2}$ matrix $W$, where $W_{ij}=W[i][j]$ is the probability of the true class of a data sample being $i\in\{+,-\}$, given that the value of the protected attribute is $j\in\{0, 1\}$.  
The world bias matrix represents the inherent real world bias in the true class of a test1 sample for different values of the protected attribute.  

__Propositions:__  
1. If the world is not biased, any classifier with arbitrary confusion matrix and satisfying equalized odds, is also non-discriminatory  
2. If the world is biased, a classifier with identity confusion matrix and satisfying only equalized odds, can not be non-discriminatory  
3. If the world is biased, a classifier satisfying equalized odds is non-discriminatory if and only if $C[+][+] = C[+][−]$ ($C$ is the confusion matrix of the classifier)  

__Impossibility of Perfect Non-Discrimination__ (from Proposition 3)  
Any practically useful classifier satisfying equalized odds in a biased world can’t be non-discriminatory  

__p-rule__  
$$\min\Big(\frac{P(\hat{y}=+|z=1)}{P(\hat{y}=+|z=0)},\frac{P(\hat{y}=+|z=0)}{P(\hat{y}=+|z=1)}\Big)\geq{p}$$  
where $P(\hat{y}=+|z=1)$ is the probability of the classifier predicting the class of a test sample as +, given that the value of the protected attribute is 1  

__Weighted Sum of Logs Technique (see pdf)__  

__Fairness__  
A machine learning classifier $f$ is said to be *__proportionally fair__* if for any other allowed classifier $u$:  
$$\sum^N_{i=1}\frac{(P_u)^+_i-(P_f)^+_i}{(P_f)^+_i}\leq0$$  
where $(P_c)^+_i$ is the probability of classifier $c$ favoring $i$  

A classifier $f$ is called to be *__weighted proportionally fair__* if for any other allowed classifier $u$:  
$$\sum^N_{i=1}w_i\frac{(P_u)^+_i-(P_f)^+_i}{(P_f)^+_i}\leq0$$  
Here $w_i$'s are interpreted as costs paid by different individuals in history  

### Paper 3  
[Fairness-aware classifier with prejudice remover regularizer](https://link.springer.com/content/pdf/10.1007%2F978-3-642-33486-3_3.pdf)  

__Three Causes of Unfairness:__  
1. __prejudice__  
   1. _direct prejudice_  
      The use of a sensitive variable in a prediction model  
   2. _indirect prejudice_  
      Statistical dependence between a sensitive variable and a target variable  
   3. _latent prejudice_  
      Statistical dependence between a sensitive variable, $S$, and a non-sensitive variable, $X$  
2. __underestimation__  
   Underestimation is the state in which a learned model is not fully converged due to the finiteness of the size of a training data set  
3. __negative legacy__  
   Negative legacy is unfair sampling or labeling in the training data  

__Prejudice Removal Techniques (see pdf)__  

### Paper 4  
[The cost of fairness in binary classification](http://proceedings.mlr.press/v81/menon18a/menon18a.pdf)  

__Fairness-aware learning:__  
* Perfect fairness  
  demographic parity (DP), equality of opportunity (EO)  
* Approximate fairness  
  disparate impact (DI), mean difference (MD)  

Informally, fairness-aware learning involves finding a randomised classifier $f:\chi\to[0,1]$ so that $Y$ (target feature) is well predicted, but $\bar{Y}$ (sensitive feature) is not  

$D$: Distribution $P(X,Y)$  
$\bar{D}_{\text{DP}}$: Distribution $P(X,\bar{Y})$  
$\bar{D}_{\text{EO}}$: Distribution $P(X,\bar{Y}|Y=1)$  

_performance measure_: $R_{\text{perf}}(\cdot;D): R^\chi\to R$  
_fairness measure_: $R_{\text{fair}}(\cdot;\bar{D}): R^\chi\to R$  

For tradeoff $\lambda\in R$, minimise the fairness-aware objective:  
$$R_{\text{perf}}(f;D)-\lambda\cdot R_{\text{fair}}(f;D)$$  

__cost-sensitive risk__  
$$\text{CS}(f;c)\doteq\pi\cdot(1-c)\cdot\text{FNR}(f)+(1-\pi)\cdot{c}\cdot\text{FPR}(f)$$  
where cost parameter $c\in(0,1)$, and $\pi\doteq{P(Y=1)}$  

__balanced cost-sensitive risk__  
$$\text{CS}_\text{bal}(f;c)\doteq(1-c)\cdot\text{FNR}(f)+c\cdot\text{FPR}(f)$$  

__A plugin approach to the fairness problem (see pdf)__  

__The fairness frontier (accuracy-fairness tradeoff) (see pdf)__  

### Paper 5  
[Fairness constraints: Mechanisms for fair classification](http://proceedings.mlr.press/v54/zafar17a/zafar17a.pdf)  

__Decision Boundary Covariance__  
Decision boundary (un)fairness is defined as the covariance between the users’ sensitive attributes, $\{z_i\}^N_{i=1}$, and the signed distance from the users’ feature vectors to the decision boundary, $\{d_\theta(x_i)\}^N_{i=1}$  
$$\text{Cov}(z,d_\theta(x))=E[(z-\bar{z})d_\theta(x)]-E[(z-\bar{z})]\bar{d}_\theta(x)\approx\frac{1}{N}\sum^N_{i=1}(z_i-\bar{z})d_\theta(x_i)$$  
In linear models for classification, such as logistic regression or linear SVMs, the decision boundary is simply the hyperplane defined by $\theta^Tx=0$, so $\text{Cov}(z,d_\theta(x))$ reduces to $\frac{1}{N}\sum^N_{i=1}(z_i-\bar{z})\theta^Tx_i$  

__Maximizing Accuracy Under Fairness Constraints__  
minimize $L(\theta)$, subject to  
* $\frac{1}{N}\sum^N_{i=1}(z_i-\bar{z})d_\theta(x_i)\leq c$  
* $\frac{1}{N}\sum^N_{i=1}(z_i-\bar{z})d_\theta(x_i)\geq -c$  
where $c$ is the covariance threshold  

__Maximizing Fairness Under Accuracy Constraints__  
minimize $|\frac{1}{N}\sum^N_{i=1}(z_i-\bar{z})d_\theta(x_i)|$,  
subject to $L(\theta)\leq(1+\gamma)L(\theta^*)$  
where $L(\theta^*)$ denotes the optimal loss over the training set provided by the unconstrained classifier and $\gamma\geq0$ specifies the maximum additional loss with respect to the loss provided by the unconstrained classifier. 

### Paper 6  
[Stable and Fair Classification](https://arxiv.org/pdf/1902.07823.pdf)  

__Stability measure__  
1. __Uniform stability__  
   Given an integer $N$, a real-valued classification algorithm $A$ is $\beta_N$-uniformly stable with respect to the loss function $L(\cdot,\cdot)$ if the following holds: for all $i\in[N]$ and $S$, $S^i\in{D^N}$,  
   $$\|L(A_S,\cdot)-L(A_{S^i},\cdot)\|_\infty\leq\beta_N$$  
   i.e., for any training set $S$, $S^i\in{D^N}$ , the $l_\infty$-distance between the risks of $A_S$ and $A_{S^i}$ is at most $\beta_N$  
2. __Prediction stability__  
   Given an integer $N$, a real-valued classification algorithm $A$ is $\beta_N$-prediction stable if the following holds: for all $i\in[N]$,  
   $$\mathop{\text{Pr}}_{S,S^i\in{D^N},X\sim\gimel}[I[A_S(X)\geq0]\neq I[A_{S^i}(X)\geq0]]\leq\beta_N$$  
   i.e., given two training sets $S$, $S^i\in D^N$ that differ by a single sample, the probability that $A_S$ and $A_{S^i}$ predict differently is at most $\beta_N$  

__The stable and fair optimization problem (see pdf)__  

## 2. Other Class Balancing Techniques

[A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data](https://dl.acm.org/doi/pdf/10.1145/1007730.1007735)  

10 methods of under and over sampling:  
1. __Random over-sampling__  
   Aim to balance class distribution through the random replication of minority class examples  
2. __Random under-sampling__  
   Aim to balance class distribution through the random elimination of majority class examples  
3. __Tomek links__  
   Given two examples $E_i$ and $E_j$ belonging to different classes, and $d(E_i,E_j)$ is the distance between $E_i$ and $E_j$. A $(E_i,E_j)$ pair is called a Tomek link if there is not an example $E_l$, such that $d(E_i,E_l)<d(E_i,E_j)$ or $d(E_j,E_l)<d(E_i,E_j)$  
   As an under-sampling method, only examples belonging to the majority class are eliminated, and as a data cleaning method, examples of both classes are removed  
4. __Condensed Nearest Neighbor Rule (CNN)__  
   Used to find a consistent subset of examples  
   A subset $\hat{E}\subseteq{E}$ is consistent with $E$ if using a 1-nearest neighbor, $\hat{E}$ correctly classifies the examples in $E$  
   Algorithm:  
   1. Randomly draw one majority class example and all examples from the minority class and put these examples in $\hat{E}$  
   2. Use a 1-NN over the examples in $\hat{E}$ to classify the examples in $E$. Every misclassified example from $E$ is moved to $\hat{E}$  
   
   This procedure does not find the smallest consistent subset from $E$  
5. __One-sided selection (OSS)__  
   An under-sampling method resulting from the application of Tomek links followed by the application of CNN  
   Tomek links are used as an under-sampling method and removes noisy and borderline majority class examples  
   CNN aims to remove examples from the majority class that are distant from the decision border  
6. __CNN + Tomek links__  
   A method proposed in the paper, competitive with OSS  
7. __Neighborhood Cleaning Rule (NCL)__  
   Use the _Wilson’s Edited Nearest Neighbor Rule_ (ENN) to remove majority class examples  
   ENN removes any example whose class label differs from the class of at least two of its three nearest neighbors  
   NCL modifies the ENN in order to increase the data cleaning  
   Algorithm on a two-class problem:  
   1. For each example $E_i$ in the training set, find its three nearest neighbors  
   2. If $E_i$ belongs to the majority class and the classification given by its three nearest neighbors contradicts the original class of $E_i$, then $E_i$ is removed  
   3. If $E_i$ belongs to the minority class and its three nearest neighbors misclassify $E_i$, then the nearest neighbors that belong to the majority class are removed  
8. __Synthetic Minority Over-sampling Technique (Smote)__  
   Main idea is to form new minority class examples by interpolating between several minority class examples that lie together  
9. __Smote + Tomek links__  
   Method proposed in the paper, applying Tomek links to the over-sampled training set as a data cleaning method  
   Instead of removing only the majority class examples that form Tomek links, examples from both classes are removed  
10. __Smote + ENN__  
    Method proposed in the paper  
    ENN tends to remove more examples than the Tomek links does, so it is expected that it will provide a more in depth data cleaning  

[imbalanced-learn (python toolkit) API](https://imbalanced-learn.readthedocs.io/en/stable/api.html)  

1. __Under Sampling__  
   * _ClusterCentroids_  
     Method that under samples the majority class by replacing a cluster of majority samples by the cluster centroid of a KMeans algorithm  
   * _CondensedNearestNeighbour_ ([Algo](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.294.6968&rep=rep1&type=pdf))  
   * _EditedNearestNeighbours_ ([Paper](https://ieeexplore.ieee.org/abstract/document/4309137))  
   * _RepeatedEditedNearestNeighbours_ ([Paper](https://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=PASCAL7630301131))  
   * _AllKNN_ ([Paper](https://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=PASCAL7630301131))  
   * _InstanceHardnessThreshold_ ([Paper](https://link.springer.com/article/10.1007/s10994-013-5422-z))  
   * _NearMiss_ ([Paper](https://www.site.uottawa.ca/~nat/Workshop2003/jzhang.pdf))  
   * _NeighbourhoodCleaningRule_ ([Paper](https://link.springer.com/chapter/10.1007/3-540-48229-6_9))  
   * _OneSidedSelection_ ([Paper](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.43.4487&rep=rep1&type=pdf))  
   * _RandomUnderSampler_  
   * _TomekLinks_ ([Paper](https://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=PASCAL7730180944))  
2. __Over Sampling__  
   * _ADASYN_ ([Paper](https://ieeexplore.ieee.org/abstract/document/4633969))<br/>
     Using Adaptive Synthetic (ADASYN) sampling approach  
   * _BorderlineSMOTE_ ([Paper](https://link.springer.com/chapter/10.1007/11538059_91))<br/>
     Borderline samples will be detected and used to generate new synthetic samples  
   * _KMeansSMOTE_ ([Paper](https://arxiv.org/abs/1711.00837))  
   * _RandomOverSampler_  
   * _SMOTE_ ([Paper](https://www.jair.org/index.php/jair/article/view/10302))  
   * _SMOTENC_ ([Paper](https://www.jair.org/index.php/jair/article/view/10302))  
     Unlike SMOTE, SMOTE-NC for dataset containing continuous and categorical features  
   * _SVMSMOTE_ ([Paper](https://www.inderscienceonline.com/doi/abs/10.1504/IJKESDP.2011.039875))  
3. __Under + Over Sampling__  
   * _SMOTEENN_ ([Paper](https://dl.acm.org/doi/pdf/10.1145/1007730.1007735))<br/>
     Perform over-sampling using SMOTE and cleaning using Edited Nearest Neighbours (ENN)  
   * _SMOTETomek_ ([Paper](https://www.inf.ufrgs.br/maslab/pergamus/pubs/balancing-training-data-for.pdf))<br/>
     Perform over-sampling using SMOTE and cleaning using Tomek links  

## 3. AI FAIRNESS 360

[API](https://aif360.readthedocs.io/en/latest/) | [Python Code](https://github.com/Trusted-AI/AIF360)  

1. __Preprocessing__  
   * _DisparateImpactRemover_ ([Paper](https://dl.acm.org/doi/pdf/10.1145/2783258.2783311))<br/>
     Disparate impact remover is a pre-processing technique that edits feature values increase group fairness while preserving rank-ordering within groups  
   * _LFR_ ([Paper](http://proceedings.mlr.press/v28/zemel13.pdf))<br/>
     Learning fair representations is a pre-processing technique that finds a latent representation which encodes the data well but obfuscates information about protected attributes  
   * _OptimPreproc_ ([Paper](http://papers.nips.cc/paper/6988-optimized-pre-processing-for-discrimination-prevention.pdf))<br/>
     Optimized preprocessing is a pre-processing technique that learns a probabilistic transformation that edits the features and labels in the data with group fairness, individual distortion, and data fidelity constraints and objectives  
   * _Reweighing_ ([Paper](https://link.springer.com/article/10.1007/s10115-011-0463-8))<br/>
     Reweighing is a pre-processing technique that Weights the examples in each (group, label) combination differently to ensure fairness before classification  
2. __Inprocessing__  
   * _AdversarialDebiasing_ ([Paper](https://dl.acm.org/doi/pdf/10.1145/3278721.3278779))<br/>
     Adversarial debiasing is an in-processing technique that learns a classifier to maximize prediction accuracy and simultaneously reduce an adversary’s ability to determine the protected attribute from the predictions. This approach leads to a fair classifier as the predictions cannot carry any group discrimination information that the adversary can exploit  
   * _ARTClassifier_  
     A Classifier object from the adversarial-robustness-toolbox  
   * _GerryFairClassifier_ ([Paper1](http://proceedings.mlr.press/v80/kearns18a.html) [Paper2](https://dl.acm.org/doi/pdf/10.1145/3287560.3287592))<br/>
     An algorithm for learning classifiers that are fair with respect to rich subgroups  
     Rich subgroups are defined by (linear) functions over the sensitive attributes  
     This implementation uses a max of two regressions as a cost-sensitive classification oracle  
   * _MetaFairClassifier_ ([Paper](https://dl.acm.org/doi/pdf/10.1145/3287560.3287586))<br/>
     The meta algorithm here takes the fairness metric as part of the input and returns a classifier optimized w.r.t. that fairness metric  
   * _PrejudiceRemover_ ([Paper](https://link.springer.com/chapter/10.1007/978-3-642-33486-3_3))<br/>
     Prejudice remover is an in-processing technique that adds a discrimination-aware regularization term to the learning objective  
3. __Postprocessing__  
   * _CalibratedEqOddsPostprocessing_ ([Paper](http://papers.nips.cc/paper/7151-on-fairness-and-calibration.pdf))<br/>
     Calibrated equalized odds postprocessing is a post-processing technique that optimizes over calibrated classifier score outputs to find probabilities with which to change output labels with an equalized odds objective  
   * _EqOddsPostprocessing_ ([Paper](http://papers.nips.cc/paper/6374-equality-of-opportunity-in-supervised-learning.pdf))<br/>
     Equalized odds postprocessing is a post-processing technique that solves a linear program to find probabilities with which to change output labels to optimize equalized odds  
   * _RejectOptionClassification_ ([Paper](https://ieeexplore.ieee.org/abstract/document/6413831))<br/>
     Reject option classification is a post-processing technique that gives favorable outcomes to unpriviliged groups and unfavorable outcomes to priviliged groups in a confidence band around the decision boundary with the highest uncertainty  

## 4. Machine Bias in Data 

$$\text{Bias} = |\text{FPR}_A - \text{FPR}_B| + |\text{FNR}_A - \text{FNR}_B|$$

__If the removal of feature X gives a large drop in bias, but not accuracy. Why?__  

1. X plays a big role in predicting target, but also highly correlated to protected attribute. Another feature Y can replace X with little reduction in accuracy  
2. Bias was originally high -> difference in FPR or FNR -> accuracy was not perfect  
   After removal, the difference is reduced, and not much change in accuracy  
   -> FPR (FNR) for one group increases, and decreases for another group  
   
   

## 5. Bias Accuracy Tradeoff  

If a classifier is perfect, then there will be no bias, which potentially means that accuracy can goes up while bias goes down  
If a classifier is the worst, then there will also be no bias, which potentially means that accuracy can reduce with bias  

Tradeoff is necessary because we are using data imputation method to reduce bias  

$\text{Bias} = |\text{FPR}_A - \text{FPR}_B| + |\text{FNR}_A - \text{FNR}_B|
= \Big|\frac{\text{FP}_A}{\text{N}_A} - \frac{\text{FP}_B}{\text{N}_B}\Big| + \Big|\frac{\text{FN}_A}{\text{P}_A} - \frac{\text{FN}_B}{\text{P}_B}\Big|$  
where $\text{N}_A$ means number of data points in group A that are actually negative  
$\text{FP}_A$ means number of data points in group A that are actually negative but predicted as positive  


If we assume that $\text{P} = \text{N} = \frac{t}{2}$, $\text{P}_A = \text{P}_B = \text{N}_A = \text{N}_B = \frac{t}{4}$, if the total number of data is $t$  

$\frac{\text{FP}_A}{\text{N}_A} + \frac{\text{FP}_B}{\text{N}_B} + \frac{\text{FN}_A}{\text{P}_A} + \frac{\text{FN}_B}{\text{P}_B}
= \frac{\text{FP}_A + \text{FP}_B + \text{FN}_A + \text{FN}_B}{t / 4}
= \frac{\text{Number of data predicted wrong}}{\text{1/4 of total number of data}}$  
which is linearly related to accuracy  

