# Fairness in Classification

- 📺 **Video:** [https://youtu.be/N4f2-S19LME](https://youtu.be/N4f2-S19LME)

## Overview
An important interlude addressing the ethical aspect of classification models, specifically focusing on fairness and bias. This video likely discusses how NLP classifiers (and ML models in general) can inadvertently become biased against certain groups if the training data reflect societal biases.

In [None]:
import os, random
random.seed(0)
CI = os.environ.get('CI') == 'true'

## Key ideas
- It introduces the concept of fairness criteria - for example, requiring equal accuracy for different demographic groups or equal false positive rates, etc.
- The lecture references a seminal work by Hutchinson & Mitchell (2018) which surveys 50 years of fairness research emphasizing that concerns about biased decision-making long predate modern ML and many definitions of fairness exist (no single metric captures everything).
- The video may provide concrete NLP examples: one famous case is an AI recruiting tool at Amazon that was found to be biased against women Specifically, that tool learned from historical hiring data (mostly male) and started down-ranking resumes that contained the word “women's” (as in “women's chess club”) or that came from women's colleges.
- By mentioning this example, the video highlights how a seemingly innocuous model can perpetuate historical discrimination if not carefully designed.

## Demo

In [None]:
print('Try the exercises below and follow the linked materials.')

## Try it
- Modify the demo
- Add a tiny dataset or counter-example


## References
- [Eisenstein 4.2](https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf)
- [Multiclass lecture note](https://www.cs.utexas.edu/~gdurrett/courses/online-course/multiclass.pdf)
- [A large annotated corpus for learning natural language inference](https://www.aclweb.org/anthology/D15-1075/)
- [Authorship Attribution of Micro-Messages](https://www.aclweb.org/anthology/D13-1193/)
- [50 Years of Test (Un)fairness: Lessons for Machine Learning](https://arxiv.org/pdf/1811.10104.pdf)
- [[Article] Amazon scraps secret AI recruiting tool that showed bias against women](https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G)
- [[Blog] Neural Networks, Manifolds, and Topology](http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/)
- [Eisenstein Chapter 3.1-3.3](https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf)
- [Dropout: a simple way to prevent neural networks from overfitting](https://dl.acm.org/doi/10.5555/2627435.2670313)
- [Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift](https://arxiv.org/abs/1502.03167)
- [Adam: A Method for Stochastic Optimization](https://arxiv.org/abs/1412.6980)
- [The Marginal Value of Adaptive Gradient Methods in Machine Learning](https://papers.nips.cc/paper/2017/hash/81b3833e2504647f9d794f7d7b9bf341-Abstract.html)


*Links only; we do not redistribute slides or papers.*