Ethical Machine Learning
A vignette designed to assist with spotting and preventing proxy bias.
Machine Learning systems often inherit biases against protected classes and historically disparaged groups via their training data . Though some biases in features are straightforward to detect (ex: age, gender, race), others are not explicit and rely on subtle correlations in machine learning algorithms to understand. The incorporation of unintended bias into predictive models is called proxy discrimination.
Example of proxy bias in a bank's loan decision workflow.
In this vignette, we will be implementing an example machine learning model using decision trees, and determining whether its classification for loan recipients is biased against certain groups. We will explore several ways of detecting unintentional bias and removing it from our predictive model.
Please note: the techniques used in this exercise only support linear models, decision trees, rule lists, and random forests - not deep learning models or neural networks. However, the supported models represent a significant fraction of models used in practice in predictive systems that operate on personal information, ranging from advertising 2, psychopathy 3, criminal justice 4, 5, and actuarial sciences 6, 7.