ML-Fairness

This repository contains the source code and data used for the paper, to be appeared at ESEC/FSE 2020.

Do the Machine Learning Models on a Crowd Sourced Platform Exhibit Bias? An Empirical Study on Model Fairness

Abstract

Machine learning models are increasingly being used in important decision-making software such as approving bank loans, recommending criminal sentencing, hiring employees, and so on. It is important to ensure the fairness of these models so that no discrimination is made between different groups in a protected attribute (e.g., race, sex, age) while decision making. Algorithms have been developed to measure unfairness and mitigate them to a certain extent. In this paper, we have focused on the empirical evaluation of fairness and mitigations on real-world machine learning models. We have created a benchmark of 40 top-rated models from Kaggle used for 5 different tasks, and then using a comprehensive set of fairness metrics evaluated their fairness. Then, we have applied 7 mitigation techniques on these models and analyzed the fairness, mitigation results, and impacts on performance. We have found that some model optimization techniques result in inducing unfairness in the models. On the other hand, although there are some fairness control mechanisms in machine learning libraries, they are not documented. The mitigation algorithm also exhibit common patterns such as mitigation in the post-processing is often costly (in terms of performance) and mitigation in the pre-processing stage is preferred in most cases. We have also presented different trade-off choices of fairness mitigation decisions. Our study suggests future research directions to reduce the gap between theoretical fairness aware algorithms and the software engineering methods to leverage them in practice.

Installation and Usage

Follow the instructions to setup environment and run the source code.

For any concerns contact the corresponding author Sumon Biswas [sumon@iastate.edu] or Hridesh Rajan [hridesh@iastate.edu].

DOI of Replication Package

ACM Reference

Biswas, S. and Rajan, H. 2020. Do the Machine Learning Models on a Crowd Sourced Platform Exhibit Bias? An Empirical Study on Model Fairness. ESEC/FSE’2020: The 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Nov. 2020).

Cite as:

@inproceedings{biswas20machine,
  author = {Sumon Biswas and Hridesh Rajan},
  title = {Do the Machine Learning Models on a Crowd Sourced Platform Exhibit Bias? An Empirical Study on Model Fairness},
  booktitle = {Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering},
  location = {Virtual Event, USA},
  year = {2020},
  entrysubtype = {conference},
  pages = {642–653},
  numpages = {12},
  url = {https://doi.org/10.1145/3368089.3409704},
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
dataset		dataset
src		src
.gitignore		.gitignore
CONTACT.md		CONTACT.md
INSTALL.md		INSTALL.md
LICENSE.md		LICENSE.md
README.md		README.md
artifact-result.xlsx		artifact-result.xlsx
ml-fairness.pdf		ml-fairness.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML-Fairness

Do the Machine Learning Models on a Crowd Sourced Platform Exhibit Bias? An Empirical Study on Model Fairness

Abstract

Installation and Usage

DOI of Replication Package

ACM Reference

About

Releases 7

Packages

Languages

License

sumonbis/ML-Fairness

Folders and files

Latest commit

History

Repository files navigation

ML-Fairness

Do the Machine Learning Models on a Crowd Sourced Platform Exhibit Bias? An Empirical Study on Model Fairness

Abstract

Installation and Usage

DOI of Replication Package

ACM Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 7

Packages 0

Languages

Packages