Detecting-Gender-Bias-using-Explainability

Abstract:

Explanations for AI systems have been used to improve the trustworthiness of these systems. These explanations can be used to find the undesirable implicit biases that machine learning models can rely on for their outputs. We apply this concept to detect gender bias in sentiment analysis models for textual data. With the help of an Equity Evaluation Corpus (EEC), we use different gender signals for otherwise identical input to the system and use explanations from LIME and SHAP to find a trend of bias, and identify terms that contribute the most to it.

Motivation:

Consider a set of three sentences fed into the model, only with different gender signals, such as:

"He was furious,"
"She was furious," and
"They were furious."

All three of these sentences should ideally predict the same final sentiment polarity. However, since the model learns from the training set, it learns undesirable associations between words which we observe as gender bias.

Gender bias in language models can be harmful in more than one way, considering their potential impact on downstream tasks which have vast industrial applications.

Objective:

Our aim is to:

Address whether Explainable AI methods can be used to detect gender bias in textual data.
Present the application of model-agnostic methods on the dataset and how these explanations help uncover the hidden bias.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
Code		Code
Figures		Figures
Tables		Tables
LIME_Anger.ipynb		LIME_Anger.ipynb
README.md		README.md
SHAP.ipynb		SHAP.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Detecting-Gender-Bias-using-Explainability

Abstract:

Motivation:

Objective:

About

Releases

Packages

Languages

SupritiVijay/Detecting-Gender-Bias-using-Explainability

Folders and files

Latest commit

History

Repository files navigation

Detecting-Gender-Bias-using-Explainability

Abstract:

Motivation:

Objective:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages