Stereotyping_ROCDS

Code and files related to ROC Data Science Meetup 12 Nov 2020.
https://www.meetup.com/ROC-Data-Science/events/274237489/

Explores fairness metrics and Shapley explanation techniques for detecting stereotyping and feature bias in models. Shows that fairness metrics do not distinguish stereotyping from decisions made based on reasonable factors. Demonstrates that features driving differences can be isolated using Shapley techinques, and suggests additional tests for analyzing causes of differences.

Uses an xgboost model via h2o; currently this is supprted only on Linux. To use a random forest model instead (which will work on Windows), modify 02_models.R by changing the value of kModelType near the top of the file.

To run the code, do the following:

Install h2o (see http://h2o-release.s3.amazonaws.com/h2o/rel-zermelo/1/index.html)
Open the package 202010_fairness.Rproj in R Studio (if not using RStudio, set your working directory to the folder containing the project)
Edit the file 00_setup.R, setting kOutputDir to a writeable directory on your machine
Run the file 00_run_all.R

Because exact Shapley values are calculated, runtimes are long. The number of samples analyzed can be reduced to speed up the scripts.

Towards Data Science Article Code

For Fairness Metrics Won’t Save You from Stereotyping, you only need to run scripts 00, 01, 02, 03, and 05.

For No Free Lunch with Feature Bias and How to Fix Feature Bias, run scripts 00, 01, 02,04, 06, and 07.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
00_scripts		00_scripts
00_run_all.R		00_run_all.R
00_setup.R		00_setup.R
01_data_process.R		01_data_process.R
02_models.R		02_models.R
03_metrics.R		03_metrics.R
04_metrics_feature_bias.R		04_metrics_feature_bias.R
05_shapley_tests.R		05_shapley_tests.R
06_shapley_tests_feature_bias.R		06_shapley_tests_feature_bias.R
07_tree_structure_feature_bias.R		07_tree_structure_feature_bias.R
202010_fairness.Rproj		202010_fairness.Rproj
README.md		README.md
SURVEY_ANSWERS.pdf		SURVEY_ANSWERS.pdf
roc_data_science_20201112_06.pdf		roc_data_science_20201112_06.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

00_scripts

00_scripts

00_run_all.R

00_run_all.R

00_setup.R

00_setup.R

01_data_process.R

01_data_process.R

02_models.R

02_models.R

03_metrics.R

03_metrics.R

04_metrics_feature_bias.R

04_metrics_feature_bias.R

05_shapley_tests.R

05_shapley_tests.R

06_shapley_tests_feature_bias.R

06_shapley_tests_feature_bias.R

07_tree_structure_feature_bias.R

07_tree_structure_feature_bias.R

202010_fairness.Rproj

202010_fairness.Rproj

README.md

README.md

SURVEY_ANSWERS.pdf

SURVEY_ANSWERS.pdf

roc_data_science_20201112_06.pdf

roc_data_science_20201112_06.pdf

Repository files navigation

Stereotyping_ROCDS

About

Releases

Packages

Languages

vla6/Stereotyping_ROCDS

Folders and files

Latest commit

History

Repository files navigation

Stereotyping_ROCDS

About

Topics

Resources

Stars

Watchers

Forks

Languages