Skip to content

sivansabato/unfairness

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

unfairness_inferences

This archive includes code and data for the manuscript: "Fairness and Unfairness in Binary and Multiclass Classification: Quantifying, Calculating, and Bounding" by Sivan Sabato, Eran Treister and Elad Yom-Tov.

The code is released under the MIT Open Source license.

*.m files are Matlab code files, which we ran using Matlab R2021b. *.py files are pytnon3.9 code files.

List of provided files

Implementation files:

  • README - this file
  • find_lb.m - the implementation of Alg. 1 from the paper. Type "help find_lb" for input and output arguments.

Experiment files for binary classification

-The following scripts use the data file USCensus1990raw.data.mat which can be downloaded from the following link: https://archive.ics.uci.edu/ml/datasets/US+Census+Data+(1990)

  • census_commands.m - the script for running the experiments for binary classification with beta = 1 on the UC Census data set.
  • census_commands_beta.m - the script for running the experiments for binary classification with variable beta on the UC Census data set.
  • cancer.m - a script for running the experiments on the classifiers generated from search-engine data.
  • poll_run.m - a script for running the poll experiments. This script requires downloading the relevant data; See explanataions on how to do this in the comment at the top of the script.
  • cancer_mortality.m - a script for running the cancer mortality experiments. This script requires downloading the relevant data; See explanataions on how to do this in the comment at the top of the script.
  • simpleClassifierAnalysis.m - an auxiliary file
  • generate_compas_classifier.py - generate the classifier for the COMPAS dataset, save informaion into a csv file.
  • compas.m - generate output for the COMPAS experiments
  • adultRunExp.m - generate output for the Adult experiments

Data files for binary classification

  • cancer_data.m - the statistics of the classifiers generated by the search-engine data in the first reported experiment. Used by the cancer.m script

Auxiliary files used by the files above:

  • calcunfairness.m
  • calculate_classifier.m
  • find_best_alphas.m

Data files used to generate paper graphs:

The following files contain the experiment results that were used to generate the graphs in the paper.

  • census_commands_tree.csv: results of census_commands.m with decision tree classifiers
  • census_commands_linear.csv: results of census_commands.m with linear classifiers
  • discdata_tree.csv: results of census_commands_beta.m with decision tree classifiers
  • discdata_linear.csv: results of census_commands_beta.m with linear classifiers
  • cancer.csv: results of cancer.m
  • polls.csv: results of poll_run.m
  • cancer_mortality.csv: results of cancer_mortality.m

Experiment files for multiclass classification:

  • runexps.py - the main python script for running the experiments. It uses the modules in the following files:
    • localmin.py
    • solve_large.py
    • load_mat_params.py
  • run_census_multiclass.m - the script for generating the multiclass classifiers for the UC Census multiclass experiments. This script uses the data file USCensus1990raw.data.mat (see above on how to obtain it)
  • Files for generating the input for the Natality data set experiments:
    • runner_read_births_data.m - reads the relevant data from the input data set file. This file can be downloaded here: https://ftp.cdc.gov/pub/Health_Statistics/NCHS/Datasets/DVS/natality/Nat2017us.zip
    • runner_test_train_split_births.m - splits the data into train and test
    • runner_model_births.m - generates a classifier from the variables generated by the previous scripts. The classifier type is determined by the variable 'classifier_type'.
    • get_labor_params.m - genrates a data file for the experiments from the classifier file generated by the previous script.
  • education.py - reads the US Education data file (downloaded from here: https://data.ers.usda.gov/reports.aspx?ID=17829) and saves a data file to be used in the experiments.
  • ukelections.py - reads the UK elections data file (downloaded from here: https://commonslibrary.parliament.uk/research-briefings/cbp-8647/) and saves a data file to be used in the experiments.

Auxiliary files used by the files above:

  • census_commands_multiclass.m - a script used by run_census_multiclass.m.
  • calculate_classifier_multiclass.m - a script used by run_census_multiclass.m.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published