# Examples of how to use `DiscriminationMitigator`

In [1]:
import json
from DiscriminationMitigation import *
from sklearn.model_selection import train_test_split
pd.set_option('display.max_rows', 20)
pd.set_option('display.max_columns', 100)

In [2]:
def simple_synth(n=10000, class_probab=0.5, gamma0=4, gamma1=6, alpha0=2, alpha1=1, beta0=1, beta1=1):

    np.random.seed(123)

    # Protected class variable
    c1 = np.random.binomial(1, p=class_probab, size=n) # group 1
    c0 = 1-c1 # group 0

    # Other covariate
    w = gamma0*c0 + gamma1*c1 + np.random.normal(0, 0.5, size=n) # linear function of class & shock

    # Outcome variable
    y = alpha0*c0 + alpha1*c1 + beta0*c0*w + beta1*c1*w + np.random.normal(0, 0.5, size=n)

    return pd.DataFrame([y, c0, c1, w]).T.rename(columns={0:'y', 1: 'c0', 2: 'c1', 3: 'w'})

#### Instantiate some synthetic data

In [3]:
synth = simple_synth()
synth['z'] = np.random.randint(low=1, high=5, size=len(synth)) # add higher-dimensional protected class
synth['a'] = np.random.randint(low=1, high=2, size=len(synth)) # other random, uncorrelated features
synth['b'] = np.random.randint(low=1, high=15, size=len(synth))
synth['c'] = np.random.randint(low=5, high=20, size=len(synth))
print(synth.head())
print("\n", synth.shape)

          y   c0   c1         w  z  a   b   c
0  7.383773  0.0  1.0  6.479200  2  1  10  18
1  6.255114  1.0  0.0  4.230080  3  1  11  14
2  5.614841  1.0  0.0  3.773609  4  1   4  19
3  7.692184  0.0  1.0  6.553467  4  1  13  19
4  7.440835  0.0  1.0  6.139432  1  1  12  19

 (10000, 8)


#### Get example configuration files

In [4]:
with open('example_config.json') as j:
    config = json.load(j)

print("Example configuration dictionary: \n", config)

Example configuration dictionary: 
 {'protected_class_features': ['c0', 'c1', 'z'], 'target_feature': ['y']}


#### Get example marginal weights

In [5]:
with open('example_weights.json') as j:
    weights = json.load(j)
print(weights)

{'z': {'1': 0.9, '2': 0.02, '3': 0.04, '4': 0.04}}


The parameter `weights` allows users to supply a dictionary of custom marginal distributions for each protected class feature. In the example above, say you'd like to know how predictions changed if the share of individuals coded 1 in protected class feature 'z' were 0.9, rather than ~0.25? You'd simply need to change the values in the weights dictionary. Importantly marginals per feature must sum to 1. Further, in this example, 'c0' and 'c1' are one-hot vectors for a binary random variable. In this case, if you altered the share of one group in 'c0' you'd also need to apply the inverse to 'c1' so that the marginals reflected mutual exclusivity. If `DiscriminationMitigator` detects two variables as possible one-hot vectors (i.e. they are extremely correlated), it will raise a Warning, but it *will not* enforce that the marginals of adjacent one-hot vectors are indeed inverses.

Additionally, JSON files require keys (in this case, feature names) to be strings, so make sure yours are. `DiscriminationMitigator` converts these to their correct numeric format, so you need not worry about this.

#### Split the data into train, validation, and test sets

In [6]:
# Train (and val) / test split
X_train, X_test, y_train, y_test = train_test_split(synth.loc[:, ~synth.columns.isin(config['target_feature'])],
                                                    synth[config['target_feature']], random_state=123,
                                                    test_size=500)

# Train / val split
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, random_state=123, test_size=0.2)

for x in X_train, X_val, X_test:
    print(x.shape)

(7600, 7)
(1900, 7)
(500, 7)


#### Train a Tensorflow Keras Sequential deep learning model

In [7]:
tf.keras.backend.clear_session()
model = tf.keras.Sequential()
model.add(tf.keras.layers.Input(shape=7,))
model.add(tf.keras.layers.Dense(8))
model.add(tf.keras.layers.Dropout(0.3))
model.add(tf.keras.layers.Dense(16))
model.add(tf.keras.layers.Dropout(0.1))
model.add(tf.keras.layers.Dense(1))

model.compile(optimizer='adam', loss='mse')

print(model.summary())

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 8)                 64        
_________________________________________________________________
dropout (Dropout)            (None, 8)                 0         
_________________________________________________________________
dense_1 (Dense)              (None, 16)                144       
_________________________________________________________________
dropout_1 (Dropout)          (None, 16)                0         
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 17        
Total params: 225
Trainable params: 225
Non-trainable params: 0
_________________________________________________________________
None


In [8]:
model.fit(X_train, y_train, epochs=10, batch_size=64, validation_data=(X_val, y_val))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x180593a7048>

## Discrimination mitigation tool

#### Example 1: Use just the marginals from `df`

In [9]:
ex1 = DiscriminationMitigator(df=X_test, model=model, config=config, train=None, weights=None).predictions()

In the following examples, unadjusted predictions (`unadj_pred`) and uniform weights (`unif_wts`) remain the same. Population weights (`pop_wts`) in this case pertain to the marginal distributions per protected class feature in `df`. 

In [10]:
print("Dataframe of predictions: \n", ex1.reset_index(drop=True).head())
print("\nStatistical moments: \n", ex1.describe())
print("\nCorrelation matrix of predictions: \n", ex1.corr())

Dataframe of predictions: 
    unadj_pred  unif_wts   pop_wts
0    6.661998  6.676531  6.653672
1    6.271629  6.202929  6.207815
2    5.450657  5.519357  5.514553
3    6.695879  6.654924  6.650561
4    6.580443  6.594976  6.572117

Statistical moments: 
        unadj_pred    unif_wts     pop_wts
count  500.000000  500.000000  500.000000
mean     6.141468    6.141298    6.141468
std      0.559403    0.533642    0.526824
min      4.807796    4.876495    4.871692
25%      5.643587    5.679643    5.690591
50%      6.155550    6.154496    6.148515
75%      6.611161    6.593558    6.572833
max      7.422514    7.353815    7.358699

Correlation matrix of predictions: 
             unadj_pred  unif_wts   pop_wts
unadj_pred    1.000000  0.998369  0.998899
unif_wts      0.998369  1.000000  0.999762
pop_wts       0.998899  0.999762  1.000000


#### Example 2: Use the marginals from another dataset (`train`)

In [11]:
ex2 = DiscriminationMitigator(df=X_test, model=model, config=config, train=X_train, weights=None).predictions()

In cases that the training set (or another dataset) is considerably larger and potentially more population representative than `df`, you may want to reweight the adjusted predictions in `pop_wts` to the marginals of all protected class features in this other dataset.

In [12]:
print("Dataframe of predictions: \n", ex2.reset_index(drop=True).head())
print("\nStatistical moments: \n", ex2.describe())
print("\nCorrelation matrix of predictions: \n", ex2.corr())

Dataframe of predictions: 
    unadj_pred  unif_wts   pop_wts
0    6.661998  6.676531  6.654327
1    6.271629  6.202929  6.208470
2    5.450657  5.519357  5.515208
3    6.695879  6.654924  6.651216
4    6.580443  6.594976  6.572772

Statistical moments: 
        unadj_pred    unif_wts     pop_wts
count  500.000000  500.000000  500.000000
mean     6.141468    6.141298    6.142123
std      0.559403    0.533642    0.526824
min      4.807796    4.876495    4.872347
25%      5.643587    5.679643    5.691246
50%      6.155550    6.154496    6.149170
75%      6.611161    6.593558    6.573488
max      7.422514    7.353815    7.359354

Correlation matrix of predictions: 
             unadj_pred  unif_wts   pop_wts
unadj_pred    1.000000  0.998369  0.998899
unif_wts      0.998369  1.000000  0.999762
pop_wts       0.998899  0.999762  1.000000


In [13]:
compare_pop = pd.concat([ex1['pop_wts'].rename('df'), ex2['pop_wts'].rename('train')], axis=1)
print("Compare population weights from 'df' vs. 'train':")
print(compare_pop.describe())

Compare population weights from 'df' vs. 'train':
               df       train
count  500.000000  500.000000
mean     6.141468    6.142123
std      0.526824    0.526824
min      4.871692    4.872347
25%      5.690591    5.691246
50%      6.148515    6.149170
75%      6.572833    6.573488
max      7.358699    7.359354


#### Example 3: Use the marginals from another dataset and use custom weights

In [14]:
ex3 = DiscriminationMitigator(df=X_test, model=model, config=config, train=X_train, weights=weights).predictions()


If no category is omitted, users must ensure custom marginal weights for one-hot vectors align correctly.


You may also want to reweight predictions to ask 'what-if' questions: i.e. what if the share of group *x* were different than their observed share in the data? Providing a dictionary of marginal distributions to `weights` will allow for this.

In [15]:
print("Custom weights:", weights, "\n")
print("Dataframe of predictions: \n", ex3.reset_index(drop=True).head())
print("\nStatistical moments: \n", ex3.describe())
print("\nCorrelation matrix of predictions: \n", ex3.corr())

Custom weights: {'z': {'1': 0.9, '2': 0.02, '3': 0.04, '4': 0.04}} 

Dataframe of predictions: 
    unadj_pred  unif_wts   pop_wts  cust_wts
0    6.661998  6.676531  6.654327  6.674205
1    6.271629  6.202929  6.208470  6.117371
2    5.450657  5.519357  5.515208  5.462864
3    6.695879  6.654924  6.651216  6.597110
4    6.580443  6.594976  6.572772  6.592651

Statistical moments: 
        unadj_pred    unif_wts     pop_wts    cust_wts
count  500.000000  500.000000  500.000000  500.000000
mean     6.141468    6.141298    6.142123    6.069887
std      0.559403    0.533642    0.526824    0.555547
min      4.807796    4.876495    4.872347    4.820003
25%      5.643587    5.679643    5.691246    5.589363
50%      6.155550    6.154496    6.149170    6.094118
75%      6.611161    6.593558    6.573488    6.558893
max      7.422514    7.353815    7.359354    7.268256

Correlation matrix of predictions: 
             unadj_pred  unif_wts   pop_wts  cust_wts
unadj_pred    1.000000  0.998369  0.99

#### Example 4: Reweighting multiple features at one - avoid this!

In [16]:
new_weights = {'c0': {0.0: 0.1, 1.0: 0.9}, 'c1': {0.0: 0.9, 1.0: 0.1}, 'z': {1.0: 0.9, 2.0: 0.02, 3.0: 0.04, 4.0: 0.04}}
ex4 = DiscriminationMitigator(df=X_test, model=model, config=config, train=X_train, weights=new_weights).predictions()


If no category is omitted, users must ensure custom marginal weights for one-hot vectors align correctly.


Though `DiscriminationMitigator` does not forbid it, we discourage users from reweighting multiple protected class features at the same time. The reason being that this attenuates the effect of the individual reweighted protected class feature. Each reweighted feature produces an *N* x 1 vector, so the preceding creates 3 counterfactual vectors, which are then averaged across for each person. Though the marginals may be weighted differently, this averaging may produce very similar predictions between `pop_wts` and `cust_wts` and should be hence avoided.

In [17]:
print("Custom weights:", new_weights, "\n")
print("Dataframe of predictions: \n", ex4.reset_index(drop=True).head())
print("\nStatistical moments: \n", ex4.describe())
print("\nCorrelation matrix of predictions: \n", ex4.corr())

Custom weights: {'c0': {0.0: 0.1, 1.0: 0.9}, 'c1': {0.0: 0.9, 1.0: 0.1}, 'z': {1.0: 0.9, 2.0: 0.02, 3.0: 0.04, 4.0: 0.04}} 

Dataframe of predictions: 
    unadj_pred  unif_wts   pop_wts  cust_wts
0    6.661998  6.676531  6.654327  6.601067
1    6.271629  6.202929  6.208470  6.155210
2    5.450657  5.519357  5.515208  5.461948
3    6.695879  6.654924  6.651216  6.597956
4    6.580443  6.594976  6.572772  6.519513

Statistical moments: 
        unadj_pred    unif_wts     pop_wts    cust_wts
count  500.000000  500.000000  500.000000  500.000000
mean     6.141468    6.141298    6.142123    6.088863
std      0.559403    0.533642    0.526824    0.526824
min      4.807796    4.876495    4.872347    4.819087
25%      5.643587    5.679643    5.691246    5.637986
50%      6.155550    6.154496    6.149170    6.095911
75%      6.611161    6.593558    6.573488    6.520228
max      7.422514    7.353815    7.359354    7.306095

Correlation matrix of predictions: 
             unadj_pred  unif_wts   