<a href="https://colab.research.google.com/github/viguardieiro/moopt_fairness/blob/master/Fairness_with_hyperparameter_tunning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [2]:
cd /content/drive/My Drive/Vitória - TCC/Notebooks

/content/drive/My Drive/Vitória - TCC/Notebooks


# Fairness by reweighing with sample_weight in Sklearn

In [3]:
!apt-get install coinor-cbc

Reading package lists... Done
Building dependency tree       
Reading state information... Done
coinor-cbc is already the newest version (2.9.9+repack1-1).
The following package was automatically installed and is no longer required:
  libnvidia-common-440
Use 'apt autoremove' to remove it.
0 upgraded, 0 newly installed, 0 to remove and 59 not upgraded.


In [4]:
!pip install git+https://github.com/viguardieiro/moopt
!pip install sklego
!pip install mip
!pip install optuna
!pip install line_profiler

Collecting git+https://github.com/viguardieiro/moopt
  Cloning https://github.com/viguardieiro/moopt to /tmp/pip-req-build-xg68onr5
  Running command git clone -q https://github.com/viguardieiro/moopt /tmp/pip-req-build-xg68onr5
Building wheels for collected packages: moopt
  Building wheel for moopt (setup.py) ... [?25l[?25hdone
  Created wheel for moopt: filename=moopt-0.0.1-cp36-none-any.whl size=28693 sha256=b4fd70c5d1f16e5460619f8b15c710692ff4224b7a09a2412e5c06d73619dc65
  Stored in directory: /tmp/pip-ephem-wheel-cache-n53g7shd/wheels/20/6d/9f/d08c62ac9635e87e332fb12c8077ae4044ff5dc84cf1d9253f
Successfully built moopt


In [5]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklego.metrics import equal_opportunity_score
from sklego.metrics import p_percent_score

In [6]:
%load_ext autoreload
%autoreload 2
%load_ext line_profiler

In [7]:
from sklearn.metrics import log_loss
from sklearn.utils.extmath import squared_norm
from moopt.scalarization_interface import scalar_interface, single_interface, w_interface
from moopt import monise
import numpy as np

class FairScalarization(w_interface, single_interface, scalar_interface):
    def __init__(self, X, y, fair_feature):
        self.fair_feature = fair_feature
        self.fair_att = sorted(X[fair_feature].unique())
        self.__M = len(self.fair_att)+1
        self.X, self.y = X, y

    @property
    def M(self):
        return self.__M

    @property
    def feasible(self):
        return True

    @property
    def optimum(self):
        return True

    @property
    def objs(self):
        return self.__objs

    @property
    def x(self):
        return self.__x

    @property
    def w(self):
        return self.__w

    def optimize(self, w):
        """Calculates the a multiobjective scalarization"""
        if type(w) is int:
            self.__w = np.zeros(self.M)
            self.__w[w] = 1
        elif type(w) is np.ndarray and w.ndim==1 and w.size==self.M:
            self.__w = w
        else:
            raise('w is in the wrong format')
        #print('w', self.__w)
            
        if self.__w[-1]==0:
            lambd=10**-20
        elif self.__w[-1]==1:
            lambd=10**20
        else:
            lambd = self.__w[-1]/(1-self.__w[-1])
        fair_weight = self.__w[:-1]*(1+lambd)
        
        sample_weight = self.X[self.fair_feature].replace({ff:fw for ff, fw in zip(self.fair_att,fair_weight)})
        #sample_weight = self.X[self.fair_feature].replace({ff:fw/sum(X[self.fair_feature]==ff) for ff, fw in zip(self.fair_att,fair_weight)})
        reg = LogisticRegression(multi_class='multinomial', solver='lbfgs',
                                 penalty='l2', max_iter=10**6, tol=10**-6, 
                                 C=1/lambd).fit(self.X, self.y, sample_weight=sample_weight)
        
        y_pred = reg.predict_proba(self.X)
        
        self.__objs = np.zeros(len(self.fair_att)+1)
        for i, feat in enumerate(self.fair_att):
            fair_weight = np.zeros(len(self.fair_att))
            fair_weight[i] = 1
            sample_weight = X[self.fair_feature].replace({ff:fw for ff, fw in zip(self.fair_att,fair_weight)})
            self.__objs[i] = log_loss(y, y_pred, sample_weight=sample_weight)*sum(X[self.fair_feature]==feat)
            #self.__objs[i] = log_loss(y, y_pred, sample_weight=sample_weight)
        
        self.__objs[-1] = squared_norm(reg.coef_)
        self.__x = reg
        #print('objs', self.__objs)
        return self

In [8]:
mydata= pd.read_csv("Datasets/german_credit_data.csv")

Dados de pedidos de crédito. É um dos datasets mais utilizados para tutoriais em Fairness, como na biblioteca [$aif360$](https://github.com/IBM/AIF360/blob/master/examples/README.md). Dataset original disponível em [aqui](https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data)), mas eu utilizei [este](https://www.kaggle.com/kabure/german-credit-data-with-risk), por estar em formato csv com os headers, embora omita informações do dataset original.

Originalmente possui 1000 dados, mas com vários valores NaN, ficando com 522 dados após remoção de dados com informações faltantes.

In [9]:
mydata.head()

Unnamed: 0.1,Unnamed: 0,Age,Sex,Job,Housing,Saving accounts,Checking account,Credit amount,Duration,Purpose,Risk
0,0,67,male,2,own,,little,1169,6,radio/TV,good
1,1,22,female,2,own,little,moderate,5951,48,radio/TV,bad
2,2,49,male,1,own,little,,2096,12,education,good
3,3,45,male,2,free,little,little,7882,42,furniture/equipment,good
4,4,53,male,2,free,little,little,4870,24,car,bad


In [10]:
mydata = mydata.drop(['Unnamed: 0', 'Purpose'], axis=1)

In [11]:
mydata = mydata.dropna()

In [12]:
mapping_Sex = {'male': 0, 'female': 1}
mapping_Housing = {'free': 1, 'rent': 2, 'own': 3}
mapping_Savings = {'little': 1, 'moderate': 2, 'quite rich': 3, 'rich': 4}
mapping_Checking = {'little': 1, 'moderate': 2, 'rich': 3}
mapping_Risk = {"bad": -1, "good": 1}

numerical_data = mydata.replace({'Sex': mapping_Sex, 'Housing': mapping_Housing, 'Saving accounts': mapping_Savings,
                'Checking account':mapping_Checking, 'Risk': mapping_Risk})

In [13]:
X = numerical_data.drop(['Risk'], axis=1)

In [14]:
y = numerical_data['Risk']

In [15]:
X_tv, X_test, y_tv, y_test = train_test_split(X, y, test_size=100)
X_train, X_val, y_train, y_val = train_test_split(X_tv, y_tv, test_size=100)

In [16]:
import optuna, sklearn, sklearn.datasets

In [17]:
class MOOLogisticRegression():
    def __init__(self, X_train, y_train, X_val, y_val, metric='accuracy'):
        self.X_train = X_train
        self.y_train = y_train
        self.X_val = X_val
        self.y_val = y_val
        self.best_perf = 0
        self.best_model = None
        self.metric = metric

    def tune(self):
        moo_ = monise(weightedScalar=FairScalarization(X, y, 'Sex'), singleScalar=FairScalarization(X, y, 'Sex'),
                      nodeTimeLimit=2, targetSize=150,
                      targetGap=0, nodeGap=0.01, norm=False)
        moo_.optimize()
        for solution in moo_.solutionsList:
            y_pred = solution.x.predict(self.X_val)
            
            if (sklearn.metrics.accuracy_score(self.y_val, y_pred)==0 or
                equal_opportunity_score(sensitive_column="Sex")(solution.x, self.X_val, self.y_val)==0 or
                p_percent_score(sensitive_column="Sex")(solution.x, self.X_val))==0:
                continue
            
            if self.metric=='accuracy':
                perf = sklearn.metrics.accuracy_score(self.y_val, y_pred)
            elif self.metric=='equal_opportunity':
                perf = equal_opportunity_score(sensitive_column="Sex")(solution.x, self.X_val, self.y_val)
            elif self.metric=='p_percent':
                perf = p_percent_score(sensitive_column="Sex")(solution.x, self.X_val)
            
            if perf>self.best_perf:
                self.best_perf = perf
                self.best_model = solution.x
        return self.best_model
        
class FindCLogisticRegression():
    def __init__(self, X_train, y_train, X_val, y_val, sample_weight=None, metric='accuracy'):
        self.X_train = X_train
        self.y_train = y_train
        self.X_val = X_val
        self.y_val = y_val
        self.best_perf = 0
        self.best_model = None
        self.sample_weight = sample_weight
        self.metric = metric

    def objective(self, trial):
        C = trial.suggest_loguniform('C', 1e-10, 1e10)
        model = LogisticRegression(C=C, max_iter=10**3, tol=10**-6)

        model.fit(self.X_train, self.y_train, sample_weight=self.sample_weight)
        y_pred = model.predict(self.X_val)

        if (sklearn.metrics.accuracy_score(self.y_val, y_pred)==0 or
            equal_opportunity_score(sensitive_column="Sex")(model, self.X_val, self.y_val)==0 or
            p_percent_score(sensitive_column="Sex")(model, self.X_val))==0:
            return float('inf')
        
        if self.metric=='accuracy':
            perf = sklearn.metrics.accuracy_score(self.y_val, y_pred)
        elif self.metric=='equal_opportunity':
            perf = equal_opportunity_score(sensitive_column="Sex")(model, self.X_val, self.y_val)
        elif self.metric=='p_percent':
            perf = p_percent_score(sensitive_column="Sex")(model, self.X_val)
        
        if perf>self.best_perf:
            self.best_perf = perf
            self.best_model = model
        
        error = 1-perf

        return error  # An objective value linked with the Trial object.
    def tune(self):
        optuna.logging.set_verbosity(optuna.logging.CRITICAL)
        study = optuna.create_study()  # Create a new study.
        study.optimize(self.objective, n_trials=100)
        
        return self.best_model
    
class FindCCLogisticRegression():
    def __init__(self, X_train, y_train, X_val, y_val, sample_weight=None, metric='accuracy', base_model='demografic'):
        self.X_train = X_train
        self.y_train = y_train
        self.X_val = X_val
        self.y_val = y_val
        self.best_perf = 0
        self.best_model = None
        self.sample_weight = sample_weight
        self.metric = metric
        self.base_model = base_model

    def objective(self, trial):
        C = trial.suggest_loguniform('C', 1e-5, 1e5)
        c = trial.suggest_loguniform('c', 1e-5, 1e5)
        try:
            if self.base_model=='equal':
                model = EqualOpportunityClassifier(sensitive_cols="Sex", positive_target=True, covariance_threshold=c, C=C, max_iter=10**3)
            else:
                model = DemographicParityClassifier(sensitive_cols="Sex", covariance_threshold=c, C=C, max_iter=10**3)
        except:
            return float('inf')

        model.fit(self.X_train, self.y_train)
        y_pred = model.predict(self.X_val)
        
        if (sklearn.metrics.accuracy_score(self.y_val, y_pred)==0 or
            equal_opportunity_score(sensitive_column="Sex")(model, self.X_val, self.y_val)==0 or
            p_percent_score(sensitive_column="Sex")(model, self.X_val))==0:
            return float('inf')

        
        if self.metric=='accuracy':
            perf = sklearn.metrics.accuracy_score(self.y_val, y_pred)
        elif self.metric=='equal_opportunity':
            perf = equal_opportunity_score(sensitive_column="Sex")(model, self.X_val, self.y_val)
        elif self.metric=='p_percent':
            perf = p_percent_score(sensitive_column="Sex")(model, self.X_val)
        
        if perf>self.best_perf:
            self.best_perf = perf
            self.best_model = model
        
        error = 1-perf

        return error  # An objective value linked with the Trial object.
    def tune(self):
        optuna.logging.set_verbosity(optuna.logging.CRITICAL)
        study = optuna.create_study()  # Create a new study.
        study.optimize(self.objective, n_trials=100)
        
        return self.best_model

Decidi utilizar duas métricas, a $\text{p% score}$ e $\text{equality of opportunity}$, definidas como:  

$\text{p% score}=\min(\frac{P(\hat{y}=1|z=1)}{P(\hat{y}=1|z=0)},\frac{P(\hat{y}=1|z=0)}{P(\hat{y}=1|z=1))}$

Membership in a protected class should have no correlation with the decision.

$\text{equality of opportunity}=\min(\frac{P(\hat{y}=1|z=1,y=1)}{P(\hat{y}=1|z=0,y=1)},\frac{P(\hat{y}=1|z=0,y=1)}{P(\hat{y}=1|z=1,y=1)})$

In [18]:
metric = 'equal_opportunity'

In [19]:
reg = FindCLogisticRegression(X_train, y_train, X_val, y_val, metric=metric).tune()

In [20]:
print('accuracy', reg.score(X_test, y_test))
print('equal_opportunity', equal_opportunity_score(sensitive_column="Sex")(reg, X_test, y_test))
print('p_percent:', p_percent_score(sensitive_column="Sex")(reg, X_test))

accuracy 0.48
equal_opportunity 1.0
p_percent: 1.0


In [21]:
reg = MOOLogisticRegression(X_train, y_train, X_val, y_val, metric=metric).tune()


invalid value encountered in double_scalars


No samples with y_hat == 1 for Sex == 1, returning 0



In [22]:
print('accuracy', reg.score(X_val, y_val))
print('equal_opportunity', equal_opportunity_score(sensitive_column="Sex")(reg, X_val, y_val))
print('p_percent:', p_percent_score(sensitive_column="Sex")(reg, X_val))

accuracy 0.59
equal_opportunity 1.0
p_percent: 0.9478787878787879


### Modelo 1: Reweighing

In [23]:
def calc_reweight(X, y):
    W = {}
    W[0] = {}
    W[1] = {}

    D = len(X)
    len_men = X.groupby('Sex').count()['Age'][0]
    len_women = X.groupby('Sex').count()['Age'][1]
    len_neg = sum(y==-1)
    len_pos = sum(y==1)
    len_men_pos = len(X[(X.Sex == 0) & (y == 1)])
    len_men_neg = len(X[(X.Sex == 0) & (y == -1)])
    len_women_pos = len(X[(X.Sex == 1) & (y == 1)])
    len_women_neg = len(X[(X.Sex == 1) & (y == -1)])

    W[0][1] = (len_men*len_pos)/(D*len_men_pos)
    W[0][-1] = (len_men*len_neg)/(D*len_men_neg)

    W[1][1] = (len_women*len_pos)/(D*len_women_pos)
    W[1][-1] = (len_women*len_neg)/(D*len_women_neg)
    
    sample_weight = []
    for i in range(X.shape[0]):
        sample_weight.append(W[X.iloc[i]['Sex']][y.iloc[i]])

    return sample_weight

In [24]:
sample_weight = calc_reweight(X_train, y_train)
reg = FindCLogisticRegression(X_train, y_train, X_val, y_val, metric=metric, sample_weight=sample_weight).tune()

In [25]:
print('accuracy', reg.score(X_test, y_test))
print('equal_opportunity', equal_opportunity_score(sensitive_column="Sex")(reg, X_test, y_test))
print('p_percent:', p_percent_score(sensitive_column="Sex")(reg, X_test))

accuracy 0.48
equal_opportunity 1.0
p_percent: 1.0


### Modelo 2: Information Filter

In [26]:
from sklego.preprocessing import InformationFilter

infoTransf = InformationFilter(["Sex"])
infoTransf.fit(X_train)
X_tr_fair = infoTransf.transform(X_train)
X_tr_fair = pd.DataFrame(X_tr_fair, columns=[n for n in X_train.columns if n not in ['Sex']])
X_vl_fair = infoTransf.transform(X_val)
X_vl_fair = pd.DataFrame(X_vl_fair, columns=[n for n in X_val.columns if n not in ['Sex']])
X_te_fair =infoTransf.fit_transform(X_test)
X_te_fair = pd.DataFrame(X_te_fair, columns=[n for n in X_test.columns if n not in ['Sex']])

In [27]:
#reg = FindCLogisticRegression(X_tr_fair, y_train, X_vl_fair, y_val, metric=metric).tune()

In [28]:
#print('accuracy', reg.score(X_te_fair, y_test))
#print('equal_opportunity_score', equal_opportunity_score(sensitive_column="Sex")(reg, X_te_fair, y_test))
#print('p_percent_score:', p_percent_score(sensitive_column="Sex")(reg, X_te_fair))

### Modelo 3: DemographicParityClassifier

In [29]:
from sklego.linear_model import DemographicParityClassifier
from sklego.linear_model import EqualOpportunityClassifier
from sklearn.linear_model import LogisticRegression

from sklearn.metrics import accuracy_score, make_scorer
from sklearn.model_selection import GridSearchCV


The sklearn.linear_model.base module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.linear_model. Anything that cannot be imported from sklearn.linear_model is now part of the private API.



In [30]:
reg = FindCCLogisticRegression(X_train, y_train, X_val, y_val, metric=metric, base_model=DemographicParityClassifier).tune()

In [31]:
print('accuracy', reg.score(X_test, y_test))
print('equal_opportunity_score', equal_opportunity_score(sensitive_column="Sex")(reg, X_test, y_test))
print('p_percent_score:', p_percent_score(sensitive_column="Sex")(reg, X_test))

accuracy 0.48
equal_opportunity_score 1.0
p_percent_score: 1.0


### Modelo 4: Equal opportunity classifier

In [32]:
reg = FindCCLogisticRegression(X_train, y_train, X_val, y_val, metric=metric, base_model='equal').tune()

In [33]:
print('accuracy', reg.score(X_test, y_test))
print('equal_opportunity_score', equal_opportunity_score(sensitive_column="Sex")(reg, X_test, y_test))
print('p_percent_score:', p_percent_score(sensitive_column="Sex")(reg, X_test))

accuracy 0.48
equal_opportunity_score 1.0
p_percent_score: 1.0
