## Detecting and mitigating Age and Sex bias on credit decisions

### Importing the Required Libraries 

In [78]:
import numpy as np
from aif360.datasets import GermanDataset
from aif360.metrics import BinaryLabelDatasetMetric
from aif360.algorithms.preprocessing import Reweighing

### Load dataset, specifying protected attribute, and split dataset into train and test


In [79]:
dataset_orig = GermanDataset(
    protected_attribute_names=['age'],                           
    privileged_classes=[lambda x: x >= 25],     
    features_to_drop=['personal_status', 'sex'] 
   )

dataset_orig_train, dataset_orig_test = dataset_orig.split([0.7], shuffle=True)

In [80]:
privileged_groups = [{'age': 1}]
unprivileged_groups = [{'age': 0}]

### Compute fairness metric on original training dataset

In [81]:
metric_orig_train = BinaryLabelDatasetMetric(dataset_orig_train, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)

print("Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_orig_train.mean_difference())

Difference in mean outcomes between unprivileged and privileged groups = -0.154574


### Mitigate bias by transforming the original dataset

In [82]:
RW = Reweighing(unprivileged_groups=unprivileged_groups,
                privileged_groups=privileged_groups)
dataset_transf_train = RW.fit_transform(dataset_orig_train) 

### Compute fairness metric on transformed dataset

In [83]:
metric_transf_train = BinaryLabelDatasetMetric(dataset_transf_train, 
                                               unprivileged_groups=unprivileged_groups,
                                               privileged_groups=privileged_groups)

print("Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_transf_train.mean_difference())

Difference in mean outcomes between unprivileged and privileged groups = -0.000000


### The Above cells deteced bias in age and we sucessfully addressed the problem by treating the Bias
we have a changed the dataset, we used same measure we used for the original training dataset to check if the baqis was reduced r. The mean difference method in the BinaryLabelDatasetMetric class was  The mitigation step appears to have been quite successful, since the difference in mean outcomes is now 0.0. So, in terms of mean result, we went from a 15 percent advantage for the affluent group to equality.

###  Now we use Sex as an attribute to for detecting and mitigating bias

We load datset and set the protected property to sex and drop age as they are not required. The original dataset is then divided into training and testing datasets. Finally, for the privileged (1) in our case we set male  and unprivileged (0) for female values of the sex property, we set  the two variables for identifying and minimizing bias.

In [92]:
dataset_orig_s = GermanDataset(
    protected_attribute_names=['sex'],
     privileged_classes=[lambda x: x == 'male'],     
    features_to_drop=['personal_status', 'age'] 
   )

In [93]:
dataset_orig_train_s, dataset_orig_test_s = dataset_orig_s.split([0.7], shuffle=True)

In [94]:
privileged_groups = [{'sex': 1}]
unprivileged_groups = [{'sex': 0}]

In [95]:
metric_orig_train = BinaryLabelDatasetMetric(dataset_orig_train_s, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)

print("Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_orig_train.mean_difference())

Difference in mean outcomes between unprivileged and privileged groups = -0.058012


The previous step showed that the privileged group was getting 6% more positive outcomes in the training dataset. Since this is not desirable, we are going to try to mitigate this bias in the training dataset, this is called pre-processing mitigation. This means that male is a  postivie class with 6 percent bias next step will be itigating it

### Mitigate bias by transforming the original dataset

To mitigate the effects of the gender bias in our original dataset, we can transform the dataset using a pre-processing technique called reweighing. This assigns different weights to the various entities in the population to ensure fairness.

In [96]:
RW = Reweighing(unprivileged_groups=unprivileged_groups,
                privileged_groups=privileged_groups)
dataset_transf_train = RW.fit_transform(dataset_orig_train_s) 

The algorithm we used is Reweighing Algorithm that means this mitigation of biasness will be done before building the model. This algorithm will transform the dataset to have more equity in positive outcomes on the protected attribute for the privileged and unprivileged groups.

### Compute fairness metric on transformed dataset

In [97]:
metric_transf_train = BinaryLabelDatasetMetric(dataset_transf_train, 
                                               unprivileged_groups=unprivileged_groups,
                                               privileged_groups=privileged_groups)

print("Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_transf_train.mean_difference())

Difference in mean outcomes between unprivileged and privileged groups = -0.000000


Now that we have a changed dataset, we used same measure we used for the original training dataset to check see how effective it was in reducing bias. The mitigation step appears to have been quite successful, since the difference in mean outcomes is now 0.0. So, in terms of mean result, we went from a 6 percent advantage for the affluent group to equality.

We've seen how a dataset with historical bias might lead to unjust findings when models are constructed on it. Males would receive greater resources in our scenario since they have traditionally been more likely to acceptable. This is due to the fact that typical machine learning approaches prioritize accuracy above fairness. We've also shown how basic bias mitigation strategies can be used to eliminate bias from datasets, resulting in models with equal accuracy but far higher fairness measures. These strategies for detecting and mitigating bias are critical for any organization looking to automate decision-making on populations with protected characteristics.

### Understanding of Mitigation and Bias

we attempt to be sensible when making decisions, assessing the potential benefits and drawbacks of the many options available to make the best decision possible but rather we are more inclined to follow our instincts. Regardless of the technique we choose, there's a good probability you've made some poor choices in your life. While most people assume that making decisions is a reasonable process, research has shown that implicit bias may drive you to particular conclusions without your knowledge. This has ramifications for learning leaders throughout a company.

Implicit bias, by its very nature, acts in the subconscious, making it difficult for learning leaders to overcome. Often, you aren't even aware that your objectivity and impartiality are being harmed by bias. 

Similar situation can be observed when handling a Imbalaced Data set we have major class of target varaible we handle them by ceration ML methods to generate a proper training data set and bais is reduce in our Model



##### Issues Faced 
- For chosing a another column 'sex' male was assumed to be privilaged as as this is again a case of bais, here I assume that male is more proviliged class, it was an issue  as why not to choose female to be more postively baised.
- comming to technical part the lobrariues imported AI fairness needed Tensorflow for proper functioning
- reading the data set to make certain choices was difficult but referred the following [https://archive.ics.uci.edu/ml/datasets/Statlog+(German+Credit+Data)]
- while ,oading the data set the datset needed to br the conda envirommet datset folder which was not the case, I downloaded the data set and plces it in the specified path

Sources:

- https://analyticsindiamag.com/guide-to-ai-fairness-360-an-open-source-toolkit-for-detection-and-mitigation-of-bias-in-ml-models/ 
- https://www.chieflearningofficer.com/2020/10/22/mitigating-the-effects-of-implicit-bias/
- https://ambiata.com/blog/2019-12-13-bias-detection-and-mitigation/
- https://nbviewer.org/github/IBM/AIF360/blob/master/examples/tutorial_credit_scoring.ipynb
