In [1]:
using Pkg
Pkg.activate("..")

using Fairness

[32m[1m Activating[22m[39m environment at `~/Desktop/Fairness/Project.toml`


### Creating Fairness Tensor

```fair_tensor``` function is used for this. It accepts 3 arguments:
 - ŷ : Predicted Class
 - y : Ground truth
 - grp : Protected attribute value

In [2]:
using CategoricalArrays

ŷ = categorical([1, 0, 1, 1, 0])
y = categorical([1, 1, 0, 1, 0])
grp = categorical(["African", "American", "Indian", "American", "African"])

ft = fair_tensor(ŷ, y, grp)

Fairness.FairTensor{3}([1 0; 1 1; 0 0]

[0 1; 0 0; 1 0], ["African", "American", "Indian"])

Now we shall use a Toy dataset of jobs, containing 10 rows. ```job_fairtensor``` function from data.jl shall be used to get the fairness tensor. In our further discussion we shall be using this same fairness tensor.

In [3]:
ft = @load_toyfairtensor

println(ft.mat)
println()
println(ft.labels)

[2 2; 0 0; 0 2]

[0 0; 2 1; 1 0]

["Board", "Education", "Healthcare"]


## Metrics

All the metrics are Callable structs. Upon instantiation of the metric, it should be called by passing the fairness tensor.

Its general form is ```metric(ft::FairTensor; grp=nothing)```

Metrics can have multiple aliases as follows :

In [4]:
tp = TruePositiveRate()
println(tp(ft))

println(true_positive_rate(ft))
println(truepositive_rate(ft))
println(tpr(ft))
println(TPR()(ft))

0.3333333333333334
0.3333333333333334
0.3333333333333334
0.3333333333333334
0.3333333333333334


### Group Specific Metric Calculation

The metrics can be calculated for specific groups by passing the grp argument.

In [5]:
println(true_positive_rate(ft; grp="Board"))

println(ppv(ft; grp="Education"))

println(TruePositive()(ft; grp="Healthcare"))

0.5000000000000001
0.9999999999999994
0


### Various Calc-Metrics

These are the metrics that return numerical values. 

The optional `grp` argument can also be passed to them to get the group specific metric.

In [6]:
println("True Positive : ", truepositive(ft))
println("True Negative : ", truenegative(ft))
println("False Positive : ", falsepositive(ft))
println("False Negative : ", falsenegative(ft))
println("True Positive Rate : ", truepositive_rate(ft))
println("True Negative Rate : ", truenegative_rate(ft))
println("False Positive Rate : ", falsepositive_rate(ft))
println("False Negative :  Rate : ", falsenegative_rate(ft))
println("False Discovery Rate : ", falsediscovery_rate(ft))
println("Positive Predictive Value : ", positive_predictive_value(ft))
println("Negative Predictive Value : ", negative_predictive_value(ft))

True Positive : 2
True Negative : 1
False Positive : 3
False Negative : 4
True Positive Rate : 0.3333333333333334
True Negative Rate : 0.2500000000000002
False Positive Rate : 0.7499999999999998
False Negative :  Rate : 0.6666666666666665
False Discovery Rate : 0.40000000000000013
Positive Predictive Value : 0.5999999999999999
Negative Predictive Value : 0.20000000000000018


## Disparity

`disparity(M, ft; refGrp=nothing)`

M is the array of Fairness metrics, ft is Fairness Tensor and refGrp is the reference group.

It computes disparity for fairness tensor `ft` with respect to an array of metrics `M` and returns a dataframe of disparity of these metrics.

For any class A and a reference Group B, disparity = metric(A)/metric(B)

Please note that division by 0 will result in NaN

In [7]:
M = [false_positive_rate, true_negative_rate, ppv, npv ]

df = disparity(M, ft; refGrp="Education")

Unnamed: 0_level_0,labels,false_positive_rate_disparity,true_negative_rate_disparity,positive_predictive_value_disparity
Unnamed: 0_level_1,String,Float64,Float64,Float64
1,Board,0.0,3.0,0.0
2,Education,1.0,1.0,1.0
3,Healthcare,1.5,3e-15,1.0


### Boolean Metrics

These are the Metrics which return Boolean values.

These metrics are callable structs. 
The struct has field for the ```A``` and ```C```. ```A``` corresponds to the matrix on LHS of the equality-check equation A*z = 0 in the paper https://arxiv.org/pdf/2004.03424.pdf Equation No. 3. In this paper it is a 1D array. But to deal with multiple group fairness, a 2D array matrix is used.

Initially the instatiated metric contains ```0``` and ```[]``` as values for ```C``` and ```A```. But after calling it on fairness tensor, the values of ```C``` and ```A``` change as shown below. This gives the advantage to reuse the same instantiation again. But upon reusing, the matrix ```A``` need not be generated again as it will remain the same. This makes it faster!

In [8]:
dp = DemographicParity()

println("Initial values in struct DemographicParity : ")
println("A : ", dp.A)
println("C : ", dp.C)

ft = @load_toyfairtensor

println()
println("Demographic Parity : ", dp(ft))
println()

println("New values in dp (instance of DemographicParity)")
println("A : ", dp.A)
println("C : ", dp.C)

Initial values in struct DemographicParity : 
A : Any[]
C : 0

Demographic Parity : false

New values in dp (instance of DemographicParity)
A : [4 0 4 0 -3 0 -3 0; 3 0 3 0 -3 0 -3 0; 0 0 0 0 0 0 0 0]
C : 3
