In [1]:
using Pkg
Pkg.activate("..")

using MLJFair
using MLJBase

[32m[1m Activating[22m[39m environment at `~/Desktop/MLJFair/Project.toml`


### Creating Fairness Tensor

```fair_tensor``` function is used for this. It accepts 3 arguments:
 - ŷ : Predicted Class
 - y : Ground truth
 - grp : Protected attribute value

In [2]:
ŷ = categorical([1, 0, 1, 1, 0])
y = categorical([1, 1, 0, 1, 0])
grp = categorical(["African", "American", "Indian", "American", "African"])

ft = fair_tensor(ŷ, y, grp)

MLJFair.FairTensor{3}([1 0; 1 1; 0 0]

[0 1; 0 0; 1 0], ["African", "American", "Indian"])

Now we shall use a Toy dataset of jobs, containing 10 rows. ```job_fairtensor``` function from data.jl shall be used to get the fairness tensor. In our further discussion we shall be using this same fairness tensor.

In [3]:
include("../test/data/data.jl")

ft = job_fairtensor();

println(ft.mat)
println()
println(ft.labels)

[2 2; 0 0; 0 2]

[0 0; 2 1; 1 0]

["Board", "Education", "Healthcare"]


## Metrics

All the metrics are Callable structs. Upon instantiation of the metric, it should be called by passing the fairness tensor.

Metrics can have multiple aliases as follows :

In [4]:
tp = TruePositiveRate()
println(tp(ft))

println(true_positive_rate(ft))
println(truepositive_rate(ft))
println(tpr(ft))
println(TPR()(ft))

0.3333333333333333
0.3333333333333333
0.3333333333333333
0.3333333333333333
0.3333333333333333


### Various Calc-Metrics

These are the metrics that return numerical values

In [5]:
println("True Positive : ", truepositive(ft))
println("True Negative : ", truenegative(ft))
println("False Positive : ", falsepositive(ft))
println("False Negative : ", falsenegative(ft))
println("True Positive Rate : ", truepositive_rate(ft))
println("True Negative Rate : ", truenegative_rate(ft))
println("False Positive Rate : ", falsepositive_rate(ft))
println("False Negative :  Rate : ", falsenegative_rate(ft))
println("False Discovery Rate : ", falsediscovery_rate(ft))
println("Positive Predictive Value : ", positive_predictive_value(ft))
println("Negative Predictive Value : ", negative_predictive_value(ft))

True Positive : 2
True Negative : 1
False Positive : 3
False Negative : 4
True Positive Rate : 0.3333333333333333
True Negative Rate : 0.25
False Positive Rate : 0.75
False Negative :  Rate : 0.6666666666666667
False Discovery Rate : 0.4
Positive Predictive Value : 0.6
Negative Predictive Value : 0.2


### Boolean Metrics

These are the Metrics which return Boolean values.

These metrics are callable structs. 
The struct has field for the ```A``` and ```C```. ```A``` corresponds to the matrix on LHS of the equality-check equation A*z = 0 in the paper https://arxiv.org/pdf/2004.03424.pdf Equation No. 3. In this paper it is a 1D array. But to deal with multiple group fairness, a 2D array matrix is used.

Initially the instatiated metric contains ```0``` and ```[]``` as values for ```C``` and ```A```. But after calling it on fairness tensor, the values of ```C``` and ```A``` change as shown below. This gives the advantage to reuse the same instantiation again. But upon reusing, the matrix ```A``` need not be generated again as it will remain the same. This makes it faster!

In [6]:
dp = DemographicParity()

println("Initial values in struct DemographicParity : ")
println("A : ", dp.A)
println("C : ", dp.C)

ft = job_fairtensor()

println()
println("Demographic Parity : ", dp(ft))
println()

println("New values in dp (instance of DemographicParity)")
println("A : ", dp.A)
println("C : ", dp.C)

Initial values in struct DemographicParity : 
A : Any[]
C : 0

Demographic Parity : false

New values in dp (instance of DemographicParity)
A : [4 0 4 0 -3 0 -3 0; 3 0 3 0 -3 0 -3 0; 0 0 0 0 0 0 0 0]
C : 3
