# Basics

The `fairness` package does a few nice things for you. For example, it provides a standardized access to different ways of preprocessing a dataset:

In [None]:
from fairness.data.objects.Ricci import Ricci
from fairness.data.objects.ProcessedData import ProcessedData
ricci = ProcessedData(Ricci())

In [None]:
ricci.get_dataframe("original")

In [None]:
ricci.get_dataframe("numerical-binsensitive")

## Necessary imports
We'll be using these throughout the tutorial.

In [None]:
# We'll suppress warnings because both altair and sklearn are
# emitting lots of them, and they're annoying in a demo setting.

import warnings
warnings.filterwarnings("ignore")

import altair as alt
# Ask Altair to produce output that works on Jupyter Notebook
alt.renderers.enable('notebook')

In [None]:
import fairness
import fairness.benchmark

In order to run specific algorithms and datasets, pass them as parameters to `fairness.benchmark.run`. Do note that if you don't pass these parameters, `fairness` will run all available algorithms on all available datasets, which will take a *very* long time (about a week on a single processor in our machines)!

In [None]:
fairness.benchmark.run(algorithm=["LR", "Feldman-LR"], dataset=["ricci"])

You can then access the results per dataset, sensitive attribute, and preprocessing option:

In [None]:
ricci_Race = fairness.get_dataset_by_name("ricci").get_results_data_frame("Race", "numerical-binsensitive")

In [None]:
# So many measures!
list(ricci_Race.columns.values)

To show a final example of what you can get out of `fairness`, we produce a simple plot of accuracy vs (binarized) disparate impact where color is used to differentiate algorithms:

In [None]:
ricci_Race = ricci_Race[ricci_Race.algorithm.str.contains("LR")]

In [None]:
alt.Chart(ricci_Race).mark_point().encode(
    x='accuracy',
    y='DIbinary',
    color='algorithm'
)