# Qaulity Measures
For this example a toy dataset is used

In [1]:
# Toy dataset creation
import pandas as pd
X = pd.DataFrame({'col1': [1, 2, 3], 'col2': [3, 2, 1]})
y_true = pd.Series([1, 0, 1])
y_pred = pd.Series([1, 1, 1])

## Fairlearn Quality measures
All the Fairlear metrics can be used as quality measure. See the Fairlearn documentation [here](https://fairlearn.github.io/v0.6.0/api_reference/fairlearn.metrics.html).<br/>
**The predefined fairlearn metrics are:**

|Metric name | Description
|:-----|:-----
| demographic_parity_difference |  Defined as the absolute value of the difference in the **selection rates** between a subgroup and its negation.
|demographic_parity_ratio | Defined as the ratio between the smallest and the largest group-level **selection rate**, between a subgroup and its negation.
|equalized_odds_difference | The greater of two metrics: true_positive_rate_difference and false_positive_rate_difference. The former is the difference between the TPRs, between a subgroup and its negation. The latter is defined similarly, but for FPRs.
|equalized_odds_ratio | The smaller of two metrics: true_positive_rate_ratio and false_positive_rate_ratio. The former is the ratio between the TPRs, between a subgroup and its negation. The latter is defined similarly, but for FPRs.

We can inizialize a SubgroupDiscoveryTask object in this way:

In [2]:
import fairsd as fsd
from fairlearn.metrics import demographic_parity_ratio
task = fsd.SubgroupDiscoveryTask(X, y_true, y_pred, demographic_parity_ratio)

Or, faster, we can pass the same metric as a string:

In [3]:
task = fsd.SubgroupDiscoveryTask(X, y_true, y_pred, 'demographic_parity_ratio')

From the version 6.0, Fairlearn also offers the interesting possibility of "create a scalar returning metric function based on aggregation of a disaggregated metric".<br/>
**Example:**

In [4]:
from sklearn.metrics import accuracy_score
from fairlearn.metrics import make_derived_metric
derived_metric = make_derived_metric(metric = accuracy_score, transform = 'difference')
task = fsd.SubgroupDiscoveryTask(X, y_true, y_pred, derived_metric)

For more details see the Fairlearn documentation.

## Customized Quality Measures
It is possible to create a quality measure by estending the class [QualityFunction](https://github.com/MaurizioPulizzi/fairsd/blob/main/fairsd/qualitymeasures.py#L3):

In [5]:
class MyQualityMeasure(fsd.QualityFunction):
    def evaluate(self, y_true = None, y_pred=None, sensitive_features=None):
        return 0.5
    
task = fsd.SubgroupDiscoveryTask(X, y_true, y_pred, MyQualityMeasure.evaluate)

# Descriptions and Quality Measures
A subgroup description is formed by the conjunction of zero or more descriptors.<br/>
A descriptor is a statement in the form "attribute_name = attribute_value" for nomilal attributes or "attribute_name = range" for numerical attributes.<br/>
Example of Description: " sex = 'Male' AND age = (18, 30] ". <br/>
The Top-k subgroup discovery task in this package returns the k subgroup descriptions of the subgroups that exert the greatest disparity.<br>
There is no single definition of subgroup disparity, the meaning changes according to the used quality measure.

**All metrics in the [fairlearn.metrics](https://fairlearn.github.io/v0.6.0/api_reference/fairlearn.metrics.html) module are symmetrical:** they always return a value between 0 and 1 and do not distinguish whether a subgroup is "positively" or "negatively" dissimilar. For example the descriptions "married = True" and "married = False" will always have the same quality.
