### Dataset Loading

`Adult` class is a wrapper class for the `adult` dataset. It provides a method to load the dataset in form of `numpy.ndarray`.

In [None]:
from data.adult import Adult

adult = Adult()
train_X, train_y = adult.get_train_data()
test_X, test_y = adult.get_test_data()

### Task Preparation

`BinaryLogisticClassificationTask` class performs binary logistic classification using `sklearn.linear_model.LogisticRegression` class. It provides a method to train the model and evaluate the model. Also, it provides a useful method to get thresholded predictions. Take a look at [tasks.py](./ferm_ge/tasks.py) for more details.

In [None]:
from ferm_ge.tasks import BinaryLogisticClassificationTask

task = BinaryLogisticClassificationTask()
task.train(train_X, train_y)
task.test(test_X, test_y)

### Parameter Definition

In [None]:
import numpy as np
from typing import List

alpha = 1.0
a = 5
c = 7.5
gamma = 0.04
nu = 0.01
lambda_max = 20.0

thr_candidates: List[float] = np.linspace(0.0, 1.0, 200).tolist()

### (Implementation Hack) Helper Variables Calculation

Our implementation utilizes pre-calculated `I_alpha` and `err` values for faster computation.
Those two values differ for each `alpha` and `r` value.

In [None]:
from ferm_ge.algorithm_ge import ge_confmat

I_alpha_cache = []
err_cache = []

for thr in thr_candidates:
    _, confmat = task.predict_train_with_threshold(thr)
    tn, fp, fn, tp = confmat.astype(float)
    err: float = (fp + fn) / (tn + fp + fn + tp)
    err_cache.append(err)
    I_alpha: float = ge_confmat(alpha, c, a, tn, fp, fn, tp)
    I_alpha_cache.append(I_alpha)

### FERM-GE Solver Initialization

In [None]:
from ferm_ge.algorithm_gefair import GEFairSolverC

lib_path = GEFairSolverC.compile_gefair()
solver = GEFairSolverC(lib_path)

### Problem Solving

For memory efficiency, by default, `I_alpha` and `err` history during training time is not collected. But in this case, to show how to use the API, we collect them by setting `collect_ge_history=True`.

In [None]:
gefair_result = solver.solve_gefair(
    thr_candidates,
    I_alpha_cache,
    err_cache,
    alpha,
    lambda_max,
    nu,
    c,
    a,
    gamma,
    collect_ge_history=True,
)

### Metrics Calculation

In our implementation, we utilizes `frozenset` to represent a set of parameters, so-called `frozenkey`. This can be converted to dictionary using `ferm_ge.utils.frozenkey_to_paramdict` function, and `ferm_ge.utils.paramdict_to_frozenkey` vice versa. All built-in metric calculation functions and plotting functions are designed to work with `frozenkey` and `paramdict` to pass multiple experiments' results at once.
Note that there are two kinds of metrics: `I_alpha` and `err`.

In [None]:
from ferm_ge.utils import paramdict_to_frozenkey
from ferm_ge.metrics import calc_metrics

key = paramdict_to_frozenkey(
    {
        "alpha": alpha,
        "c": c,
        "a": a,
        "gamma": gamma,
        "nu": nu,
        "lambda_max": lambda_max,
    }
)

gefair_metrics = calc_metrics(
    {key: gefair_result},
    task,
    repeat=0,
)

print("I_alpha", gefair_metrics[key].I_alpha)
print("err", gefair_metrics[key].err)

### Convergence Plotting

In [None]:
from ferm_ge.plotting import plot_convergence

plot_convergence({key: gefair_result}, "I_alpha");
plot_convergence({key: gefair_result}, "err");