# Fuzzy Inference System

## Takagi-Sugeno (TS) Fuzzy Model
We use a Takagi-Sugeno (TS) Fuzzy Model since we want an output probability that the patient has alzheimers, rather than a fuzzy set output as with the Mamdani model. Since the output is binary (AD vs Non-AD) we use zero-order TS.

## Preparing Fuzzy Sets

### Imports

In [266]:
from pathlib import Path
import pandas as pd
import numpy as np
from simpful import *
from sklearn.cluster import KMeans
from collections import Counter, defaultdict
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report, roc_auc_score, fbeta_score
from sklearn.model_selection import StratifiedKFold, train_test_split
from itertools import product

### Load Data

In [267]:
# Folder path
path = Path("../data/selected")

# Load data (already split)
X_train = pd.read_csv(path / "X_train_selected_fis.csv")
X_test = pd.read_csv(path / "X_test_selected_fis.csv")
y_train = pd.read_csv(path / "y_train.csv")
y_test = pd.read_csv(path / "y_test.csv")

# Drop the unwanted index-like column
if "Unnamed: 0" in X_train.columns:
    X_train = X_train.drop(columns=["Unnamed: 0"])
if "Unnamed: 0" in X_test.columns:
    X_test = X_test.drop(columns=["Unnamed: 0"])
y_train = y_train["Diagnosis"]
y_test = y_test["Diagnosis"]

print(X_train.shape)
print(y_train.shape)
print(X_train.columns)

(1934, 11)
(1934,)
Index(['BehavioralProblems', 'Diabetes', 'EducationLevel', 'MemoryComplaints',
       'ADL', 'AlcoholConsumption', 'CholesterolHDL', 'FunctionalAssessment',
       'MMSE', 'SleepQuality', 'SystolicBP'],
      dtype='object')


### Select Features

The features used for this model are selected based on the following: from the features identified during the feature engineering, only the numerical ones are considered since the categorical features have too few classes to create meaningful fuzzy sets. Of these 7 numerical features, we selct 5 which provide both cognitive/functional and vascular/metabolic context:
- MMSE (Mini-Mental State Examination) - Standard global cognitive screening test used in dementia and Alzheimer’s diagnosis and staging
- FunctionalAssessment - Functional decline (managing finances, medication, daily tasks) is core to dementia diagnosis and staging, not just a side measure
- ADL (Activities of Daily Living) - ADL scores (bathing, dressing, feeding, toileting, etc.) are a classic way to measure the impact of dementia on basic autonomy
- CholesterolHDL - Vascular risk factors (lipids, hypertension, diabetes) are important for vascular dementia and mixed dementia, and they also interact with Alzheimer’s pathology
- SystolicBP - Hypertension (especially midlife) is a key risk factor for later-life cognitive impairment and dementia

In [268]:
fis_features = [
    "MMSE",
    "FunctionalAssessment",
    "ADL",
    "CholesterolHDL",
    "SystolicBP",
]

X_train_fis = X_train[fis_features].copy()
X_test_fis = X_test[fis_features].copy()
print(X_train_fis.head(5))

        MMSE  FunctionalAssessment       ADL  CholesterolHDL  SystolicBP
0  12.292725              6.751583  9.061350       50.421329         104
1   3.731915              8.136757  5.886320       51.891374         173
2  26.980845              4.803596  0.123688       36.567192         120
3   2.313023              2.952020  9.017740       58.584837         149
4  19.739526              6.931809  1.927703       52.130322         129


### Create Fuzzy Set Membership Functions

- MMSE - https://muhc.ca/sites/default/files/micro/m-PT-OT/OT/Mini-Mental-State-Exam-%28MMSE%29.pdf
    - 24-30: No cognitive impairment 
    - 18-23: Mild cognitive impairment 
    - 0-17: Severe cognitive impairment 
- Functional Assessment
    - Since it isn't standardised, generic
- Activities of daily living (ADL)
    - Since it isn't standardised, generic
- Cholesteroal HDL - https://www.heartuk.org.uk/cholesterol/understanding-your-cholesterol-test-results-
    - Threshold differs for men and women:
    - above 39 for a man is healthy
    - above 46 for a woman is healthy
    - But with fuzzy sets we can blur the boundary
- Systolic BP - https://www.heart.org/en/health-topics/high-blood-pressure/understanding-blood-pressure-readings
    - \<120: Normal
    - 120-129: Elevated
    - 130-139: Stage 1 Hypertension
    - 140-180: Stage 2 Hypertension

In [269]:
FS = FuzzySystem(show_banner=False)

# Input: MMSE
MMSE_S_1 = FuzzySet(function=Triangular_MF(a=0, b=0, c=19), term="Severe")
MMSE_S_2 = FuzzySet(function=Triangular_MF(a=16, b=20, c=25), term="Mild")
MMSE_S_3 = FuzzySet(function=Triangular_MF(a=22, b=30, c=30), term="Normal")
MMSE = LinguisticVariable([MMSE_S_1, MMSE_S_2, MMSE_S_3], universe_of_discourse=[0,30])
FS.add_linguistic_variable("MMSE", MMSE)

# Input: Functional Assessment
FUNC_S_1 = FuzzySet(function=Triangular_MF(a=0, b=0, c=4), term="SevereImpairment")
FUNC_S_2 = FuzzySet(function=Triangular_MF(a=2, b=5, c=8), term="ModerateImpairment")
FUNC_S_3 = FuzzySet(function=Triangular_MF(a=6, b=10, c=10), term="NoImpairment")
FUNC = LinguisticVariable([FUNC_S_1, FUNC_S_2, FUNC_S_3], universe_of_discourse=[0,10])
FS.add_linguistic_variable("FunctionalAssessment", FUNC)

# Input: ADL
ADL_S_1 = FuzzySet(function=Triangular_MF(a=0, b=0, c=4), term="SevereImpairment")
ADL_S_2 = FuzzySet(function=Triangular_MF(a=2, b=5, c=8), term="ModerateImpairment")
ADL_S_3 = FuzzySet(function=Triangular_MF(a=6, b=10, c=10), term="NoImpairment")
ADL = LinguisticVariable([ADL_S_1, ADL_S_2, ADL_S_3], universe_of_discourse=[0,10])
FS.add_linguistic_variable("ADL", ADL)

# Input: Cholesterol HDL
HDL_S_1 = FuzzySet(function=Triangular_MF(a=20, b=20, c=46), term="Low")
HDL_S_2 = FuzzySet(function=Triangular_MF(a=39, b=100, c=100), term="Normal")
HDL = LinguisticVariable([HDL_S_1, HDL_S_2], universe_of_discourse=[20,100])
FS.add_linguistic_variable("CholesterolHDL", HDL)

# Input: Systolic BP
SBP_S_1 = FuzzySet(function=Triangular_MF(a=90,  b=90,  c=120), term="Normal")
SBP_S_2 = FuzzySet(function=Triangular_MF(a=115, b=125, c=135), term="Elevated")
SBP_S_3 = FuzzySet(function=Triangular_MF(a=130, b=160, c=180), term="Hypertensive")
SBP = LinguisticVariable([SBP_S_1, SBP_S_2, SBP_S_3], universe_of_discourse=[90,180])
FS.add_linguistic_variable("SystolicBP", SBP)

## Initialise the Rule Base

There are two options for determining the rule base:
- Grid rule base - not ideal for this case since there are 162 possible rules (3x3x3x2x3) which is too large for grid
- Data-driven rule extraction - works well for this case

DECISION: With all 5 features there are too many rule options, so instead we determine the rules based only on the cognitive features and then incorporate the remaining features in the consequents. (This was also found to improve model performance on validation set)

### Prepare dataframe and input and output columns

In [None]:
input_features = [
    "MMSE",
    "FunctionalAssessment",
    "ADL",
    "CholesterolHDL",
    "SystolicBP",
]

input_features_rules = [
    "MMSE",
    "FunctionalAssessment",
    "ADL",
]

output_col = "diagnosis"

df_train = X_train_fis.copy()
df_train[output_col] = y_train.values

### Get the linguistic terms for each feature for each sample

In [None]:


def get_max_membership_term(FS, var_name, x):
    """
    For a given crisp value x of feature var_name,
    return the linguistic term with highest membership degree.
    """
    fuzzy_sets = FS.get_fuzzy_sets(var_name)
    best_term = None
    best_mu = -1.0

    for fs in fuzzy_sets:
        term = fs.get_term()
        mu = fs.get_value(x)
        if mu > best_mu:
            best_mu = mu
            best_term = term

    return best_term, best_mu

def compute_max_term_combinations(FS, df, input_vars=input_features):
    """
    For each row in df:
      - find the max-membership term for each input variable
      - create a tuple combo of these terms
    Returns:
      df_with_terms: original df with *_term columns and 'combo_key'
    """
    df = df.copy()

    combo_keys = []

    for idx, row in df.iterrows():
        combo_terms = []
        for var in input_vars:
            x = row[var]
            term, mu = get_max_membership_term(FS, var, x)
            df.loc[idx, f"{var}_term"] = term
            combo_terms.append(term)

        combo_key = tuple(combo_terms)
        combo_keys.append(combo_key)

    df["combo_key"] = combo_keys
    return df

# df_terms = compute_max_term_combinations(FS, df_train, input_features_rules)
# df_terms.head()



### Aggregate diagnoses for term combinations

In [272]:
# df_terms["combo_key"].value_counts().head(10)
# print("Unique combos:", df_terms["combo_key"].nunique())

# df_terms[
#     [
#         "MMSE", "FunctionalAssessment", "ADL",
#         "MMSE_term", "FunctionalAssessment_term",
#         "ADL_term", "combo_key"
#     ]
# ].head(10)

# df_terms[
#     [
#         "MMSE", "FunctionalAssessment", "ADL",
#         "CholesterolHDL", "SystolicBP",
#         "MMSE_term", "FunctionalAssessment_term",
#         "ADL_term", "CholesterolHDL_term",
#         "SystolicBP_term", "combo_key"
#     ]
# ].head(10)

In [273]:
def aggregate_combinations(df_terms, input_vars=input_features, label_col=output_col):
    """
    Build a table of:
      - each unique combo_key
      - total count
      - AD / Non-AD counts
      - AD_ratio
    Returns:
      combo_df: data frame with one row per combination
    """
    records = []

    # Group by combo_key
    grouped = df_terms.groupby("combo_key")
    print(len(grouped))

    for combo, group in grouped:
        total = len(group)
        # binary label 0/1 and 1 = AD
        ad_count = group[label_col].sum()
        nonad_count = total - ad_count
        ad_ratio = ad_count / total if total > 0 else 0.0

        # unpack terms into separate columns for readability
        combo_dict = {"combo_key": combo,
                      "total": total,
                      "AD_count": ad_count,
                      "NonAD_count": nonad_count,
                      "AD_ratio": ad_ratio}

        for var, term in zip(input_vars, combo):
            combo_dict[f"{var}_term"] = term

        records.append(combo_dict)

    combo_df = pd.DataFrame.from_records(records)
    return combo_df

# combo_df = aggregate_combinations(df_terms, input_features_rules, output_col)
# combo_df.sort_values("total", ascending=False).head(10)


### Select Rule Antecedents

The min_support and min_purity are hyperparameters which determine the number of rules. We optimise these with cross validation during HPO.


In [None]:
def select_rule_antecedents(df, input_vars=input_features_rules, min_support=45, min_purity=0.8):
    """
    Select combinations that will become rule antecedents, based on:
      - min_support: minimum number of samples for that combination
      - min_purity: majority class proportion (e.g. AD_ratio >= 0.7 or <= 0.3)

    Returns:
      rules: list of dicts like
             {'antecedent': { 'MMSE': 'Severe', 'Functional Assessment': 'SevereImpairment', ... },
              'total': ...,
              'AD_ratio': ...}
    """
    rules = []

    for _, row in df.iterrows():
        total = row["total"]
        ad_ratio = row["AD_ratio"]

        if total < min_support:
            continue

        # purity: either strongly AD or strongly Non-AD
        if (ad_ratio >= min_purity) or (ad_ratio <= 1 - min_purity):
            antecedent = {}
            for var in input_vars:
                antecedent[var] = row[f"{var}_term"]

            rule_info = {
                "antecedent": antecedent,
                "total": int(total),
                "AD_ratio": float(ad_ratio),
            }
            rules.append(rule_info)

    return rules

# rules = select_rule_antecedents(combo_df, input_features_rules)
# len(rules), rules[:3]


### Compute firing strengths for each rule

In [275]:
def get_membership(FS, var_name, term_name, x):
    """
    Return μ_{term_name}(x) for the given variable.
    """
    fuzzy_sets = FS.get_fuzzy_sets(var_name)
    for fs in fuzzy_sets:
        if fs.get_term() == term_name:
            return fs.get_value(x)
    raise ValueError(f"Term '{term_name}' not found for variable '{var_name}'")

def firing_strength_for_rule(FS, rule_antecedent, sample_row):
    """
    Compute the firing strength of ONE rule for ONE sample.
    
    rule_antecedent: dict like { 'MMSE': 'Severe', 'Functional Assessment': 'SevereImpairment', ... }
    sample_row: a pandas Series with crisp input values for the same variables
    """
    strength = 1.0   # product t-norm for fuzzy AND

    for var_name, term_name in rule_antecedent.items():
        x = sample_row[var_name]
        mu = get_membership(FS, var_name, term_name, x)
        strength *= mu

        # early exit if strength drops to zero
        if strength == 0.0:
            break

    return strength

def compute_firing_strength_matrix(FS, rules, X_fis):
    """
    FS    : Simpful FuzzySystem with your 5 input variables.
    rules : list of rule dicts, each with an 'antecedent' key as above.
    X_fis : DataFrame of raw inputs used for FIS
            (columns: 'MMSE', 'Functional Assessment', 'Activities of Daily Living',
                      'Cholesterol HDL', 'Systolic BP')
    
    Returns:
        W: numpy array of shape (N_samples, N_rules)
    """
    N = len(X_fis)
    R = len(rules)
    W = np.zeros((N, R), dtype=float)

    for i, (idx, row) in enumerate(X_fis.iterrows()):
        for j, rule in enumerate(rules):
            antecedent = rule["antecedent"]
            W[i, j] = firing_strength_for_rule(FS, antecedent, row)

    return W

# # X_train_fis must be the RAW data (not scaled) with the FIS column names
# W_train = compute_firing_strength_matrix(FS, rules, X_train_fis)

# print(W_train.shape)  # (N_samples, 11) for your current ~11 rules
# print(W_train[:5])    # first 5 samples' firing strengths

# rule_names = [f"rule_{k}" for k in range(len(rules))]
# W_train_df = pd.DataFrame(W_train, index=X_train_fis.index, columns=rule_names)

# row_sums = W_train.sum(axis=1)
# covered = np.mean(row_sums > 0)
# print("Fraction of training samples covered by at least one rule:", covered)

# for j, rule in enumerate(rules):
#     nz = np.count_nonzero(W_train[:, j] > 0)
#     print(f"Rule {j}: nonzero activations = {nz}")

## Training the TS Model

### Fit the TS Consequents

DECISION: We opt for linear consequents (rule describes the region, and the linear model describes the effect of variables in that region) since it allows us to include the effect of the features excluded from the rules

In [276]:
def fit_ts_linear_consequents(W_train, X_fis, y_train):
    """
    Learn linear TS consequents:
        y_r(x) = a_{r0} + a_{r1}*x1 + ... + a_{rd}*xd

    using global least squares on the Takagi–Sugeno structure:
        y_hat(x) = sum_r phi_r(x) * y_r(x)

    Parameters
    ----------
    W_train : array (N, R)
        Firing strengths for each sample and rule.
    X_fis : DataFrame (N, d)
        Raw input features used in the consequents (here: MMSE, Func, ADL, HDL, SBP).
    y_train : array-like (N,)
        Binary labels (0/1).

    Returns
    -------
    consequents : list of dicts
        Each dict has keys 'bias' and 'weights' (length d).
    """
    W = np.asarray(W_train, dtype=float)
    X = np.asarray(X_fis.values, dtype=float)
    y = np.asarray(y_train, dtype=float)

    N, R = W.shape
    _, d = X.shape

    # 1) Normalise firing strengths -> phi
    sum_w = W.sum(axis=1, keepdims=True)  # (N,1)
    phi = np.zeros_like(W)
    nonzero = sum_w[:, 0] > 0
    phi[nonzero, :] = W[nonzero, :] / sum_w[nonzero, :]

    # 2) Build design matrix Z (N x (R*(d+1)))
    # For each rule r, we have (1, x1,...,xd) scaled by phi_r.
    Z = np.zeros((N, R * (d + 1)), dtype=float)

    for n in range(N):
        for r in range(R):
            coeff = phi[n, r]
            if coeff == 0.0:
                continue
            start = r * (d + 1)
            Z[n, start] = coeff                # bias term
            Z[n, start + 1:start + 1 + d] = coeff * X[n, :]

    # 3) Solve least squares: Z @ theta ≈ y
    # (you could add regularisation if needed, but plain LS is fine here)
    theta, *_ = np.linalg.lstsq(Z, y, rcond=None)

    # 4) Unpack into per-rule parameters
    consequents = []
    for r in range(R):
        start = r * (d + 1)
        a0 = theta[start]
        a = theta[start + 1:start + 1 + d]
        consequents.append({
            "bias": float(a0),
            "weights": a,   # numpy array of length d
        })

    return consequents

# consequents = fit_ts_linear_consequents(W_train, X_train_fis, y_train)
# len(consequents), consequents[0]


### Predict with the trained TS model

In [277]:
def ts_predict_linear(FS, rules, consequents, X_fis, global_default=0.5):
    """
    Make predictions with a trained linear TS model.

    Parameters
    ----------
    FS : FuzzySystem
        Your Simpful fuzzy system with defined input variables and sets.
    rules : list of dicts
        Rule antecedents as before.
    consequents : list of dicts
        Output of fit_ts_linear_consequents.
    X_fis : DataFrame
        Raw input features (same columns/order as used in training).
    global_default : float
        Fallback prediction if a sample has zero firing strength for all rules.

    Returns
    -------
    y_hat : array (N,)
        Continuous outputs (you can threshold at 0.5 for classification).
    """
    X = np.asarray(X_fis.values, dtype=float)
    N = len(X_fis)
    R = len(rules)
    d = X.shape[1]

    y_hat = np.zeros(N, dtype=float)

    for i, (idx, row) in enumerate(X_fis.iterrows()):
        # 1) Compute firing strengths for this sample
        w = np.zeros(R, dtype=float)
        for r, rule in enumerate(rules):
            antecedent = rule["antecedent"]
            w[r] = firing_strength_for_rule(FS, antecedent, row)

        sum_w = w.sum()
        if sum_w == 0.0:
            # No rule fires: fallback to global default (e.g. mean of y_train)
            y_hat[i] = global_default
            continue

        # 2) Normalise to phi
        phi = w / sum_w

        # 3) Combine rule outputs
        out = 0.0
        x_i = X[i, :]
        for r in range(R):
            a0 = consequents[r]["bias"]
            a = consequents[r]["weights"]
            y_r = a0 + np.dot(a, x_i)
            out += phi[r] * y_r

        y_hat[i] = out

    return y_hat

# # Global default = mean label (good fallback if no rules fire)
# global_default = float(np.mean(y_train))

# y_train_hat = ts_predict_linear(FS, rules, consequents, X_train_fis,
#                                 global_default=global_default)

# # For binary classification, threshold at 0.5 (or tune threshold)
# y_train_pred = (y_train_hat >= 0.5).astype(int)


### Hyperparameter Optimisation

3 hyperparameters:
- min_support - the minimum number of samples with the same term combination required for rule base
- min_purity - the minimum purity (number of samples with same output value) of a term combination required for rule base
- n_rules_max - maximum number of rules

In [278]:
def cv_score_fis(
    FS,
    X_train_full,
    y_train,
    fis_features,
    rule_input_vars,
    min_support,
    min_purity,
    n_rules_max,
    n_splits=5,
    random_state=42,
):
    """
    Compute mean ROC AUC over stratified k-fold CV for one FIS hyperparameter setting.
    """
    skf = StratifiedKFold(n_splits=n_splits, shuffle=True, random_state=random_state)
    aucs = []

    # Make sure y_train is a 1D array / Series
    y_train = np.asarray(y_train).ravel()

    for train_idx, val_idx in skf.split(X_train_full, y_train):
        # Split into CV train/val
        X_tr_full = X_train_full.iloc[train_idx]
        X_val_full = X_train_full.iloc[val_idx]
        y_tr = y_train[train_idx]
        y_val = y_train[val_idx]

        # FIS uses the selected 5 continuous inputs
        X_tr_fis = X_tr_full[fis_features].copy()
        X_val_fis = X_val_full[fis_features].copy()

        # Build df_tr with diagnosis for rule mining
        df_tr = X_tr_fis.copy()
        df_tr["diagnosis"] = y_tr

        # 1) Max-membership combos per sample (for rule inputs only)
        df_terms_fold = compute_max_term_combinations(FS, df_tr, rule_input_vars)

        # 2) Aggregate combinations
        combo_df_fold = aggregate_combinations(df_terms_fold, rule_input_vars, "diagnosis")

        # 3) Select rule antecedents with current min_support / min_purity
        rules_raw = select_rule_antecedents(
            combo_df_fold,
            input_vars=rule_input_vars,
            min_support=min_support,
            min_purity=min_purity,
        )

        # If no rules survive, this configuration is useless
        if len(rules_raw) == 0:
            # Assign a very poor score
            return 0.5

        # 4) Rank rules by informativeness and keep top n_rules_max
        #    score = support * |AD_ratio - 0.5|
        rules_sorted = sorted(
            rules_raw,
            key=lambda r: r["total"] * abs(r["AD_ratio"] - 0.5),
            reverse=True,
        )
        rules = rules_sorted[:n_rules_max]

        # 5) Compute firing strengths on CV-train
        W_tr = compute_firing_strength_matrix(FS, rules, X_tr_fis)

        # 6) Fit TS linear consequents
        consequents = fit_ts_linear_consequents(W_tr, X_tr_fis, y_tr)

        # 7) Predict scores on CV-val
        global_default = float(y_tr.mean())
        y_val_scores = ts_predict_linear(FS, rules, consequents, X_val_fis,
                                         global_default=global_default)

        # 8) ROC AUC on this fold
        try:
            auc = roc_auc_score(y_val, y_val_scores)
        except ValueError:
            # In case only one class appears in the fold (unlikely with stratified CV)
            auc = 0.5
        aucs.append(auc)

    return float(np.mean(aucs)) if aucs else 0.5



In [None]:
# Full training data that FIS sees
X_train_full = X_train_fis.copy()   
y_train_full = y_train["diagnosis"] if isinstance(y_train, pd.DataFrame) else y_train

# Antecedent variables
rule_input_vars = input_features_rules 

# Hyperparameter grid (≤ 20 combos)
param_grid = {
    'min_support': [30, 35, 40],
    'min_purity': [0.75, 0.8, 0.85],
    'n_rules_max': [6, 8]
}

param_combos = [
    {"min_support": ms, "min_purity": mp, "n_rules_max": nr}
    for ms, mp, nr in product(
        param_grid["min_support"],
        param_grid["min_purity"],
        param_grid["n_rules_max"],
    )
]

best_auc = -np.inf
best_params = None

for params in param_combos:
    auc = cv_score_fis(
        FS=FS,
        X_train_full=X_train_full,
        y_train=y_train_full,
        fis_features=fis_features,
        rule_input_vars=rule_input_vars,
        min_support=params["min_support"],
        min_purity=params["min_purity"],
        n_rules_max=params["n_rules_max"],
        n_splits=5,
        random_state=42,
    )
    print(params, "→ mean CV ROC AUC:", auc)

    if auc > best_auc:
        best_auc = auc
        best_params = params

print("\nBest params:", best_params, "with mean CV ROC AUC:", best_auc)

27
27
27
27
27
{'min_support': 30, 'min_purity': 0.75, 'n_rules_max': 6} → mean CV ROC AUC: 0.7489200515242593
27
27
27
27
27
{'min_support': 30, 'min_purity': 0.75, 'n_rules_max': 8} → mean CV ROC AUC: 0.76524259338772
27
27
27
27
27
{'min_support': 30, 'min_purity': 0.8, 'n_rules_max': 6} → mean CV ROC AUC: 0.748897745813654
27
27
27
27
27
{'min_support': 30, 'min_purity': 0.8, 'n_rules_max': 8} → mean CV ROC AUC: 0.7679331902103907
27
27
27
27
27
{'min_support': 30, 'min_purity': 0.85, 'n_rules_max': 6} → mean CV ROC AUC: 0.7520290253327607
27
27
27
27
27
{'min_support': 30, 'min_purity': 0.85, 'n_rules_max': 8} → mean CV ROC AUC: 0.7658146414770288
27
27
27
27
27
{'min_support': 35, 'min_purity': 0.75, 'n_rules_max': 6} → mean CV ROC AUC: 0.7489200515242593
27
27
27
27
27
{'min_support': 35, 'min_purity': 0.75, 'n_rules_max': 8} → mean CV ROC AUC: 0.7659958780592528
27
27
27
27
27
{'min_support': 35, 'min_purity': 0.8, 'n_rules_max': 6} → mean CV ROC AUC: 0.7483897166165736
27
27
2

In [None]:
# Final FIS training on full training set using best hyperparameters

min_support = best_params["min_support"]
min_purity = best_params["min_purity"]
n_rules_max = best_params["n_rules_max"]

# Build df_terms on full train
df_train_full = X_train_full.copy()
df_train_full["diagnosis"] = y_train_full

df_terms_full = compute_max_term_combinations(FS, df_train_full, rule_input_vars)
combo_df_full = aggregate_combinations(df_terms_full, rule_input_vars, "diagnosis")

# Raw rules from full train
rules_raw_full = select_rule_antecedents(
    combo_df_full,
    input_vars=rule_input_vars,
    min_support=min_support,
    min_purity=min_purity,
)

# Rank and keep top n_rules_max
rules_sorted_full = sorted(
    rules_raw_full,
    key=lambda r: r["total"] * abs(r["AD_ratio"] - 0.5),
    reverse=True,
)
rules_final = rules_sorted_full[:n_rules_max]

print("Number of final rules:", len(rules_final))
for r in rules_final:
    print(r)

# Firing strengths and consequents on full train
W_train_final = compute_firing_strength_matrix(FS, rules_final, X_train_full)
consequents_final = fit_ts_linear_consequents(W_train_final, X_train_full, y_train_full)

# Evaluate on TRAIN (just to see)
global_default_final = float(np.mean(y_train_full))
y_train_score = ts_predict_linear(
    FS, rules_final, consequents_final, X_train_full, global_default=global_default_final
)
train_auc = roc_auc_score(y_train_full, y_train_score)
print("Final FIS TRAIN ROC AUC:", train_auc)


27
Number of final rules: 8
{'antecedent': {'MMSE': 'Severe', 'FunctionalAssessment': 'SevereImpairment', 'ADL': 'SevereImpairment'}, 'total': 107, 'AD_ratio': 0.9626168224299065}
{'antecedent': {'MMSE': 'Severe', 'FunctionalAssessment': 'NoImpairment', 'ADL': 'ModerateImpairment'}, 'total': 146, 'AD_ratio': 0.1917808219178082}
{'antecedent': {'MMSE': 'Normal', 'FunctionalAssessment': 'ModerateImpairment', 'ADL': 'ModerateImpairment'}, 'total': 76, 'AD_ratio': 0.039473684210526314}
{'antecedent': {'MMSE': 'Severe', 'FunctionalAssessment': 'NoImpairment', 'ADL': 'NoImpairment'}, 'total': 87, 'AD_ratio': 0.11494252873563218}
{'antecedent': {'MMSE': 'Mild', 'FunctionalAssessment': 'NoImpairment', 'ADL': 'NoImpairment'}, 'total': 60, 'AD_ratio': 0.11666666666666667}
{'antecedent': {'MMSE': 'Normal', 'FunctionalAssessment': 'SevereImpairment', 'ADL': 'ModerateImpairment'}, 'total': 46, 'AD_ratio': 0.043478260869565216}
{'antecedent': {'MMSE': 'Normal', 'FunctionalAssessment': 'ModerateImpai

## Threshold optimisation

DECISION: Use F2 score to optimise threshold since it prioritises recall while balancing with precision

In [None]:
# Validation split and threshold optimisation (using F2)
X_train_inner, X_val, y_train_inner, y_val = train_test_split(
    X_train_full,
    y_train_full,
    test_size=0.2,
    stratify=y_train_full,
    random_state=42,
)

X_val_fis = X_val[fis_features].copy()
y_val_scores = ts_predict_linear(
    FS, rules_final, consequents_final, X_val_fis, global_default=global_default_final
)

threshold_grid = np.linspace(0.0, 1.0, 101)

best_thresh = 0.5
best_f2 = -1.0
best_acc = -1.0

for thr in threshold_grid:
    y_val_pred = (y_val_scores >= thr).astype(int)

    f2 = fbeta_score(y_val, y_val_pred, beta=2)
    acc = accuracy_score(y_val, y_val_pred)

    if f2 > best_f2:
        best_f2 = f2
        best_acc = acc
        best_thresh = thr

print(f"Best threshold on validation set (F2): {best_thresh:.3f}")
print(f"Validation F2 at best threshold: {best_f2:.3f}")
print(f"Validation accuracy at best threshold: {best_acc:.3f}")


Best threshold on validation set (F2): 0.190
Validation F2 at best threshold: 0.783
Validation accuracy at best threshold: 0.566


## Evaluation on Test Set

In [None]:

# Test evaluation

X_test_fis = X_test[fis_features].copy()
y_test_full = y_test["diagnosis"] if isinstance(y_test, pd.DataFrame) else y_test

y_test_score = ts_predict_linear(
    FS, rules_final, consequents_final, X_test_fis, global_default=global_default_final
)
test_auc = roc_auc_score(y_test_full, y_test_score)
print("Final FIS TEST ROC AUC:", test_auc)

# Apply the optimised threshold from the validation set
y_test_pred = (y_test_score >= best_thresh).astype(int)

print(f"\nTest metrics at threshold = {best_thresh:.3f}")
print("Confusion matrix:")
print(confusion_matrix(y_test_full, y_test_pred))

print("\nClassification report:")
print(classification_report(y_test_full, y_test_pred, digits=3))

test_f2 = fbeta_score(y_test_full, y_test_pred, beta=2)
print(f"Test F2 at threshold = {best_thresh:.3f}: {test_f2:.3f}")



Final FIS TEST ROC AUC: 0.8079326012873911

Test metrics at threshold = 0.190
Confusion matrix:
[[57 82]
 [ 3 73]]

Classification report:
              precision    recall  f1-score   support

           0      0.950     0.410     0.573       139
           1      0.471     0.961     0.632        76

    accuracy                          0.605       215
   macro avg      0.710     0.685     0.602       215
weighted avg      0.781     0.605     0.594       215

Test F2 at threshold = 0.190: 0.795
