# FastWoe Multiclass Example

Author: https://www.github.com/xRiskLab

This notebook demonstrates how to use FastWoe with a multiclass target in a row-level format. The target has three classes:
- `0`: No Default
- `1`: UTP Default
- `2`: DPD Default

In [14]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split

from fastwoe import FastWoe

In [15]:
# Create synthetic row-level data from the contingency table
data = []
rows = [
    ("Delinquent > 5 days", 2, 50),
    ("Delinquent > 5 days", 1, 30),
    ("Delinquent > 5 days", 0, 100),
    ("Delinquent ≤ 5 days", 2, 5),
    ("Delinquent ≤ 5 days", 1, 15),
    ("Delinquent ≤ 5 days", 0, 500),
]

for evidence, label, count in rows:
    data.extend([[evidence, label]] * count)

df = pd.DataFrame(data, columns=["evidence", "target"])
df["target"] = df["target"].astype(int)

# Show class distribution
print(df["target"].value_counts().sort_index())

target
0    600
1     45
2     55
Name: count, dtype: int64


In [16]:
# Split the data
X = df[["evidence"]]
y = df["target"]

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

In [17]:
# Fit FastWoe with multiclass support
woe = FastWoe()
woe.fit(X_train, y_train)

# Show mappings for each class
for class_label in sorted(y.unique()):
    print(f"\nMapping for class {class_label}:")
    display(woe.get_mapping("evidence", class_label=class_label))


Mapping for class 0:


Unnamed: 0,category,count,count_pct,good_count,bad_count,event_rate,woe,woe_se,woe_ci_lower,woe_ci_upper
0,Delinquent > 5 days,142,25.357143,63,79,0.556338,-1.565446,0.168912,-1.896508,-1.234385
1,Delinquent ≤ 5 days,418,74.642857,17,401,0.95933,1.368989,0.247623,0.883656,1.854321



Mapping for class 1:


Unnamed: 0,category,count,count_pct,good_count,bad_count,event_rate,woe,woe_se,woe_ci_lower,woe_ci_upper
0,Delinquent > 5 days,142,25.357143,119,23,0.161972,1.034343,0.227775,0.587912,1.480775
1,Delinquent ≤ 5 days,418,74.642857,405,13,0.0311,-0.760965,0.281766,-1.313217,-0.208713



Mapping for class 2:


Unnamed: 0,category,count,count_pct,good_count,bad_count,event_rate,woe,woe_se,woe_ci_lower,woe_ci_upper
0,Delinquent > 5 days,142,25.357143,102,40,0.28169,1.525824,0.186558,1.160177,1.891471
1,Delinquent ≤ 5 days,418,74.642857,414,4,0.009569,-2.177654,0.50241,-3.162359,-1.192949


In [18]:
X_ = X_train.copy()
y_ = y_train.copy()

display(pd.DataFrame(woe.predict_proba(X_)).mean())

# calculate mean of each class in y_test
for i in range(len(y_.unique())):
    print(f"Mean of class {i} in y_test: {np.mean(y_ == i)}")

0    0.857143
1    0.064286
2    0.078571
dtype: float64

Mean of class 0 in y_test: 0.8571428571428571
Mean of class 1 in y_test: 0.06428571428571428
Mean of class 2 in y_test: 0.07857142857142857
