# <span style="font-family: Arial, sans-serif; color:#97f788">xbooster</span>

## <span style="font-family: Arial, sans-serif; color:navyblue">SHAP-Based Scorecard Construction Examples</span>

<span style="font-family: Arial, sans-serif; color:navyblue">Repo: <a href="https://github.com/xRiskLab/xBooster" title="GitHub link">https://github.com/xRiskLab/xBooster</a></span>

This notebook demonstrates how to use SHAP values for scorecard construction with XGBoost, LightGBM, and CatBoost models.

**Important:**

- SHAP is computed on-demand only when predict_score(`method="shap"`) or predict_scores(method="shap") is called
- No SHAP values stored in scorecard DataFrames
- No unnecessary computation during scorecard construction

The implementation follows the single responsibility principle: scorecards handle traditional scoring, and SHAP is a separate, optional feature that users can opt into when needed.

**Key Features:**

- Native SHAP extraction (no external `shap` package needed)
- SHAP values automatically added to scorecard during `construct_scorecard()`
- Use `predict_score(method="shap")` for SHAP-based scoring (no binning table needed)
- Use `predict_scores(method="shap")` for feature-level score decomposition
- Particularly useful for models with `max_depth > 1` where interpretability is challenging


In [1]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score

# Import xbooster constructors
from xbooster.xgb_constructor import XGBScorecardConstructor
from xbooster.lgb_constructor import LGBScorecardConstructor
from xbooster.cb_constructor import CatBoostScorecardConstructor

# Import model libraries
import xgboost as xgb
import lightgbm as lgb
from catboost import CatBoostClassifier, Pool

## Generate Sample Data

We'll create a synthetic credit risk dataset for demonstration.


In [2]:
# Generate synthetic credit risk data
np.random.seed(42)
n_samples = 1000

X = pd.DataFrame(
    {
        "age": np.random.randint(18, 80, n_samples),
        "income": np.random.randint(20000, 150000, n_samples),
        "credit_history": np.random.randint(0, 10, n_samples),
        "debt_ratio": np.random.uniform(0.1, 0.8, n_samples),
        "employment_years": np.random.randint(0, 30, n_samples),
    }
)

# Create target with some relationship to features
y = (
    (
        (X["age"] < 30).astype(int) * 0.3
        + (X["income"] < 40000).astype(int) * 0.4
        + (X["debt_ratio"] > 0.6).astype(int) * 0.3
        + np.random.random(n_samples) * 0.2
    )
    .round()
    .astype(int)
)

# Split data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

print(f"Training set: {X_train.shape[0]} samples")
print(f"Test set: {X_test.shape[0]} samples")
print(f"Default rate: {y.mean():.2%}")

Training set: 800 samples
Test set: 200 samples
Default rate: 17.60%


## Example 1: XGBoost with SHAP


In [3]:
# Train XGBoost model with depth > 1
xgb_model = xgb.XGBClassifier(max_depth=3, n_estimators=50, learning_rate=0.1, random_state=42)
xgb_model.fit(X_train, y_train)

# Evaluate model
xgb_pred = xgb_model.predict_proba(X_test)[:, 1]
gini_xgb = roc_auc_score(y_test, xgb_pred) * 2 - 1
print(f"XGBoost Gini: {gini_xgb:.4f}")

XGBoost Gini: 0.9796


In [4]:
# Create scorecard constructor
xgb_constructor = XGBScorecardConstructor(xgb_model, X_train, y_train)

# Construct scorecard (SHAP is NOT stored in scorecard - computed on-demand only)
xgb_scorecard = xgb_constructor.construct_scorecard()

print("Scorecard columns:", xgb_scorecard.columns.tolist())
print(f"\nScorecard shape: {xgb_scorecard.shape}")
print(
    "\nNote: SHAP values are NOT stored in the scorecard. They are computed on-demand when using predict_score(method='shap')"
)
print("\nFirst few rows of scorecard:")
display(xgb_scorecard[["Tree", "Node", "Feature", "XAddEvidence", "Count", "EventRate"]].head(10))

Scorecard columns: ['Tree', 'Node', 'Feature', 'Sign', 'Split', 'Count', 'CountPct', 'NonEvents', 'Events', 'EventRate', 'WOE', 'IV', 'XAddEvidence', 'DetailedSplit']

Scorecard shape: (308, 14)

Note: SHAP values are NOT stored in the scorecard. They are computed on-demand when using predict_score(method='shap')

First few rows of scorecard:


Unnamed: 0,Tree,Node,Feature,XAddEvidence,Count,EventRate
0,0,4,debt_ratio,0.455527,65.0,0.907692
1,0,6,age,-0.11987,541.0,0.0
2,0,7,debt_ratio,0.292852,50.0,0.66
3,0,8,debt_ratio,0.067897,23.0,0.304348
4,0,9,debt_ratio,-0.103846,80.0,0.0125
5,0,10,debt_ratio,0.48577,41.0,1.0
6,1,4,debt_ratio,0.319853,67.0,0.895522
7,1,6,age,-0.117355,539.0,0.0
8,1,7,age,0.336131,15.0,1.0
9,1,8,age,0.117086,59.0,0.423729


In [5]:
# Predict scores using SHAP method (no binning table needed)
xgb_scores_shap = xgb_constructor.predict_score(X_test, method="shap")
xgb_scores_leafs = xgb_constructor.predict_score(X_test)  # Leaf-based scorecard (default)

print("=== XGBoost Score Prediction Comparison ===")
print(
    f"SHAP-based scores - Mean: {xgb_scores_shap.mean():.2f}, Range: {xgb_scores_shap.min():.1f} to {xgb_scores_shap.max():.1f}"
)
print(
    f"Leaf-based scores - Mean: {xgb_scores_leafs.mean():.2f}, Range: {xgb_scores_leafs.min():.1f} to {xgb_scores_leafs.max():.1f}"
)

# Compare with actual model predictions
xgb_predictions = xgb_model.predict_proba(X_test)[:, 1]
print(f"\nModel predictions - Mean: {xgb_predictions.mean():.4f}")

# Show sample predictions
xgb_comparison_df = pd.DataFrame(
    {
        "SHAP_Score": xgb_scores_shap.head(10),
        "XAddEvidence_Score": xgb_scores_leafs.head(10),
        "Model_Prob": xgb_predictions[:10],
    }
)
print("\nSample predictions (first 10):")
display(xgb_comparison_df)

=== XGBoost Score Prediction Comparison ===
SHAP-based scores - Mean: 554.36, Range: -32.0 to 708.0
Leaf-based scores - Mean: 645.95, Range: 69.0 to 799.0

Model predictions - Mean: 0.1643

Sample predictions (first 10):


Unnamed: 0,SHAP_Score,XAddEvidence_Score,Model_Prob
0,684,776,0.002929
1,666,758,0.003752
2,77,173,0.930465
3,26,123,0.964173
4,671,762,0.003498
5,691,782,0.002663
6,699,790,0.002376
7,-6,82,0.976878
8,588,679,0.011058
9,675,767,0.003311


In [6]:
xgb_comparison_df.corr()

Unnamed: 0,SHAP_Score,XAddEvidence_Score,Model_Prob
SHAP_Score,1.0,0.999969,-0.994927
XAddEvidence_Score,0.999969,1.0,-0.994602
Model_Prob,-0.994927,-0.994602,1.0


In [7]:
# Decompose scores by feature using SHAP method
xgb_scores_decomposed = xgb_constructor.predict_scores(X_test, method="shap")
print("=== XGBoost SHAP Score Decomposition ===")
print(f"Feature-level decomposition shape: {xgb_scores_decomposed.shape}")
print(f"Columns: {xgb_scores_decomposed.columns.tolist()}")
print("\nFirst 5 rows (showing feature contributions and total score):")
display(xgb_scores_decomposed.head())

=== XGBoost SHAP Score Decomposition ===
Feature-level decomposition shape: (200, 6)
Columns: ['age_score', 'income_score', 'credit_history_score', 'debt_ratio_score', 'employment_years_score', 'score']

First 5 rows (showing feature contributions and total score):


Unnamed: 0,age_score,income_score,credit_history_score,debt_ratio_score,employment_years_score,score
0,46,198,2,67,-7,684
1,51,209,3,35,-11,666
2,39,-237,1,-113,8,77
3,44,-264,-5,-130,2,26
4,50,204,-1,40,0,671


## Example 2: LightGBM with SHAP


In [8]:
# Train LightGBM model with depth > 1
lgb_model = lgb.LGBMClassifier(
    max_depth=3, n_estimators=50, learning_rate=0.1, random_state=42, verbose=-1
)
lgb_model.fit(X_train, y_train)

# Evaluate model
lgb_pred = lgb_model.predict_proba(X_test)[:, 1]
gini_lgb = roc_auc_score(y_test, lgb_pred) * 2 - 1
print(f"LightGBM Gini: {gini_lgb:.4f}")

LightGBM Gini: 0.9794


In [9]:
# Create scorecard constructor
lgb_constructor = LGBScorecardConstructor(lgb_model, X_train, y_train)

# Construct scorecard (SHAP is NOT stored in scorecard - computed on-demand only)
lgb_scorecard = lgb_constructor.construct_scorecard()

print("Scorecard columns:", lgb_scorecard.columns.tolist())
print(f"\nScorecard shape: {lgb_scorecard.shape}")
print(
    "\nNote: SHAP values are NOT stored in the scorecard. They are computed on-demand when using predict_score(method='shap')"
)
print("\nFirst few rows of scorecard:")
display(lgb_scorecard[["Tree", "Node", "Feature", "XAddEvidence", "Count", "EventRate"]].head(10))

Scorecard columns: ['Tree', 'Node', 'Feature', 'Sign', 'Split', 'Count', 'CountPct', 'NonEvents', 'Events', 'EventRate', 'WOE', 'IV', 'XAddEvidence']

Scorecard shape: (341, 13)

Note: SHAP values are NOT stored in the scorecard. They are computed on-demand when using predict_score(method='shap')

First few rows of scorecard:


Unnamed: 0,Tree,Node,Feature,XAddEvidence,Count,EventRate
0,0,0,age,-0.974588,17.0,0.176471
1,0,1,debt_ratio,-1.66336,65.0,0.076923
2,0,2,age,-1.66336,17.0,0.117647
3,0,3,debt_ratio,-0.974588,37.0,0.189189
4,0,4,debt_ratio,-0.974588,29.0,0.241379
5,0,5,age,-1.318974,61.0,0.262295
6,0,6,age,-1.66336,414.0,0.181159
7,1,0,debt_ratio,0.250653,40.0,0.3
8,1,1,debt_ratio,-0.11895,65.0,0.076923
9,1,2,age,-0.11895,356.0,0.168539


In [10]:
# Predict scores using SHAP method (no binning table needed)
lgb_scores_shap = lgb_constructor.predict_score(X_test, method="shap")
lgb_scores_leafs = lgb_constructor.predict_score(X_test)  # Leaf-based scorecard (default)

print("=== LightGBM Score Prediction Comparison ===")
print(
    f"SHAP-based scores - Mean: {lgb_scores_shap.mean():.2f}, Range: {lgb_scores_shap.min():.1f} to {lgb_scores_shap.max():.1f}"
)
print(
    f"Leaf-based scores - Mean: {lgb_scores_leafs.mean():.2f}, Range: {lgb_scores_leafs.min():.1f} to {lgb_scores_leafs.max():.1f}"
)

# Compare with actual model predictions
lgb_predictions = lgb_model.predict_proba(X_test)[:, 1]
print(f"\nModel predictions - Mean: {lgb_predictions.mean():.4f}")

# Show sample predictions
lgb_comparison_df = pd.DataFrame(
    {
        "SHAP_Score": lgb_scores_shap.head(10),
        "XAddEvidence_Score": lgb_scores_leafs.head(10),
        "Model_Prob": lgb_predictions[:10],
    }
)
print("\nSample predictions (first 10):")
display(lgb_comparison_df)

=== LightGBM Score Prediction Comparison ===
SHAP-based scores - Mean: 696.00, Range: 10.0 to 881.0
Leaf-based scores - Mean: 660.99, Range: -28.0 to 847.0

Model predictions - Mean: 0.1635

Sample predictions (first 10):


Unnamed: 0,SHAP_Score,XAddEvidence_Score,Model_Prob
0,813,778.0,0.002745
1,814,779.0,0.002707
2,191,152.0,0.938611
3,134,96.0,0.971278
4,814,779.0,0.002712
5,818,783.0,0.002549
6,822,787.0,0.002433
7,40,1.0,0.991951
8,838,805.0,0.001935
9,818,783.0,0.002549


In [11]:
lgb_comparison_df.corr()

Unnamed: 0,SHAP_Score,XAddEvidence_Score,Model_Prob
SHAP_Score,1.0,0.999998,-0.996566
XAddEvidence_Score,0.999998,1.0,-0.996564
Model_Prob,-0.996566,-0.996564,1.0


In [12]:
# Decompose scores by feature using SHAP method
lgb_scores_decomposed = lgb_constructor.predict_scores(X_test, method="shap")
print("=== LightGBM SHAP Score Decomposition ===")
print(f"Feature-level decomposition shape: {lgb_scores_decomposed.shape}")
print(f"Columns: {lgb_scores_decomposed.columns.tolist()}")
print("\nFirst 5 rows (showing feature contributions and total score):")
display(lgb_scores_decomposed.head())

=== LightGBM SHAP Score Decomposition ===
Feature-level decomposition shape: (200, 6)
Columns: ['age_score', 'income_score', 'credit_history_score', 'debt_ratio_score', 'employment_years_score', 'score']

First 5 rows (showing feature contributions and total score):


Unnamed: 0,age_score,income_score,credit_history_score,debt_ratio_score,employment_years_score,score
0,14,83,-2,34,-2,813
1,20,85,4,22,-3,814
2,37,-438,9,-105,2,191
3,42,-474,-6,-120,5,134
4,19,82,-2,26,2,814


## Example 3: CatBoost with SHAP


In [13]:
# Train CatBoost model with depth > 1
cb_model = CatBoostClassifier(
    max_depth=3, n_estimators=50, learning_rate=0.1, random_state=42, verbose=False
)

# Create Pool for CatBoost
train_pool = Pool(X_train, y_train)
test_pool = Pool(X_test, y_test)

cb_model.fit(train_pool)

# Evaluate model
cb_pred = cb_model.predict_proba(test_pool)[:, 1]
cb_auc = roc_auc_score(y_test, cb_pred)
print(f"CatBoost AUC: {cb_auc:.4f}")

CatBoost AUC: 0.9946


In [14]:
# Create scorecard constructor
cb_constructor = CatBoostScorecardConstructor(cb_model, train_pool)

# Construct scorecard (SHAP is NOT stored in scorecard - computed on-demand only)
cb_scorecard = cb_constructor.construct_scorecard()

print("Scorecard columns:", cb_scorecard.columns.tolist())
print(f"\nScorecard shape: {cb_scorecard.shape}")
print(
    "\nNote: SHAP values are NOT stored in the scorecard. They are computed on-demand when using predict_score(method='shap')"
)
print("\nFirst few rows of scorecard:")
display(
    cb_scorecard[["Tree", "LeafIndex", "Feature", "XAddEvidence", "Count", "EventRate"]].head(10)
)

Scorecard columns: ['Tree', 'LeafIndex', 'Feature', 'Sign', 'Split', 'CountPct', 'Count', 'NonEvents', 'Events', 'EventRate', 'XAddEvidence', 'WOE', 'IV', 'DetailedSplit']

Scorecard shape: (400, 14)

Note: SHAP values are NOT stored in the scorecard. They are computed on-demand when using predict_score(method='shap')

First few rows of scorecard:


Unnamed: 0,Tree,LeafIndex,Feature,XAddEvidence,Count,EventRate
0,0,0,income,0.097,62.0,0.790323
1,0,1,income,0.0,0.0,0.17625
2,0,2,income,-0.076,17.0,0.176471
3,0,3,income,-0.141,306.0,0.133987
4,0,4,income,0.047,69.0,0.637681
5,0,5,income,0.0,0.0,0.17625
6,0,6,income,-0.086,23.0,0.173913
7,0,7,income,-0.193,323.0,0.0
8,1,0,income,0.087,10.0,1.0
9,1,1,income,-0.131,28.0,0.0


In [15]:
# Predict scores using SHAP method (no binning table needed)
cb_scores_shap = cb_constructor.predict_score(X_test, method="shap")
cb_scores_leafs = cb_constructor.predict_score(
    X_test, method="pdo"
)  # Leaf-based scorecard (default)

print("=== CatBoost Score Prediction Comparison ===")
print(
    f"SHAP-based scores - Mean: {cb_scores_shap.mean():.2f}, Range: {cb_scores_shap.min():.1f} to {cb_scores_shap.max():.1f}"
)
print(
    f"Leaf-based scores - Mean: {cb_scores_leafs.mean():.2f}, Range: {cb_scores_leafs.min():.1f} to {cb_scores_leafs.max():.1f}"
)

# Compare with actual model predictions
cb_predictions = cb_model.predict_proba(test_pool)[:, 1]
print(f"\nModel predictions - Mean: {cb_predictions.mean():.4f}")

# Show sample predictions
cb_comparison_df = pd.DataFrame(
    {
        "SHAP_Score": cb_scores_shap.head(10),
        "XAddEvidence_Score": cb_scores_leafs.head(10),
        "Model_Prob": cb_predictions[:10],
    }
)
print("\nSample predictions (first 10):")
display(cb_comparison_df)

=== CatBoost Score Prediction Comparison ===
SHAP-based scores - Mean: 598.10, Range: 194.0 to 712.0
Leaf-based scores - Mean: 749.11, Range: 142.0 to 895.0

Model predictions - Mean: 0.1669

Sample predictions (first 10):


Unnamed: 0,SHAP_Score,XAddEvidence_Score,Model_Prob
0,699,872.0,0.013236
1,700,888.0,0.01292
2,215,329.0,0.916289
3,230,354.0,0.899126
4,670,829.0,0.01959
5,706,885.0,0.011901
6,704,884.0,0.012361
7,317,428.0,0.728069
8,646,812.0,0.026979
9,704,884.0,0.012319


In [16]:
# Decompose scores by feature using SHAP method
cb_scores_decomposed = cb_constructor.predict_scores(X_test, method="shap")
print("=== CatBoost SHAP Score Decomposition ===")
print(f"Feature-level decomposition shape: {cb_scores_decomposed.shape}")
print(f"Columns: {cb_scores_decomposed.columns.tolist()}")
print("\nFirst 5 rows (showing feature contributions and total score):")
display(cb_scores_decomposed.head())

=== CatBoost SHAP Score Decomposition ===
Feature-level decomposition shape: (200, 6)
Columns: ['age_score', 'income_score', 'credit_history_score', 'debt_ratio_score', 'employment_years_score', 'score']

First 5 rows (showing feature contributions and total score):


Unnamed: 0,age_score,income_score,credit_history_score,debt_ratio_score,employment_years_score,score
0,18,60,-1,30,1,699
1,26,58,-4,30,-1,700
2,23,-312,2,-91,3,215
3,28,-314,-1,-73,0,230
4,26,20,2,32,0,670
