Mismatches between ScoreCardPoints object and calibrate_to_master_scale scores #68

idellang · 2021-10-29T00:32:22Z

Please excuse the way that I reported this issue. This is my first time reporting a GitHub issue. I get different results from the ScoreCardPoints object. The scores using calibrate_to_master_scale on the proba_train are different from the score using scp.transform(X_train). I believe the calibrate_to_master_scale scores were right.

EDIT: I tried following the last tutorial example 'Scorecard Model' and I encounter the same problem. Going through the example, I noticed that the coefficients from scorecard.get_stats() are negative and the scorecard.woe_transform(X_test) are positive values but I get positive coefficients and negative scorecard.woe_transform(X_test).

Check the following images. In this example, I used a single categorical variable educational attainment versus default rate. Thank you!

timvink · 2021-11-03T10:06:50Z

Hee, thanks for reporting. I happen to know the maintainer of this project is on paternity leave :) @sbjelogr Perhaps you can have a look ?

idellang · 2021-11-03T11:46:48Z

Sure! I think you just need to add a negative sign somewhere in the equation. I followed the code and added a negative sign when multiplying WoE and Coef. and was able to get the same results in the example. I'm not really familiar yet with OOP so I just created a custom function

orchardbirds · 2021-11-17T16:20:58Z

Thanks for this issue. I think the ScoreCardPoints is actually quite broken and I propose to remove it.

Looking at a minimal example, we see that the woe_dict cannot deal with the "Other" and "Missing" categories, so using the encoder to calculate the WoE for these cases doesn't work. This produces lots of missing values later on and breaks the point mapping.

@sbjelogr @timvink It seems the ScoreCardPoints method can just be replaced with the calibrate_to_masterscale function, or am I missing another use for this? I propose to remove it.

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from skorecard import datasets
from skorecard import Skorecard
from skorecard.bucketers import OrdinalCategoricalBucketer
from skorecard.rescale import calibrate_to_master_scale, ScoreCardPoints

X, y = datasets.load_uci_credit_card(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X[["EDUCATION", "MARRIAGE"]], y)

o = OrdinalCategoricalBucketer(variables=["EDUCATION"])

sc = Skorecard(
    bucketing=o,
    variables=["EDUCATION"],
    calculate_stats=True

)
sc.fit(X_train, y_train)

scp = ScoreCardPoints(skorecard_model=sc, pdo=25, ref_score=400, ref_odds=20)

sc.bucket_table("EDUCATION")

bucket	label	Count	Count (%)	Non-event	Event	Event Rate	WoE	IV
-2	Other	57.0	1.27	53.0	4.0	0.070175	1.373	0.016
-1	Missing	0.0	0.00	0.0	0.0	NaN	0.000	0.000
0	2.0	2026.0	45.02	1505.0	521.0	0.257157	-0.149	0.010
1	1.0	1662.0	36.93	1351.0	311.0	0.187124	0.259	0.023
2	3.0	755.0	16.78	557.0	198.0	0.262252	-0.175	0.005

woe_enc = scp.skorecard_model.pipeline_.named_steps["encoder"]
woe_dict = woe_enc.mapping
woe_dict['EDUCATION']

EDUCATION
1 -0.258126
2 0.148666
3 0.177157
4 -1.171335
-1 0.000000
-2 0.000000
dtype: float64

See that the WoE for -1 and 2 is bad.

sbjelogr · 2021-11-17T17:36:08Z

@orchardbirds, they are not exactly the same.

calibrate_to_master_scale just takes the predicted probas and rescales them.

ScoreCardPoints does the same via the transformer.
However, ScoreCardPoints takes a Scorecard model in input, and basically applies the selected features within the model (otherwise the calculation of the coefficients is wrong, as the points are distributed among more features).

In addition it provides an extra tabular representation of the points per feature per bucket

sbjelogr · 2021-11-17T17:40:26Z

@idellang, I will be investigating this issue in the coming days. Keep you posted

anilkumarpanda mentioned this issue Nov 3, 2021

Feature co-efficient signs are inverted between version 0.7.1 and 1.4.0 #69

Closed

sbjelogr self-assigned this Nov 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mismatches between ScoreCardPoints object and calibrate_to_master_scale scores #68

Mismatches between ScoreCardPoints object and calibrate_to_master_scale scores #68

idellang commented Oct 29, 2021 •

edited

timvink commented Nov 3, 2021

idellang commented Nov 3, 2021

orchardbirds commented Nov 17, 2021

sbjelogr commented Nov 17, 2021

sbjelogr commented Nov 17, 2021

Mismatches between ScoreCardPoints object and calibrate_to_master_scale scores #68

Mismatches between ScoreCardPoints object and calibrate_to_master_scale scores #68

Comments

idellang commented Oct 29, 2021 • edited

timvink commented Nov 3, 2021

idellang commented Nov 3, 2021

orchardbirds commented Nov 17, 2021

sbjelogr commented Nov 17, 2021

sbjelogr commented Nov 17, 2021

idellang commented Oct 29, 2021 •

edited