# Global and local interpretations

No feature engineering, as the special values for MSinceMostRecentInqexcl7days seen in the global intuitive explanation are picked up by the rules emitted here.

If we were to use another -more linear- algorithm that is not decision tree based, we might would have needed to split MSinceMostRecentInqexcl7days's special values into separate features.

In [1]:
import pandas as pd
from rulefitcustom import RuleFitCustom

df = pd.read_csv("heloc_dataset_v1.csv")

codes = {'Bad':1, 'Good':0}
df['RiskPerformance'] = df['RiskPerformance'].map(codes)


X, y = df.iloc[:,1:], df.iloc[:,0]

features = X.columns
X_mat = X.as_matrix()

# As part of the evaluation of RuleFitCustom, we know these parameters 
# will yield an AUC of around 0.79 under 5-fold cross-validation
rule_fit = RuleFitCustom(model_type = 'r', simple_rules=True)
rule_fit.fit(X_mat, y, feature_names=features)

rules = rule_fit.rule_ensemble.rules



In [2]:
# post-processing, helper for additional rule de-duplication

output_rules = []
for i in range(0, len(rules)):
    rule = rules[i]
    coef = rule_fit.coef_[i]
    if (coef != 0):
        features_used = []
        for rule_condition in rule.conditions:
            features_used += [rule_condition.feature_name]
        single_feature_threshold = ""
        single_feature_name = ""
        direction_str = ""
        if len(rule.conditions) == 1:
            rule_condition = next(iter(rule.conditions))
            single_feature_threshold = rule_condition.threshold
            single_feature_name = rule_condition.feature_name
            direction = (1 if ('>' == rule_condition.operator) else -1) * (1 if (0 < coef) else -1)
            direction_str = 'increasing' if (1 == direction) else 'decreasing'
        output_rules += [(i, rule.__str__(), coef / len(features_used), rule.support, features_used, len(rule.conditions), single_feature_threshold, single_feature_name, direction_str)]
        
rules = pd.DataFrame(output_rules, columns=["index", "rule", "coef", "support", "features_used", "rule_length", "single_feature_threshold", "single_feature_name", "direction"])

rules = rules.sort_values(['rule_length', 'single_feature_name', 'single_feature_threshold'])


## Rules: The global explanation

MSinceMostRecentInqexcl7days's special values get a special treatment with 'MSinceMostRecentInqexcl7days <= -7.5'

In [3]:
pd.set_option('display.height', 500)
pd.set_option('display.max_rows', 500)
pd.options.display.max_colwidth = 1000

rules[['rule', 'coef', 'direction']].to_csv('global_rules.csv')
rules[['rule', 'coef', 'direction']]

Unnamed: 0,rule,coef,direction
1,AverageMInFile <= 59.5,0.002913,decreasing
67,AverageMInFile > 59.5,-0.025867,decreasing
65,AverageMInFile <= 63.5,0.005612,decreasing
24,AverageMInFile <= 69.5,0.000571,decreasing
18,AverageMInFile <= 75.5,0.026121,decreasing
72,AverageMInFile > 75.5,-0.003252,decreasing
26,AverageMInFile <= 81.5,0.015094,decreasing
31,AverageMInFile > 95.5,-0.001074,decreasing
55,AverageMInFile <= 95.5,8.8e-05,decreasing
58,ExternalRiskEstimate <= 63.5,0.026423,decreasing


#### Monotonicity of the simple rule sets

In [4]:
monotonicity_simple_rules_sets = rules.query('rule_length == 1')[['single_feature_name', 'direction']].groupby('single_feature_name').nunique()['direction'].apply(lambda x: 1 if x == 1 else 0)
monotonicity_number = monotonicity_simple_rules_sets.sum()
print(monotonicity_simple_rules_sets)

single_feature_name
AverageMInFile                  1
ExternalRiskEstimate            1
MSinceMostRecentDelq            0
MSinceMostRecentInqexcl7days    0
MSinceOldestTradeOpen           1
MaxDelq2PublicRecLast12M        1
NetFractionRevolvingBurden      1
NumInqLast6M                    1
NumRevolvingTradesWBalance      1
NumSatisfactoryTrades           1
PercentInstallTrades            1
PercentTradesNeverDelq          1
PercentTradesWBalance           1
Name: direction, dtype: int64


#### Statistics of the rules

In [5]:

distinct_feature_used_number = len(set([item for sublist in rules['features_used'].tolist() for item in sublist]))
rule_number = rules.shape[0]
rule_conditions_number = rules['rule_length'].sum()
single_rules = rules.query('rule_length == 1')
other_rules = rules.query('rule_length != 1')

improveable_single_rule_conditions = len(single_rules[['single_feature_name', 'single_feature_threshold', 'direction']].apply(lambda x: '{}{}{}'.format(x[0], x[1], x[2]), axis=1).unique())

others_rule_conditions = other_rules['rule_length'].sum()
others_rule_number = other_rules.shape[0]

improveable_rule_number = improveable_single_rule_conditions + others_rule_number
improveable_rule_conditions_number = improveable_single_rule_conditions + others_rule_conditions

print('{} features used, {} of which used in a monotonic fashion in single condition rule sets.'.format(distinct_feature_used_number, monotonicity_number))
print('{} rules (improveable to {} rules), {} of which dealing with feature-interactions.'.format(rule_number, improveable_rule_number, others_rule_number))
print('{} rule conditions (improveable to {} rule conditions).'.format(rule_conditions_number, improveable_rule_conditions_number))



13 features used, 11 of which used in a monotonic fashion in single condition rule sets.
81 rules (improveable to 63 rules), 8 of which dealing with feature-interactions.
89 rule conditions (improveable to 71 rule conditions).


In [6]:
import numpy as np

def print_local_explanation(instance_index):
    
    rule_coefs = rule_fit.coef_[-len(rule_fit.rule_ensemble.rules):]
    X_rules = rule_fit.rule_ensemble.transform(X_mat, coefs=rule_coefs, weigh_rules=True)

    intercept_ = rule_fit.lscv.intercept_
    pred_full_off = rule_fit.predict(X_mat)
    
    full_attributions = np.multiply(X_rules[instance_index,:], rule_coefs)
    prediction = intercept_ + full_attributions.sum()
    nz = full_attributions.nonzero()
    
    single_feature_attributions = dict.fromkeys(monotonicity_simple_rules_sets.index.tolist(), 0)
    complex_feature_attributions = dict()
    
    for nz_index in full_attributions.nonzero()[0]:
        attribution = full_attributions[nz_index]
        rule_row = rules[rules["index"] == nz_index]
        feature_str = rule_row["single_feature_name"].item()
        rule_str = rule_row["rule"].item()
        
        if (rule_row["rule_length"].item() == 1):
            single_feature_attributions[feature_str] += attribution
        else:
            complex_feature_attributions[rule_str] = attribution


    simple_rule_str = ""
    complex_rule_str = ""

    for k in sorted(single_feature_attributions, key=single_feature_attributions.get, reverse=True):
        simple_rule_str += '\n{:+7.4f} : {}'.format(single_feature_attributions[k], k)
            
    
    for k in sorted(complex_feature_attributions, key=complex_feature_attributions.get, reverse=True):
        complex_rule_str += '\n{:+7.4f} : {}'.format(complex_feature_attributions[k], k)
        
    print('Local explanation')
    print('----')
    print('Attributions to single features:')
    print(simple_rule_str)
    print('')
    print('----')
    print('Attributions to feature interactions:')
    print(complex_rule_str)
    print('')
    print('----')
    print('Intercept')
    print('{:+7.4f}'.format(intercept_))
    print('')
    print('----')
    print('Total:')
    print('{:7.4f}'.format(prediction))
    print('')
    print('------------')


## Applying the rules: The local explanations

In [7]:
instance_index = 0

print_local_explanation(instance_index)
X.iloc[instance_index, :]

Local explanation
----
Attributions to single features:

+0.1438 : ExternalRiskEstimate
+0.0224 : PercentTradesNeverDelq
+0.0129 : MSinceOldestTradeOpen
+0.0121 : MaxDelq2PublicRecLast12M
+0.0070 : MSinceMostRecentInqexcl7days
+0.0056 : NumRevolvingTradesWBalance
+0.0050 : PercentTradesWBalance
+0.0019 : NumSatisfactoryTrades
+0.0007 : MSinceMostRecentDelq
+0.0000 : NumInqLast6M
-0.0290 : AverageMInFile
-0.0341 : NetFractionRevolvingBurden
-0.0343 : PercentInstallTrades

----
Attributions to feature interactions:

+0.0687 : MSinceMostRecentInqexcl7days <= 0.5 & MSinceMostRecentInqexcl7days > -7.5
+0.0323 : MSinceMostRecentInqexcl7days > -7.5 & MSinceMostRecentInqexcl7days <= 1.5

----
Intercept
+0.5010

----
Total:
 0.7160

------------


ExternalRiskEstimate                   55
MSinceOldestTradeOpen                 144
MSinceMostRecentTradeOpen               4
AverageMInFile                         84
NumSatisfactoryTrades                  20
NumTrades60Ever2DerogPubRec             3
NumTrades90Ever2DerogPubRec             0
PercentTradesNeverDelq                 83
MSinceMostRecentDelq                    2
MaxDelq2PublicRecLast12M                3
MaxDelqEver                             5
NumTotalTrades                         23
NumTradesOpeninLast12M                  1
PercentInstallTrades                   43
MSinceMostRecentInqexcl7days            0
NumInqLast6M                            0
NumInqLast6Mexcl7days                   0
NetFractionRevolvingBurden             33
NetFractionInstallBurden               -8
NumRevolvingTradesWBalance              8
NumInstallTradesWBalance                1
NumBank2NatlTradesWHighUtilization      1
PercentTradesWBalance                  69
Name: 0, dtype: int64