## K-Lime & Shapley Scoring Pipeline

参考(MLI Python Scoringをダウンロード)：
- scoring-pipeline/example.py : 予測値のスコアリングとShapleyのスコアリングに関するサンプルプログラム  
- scoring-pipeline-mli/example.py : K-Limeのスコアリングに関するサンプルプログラム 
- scoring-pipeline-mli/example_shapley.py : K-LimeとShapleyの比較に関するサンプルプログラム 

In [1]:
import pandas as pd
import numpy as np
from numpy import nan
from scipy.special._ufuncs import expit

---

### K-Lime Scoring

`pip  install  scoring-pipeline-mli/scoring_mli_experiment_eb096b2c_4fcd_11eb_9924_0242ac110002-1.0.0-py3-none-any.whl`

In [3]:
from scoring_mli_experiment_eb096b2c_4fcd_11eb_9924_0242ac110002 import KLimeScorer

In [33]:
# KLimeScorerインスタンスの作成
# インスタンス化は一度のみとし、複数回スコアリングする場合でもそのインスタンスから
# スコアリングメソッド（score_reason_codes(), score_reason_codes_batch()）を複数回呼び出すことを推奨 
mli_scorer = KLimeScorer()
mli_scorer

models/GLM/klime_glm_cluster0-Grid_KLime_mli_eb096b2c_4fcd_11eb_9924_0242ac110002.hex_model_1609901593811_1_model_3
models/GLM/klime_glm_cluster1-Grid_KLime_mli_eb096b2c_4fcd_11eb_9924_0242ac110002.hex_model_1609901593811_1_model_3
models/GLM/klime_glm_cluster2-Grid_KLime_mli_eb096b2c_4fcd_11eb_9924_0242ac110002.hex_model_1609901593811_1_model_3
models/GLM/klime_glm_cluster3-Grid_KLime_mli_eb096b2c_4fcd_11eb_9924_0242ac110002.hex_model_1609901593811_1_model_3
models/GLM/klime_glm_cluster4-Grid_KLime_mli_eb096b2c_4fcd_11eb_9924_0242ac110002.hex_model_1609901593811_1_model_3
models/GLM/klime_glm_cluster5-Grid_KLime_mli_eb096b2c_4fcd_11eb_9924_0242ac110002.hex_model_1609901593811_1_model_3
models/GLM/klime_glm_cluster6-Grid_KLime_mli_eb096b2c_4fcd_11eb_9924_0242ac110002.hex_model_1609901593811_1_model_3
models/GLM/klime_glm_cluster7-Grid_KLime_mli_eb096b2c_4fcd_11eb_9924_0242ac110002.hex_model_1609901593811_1_model_3
models/GLM/klime_glm_cluster8-Grid_KLime_mli_eb096b2c_4fcd_11eb_9924_024

<scoring_mli_experiment_eb096b2c_4fcd_11eb_9924_0242ac110002.klime_scorer.KLimeScorer at 0x7f5224b362e8>

インプットデータ：

| Name | Type    |
| ---- | ------- | 
| x3   | float64 |
| x1   | float64 | 
| x4   | float64 |

In [34]:
# カラム名の確認
mli_scorer.get_column_names()

['x3', 'x1', 'x4']

In [36]:
# 返り値名の確認
mli_scorer.get_reason_code_column_names()

['x3', 'x1', 'x4', 'Intercept']

**一行データのスコアリング**

結果(リスト)の最後の値はIntercept

In [37]:
mli_scorer.score_reason_codes([
    '-2.7997',  # x3
    '-2.4307',  # x1
    '-2.3697',  # x4
])

[-0.39610478260245907,
 -0.5276176194651054,
 3.415107055976632,
 1.233942981112774]

In [38]:
mli_scorer.score_reason_codes([
    '0',  # x3
    '4.8',  # x1
    '-6.3',  # x4
])

[-0.0, 5.08845501846021, 2.1897282740067365, 0.5037904019017483]

**データテーブルからのバッチスコアリング**

In [39]:
# スコアリング用データ
columns = [
    pd.Series(['-2.7997', '-2.5746', '-2.6275', '-2.7208', '0'], name='x3', dtype='float64'),
    pd.Series(['-2.4307', '-2.9838', '-2.4492', '-2.4307', '4.8'], name='x1', dtype='float64'),
    pd.Series(['-2.3697', '-3.4233', '-2.4827', '-3.4233', '-6.3'], name='x4', dtype='float64'),
]
df = pd.concat(columns, axis=1)
print(type(df))
df

<class 'pandas.core.frame.DataFrame'>


Unnamed: 0,x3,x1,x4
0,-2.7997,-2.4307,-2.3697
1,-2.5746,-2.9838,-3.4233
2,-2.6275,-2.4492,-2.4827
3,-2.7208,-2.4307,-3.4233
4,0.0,4.8,-6.3


In [40]:
# K-Limeスコアリング
mli_scorer.score_reason_codes_batch(df)

Unnamed: 0,x3,x1,x4,Intercept
0,-0.396105,-0.527618,3.415107,1.233943
1,2.07571,-1.466986,0.969845,0.896459
2,-0.371742,-0.531633,3.577958,1.233943
3,2.19358,-1.195055,0.969845,0.896459
4,-0.0,5.088455,2.189728,0.50379


Driverless AIとの一貫性を持たせるため、datatableの利用を推奨

In [41]:
import datatable as dt

# インプットデータ
df_dt = dt.Frame(df)
print(type(df_dt))
df_dt

<class 'datatable.Frame'>


Unnamed: 0_level_0,x3,x1,x4
Unnamed: 0_level_1,▪▪▪▪▪▪▪▪,▪▪▪▪▪▪▪▪,▪▪▪▪▪▪▪▪
0,−2.7997,−2.4307,−2.3697
1,−2.5746,−2.9838,−3.4233
2,−2.6275,−2.4492,−2.4827
3,−2.7208,−2.4307,−3.4233
4,0,4.8,−6.3


In [42]:
preds_df = mli_scorer.score_reason_codes_batch(df_dt) 
print(type(preds_df))    # 結果がpandas.DataFrameとして返る
preds_df

<class 'pandas.core.frame.DataFrame'>


Unnamed: 0,x3,x1,x4,Intercept
0,-0.396105,-0.527618,3.415107,1.233943
1,2.07571,-1.466986,0.969845,0.896459
2,-0.371742,-0.531633,3.577958,1.233943
3,2.19358,-1.195055,0.969845,0.896459
4,-0.0,5.088455,2.189728,0.50379


In [43]:
# datatable.Frameへの変更
dt.Frame(preds_df)

Unnamed: 0_level_0,x3,x1,x4,Intercept
Unnamed: 0_level_1,▪▪▪▪▪▪▪▪,▪▪▪▪▪▪▪▪,▪▪▪▪▪▪▪▪,▪▪▪▪▪▪▪▪
0,−0.396105,−0.527618,3.41511,1.23394
1,2.07571,−1.46699,0.969845,0.896459
2,−0.371742,−0.531633,3.57796,1.23394
3,2.19358,−1.19505,0.969845,0.896459
4,-0,5.08846,2.18973,0.50379


---

### Shapley Scoring

`pip  install  scoring-pipeline-mli/scoring_h2oai_experiment_3a80fcea_4fcb_11eb_9924_0242ac110002-1.0.0-py3-none-any.whl`

In [48]:
from scoring_h2oai_experiment_3a80fcea_4fcb_11eb_9924_0242ac110002 import Scorer

In [50]:
scorer = Scorer()
scorer

2021-01-08 00:04:07,485 C:  6% D:202.9GB M:30.4GB  NODE:SERVER      2548   INFO   | Starting H2O server for recipes.  url: None, ip: None, port: 50351, name: DAI-H2O-RECIPES-1.9.0., threads: 8
Checking whether there is an H2O instance running at http://localhost:50351 . connected.


0,1
H2O_cluster_uptime:,06 secs
H2O_cluster_timezone:,Etc/UTC
H2O_data_parsing_timezone:,UTC
H2O_cluster_version:,3.30.0.3
H2O_cluster_version_age:,7 months and 25 days !!!
H2O_cluster_name:,DAI-H2O-RECIPES-1.9.0.
H2O_cluster_total_nodes:,1
H2O_cluster_free_memory:,6.952 Gb
H2O_cluster_total_cores:,8
H2O_cluster_allowed_cores:,8


2021-01-08 00:04:08,520 C:  0% D:202.9GB M:30.4GB  NODE:SERVER      2548   INFO   | RECIPE H2O-3 server started
2021-01-08 00:04:08,522 C:  0% D:202.9GB M:30.4GB  NODE:SERVER      2548   INFO   | Started H2O version 3.30.0.3 at http://localhost:50351


<scoring_h2oai_experiment_3a80fcea_4fcb_11eb_9924_0242ac110002.scorer.Scorer at 0x7f51f7fcaf60>

In [58]:
# カラム名の確認
scorer.get_column_names()

('x1', 'x3', 'x4')

In [63]:
# Driverless AIの特徴量エンジニアリング後の名前
scorer.get_transformed_column_names()

['0_x1', '2_x3', '3_x4']

In [59]:
# スコアリング用サンプルデータ
mli_df = pd.DataFrame(pd.np.array([['-2.7997',  '-2.4307', '-2.3697',]]), columns=mli_scorer.get_column_names())
mli_df

Unnamed: 0,x3,x1,x4
0,-2.7997,-2.4307,-2.3697


In [60]:
# Python Scoring Pipeline(Scorer)用のインプットに順序を揃える
# Make the row compatible with DAI model input
dai_df = mli_df.reindex(scorer.get_column_names(), axis=1)
dai_df

Unnamed: 0,x1,x3,x4
0,-2.4307,-2.7997,-2.3697


In [61]:
# # Python Scoring Pipelineによる、予測値のスコアリング
dai_score = scorer.score_batch(dai_df)
dai_score

2021-01-08 00:10:41,469 C:  1% D:202.9GB M:29.8GB  NODE:SERVER      2548   INFO   | Submitted    0 and Completed    0 non-identity feature engineering tasks out of    3 total tasks (including    3 identity)


Unnamed: 0,y,y.lower,y.upper
0,2.633075,0.259097,5.167462


In [72]:
# Shapleyの算出
dai_reason_codes = scorer.score_batch(dai_df, pred_contribs=True)
print(type(dai_reason_codes))    # pandas.DataFrameとして返る
dai_reason_codes

2021-01-08 00:15:40,381 C:  0% D:202.9GB M:29.7GB  NODE:SERVER      2548   INFO   | Submitted    0 and Completed    0 non-identity feature engineering tasks out of    3 total tasks (including    3 identity)
<class 'pandas.core.frame.DataFrame'>


Unnamed: 0,contrib_0_x1,contrib_2_x3,contrib_3_x4,contrib_bias
0,-58650.125,95974.257812,72283.265625,466291.90625


In [76]:
# 値の合計（contrib_0_x1 + contrib_2_x3 + contrib_3_x4 + contrib_bias）
# Target Transformationが実施されているため、予測結果に一致しない
dai_reason_codes.iloc[0,:].sum()

575899.3

---

### K-LimeとShapleyの比較

K-Limeの合計値が予測結果と大きく異なる場合、代理モデル(Surrogate Model)として上手く機能していないと判断。この場合はShapleyを利用

In [77]:
mli_df

Unnamed: 0,x3,x1,x4
0,-2.7997,-2.4307,-2.3697


In [78]:
# K-Limeの算出
klime_reason_codes = mli_scorer.score_reason_codes_batch(mli_df)
klime_reason_codes

Unnamed: 0,x3,x1,x4,Intercept
0,-0.396105,-0.527618,3.415107,1.233943


In [81]:
# 値の合計（上のx3+x1+x4+Intercept）
klime_score = mli_scorer.score_reason_codes_batch(mli_df).sum(axis=1)
klime_score

0    3.725328
dtype: float64

In [96]:
THRESHOLD = 0.2  # Python Scoring Pipelienによる予測とK-Limeの差の許容

if abs(dai_score.values[0][0] - klime_score.values[0]) > THRESHOLD:   # THRESHOLDより差が大きい場合
    print(" KLIME score vs DAI score difference is higher than threshold. Reporting Reason codes from DAI model:")
    display(dai_reason_codes)   # Shapley
else:
    print(" KLIME score within threshold difference of DAI model. Reporting Reason codes from KLIME:")
    display(klime_reason_codes)   # K-Lime

 KLIME score vs DAI score difference is higher than threshold. Reporting Reason codes from DAI model:


Unnamed: 0,contrib_0_x1,contrib_2_x3,contrib_3_x4,contrib_bias
0,-58650.125,95974.257812,72283.265625,466291.90625


In [90]:
dai_score.values[0][0]    # Python Scoring Pipelienによる予測

2.6330752

In [91]:
klime_score.values[0]    # K-Limeの合計

3.7253276350218414