## Shaplyスコアリング

MLOps Model Deployments (Driverless AI Model - MOJO)
- Artifact Type: DAI MOJO Pipeline
- Runtime: DAI MOJO Scorer (Shapley original only)

<img src="image/MLOps_Shaply.png" width=600px>

HTTPリクエストの実施  
"requestShapleyValueType": "ORIGINAL"をリクエストデータに追加

Document: https://docs.h2o.ai/mlops/model-scoring/shapley-values-support

```bash
curl -X POST -H "Content-Type: application/json" -d @- https://model.internal.dedicated.h2o.ai/30cbdf4e-be01-40f7-88ba-8ea05557073a/model/score << EOF
 {
   "fields": [
   "EDUCATION","MARRIAGE","AGE","PAY_1","PAY_2","PAY_3","PAY_4","PAY_5","PAY_6","BILL_AMT1","BILL_AMT2","BILL_AMT3","BILL_AMT4","BILL_AMT5","BILL_AMT6","PAY_AMT1","PAY_AMT2","PAY_AMT3","PAY_AMT4","PAY_AMT5","PAY_AMT6"
   ],
   "rows": [
      [
         "text","text","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0"
      ]
   ], 
   "requestShapleyValueType": "ORIGINAL"
} 
EOF
```

MLOpsからの応答

```bash
{
    "id":"dfb0ac84-4634-11ef-b0ad-127940c2494e",
    "fields":["default_payment_next_month.0","default_payment_next_month.1"],
    "score":[["0.6999526659713154","0.30004733402868455"]],
    "featureShapleyContributions":{"features":["contrib_EDUCATION","contrib_MARRIAGE","contrib_AGE","contrib_PAY_1","contrib_PAY_2","contrib_PAY_3","contrib_PAY_4","contrib_PAY_5","contrib_PAY_6","contrib_BILL_AMT1","contrib_BILL_AMT2","contrib_BILL_AMT3","contrib_BILL_AMT4","contrib_BILL_AMT5","contrib_BILL_AMT6","contrib_PAY_AMT1","contrib_PAY_AMT2","contrib_PAY_AMT3","contrib_PAY_AMT4","contrib_PAY_AMT5","contrib_PAY_AMT6","contrib_bias"],
    "contributionGroups":[
            {"contributions":[["0.018954395175485188","0.02064209376526759","0.01739020325138992","-0.09536292450482012","-0.011726511778063973","0.034657258684815145","0.028547428357985582","0.004374551135780689","-0.03227419095513172","0.10147395234465731","-0.11532813084830755","-0.0799173529898052","-0.03746330075203831","0.0064375793850206305","-0.04033921631545275","0.13440659277986233","0.13308447886276945","0.12261666806500486","0.09766839374193757","0.08133276094386327","0.09352165490099","-1.329768897488515"]]
        }
    ]
}
```

### Python ([Request](https://requests-docs-ja.readthedocs.io/en/latest/#))での実施

In [42]:
import requests
import json
import pandas as pd
import numpy as np

In [11]:
dict_data = {
  "fields":["EDUCATION","MARRIAGE","AGE","PAY_1","PAY_2","PAY_3","PAY_4","PAY_5","PAY_6","BILL_AMT1","BILL_AMT2","BILL_AMT3","BILL_AMT4","BILL_AMT5","BILL_AMT6","PAY_AMT1","PAY_AMT2","PAY_AMT3","PAY_AMT4","PAY_AMT5","PAY_AMT6"],
  "rows": [
    ["text","text","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0"]
  ],
    "requestShapleyValueType":"ORIGINAL",
}

In [12]:
response = requests.post(url="https://model.internal.dedicated.h2o.ai/30cbdf4e-be01-40f7-88ba-8ea05557073a/model/score", 
                         headers= {'content-type': 'application/json'}, 
                         data=json.dumps(dict_data))
response

<Response [200]>

In [13]:
response.text

'{"id":"dfb0ac84-4634-11ef-b0ad-127940c2494e","fields":["default_payment_next_month.0","default_payment_next_month.1"],"score":[["0.6999526659713154","0.30004733402868455"]],"featureShapleyContributions":{"features":["contrib_EDUCATION","contrib_MARRIAGE","contrib_AGE","contrib_PAY_1","contrib_PAY_2","contrib_PAY_3","contrib_PAY_4","contrib_PAY_5","contrib_PAY_6","contrib_BILL_AMT1","contrib_BILL_AMT2","contrib_BILL_AMT3","contrib_BILL_AMT4","contrib_BILL_AMT5","contrib_BILL_AMT6","contrib_PAY_AMT1","contrib_PAY_AMT2","contrib_PAY_AMT3","contrib_PAY_AMT4","contrib_PAY_AMT5","contrib_PAY_AMT6","contrib_bias"],"contributionGroups":[{"contributions":[["0.018954395175485188","0.02064209376526759","0.01739020325138992","-0.09536292450482012","-0.011726511778063973","0.034657258684815145","0.028547428357985582","0.004374551135780689","-0.03227419095513172","0.10147395234465731","-0.11532813084830755","-0.0799173529898052","-0.03746330075203831","0.0064375793850206305","-0.04033921631545275",

In [20]:
results = response.json()
results

{'id': 'dfb0ac84-4634-11ef-b0ad-127940c2494e',
 'fields': ['default_payment_next_month.0', 'default_payment_next_month.1'],
 'score': [['0.6999526659713154', '0.30004733402868455']],
 'featureShapleyContributions': {'features': ['contrib_EDUCATION',
   'contrib_MARRIAGE',
   'contrib_AGE',
   'contrib_PAY_1',
   'contrib_PAY_2',
   'contrib_PAY_3',
   'contrib_PAY_4',
   'contrib_PAY_5',
   'contrib_PAY_6',
   'contrib_BILL_AMT1',
   'contrib_BILL_AMT2',
   'contrib_BILL_AMT3',
   'contrib_BILL_AMT4',
   'contrib_BILL_AMT5',
   'contrib_BILL_AMT6',
   'contrib_PAY_AMT1',
   'contrib_PAY_AMT2',
   'contrib_PAY_AMT3',
   'contrib_PAY_AMT4',
   'contrib_PAY_AMT5',
   'contrib_PAY_AMT6',
   'contrib_bias'],
  'contributionGroups': [{'contributions': [['0.018954395175485188',
      '0.02064209376526759',
      '0.01739020325138992',
      '-0.09536292450482012',
      '-0.011726511778063973',
      '0.034657258684815145',
      '0.028547428357985582',
      '0.004374551135780689',
      '-0

In [21]:
results.keys()

dict_keys(['id', 'fields', 'score', 'featureShapleyContributions'])

In [27]:
results['fields'][1], results['score'][0][1]

('default_payment_next_month.1', '0.30004733402868455')

In [29]:
results['featureShapleyContributions'].keys()

dict_keys(['features', 'contributionGroups'])

In [30]:
results['featureShapleyContributions']['features']

['contrib_EDUCATION',
 'contrib_MARRIAGE',
 'contrib_AGE',
 'contrib_PAY_1',
 'contrib_PAY_2',
 'contrib_PAY_3',
 'contrib_PAY_4',
 'contrib_PAY_5',
 'contrib_PAY_6',
 'contrib_BILL_AMT1',
 'contrib_BILL_AMT2',
 'contrib_BILL_AMT3',
 'contrib_BILL_AMT4',
 'contrib_BILL_AMT5',
 'contrib_BILL_AMT6',
 'contrib_PAY_AMT1',
 'contrib_PAY_AMT2',
 'contrib_PAY_AMT3',
 'contrib_PAY_AMT4',
 'contrib_PAY_AMT5',
 'contrib_PAY_AMT6',
 'contrib_bias']

In [35]:
results['featureShapleyContributions']['contributionGroups'][0]['contributions']

[['0.018954395175485188',
  '0.02064209376526759',
  '0.01739020325138992',
  '-0.09536292450482012',
  '-0.011726511778063973',
  '0.034657258684815145',
  '0.028547428357985582',
  '0.004374551135780689',
  '-0.03227419095513172',
  '0.10147395234465731',
  '-0.11532813084830755',
  '-0.0799173529898052',
  '-0.03746330075203831',
  '0.0064375793850206305',
  '-0.04033921631545275',
  '0.13440659277986233',
  '0.13308447886276945',
  '0.12261666806500486',
  '0.09766839374193757',
  '0.08133276094386327',
  '0.09352165490099',
  '-1.329768897488515']]

In [39]:
df_contrib = pd.DataFrame({
    'features':results['featureShapleyContributions']['features'],
    'contributions':results['featureShapleyContributions']['contributionGroups'][0]['contributions'][0],
})
df_contrib

Unnamed: 0,features,contributions
0,contrib_EDUCATION,0.0189543951754851
1,contrib_MARRIAGE,0.0206420937652675
2,contrib_AGE,0.0173902032513899
3,contrib_PAY_1,-0.0953629245048201
4,contrib_PAY_2,-0.0117265117780639
5,contrib_PAY_3,0.0346572586848151
6,contrib_PAY_4,0.0285474283579855
7,contrib_PAY_5,0.0043745511357806
8,contrib_PAY_6,-0.0322741909551317
9,contrib_BILL_AMT1,0.1014739523446573


In [40]:
df_contrib.dtypes

features         object
contributions    object
dtype: object

In [44]:
df_contrib['contributions'] = df_contrib['contributions'].astype(np.float64)

In [45]:
df_contrib.dtypes

features          object
contributions    float64
dtype: object

In [51]:
df_contrib['features'] = [s.replace('contrib_', '') for s in df_contrib['features']]

In [52]:
df_contrib

Unnamed: 0,features,contributions
0,EDUCATION,0.018954
1,MARRIAGE,0.020642
2,AGE,0.01739
3,PAY_1,-0.095363
4,PAY_2,-0.011727
5,PAY_3,0.034657
6,PAY_4,0.028547
7,PAY_5,0.004375
8,PAY_6,-0.032274
9,BILL_AMT1,0.101474


#### from CSV(Pandas)

In [17]:
df = pd.read_csv('data/UCI_Credit_Card3_sample5.csv')
df

Unnamed: 0,ID,LIMIT_BAL,SEX,EDUCATION,MARRIAGE,AGE,PAY_1,PAY_2,PAY_3,PAY_4,...,BILL_AMT4,BILL_AMT5,BILL_AMT6,PAY_AMT1,PAY_AMT2,PAY_AMT3,PAY_AMT4,PAY_AMT5,PAY_AMT6,default_payment_next_month
0,1,20000,M,university,married,24,2,2,-1,-1,...,0,0,0,0,689,0,0,0,0,1
1,2,120000,M,university,single,26,-1,2,0,0,...,3272,3455,3261,0,1000,1000,1000,0,2000,1
2,3,90000,M,university,single,34,0,0,0,0,...,14331,14948,15549,1518,1500,1000,1000,1000,5000,0
3,4,50000,M,university,married,37,0,0,0,0,...,28314,28959,29547,2000,2019,1200,1100,1069,1000,0
4,5,50000,F,university,married,57,-1,0,-1,0,...,20940,19146,19131,2000,36681,10000,9000,689,679,0


In [54]:
df.dtypes

ID                             int64
LIMIT_BAL                      int64
SEX                           object
EDUCATION                     object
MARRIAGE                      object
AGE                            int64
PAY_1                          int64
PAY_2                          int64
PAY_3                          int64
PAY_4                          int64
PAY_5                          int64
PAY_6                          int64
BILL_AMT1                      int64
BILL_AMT2                      int64
BILL_AMT3                      int64
BILL_AMT4                      int64
BILL_AMT5                      int64
BILL_AMT6                      int64
PAY_AMT1                       int64
PAY_AMT2                       int64
PAY_AMT3                       int64
PAY_AMT4                       int64
PAY_AMT5                       int64
PAY_AMT6                       int64
default_payment_next_month     int64
dtype: object

In [56]:
# データ型の変換
df_str = df.astype(str)

In [57]:
# pandas.DataFrameから、jsonに変換
json_data = df_str.to_json(orient="split")
json_data = json.loads(json_data)
json_data

{'columns': ['ID',
  'LIMIT_BAL',
  'SEX',
  'EDUCATION',
  'MARRIAGE',
  'AGE',
  'PAY_1',
  'PAY_2',
  'PAY_3',
  'PAY_4',
  'PAY_5',
  'PAY_6',
  'BILL_AMT1',
  'BILL_AMT2',
  'BILL_AMT3',
  'BILL_AMT4',
  'BILL_AMT5',
  'BILL_AMT6',
  'PAY_AMT1',
  'PAY_AMT2',
  'PAY_AMT3',
  'PAY_AMT4',
  'PAY_AMT5',
  'PAY_AMT6',
  'default_payment_next_month'],
 'index': [0, 1, 2, 3, 4],
 'data': [['1',
   '20000',
   'M',
   'university',
   'married',
   '24',
   '2',
   '2',
   '-1',
   '-1',
   '-2',
   '-2',
   '3913',
   '3102',
   '689',
   '0',
   '0',
   '0',
   '0',
   '689',
   '0',
   '0',
   '0',
   '0',
   '1'],
  ['2',
   '120000',
   'M',
   'university',
   'single',
   '26',
   '-1',
   '2',
   '0',
   '0',
   '0',
   '2',
   '2682',
   '1725',
   '2682',
   '3272',
   '3455',
   '3261',
   '0',
   '1000',
   '1000',
   '1000',
   '0',
   '2000',
   '1'],
  ['3',
   '90000',
   'M',
   'university',
   'single',
   '34',
   '0',
   '0',
   '0',
   '0',
   '0',
   '0',
   '29239

In [59]:
json_data.keys()

dict_keys(['columns', 'index', 'data'])

In [60]:
# キー名の変更
json_data['fields'] = json_data['columns']
del json_data['columns']
json_data['rows'] = json_data['data']
del json_data['data']

In [65]:
# Shapley Scoring指示の追加
json_data['requestShapleyValueType'] = 'ORIGINAL'

In [67]:
json_data.keys()

dict_keys(['index', 'fields', 'rows', 'requestShapleyValueType'])

In [68]:
response = requests.post(url='https://model.internal.dedicated.h2o.ai/30cbdf4e-be01-40f7-88ba-8ea05557073a/model/score', 
                         headers={'content-type': 'application/json'}, 
                         data=json.dumps(json_data))
response.json()

{'id': 'dfb0ac84-4634-11ef-b0ad-127940c2494e',
 'fields': ['default_payment_next_month.0', 'default_payment_next_month.1'],
 'score': [['0.3025260726918264', '0.6974739273081736'],
  ['0.5915855520636213', '0.40841444793637877'],
  ['0.8672090651155677', '0.13279093488443225'],
  ['0.8746903236209524', '0.1253096763790475'],
  ['0.9050907569177213', '0.0949092430822786']],
 'featureShapleyContributions': {'features': ['contrib_EDUCATION',
   'contrib_MARRIAGE',
   'contrib_AGE',
   'contrib_PAY_1',
   'contrib_PAY_2',
   'contrib_PAY_3',
   'contrib_PAY_4',
   'contrib_PAY_5',
   'contrib_PAY_6',
   'contrib_BILL_AMT1',
   'contrib_BILL_AMT2',
   'contrib_BILL_AMT3',
   'contrib_BILL_AMT4',
   'contrib_BILL_AMT5',
   'contrib_BILL_AMT6',
   'contrib_PAY_AMT1',
   'contrib_PAY_AMT2',
   'contrib_PAY_AMT3',
   'contrib_PAY_AMT4',
   'contrib_PAY_AMT5',
   'contrib_PAY_AMT6',
   'contrib_bias'],
  'contributionGroups': [{'contributions': [['0.09608278421794794',
      '0.08722179091945213

In [72]:
response.json().keys()

dict_keys(['id', 'fields', 'score', 'featureShapleyContributions'])