# Azure Machine Learning Interpretability SDK による要因探索

品質を予測する機械学習モデルによって製造工程のデータから製造品の品質を予測することが可能になります。それだけでなく、モデルの構造を理解することで、不良に影響を与える説明変数・因子を特定し、不良の原因を見つける手助けができます。本Notebookでは、**Factory.csv** を利用し、製造工程のデータから品質を予測する機械学習を構築し、**Azure Machine Learning Interpretability SDK** の、品質に対する因子の影響度を分析します。

## 1. Python SDK のインポート
Azure Machine Learning service の Python SDKをインポートします。

In [1]:
import azureml.core
from azureml.core.experiment import Experiment
from azureml.core.workspace import Workspace
from azureml.train.automl import AutoMLConfig
from azureml.train.automl.run import AutoMLRun
import os

W0723 10:31:16.210340 139779322103552 deprecation_wrapper.py:119] From /anaconda/envs/py36/lib/python3.6/site-packages/azureml/automl/core/_vendor/automl/client/core/common/tf_wrappers.py:36: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.

W0723 10:31:16.211839 139779322103552 deprecation_wrapper.py:119] From /anaconda/envs/py36/lib/python3.6/site-packages/azureml/automl/core/_vendor/automl/client/core/common/tf_wrappers.py:36: The name tf.logging.ERROR is deprecated. Please use tf.compat.v1.logging.ERROR instead.



In [2]:
print("Azure ML SDK Version: ", azureml.core.VERSION)

Azure ML SDK Version:  1.0.48


### Azure ML workspace との接続
Azure Machine Learning service との接続を行います。Azure に対する認証が必要です。

In [3]:
ws = Workspace.from_config()
print(ws.name, ws.location, ws.resource_group, ws.location, sep = '\t')

azureml	japaneast	test	japaneast


### 実験名の設定

In [4]:
experiment=Experiment(ws, "automlQC_explain")

# 2. 学習データの準備

In [5]:
import pandas as pd
#os.makedirs("./outputs", exist_ok=True)
df = pd.read_csv('Factory.csv')

In [6]:
df.tail(10)

Unnamed: 0,ID,Quality,ProcessA-Pressure,ProcessA-Humidity,ProcessA-Vibration,ProcessB-Light,ProcessB-Skill,ProcessB-Temp,ProcessB-Rotation,ProcessC-Density,ProcessC-PH,ProcessC-skewness,ProcessC-Time
4888,4889,0,6.8,0.22,0.36,1.2,0.05,38.0,127.0,0.99,3.04,0.54,9.2
4889,4890,0,4.9,0.23,0.27,11.75,0.03,34.0,118.0,1.0,3.07,0.5,9.4
4890,4891,0,6.1,0.34,0.29,2.2,0.04,25.0,100.0,0.99,3.06,0.44,11.8
4891,4892,0,5.7,0.21,0.32,0.9,0.04,38.0,121.0,0.99,3.24,0.46,10.6
4892,4893,0,6.5,0.23,0.38,1.3,0.03,29.0,112.0,0.99,3.29,0.54,9.7
4893,4894,0,6.2,0.21,0.29,1.6,0.04,24.0,92.0,0.99,3.27,0.5,11.2
4894,4895,0,6.6,0.32,0.36,8.0,0.05,57.0,168.0,0.99,3.15,0.46,9.6
4895,4896,0,6.5,0.24,0.19,1.2,0.04,30.0,111.0,0.99,2.99,0.46,9.4
4896,4897,1,5.5,0.29,0.3,1.1,0.02,20.0,110.0,0.99,3.34,0.38,12.8
4897,4898,0,6.0,0.21,0.38,0.8,0.02,22.0,98.0,0.99,3.26,0.32,11.8


In [7]:
from sklearn.model_selection import train_test_split

X = df.drop(columns=["Quality","ID"],axis=1)
y = df["Quality"].values

X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.1,random_state=100,stratify=y)

# 3. 事前設定 (Automated Machine Learning)

In [8]:
Automl_config = AutoMLConfig(task = 'classification',
                             primary_metric = 'AUC_weighted',
                             iteration_timeout_minutes = 10,
                             iterations = 10,
                             X = X_train,
                             y = y_train,
                             n_cross_validations = 3)

# 4. 実行と結果確認

In [9]:
local_run = experiment.submit(Automl_config, show_output=True)

Running on local machine
Parent Run ID: AutoML_84073eaf-27e9-485a-b003-e7ee1a86b78d
Current status: DatasetCrossValidationSplit. Generating CV splits.
Current status: ModelSelection. Beginning model selection.

****************************************************************************************************
ITERATION: The iteration being evaluated.
PIPELINE: A summary description of the pipeline being evaluated.
DURATION: Time taken for the current iteration.
METRIC: The result of computing score on the fitted pipeline.
BEST: The best observed score thus far.
****************************************************************************************************

 ITERATION   PIPELINE                                       DURATION      METRIC      BEST
         0   StandardScalerWrapper SGD                      0:00:08       0.7817    0.7817
         1   StandardScalerWrapper SGD                      0:00:08       0.7841    0.7841
         2   MinMaxScaler LightGBM                      

In [10]:
from azureml.widgets import RunDetails
RunDetails(local_run).show()

_AutoMLWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', 's…

In [11]:
best_run, fitted_model = local_run.get_output()
best_run

Experiment,Id,Type,Status,Details Page,Docs Page
automlQC_explain,AutoML_84073eaf-27e9-485a-b003-e7ee1a86b78d_5,,Completed,Link to Azure Portal,Link to Documentation


In [12]:
fitted_model

Pipeline(memory=None,
     steps=[('StandardScalerWrapper', <automl.client.core.common.model_wrappers.StandardScalerWrapper object at 0x7f2030762828>), ('LightGBMClassifier', LightGBMClassifier(boosting_type='gbdt', class_weight=None,
          colsample_bytree=0.6933333333333332, importance_type='split',
          learning_r..., subsample=0.3963157894736842,
          subsample_for_bin=200000, subsample_freq=0, verbose=-10))])

# 5. Azure Machine Learning Interpretability SDK

[Azure Machine Learning Interpretability SDK](https://docs.microsoft.com/en-US/azure/machine-learning/service/machine-learning-interpretability-explainability?view=azuremgmtcompute-fluent-1.0.0) は、Microsoftと主要な3rd Partyのライブラリ(LIME,SHAP etc)で構成されたモデル解釈のフレームワークで、統合APIをご提供しています。  
<img src="https://docs.microsoft.com/en-US/azure/machine-learning/service/media/machine-learning-interpretability-explainability/interpretability-architecture.png#lightbox" width=800 align=left>

In [13]:
!pip install azureml-sdk[contrib]





[31mmxnet-model-server 1.0.1 requires model-archiver, which is not installed.[0m
[31mautovizwidget 0.12.7 has requirement plotly<3.0,>=1.10.0, but you'll have plotly 3.6.1 which is incompatible.[0m
[31mflake8 3.7.5 has requirement pycodestyle<2.6.0,>=2.5.0, but you'll have pycodestyle 2.4.0 which is incompatible.[0m
[31mchainermn 1.3.1 has requirement chainer<5.0,>=3.5.0, but you'll have chainer 5.2.0 which is incompatible.[0m
[31mbotocore 1.12.93 has requirement urllib3<1.25,>=1.20; python_version >= "3.4", but you'll have urllib3 1.25.3 which is incompatible.[0m
[31mmxnet 1.3.0 has requirement requests<2.19.0,>=2.18.4, but you'll have requests 2.22.0 which is incompatible.[0m
[31mazureml-contrib-opendatasets 1.0.33 has requirement azureml-telemetry==1.0.33.*, but you'll have azureml-telemetry 1.0.48 which is incompatible.[0m
[31mblobxfer 1.6.0 has requirement requests~=2.21.0, but you'll have requests 2.22.0 which is incompatible.[0m
[31mblobxfer 1.6.0 has requiremen

In [14]:
!jupyter nbextension install --py --sys-prefix azureml.contrib.explain.model.visualize
!jupyter nbextension enable --py --sys-prefix azureml.contrib.explain.model.visualize

Installing /data/anaconda/envs/py35/lib/python3.5/site-packages/azureml/contrib/explain/model/visualize/static -> microsoft-mli-widget
Up to date: /data/anaconda/envs/py35/share/jupyter/nbextensions/microsoft-mli-widget/index.js.map
Up to date: /data/anaconda/envs/py35/share/jupyter/nbextensions/microsoft-mli-widget/index.js
Up to date: /data/anaconda/envs/py35/share/jupyter/nbextensions/microsoft-mli-widget/extension.js.map
Up to date: /data/anaconda/envs/py35/share/jupyter/nbextensions/microsoft-mli-widget/extension.js
- Validating: [32mOK[0m

    To initialize this nbextension in the browser every time the notebook (or other app) loads:
    
          jupyter nbextension enable azureml.contrib.explain.model.visualize --py --sys-prefix
    
Enabling notebook extension microsoft-mli-widget/extension...
      - Validating: [32mOK[0m


In [15]:
from azureml.explain.model.tabular_explainer import TabularExplainer
classes = ["false","true"]
tabular_explainer = TabularExplainer(fitted_model, X_train, features=X_train.columns, classes=classes)

In [16]:
global_explanation = tabular_explainer.explain_global(X_train[:100])

100%|██████████| 100/100 [00:15<00:00,  6.54it/s]


In [21]:
from azureml.contrib.explain.model.visualize import ExplanationDashboard
ExplanationDashboard(global_explanation, fitted_model, X_test[:100])

ExplanationWidget(value={'localExplanations': [[[-0.00019272351599272652, -0.018569533525989744, 0.0, 0.010266…

<azureml.contrib.explain.model.visualize.ExplanationDashboard.ExplanationDashboard at 0x7f1fd2ace668>