# Automated Machine Learning による設備メンテナンス
本Notebookでは、自動機械学習 Automated Machine Learning を利用して、設備故障の予兆検知モデルを構築します。

## 1. Python SDK のインポート
Azure Machine Learning service の Python SDKをインポートします。

In [1]:
from azureml.core import Workspace, Experiment
from azureml.train.automl import AutoMLConfig

W0808 01:02:58.101434 4481119680 deprecation_wrapper.py:119] From /Users/konabuta/miniconda3/envs/myenv/lib/python3.6/site-packages/azureml/automl/core/_vendor/automl/client/core/common/tf_wrappers.py:36: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.

W0808 01:02:58.102660 4481119680 deprecation_wrapper.py:119] From /Users/konabuta/miniconda3/envs/myenv/lib/python3.6/site-packages/azureml/automl/core/_vendor/automl/client/core/common/tf_wrappers.py:36: The name tf.logging.ERROR is deprecated. Please use tf.compat.v1.logging.ERROR instead.



In [2]:
print("Azure ML SDK Version: ", azureml.core.VERSION)

Azure ML SDK Version:  1.0.55


### Azure ML workspace との接続
Azure Machine Learning service との接続を行います。Azure に対する認証が必要です。

In [3]:
ws = Workspace.from_config()
print('Workspace name: ' + ws.name, 
      'Azure region: ' + ws.location, 
      'Subscription id: ' + ws.subscription_id, 
      'Resource group: ' + ws.resource_group, sep = '\n')

Workspace name: azureml
Azure region: japaneast
Subscription id: 9c0f91b8-eb2f-484c-979c-15848c098a6b
Resource group: azureml-ja


### 実験名の設定

In [4]:
experiment = Experiment(workspace = ws, name = "automl-classification-pm")

## 2. 学習データ準備

In [5]:
import pandas as pd 

train_df = pd.read_csv('./data/train.csv')
test_df = pd.read_csv('./data/test.csv')

In [6]:
train_df.head(10)

Unnamed: 0,id,cycle,setting1,setting2,setting3,s1,s2,s3,s4,s5,...,s16,s17,s18,s19,s20,s21,RUL,label1,label2,cycle_norm
0,1,1,0.46,0.17,0.0,0.0,0.18,0.41,0.31,0.0,...,0.0,0.33,0.0,0.0,0.71,0.72,191,0,0,0.0
1,1,2,0.61,0.25,0.0,0.0,0.28,0.45,0.35,0.0,...,0.0,0.33,0.0,0.0,0.67,0.73,190,0,0,0.0
2,1,3,0.25,0.75,0.0,0.0,0.34,0.37,0.37,0.0,...,0.0,0.17,0.0,0.0,0.63,0.62,189,0,0,0.01
3,1,4,0.54,0.5,0.0,0.0,0.34,0.26,0.33,0.0,...,0.0,0.33,0.0,0.0,0.57,0.66,188,0,0,0.01
4,1,5,0.39,0.33,0.0,0.0,0.35,0.26,0.4,0.0,...,0.0,0.42,0.0,0.0,0.59,0.7,187,0,0,0.01
5,1,6,0.25,0.42,0.0,0.0,0.27,0.29,0.27,0.0,...,0.0,0.25,0.0,0.0,0.65,0.65,186,0,0,0.01
6,1,7,0.56,0.58,0.0,0.0,0.38,0.46,0.26,0.0,...,0.0,0.33,0.0,0.0,0.74,0.67,185,0,0,0.02
7,1,8,0.3,0.75,0.0,0.0,0.41,0.26,0.32,0.0,...,0.0,0.25,0.0,0.0,0.64,0.57,184,0,0,0.02
8,1,9,0.55,0.58,0.0,0.0,0.27,0.43,0.21,0.0,...,0.0,0.33,0.0,0.0,0.71,0.71,183,0,0,0.02
9,1,10,0.31,0.58,0.0,0.0,0.15,0.44,0.31,0.0,...,0.0,0.42,0.0,0.0,0.63,0.79,182,0,0,0.02


In [7]:
#　特徴量となる列名を抽出
sensor_cols = ['s' + str(i) for i in range(1,22)]
sequence_cols = ['setting1', 'setting2', 'setting3', 'cycle']
sequence_cols.extend(sensor_cols)
print(sequence_cols)

['setting1', 'setting2', 'setting3', 'cycle', 's1', 's2', 's3', 's4', 's5', 's6', 's7', 's8', 's9', 's10', 's11', 's12', 's13', 's14', 's15', 's16', 's17', 's18', 's19', 's20', 's21']


In [8]:
X = train_df[sequence_cols]
y = train_df['label1'].values

In [9]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1,random_state=100,stratify=y)

# 3. 事前設定 (Automated Machine Learning)

In [10]:
automl_config = AutoMLConfig(task = 'classification',
                             iteration_timeout_minutes = 10,
                             iterations = 10,
                             primary_metric = 'AUC_weighted',
                             n_cross_validations = 3,
                             X = train_df[sequence_cols], 
                             y = train_df['label1'].values
                             )

## 4. 実行と結果確認

In [11]:
local_run = experiment.submit(automl_config, show_output=True)

Running on local machine
Parent Run ID: AutoML_07e79484-e977-4c6a-8d4c-2f151ce86c97
Current status: DatasetCrossValidationSplit. Generating CV splits.
Current status: ModelSelection. Beginning model selection.

****************************************************************************************************
ITERATION: The iteration being evaluated.
PIPELINE: A summary description of the pipeline being evaluated.
DURATION: Time taken for the current iteration.
METRIC: The result of computing score on the fitted pipeline.
BEST: The best observed score thus far.
****************************************************************************************************

 ITERATION   PIPELINE                                       DURATION      METRIC      BEST
         0   StandardScalerWrapper SGD                      0:00:10       0.9871    0.9871
         1   StandardScalerWrapper SGD                      0:00:11       0.9895    0.9895
         2   MinMaxScaler LightGBM                      

In [12]:
from azureml.widgets import RunDetails
RunDetails(local_run).show()

_AutoMLWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', 's…

In [13]:
best_run, fitted_model = local_run.get_output()
best_run

Experiment,Id,Type,Status,Details Page,Docs Page
automl-classification-pm,AutoML_07e79484-e977-4c6a-8d4c-2f151ce86c97_8,,Completed,Link to Azure Portal,Link to Documentation


In [14]:
fitted_model

Pipeline(memory=None,
     steps=[('prefittedsoftvotingclassifier', PreFittedSoftVotingClassifier(classification_labels=None,
               estimators=[('6', Pipeline(memory=None,
     steps=[('StandardScalerWrapper', <automl.client.core.common.model_wrappers.StandardScalerWrapper object at 0x133bbc208>), ('LightGBMClassifie...))],
               flatten_transform=None,
               weights=[0.4, 0.2, 0.1, 0.1, 0.1, 0.1]))])

# 5. Azure Machine Learning Interpretability SDK

[Azure Machine Learning Interpretability SDK](https://docs.microsoft.com/en-US/azure/machine-learning/service/machine-learning-interpretability-explainability?view=azuremgmtcompute-fluent-1.0.0) は、Microsoftと主要な3rd Partyのライブラリ(LIME,SHAP etc)で構成されたモデル解釈のフレームワークで、統合APIをご提供しています。  
<img src="https://docs.microsoft.com/en-US/azure/machine-learning/service/media/machine-learning-interpretability-explainability/interpretability-architecture.png#lightbox" width=800 align=left>

In [15]:
from azureml.explain.model.tabular_explainer import TabularExplainer
classes = ["false","true"]
tabular_explainer = TabularExplainer(fitted_model, X_train, features=X_train.columns, classes=classes)

In [16]:
global_explanation = tabular_explainer.explain_global(X_test[:100])

100%|██████████| 100/100 [00:28<00:00,  3.45it/s]


In [17]:
from azureml.contrib.explain.model.visualize import ExplanationDashboard
ExplanationDashboard(global_explanation, fitted_model, X_test[:100])

ExplanationWidget(value={'predictedY': [1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1…

<azureml.contrib.explain.model.visualize.ExplanationDashboard.ExplanationDashboard at 0x133bb9d68>