Install the Azure Machine Learning SDK with AutoML.

For more details see https://docs.microsoft.com/en-us/azure/machine-learning/how-to-configure-databricks-automl-environment#add-the-azure-ml-sdk-with-automl-to-databricks

In [0]:
%pip install --upgrade --force-reinstall -r https://aka.ms/automl_linux_requirements.txt

Install the Azure ML Interpretability library.

In [0]:
%pip install azureml-interpret

Install support for the explainability dashboard.

In [0]:
%pip install interpret-community[visualization]

Provide information to connect to the Azure Machine Learning workspace.

In [0]:
from azureml.core import Workspace, Experiment, Datastore, Dataset
from azureml.train.automl.run import AutoMLRun
import azureml.train.automl
import azureml.train.automl.runtime
import numpy as np
import pandas as pd
import matplotlib.pylab as plt 

#Provide the Subscription ID of your existing Azure subscription
subscription_id = "" # <- subscription you are using for this hands-on lab

#Replace the name below with the name of your resource group 
resource_group = "MCW-Machine-Learning"

#Replace the name below with the name of your Azure Machine Learning workspace
workspace_name = "mcwmachinelearning"

experiment_name = 'Battery-Cycles-Forecast'
dataset_name = 'daily-battery-time-series'

## Connect to the Azure Machine Learning workspace

**Important note for workspace authentication**

If you have access to multiple tenants, you may need to import the `InteractiveLoginAuthentication` class and explicitly define what tenant you are targeting. Uncomment the first two lines of the cell below to force interactive authentication.

For mode details, see https://docs.microsoft.com/en-us/azure/machine-learning/how-to-setup-authentication.

In [0]:
#from azureml.core.authentication import InteractiveLoginAuthentication
#interactive_auth = InteractiveLoginAuthentication(tenant_id="<azure_ad_tenant_id>")

# Connect to the Azure ML Workspace
ws = Workspace(subscription_id, resource_group, workspace_name)

# Get default datastore to upload prepared data
datastore = ws.get_default_datastore()

Connect to the AutoML experiment that was previously manually created and list the runs associated with the experiment.

In [0]:
# In case you used previously a different name for the Azure Machine Learning experiment, replace the name below with that name.
experiment = Experiment(ws, experiment_name)

list(experiment.get_runs())

Identify the latest run with a status of `Completed`, and use its `Id` to get the best model.

In [0]:
# Replace `<automl_run_id>` below with the `Id` of the latest completed run
automl_run = AutoMLRun(experiment, run_id = '')

best_run, best_model = automl_run.get_output()
best_model

Load the dataset used to train the model with AutoML.

In [0]:
car_battery_ds = Dataset.get_by_name(ws, dataset_name)
car_battery_df = car_battery_ds.to_pandas_dataframe()
car_battery_df.tail(10)

## Generate feature importance values

Use `automl_setup_model_explanations` to configure the process of getting feature importance values.

In [0]:
from azureml.train.automl.runtime.automl_explain_utilities import automl_setup_model_explanations

automl_explainer_setup_obj = automl_setup_model_explanations(best_model, X=car_battery_df, 
                                                             y=car_battery_df['Daily_Cycles_Used'],
                                                             task='forecasting')

To generate an explanation for AutoML models, use the MimicWrapper class. You can initialize the MimicWrapper with these parameters:

- The explainer setup object
- Your workspace
- A surrogate model to explain the `best_model` automated ML model

In [0]:
from azureml.interpret import MimicWrapper

# Initialize the Mimic Explainer
explainer = MimicWrapper(ws, automl_explainer_setup_obj.automl_estimator,
                         explainable_model=automl_explainer_setup_obj.surrogate_model, 
                         init_dataset=automl_explainer_setup_obj.X_transform, run=automl_run,
                         features=automl_explainer_setup_obj.engineered_feature_names, 
                         feature_maps=[automl_explainer_setup_obj.feature_map],
                         classes=automl_explainer_setup_obj.classes,
                         explainer_kwargs=automl_explainer_setup_obj.surrogate_model_params)

You can call the `explain()` method in `MimicWrapper` with the transformed test samples to get the feature importance for the generated engineered features. You can also use `ExplanationDashboard` to view the dashboard visualization of the feature importance values of the generated engineered features by automated ML featurizers.

Notice the relative importance values for the most relevant features.

In [0]:
engineered_explanations = explainer.explain(['local', 'global'], eval_dataset=automl_explainer_setup_obj.X_test_transform)
print(engineered_explanations.get_feature_importance_dict())

Open the Explanation Dashboard to view feature importance.

Once the dashboard is opened, select the **Aggregate Feature Importance** tab to see the top features ordered by importance.

In [0]:
from interpret_community.widget import ExplanationDashboard

ExplanationDashboard(engineered_explanations, best_model, datasetX=automl_explainer_setup_obj.X_test_transform)