## Testing the Automated ML model

In this notebook you will be loading the best machine learning model trained using Automated ML, and use it to assign clusters to a series of new COVID-19 articles.

### Loading the latest model trained with Automated ML

We'll start off by importing the necessary modules and checking the Azure ML SDK version

In [9]:
import matplotlib.pyplot as plt
#from sklearn.metrics import plot_confusion_matrix

from azureml.core import Workspace, Experiment, Dataset, VERSION
from azureml.train.automl.run import AutoMLRun
from azureml.interpret import ExplanationClient

from azureml.widgets import RunDetails
from interpret_community.widget import ExplanationDashboard
from raiwidgets import ExplanationDashboard

print("Azure ML SDK Version: ", VERSION)



Azure ML SDK Version:  1.48.0


We first need to load our workspace, and use that to retrieve our Automated ML experiment

In [10]:
# Load the workspace from a configuration file
ws = Workspace.from_config()

# Get a reference to our automated ml experiment
exp = Experiment(ws, 'COVID19_Classification')

We now need to retrieve our latest Automated ML run, and its corresponding best model

In [11]:
# Retrieve a list of all the experiment's runs
runs = list(exp.get_runs()) 

# Pick the latest run
raw_run = runs[len(runs)-1]

# Convert it to an AutoMLRun object in order to retrieve its best model
automl_run = AutoMLRun(exp, raw_run.id)

# Retrieve the best run and its corresponding model
best_run, best_model = automl_run.get_output()

### Analyzing the metrics calculated while training the model

After having retrieved the best performing run, let's examine some of its metrics using the SDK's `RunDetails` widget. Analyze the various metrics of your model, including *Precision-Recall*, *ROC*, *Lift Curve*, *Gain Curve*, and *Calibration Curve*.

Analyze the *Confusion Matrix* and see which clusters are correctly identified by the model, and which have a higher likelihood ofbeing misclassified.

Experiment with the *Feature Importance* and analyze the relative importance of the top K features.

In [12]:
RunDetails(best_run).show()

_UserRunWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', '…

INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Usin

### Running the model on a new dataset

First we'll load the dataset we had previously prepared for testing, and convert it to a Pandas data frame.

In [13]:
# Retrieve the dataset from the workspace
test_ds = Dataset.get_by_name(ws, 'COVID19Articles_Test_Vectors')

# Convert it to a standard pandas data frame
test_df = test_ds.to_pandas_dataframe()

# Examine a sample of 5 documents
test_df.sample(5)

INFO:interpret_community.common.explanation_utils:Using default datastore for uploads


Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,119,120,121,122,123,124,125,126,127,cluster
10,3.64,-1.18,0.02,0.61,-3.09,0.89,-3.06,1.61,-0.17,0.22,...,0.25,-2.5,-1.6,-0.67,1.41,0.05,-0.91,1.54,1.92,7
9,0.38,-1.6,-1.07,0.64,-0.59,2.49,0.33,2.3,-1.19,-0.47,...,-1.27,0.93,1.57,-0.81,3.47,1.89,-1.49,0.34,0.57,3
69,0.41,0.43,-1.35,-0.22,-1.17,0.76,-0.17,1.4,0.21,0.69,...,0.59,0.32,0.56,-0.25,1.11,-0.39,-0.57,0.03,0.63,0
13,-0.23,-0.77,-2.93,0.51,-1.09,0.94,-0.38,-0.21,-1.41,1.21,...,2.17,-0.76,2.03,-0.51,1.41,0.75,-0.39,-0.71,1.85,5
83,1.17,-0.85,-0.79,0.25,-1.63,1.47,0.64,1.91,0.36,1.85,...,1.4,-0.58,1.1,-0.17,1.16,-1.87,-0.29,-1.35,0.23,3


INFO:interpret_community.common.explanation_utils:Using default datastore for uploads


Now we can use the *best_model* to assign clusters to the test documents

In [14]:
# Save the true values of the clusters
true_clusters = test_df['cluster']

# Keep all features except the label column
features_df = test_df.drop(columns=['cluster'])

# Predict the clusters for each document and display them
best_model.predict(features_df)

TransformException: TransformException:
	Message: Failed while applying learned transformations.
	InnerException: AttributeError: ['0']: 'SimpleImputer' object has no attribute '_fit_dtype'
	ErrorResponse 
{
    "error": {
        "code": "SystemError",
        "message": "Encountered an internal AutoML error. Error Message/Code: Failed while applying learned transformations.. Additional Info: TransformException:\n\tMessage: Failed while applying learned transformations.\n\tInnerException: None\n\tErrorResponse \n{\n    \"error\": {\n        \"message\": \"Failed while applying learned transformations.\",\n        \"target\": \"DataTransformer.transform\",\n        \"reference_code\": \"e5bbc77b-2264-418d-bdb3-39ea0f09eae8\"\n    }\n}",
        "details_uri": "https://aka.ms/automltroubleshoot",
        "target": "DataTransformer.transform",
        "inner_error": {
            "code": "ClientError",
            "inner_error": {
                "code": "AutoMLInternal"
            }
        },
        "reference_code": "e5bbc77b-2264-418d-bdb3-39ea0f09eae8"
    }
}

We can compare the true clusters with the predicted ones by using a confusion matrix - notice the true positive values on the diagonal

In [None]:
plot_confusion_matrix(best_model, features_df, true_clusters)

### Interpreting and explaining the model

By default, Automated ML also explains the machine learning models it trains. We will download and examine the explanations for our *best_model*.

In [None]:
# Use an ExplanationClient for accesing the best run's model explanations
client = ExplanationClient.from_run(best_run)

# Download the engineered explanations in their raw form
engineered_explanations = client.download_model_explanation(raw=True)

# Retrieve the dataset used for training the model - it will be needed when visualizing the explanations
training_df = Dataset.get_by_name(ws, 'COVID19Articles_Train').to_pandas_dataframe()

We will use an `ExplanationDashboard` to visualize the engineered explanations. For best results it needs to be presented with the same dataset used for training the model.

Analyze the *Aggregate Feature Importance* to identify the top predictive features. Select a feature and analyze how individual values of that feature impact prediction results. Switch to the *Individual Feature Importance & What-If* and explore the feature importance plots for individual points.

In [None]:
from raiwidgets import ExplanationDashboard
ExplanationDashboard(engineered_explanations, best_model, dataset=training_df.drop(columns='cluster'), true_y=training_df['cluster'])