## Retrieve metrics, logs and other outputs from experiment runs

As seen before, experiments can be run multiple times and different results can be logged. We probably need to monitor these runs, and to do so we need to be able to retrieve the information that were logged during these runs. That is the topics of this notebook.

In [1]:
from azureml.core import Workspace

ws = Workspace.from_config()

print(f"Working in Azure ML Workspace {ws.name}")

Working in Azure ML Workspace azml-sdk


List all experiments in the Workspace

In [2]:
print(ws.experiments)

{'diabetes-local': Experiment(Name: diabetes-local,
Workspace: azml-sdk), 'diabetes-local-mlflow': Experiment(Name: diabetes-local-mlflow,
Workspace: azml-sdk)}


List the run history of a given experiment

In [10]:
diabetes_local_exp = ws.experiments['diabetes-local']
runs = diabetes_local_exp.get_runs()  # a generator

# We just list the first 5 runs in the list
for i, run in enumerate(runs):
    print(run)
    if i == 5:
        break
        
# Show more details for aparticular run
run_instance = next(runs)
run_instance

Run(Experiment: diabetes-local,
Id: diabetes-local_1623274658_cc67aebc,
Type: azureml.scriptrun,
Status: Completed)
Run(Experiment: diabetes-local,
Id: 0e84a2f5-5c34-417e-9d8b-bb203b058fbd,
Type: None,
Status: Completed)
Run(Experiment: diabetes-local,
Id: 5d8e651e-8956-47e8-a076-856eed36d74e,
Type: None,
Status: Canceled)
Run(Experiment: diabetes-local,
Id: d8311fb3-c5f9-4eca-a157-8c7af795566d,
Type: None,
Status: Canceled)
Run(Experiment: diabetes-local,
Id: 046b0eab-6ea3-4af0-8c04-5a22c78fcce2,
Type: None,
Status: Canceled)
Run(Experiment: diabetes-local,
Id: diabetes-local_1623215754_45eb5142,
Type: azureml.scriptrun,
Status: Completed)


Experiment,Id,Type,Status,Details Page,Docs Page
diabetes-local,937f8983-df50-41fc-a147-619bd7379d19,,Completed,Link to Azure Machine Learning studio,Link to Documentation


List all metrics from a specific run

In [14]:
run_metrics = run_instance.get_metrics()
for key, val in run_metrics.items():
    print(f"{key}: {val}\n")

observations: 10000

data columns: ['PatientID', 'Pregnancies', 'PlasmaGlucose', 'DiastolicBloodPressure', 'TricepsThickness', 'SerumInsulin', 'BMI', 'DiabetesPedigree', 'Age', 'Diabetic']

categorical columns: Pregnancies

numerical columns: ['PlasmaGlucose', 'DiastolicBloodPressure', 'TricepsThickness', 'SerumInsulin', 'BMI']

PlasmaGlucose: {'stat': ['count', 'mean', 'std', 'min', '25%', '50%', '75%', 'max'], 'value': [10000.0, 107.8502, 31.92090936056554, 44.0, 84.0, 105.0, 129.0, 192.0]}

DiastolicBloodPressure: {'stat': ['count', 'mean', 'std', 'min', '25%', '50%', '75%', 'max'], 'value': [10000.0, 71.2075, 16.80147828964082, 24.0, 58.0, 72.0, 85.0, 117.0]}

TricepsThickness: {'stat': ['count', 'mean', 'std', 'min', '25%', '50%', '75%', 'max'], 'value': [10000.0, 28.8176, 14.506480415228355, 7.0, 15.0, 31.0, 41.0, 92.0]}

SerumInsulin: {'stat': ['count', 'mean', 'std', 'min', '25%', '50%', '75%', 'max'], 'value': [10000.0, 139.2436, 133.77791937465324, 14.0, 39.0, 85.0, 197.0, 79

Reconstruct pandas DataFrame from the logged rows

In [18]:
import pandas as pd

pd.DataFrame(data=run_metrics["PlasmaGlucose"])

Unnamed: 0,stat,value
0,count,10000.0
1,mean,107.8502
2,std,31.920909
3,min,44.0
4,25%,84.0
5,50%,105.0
6,75%,129.0
7,max,192.0
