# Experiment Analytics with SageMaker Experiments

<img align="left" width="130" src="https://raw.githubusercontent.com/PacktPublishing/Amazon-SageMaker-Cookbook/master/Extra/cover-small-padded.png"/>

This notebook contains the code to help readers work through one of the recipes of the book [Machine Learning with Amazon SageMaker Cookbook: 80 proven recipes for data scientists and developers to perform ML experiments and deployments](https://www.amazon.com/Machine-Learning-Amazon-SageMaker-Cookbook/dp/1800567030)

### How to do it...

In [None]:
%store -r experiment_name
experiment_name

In [None]:
from sagemaker.analytics import ExperimentAnalytics

In [None]:
import sagemaker, boto3

session = boto3.Session()
sagemaker_session = sagemaker.Session(boto_session=session)

experiment_analytics = ExperimentAnalytics(
    sagemaker_session=sagemaker_session, 
    experiment_name=experiment_name,
)

experiment_details_df = experiment_analytics.dataframe()

In [None]:
import pandas as pd
from IPython.display import display

pd.options.display.max_columns = None
display(experiment_details_df)

In [None]:
from time import sleep

metric = "validation:error - Avg"
while metric not in experiment_details_df:
    experiment_details_df = experiment_analytics.dataframe()
    print("Not yet ready. Sleeping for 10 seconds")
    sleep(10)
    
print("Ready")

In [None]:
target_fields = [
    "TrialComponentName",
    "DisplayName",
    "eta",
    "gamma",
    "max_depth",
    "min_child_weight",
    "num_round",
    "objective",
    "subsample",
    "validation:error - Avg",
    "train:error - Avg",
    "Trials",
    "Experiments",
]

experiment_summary_df = experiment_details_df[target_fields]

In [None]:
display(experiment_summary_df)

In [None]:
import math

def is_not_nan(num):
    return not math.isnan(num)

def remove_nan_rows(df):
    return df[df['train:error - Avg'].map(is_not_nan)]

experiment_summary_df = remove_nan_rows(experiment_summary_df)

experiment_summary_df

In [None]:
sorted_df = experiment_summary_df.sort_values('train:error - Avg', ascending=True)
sorted_df

In [None]:
final_df = sorted_df[["DisplayName", "train:error - Avg"]]
final_df

In [None]:
final_df.plot(kind='barh', x="DisplayName", fontsize=8)