# Autopilot Experiment with SageMaker Studio UI

This notebook works well with the `Python 3 (Data Science)` kernel on SageMaker Studio.

---

---

## Contents

1. [Introduction](#Introduction)
1. [Setup](#Setup)
1. [AutoPilot Experiment using SageMaker Studio UI](#AutoPilot-Experiment-using-SageMaker-Studio-UI)
 * [Open Amazon SageMaker Studio](#Open-Amazon-SageMaker-Studio)
 * [Create Autopilot Experiment](#Create-Autopilot-Experiment-Job)
 * [Enter information for the AutoPilot Job](#Enter-information-for-the-AutoPilot-Job)
 * [View Autopilot Experiment Job ](#View-Autopilot-Experiment-Job)
 * [Test Deployed Model](#Test-Deployed-Model)
1. [Clean Up](#Cleanup)

## Setup

Retrieve shared variables created by [01_sagemaker_autopilot_setup.ipynb](./01_sagemaker_autopilot_setup.ipynb) notebook and list out the S3 URIs to prepare Autopilot experiment.

In [6]:
#cell - 01
%store -r train_data_s3_path
%store -r test_file_with_label
%store -r lab_ap_prefix
%store -r bucket

try:
  train_data_s3_path
except NameError:
    raise ValueError("Training dataset S3 URI is missing, please execute the data preparation notebook!")

> **__NOTE__** Please note down below variables:
* `train_data_s3_path` for training data input.
* `using_studio_ui_output_path` for Autopilot experiment output.

In [4]:
#cell - 02
train_data_s3_path

's3://sagemaker-ap-southeast-2-452533547478/mlu-workshop/direct-marketing/autopilot/train/train_data.csv'

In [5]:
#cell - 03
using_studio_ui_output_path = f"s3://{bucket}/{lab_ap_prefix}/using-studio-ui-output"
using_studio_ui_output_path

's3://sagemaker-ap-southeast-2-452533547478/mlu-workshop/direct-marketing/autopilot/using-studio-ui-output'

## Test Deployed Model

> Please ensure that you've deployed the Autopilot trained model while creating the experiment.

View the split test dataset with label.

In [8]:
#cell - 04
import pandas as pd

column_label = 'y'

test_data = pd.read_csv(test_file_with_label)
columns = test_data.columns.tolist()
columns.remove(column_label)
columns.insert(0, column_label)

# list the label as first column so that you can verify the prediction result easier
test_data[columns]

Unnamed: 0,y,age,job,marital,education,default,housing,loan,contact,month,...,duration,campaign,pdays,previous,poutcome,emp.var.rate,cons.price.idx,cons.conf.idx,euribor3m,nr.employed
0,no,37,services,married,high.school,no,yes,no,telephone,may,...,226,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0
1,no,40,admin.,married,basic.6y,no,no,no,telephone,may,...,151,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0
2,no,56,services,married,high.school,no,no,yes,telephone,may,...,307,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0
3,no,45,services,married,basic.9y,unknown,no,no,telephone,may,...,198,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0
4,no,46,blue-collar,married,basic.6y,unknown,yes,yes,telephone,may,...,440,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8233,yes,31,admin.,single,university.degree,no,yes,no,cellular,nov,...,353,1,999,0,nonexistent,-1.1,94.767,-50.8,1.031,4963.6
8234,no,38,housemaid,divorced,high.school,no,yes,yes,cellular,nov,...,360,1,999,0,nonexistent,-1.1,94.767,-50.8,1.031,4963.6
8235,no,64,retired,divorced,professional.course,no,yes,no,cellular,nov,...,151,3,999,0,nonexistent,-1.1,94.767,-50.8,1.028,4963.6
8236,yes,37,admin.,married,university.degree,no,yes,no,cellular,nov,...,281,1,999,0,nonexistent,-1.1,94.767,-50.8,1.028,4963.6


In [9]:
#cell - 05
X_test_numpy = test_data.drop(["y"], axis=1).values

set the endpoint name, if you are using something different from `dm-autopilot-experiment`, please update the value below:

In [10]:
#cell - 06
endpoint_name = 'dm-autopilot-experiment'

In [None]:
#cell - 07
import boto3

runtime = boto3.client('sagemaker-runtime')


def predict(payload):
    response = runtime.invoke_endpoint(EndpointName=endpoint_name,
                                           ContentType='text/csv',
                                           Body=payload)
    print(response)
    result = response['Body'].read().decode('utf-8').strip()
    pred, pred_probability = result.split(',')
    print(f"Prediction result: {pred} with probability: {pred_probability}")
    return pred, pred_probability

In [None]:
#cell - 08
# update the index to test on different row in the test dataset.
index = 15

label = test_data.iloc[index]['y']
print(f"Row index {index} with Label: {label}")

payload = ','.join(X_test_numpy[index].astype(str).tolist())
predict(payload)

### Cleanup

It's generally a good practice to deactivate all endpoints which are not in use.

Please uncomment the following lines and run the cell in order to deactivate the endpoint that were created before.

In [6]:
# sm_client = boto3.client('sagemaker')
# sm_client.delete_endpoint(EndpointName=endpoint_name)